Im ersten Jahr der KI - Anwendungen: Nur „Ja“ sagen und Risiken ignorieren? Offenlegung des vollständigen Logbuchs der Softwareentwicklung

Die Peking-Universität hat das Open-Source-Projekt „Narwhal AI Code Risks“ entwickelt, das die Risiken der KI-gestützten Codeerfassung systematisiert.

Die Risiken beim Schreiben von Code durch KI verbergen sich in scheinbar korrektem Code und können Datenlecks oder Vermögensverluste verursachen. Das Open-Source-Projekt Narwhal AI Code Risks hat reale Fälle, Frühwarnsignale und typische Risikopfade zusammengestellt, um Entwicklern zu helfen, potenzielle Gefahren frühzeitig zu erkennen und Wiederholungen zu vermeiden.

Im Jahr 2026 wird Code mit immer höherer Geschwindigkeit generiert, aber mit immer weniger Prüfung deployt.

Increasingly often, the user's requirements are put into a dialog box. The AI reads the context, completes the function, pulls in dependencies, adjusts the configuration, and even generates tests on the fly.

When the user comes to their senses, a piece of code is already sitting in the repository, waiting to be merged.

Users have developed a new habit: let the AI write the code and run it first, and then see where changes are needed if there are problems.

However, in the software world, the most dangerous things are often seemingly ordinary code: correct syntax, legal interfaces, passing tests, and perfect comments.

But it can still introduce non-existent package names, open overly broad permissions, expose databases... and even allow an Agent that can directly call system tools to take sensitive data out of the internal system under prompt injection.

What's truly dangerous is not the red light of an error. It's when all risk meters show normal.

In the past, the risks of AI writing code were scattered everywhere: a security blog might hide a case, and an Issue might record a clue. When the next team encounters a similar problem, they need to piece together the source of the risk from scratch and spend a lot of time and effort on large-scale empirical measurements of the code.

The Narwhal AI Code Risks project, recently open-sourced by Narwhal-Lab at Peking University, has organized these information fragments and categorized them into three types: real events, early signals, and typical risk paths, for researchers to review.

Paper link: https://github.com/Narwhal-Lab/Narwhal-aicode-risks

When all 28 checks pass, the system still goes off course

The first clue was a merged Pull Request. The PR signature section clearly showed Claude Opus 4.6, Copilot, and four human developers. All 28 checks passed: no one noticed the problem.

Then, the liquidation robot took collateral worth $1,778,044.83 in a few minutes.

In the configuration file, the price of cbETH was set as the conversion ratio to ETH, approximately $1.12, instead of the actual price close to $2,200.

A semantic error in the price passed through the development, review, and merging processes and finally resulted in real losses in the financial system. This is the most glaring aspect of the Moonwell cbETH oracle configuration incident.

The problem lies in the fact that there were no syntax errors in the code, and human developers did not immediately interrupt the abnormal process. Instead, it seemed complete and smooth, just like a normal engineering delivery.

But it is precisely this seemingly normal situation with hidden dangers that makes it a typical example of a security incident.

The risk of AI Coding is that it doesn't always manifest as an error.

Often, it sneaks into the engineering process under the guise of the correct answer. The code can run, the checks can pass, and the PR can be merged, but the business semantics have deviated from the real world.

In low-risk projects, this semantic deviation might just result in rework; but in sensitive scenarios such as finance and enterprise data systems, it can directly lead to data leakage, exposure of permissions, and asset losses.

When AI participates in writing code, adjusting configurations, conducting reviews, and even co-signing PRs, do we have enough confidence to know how each deviation occurs?

The green light doesn't reach all corners

In the early days, AI-assisted code writing mostly focused on local completion. If there were syntax errors, the compiler would report an error, unit tests would fail, and the CI process would reject it.

Today, AI Coding has advanced further, but the supervision has not kept up.

It can read files, adjust configurations, install dependencies, generate infrastructure scripts, and plan tasks autonomously through Agents.

AI is no longer just sitting on the side passing tools; it has started to enter a longer chain of software engineering.

The originally clear boundaries in software engineering have been reconnected by AI Agents into a longer and more difficult-to-trace path.

Scattered records need a public logbook

Security incidents rarely have a complete conclusion from the start. Some incidents have sufficient evidence and can be included in the catalog as real cases; some are still in the stage of community screenshots, researcher discussions, or preliminary disclosures and are only suitable for continued observation; others are not tied to a single real event but have formed a clear pattern and are suitable for pre - deduction.

Narwhal AI Code Risks divides the materials into three layers: `cases/`, `inferred/`, and `scenarios/`.

`cases/` records real events with public sources and evidence chains; `inferred/` stores early signals that have not been fully confirmed but are worth continuous tracking; `scenarios/` organizes typical scenarios that are not currently tied to a single event but have clear risk paths.

Without such a public record, the risks of AI Coding can easily become short - term memories on the Internet.

Today, people remember a certain package name, tomorrow they discuss a data exposure incident, and a few months later, they are covered by a new tool craze. When a similar problem appears again, the team still rushes into a risk - unknown area like a headless chicken.

What Narwhal AI Code Risks does is to fix these scattered risk fragments so that later people can refer to the same page.

Follow the seven - category index to see the origin of risks

The problems brought by AI writing code are not only in the code itself. They are also in the dependencies, permissions, tool calls of Agents, and especially in the way humans trust AI output.

Narwhal AI Code Risks currently divides the risks into 7 categories: supply chain, code - level vulnerabilities, cloud and infrastructure configuration, Agent risks, vertical domain risks, intellectual property and compliance risks, and human factors.

In supply chain risks, AI may recommend non - existent dependencies. In code - level vulnerabilities, AI may re - introduce path traversal, lack of input validation, and authentication issues into business code. In cloud and infrastructure configuration, AI may grant overly broad permissions, public storage buckets, or exposed ports just to get the code running. Agent risks are even more complex; they not only generate text but also start to perform actions. AI - generated products are posing hidden dangers to real systems.

The AI engine is starting, and the logbook is just being opened

As AI gradually enters the real world, the prevention and control of related risks should not be limited to post - incident reviews or scattered discussions.

The truly important thing about Narwhal AI Code Risks is to turn risk cases into reusable knowledge.

Developers can use it to identify similar problems; security researchers can use it as a sample library; tool vendors can extract detection rules and evaluation benchmarks from it; and the open - source community can continue to add new cases, new evidence, and new risk types.

The AI engine is roaring, and every deviation should leave a coordinate. Risks never disappear because they are ignored, but experience can be recorded and passed on. What's truly valuable is not just discovering a single vulnerability but ensuring that later people don't fall into the same trap.

What Narwhal AI Code Risks is doing is leaving an open - source logbook for the software world in the era of AI application.

References:

https://github.com/Narwhal-Lab/Narwhal-aicode-risks

This article is from the WeChat official account “New Intelligence Yuan”, author: LRST, published by 36Kr with permission.