HomeArticle

Our understanding of AI is far from sufficient, so transparency is crucial.

腾讯研究院2025-11-06 17:38
The value of transparency goes far beyond this.

Introduction: When We Can't See AI Clearly, We Can't Truly Govern It

We are entering an era where AI is ubiquitous yet almost imperceptible. It quietly participates in our social interactions, content consumption, services, and consumption, and even influences our emotions, preferences, and behaviors. But do we really know where it is, what it does, and who controls it? When we can't see clearly, we can't trust; without trust, governance is out of the question.

The discussion about AI transparency is pointing to a most fundamental yet crucial question - in the AI era, what does the ability to "see" mean? And how can we truly "see" AI?

This article is the first in the Tencent Research Institute AI & Society Overseas Experts Dialogue series. The interviewee is Cao Jianfeng (Senior Researcher at the Tencent Research Institute).

Why Is It So Important to "See" AI?

When we receive information and interact on the Internet, are we facing real humans or "realistic" AI? As generative AI penetrates more widely into various scenarios such as social media, creation, and services, risks such as false information, identity fraud, and deepfake have emerged. As a result, "AI Activity Labeling" has gradually become a global consensus, and the obligation of AI transparency has been written into laws by multiple regulatory agencies such as China and the EU. Service providers are required to clearly indicate which content is generated by AI and which interactions come from AI systems, to help users identify forged information, enhance vigilance, and reduce the risk of misbelief and deception. This is the most direct and primary function of current AI transparency policies.

However, these are just the tip of the iceberg of AI risks, and the value of transparency goes far beyond that. AI systems are evolving from tools that execute commands to intelligent agents (AI Agents) with a certain degree of autonomy, capable of browsing the web, conducting transactions, writing code, and controlling devices. This new ability blurs the line between AI and the real world. However, government regulators, industry practitioners, academia, and the public still know very little about the operating logic, risk chains, and social impacts of AI, and are even in a state of "cognitive vacuum."

For example, a controversial issue is the social impact of "AI persuasion." When AI can accurately imitate human language, understand psychological needs, and even influence emotions, can it quietly shape our viewpoints, value judgments, and even behavior patterns in daily life scenarios such as social media and short - video platforms? How deep is its influence? How wide is its spread? How long does it last? Currently, there is a lack of evidence to support answers to these questions. The bigger problem is that we don't even know where to find the answers.

To govern AI, we must first see it clearly. To truly answer the above questions, imagination, speculation, and pure theoretical deduction are far from enough. We must collect a large amount of "known evidence" from the real world about how AI operates and how it affects people. And this is where the long - term value of the transparency system lies: providing a real observation perspective and first - hand data for researching, evaluating, and dealing with AI risks. For example, regarding the issue of "AI persuasion," to judge how AI affects human cognition, emotions, behaviors, and the broader social order, the premise is to accurately distinguish which interactions come from AI and which come from real humans. Here, the AI labeling system, as a transparency mechanism, not only helps individual users enhance their recognition ability but also provides technical support for platforms to track, analyze, and manage AI activities, and provides researchers with the practical possibility of collecting evidence, evaluating risks, and formulating more scientific policies.

Furthermore, transparency also plays an important role in alleviating anxiety and building trust. With the rapid development of technology, our understanding of its operating logic and potential impacts lags far behind. The cognitive lag has brought widespread governance anxiety: we don't know which risks are the most important and urgent, and we can't confirm whether we have ignored deeper hidden dangers. To some extent, this is also hindering the popularization and application of AI in society.

In the stage where risks have not been fully clarified and AI capabilities are still evolving rapidly, the transparency mechanism can alleviate the uneasiness of all parties, enabling us to return from risk anxiety to governance rationality and use "known evidence" to alleviate "unknown fears." It's not about blind trust but making rational judgments on the basis of "seeing clearly." Not limited to AI labeling, transparency mechanisms such as model specifications and interpretability technologies are all trying to balance the information gap in the AI era: alleviating our cognitive "black box" of AI technology and the information asymmetry between the government, industry, academia, and the public. The more we know about AI, the more we can talk about using it with confidence and even innovating boldly.

In today's world where we know very little about the boundaries of AI capabilities, risk characteristics, and social impacts, the ability to "see" is an indispensable force. And the transparency mechanism endows us with this ability: to see how AI operates, how it interacts with humans, and how it produces impacts. It can be said that in the current context of continuous evolution and expansion of AI technology, transparency is becoming the key to understanding, trusting, and governing AI.

How to Make AI Labeling Effective?

In the current AI governance landscape, "AI labeling" is one of the earliest and fastest - advancing transparency mechanisms. China's "Measures for the Labeling of AI - Generated Synthetic Content" and national mandatory standards have been officially implemented, achieving phased results. In the EU, Article 50 of the "Artificial Intelligence Act" (EU AIA) also specifically stipulates the labeling obligations of AI system providers. As the implementation of Article 50 in the EU accelerates, industry discussions have shifted from "whether labeling is needed" to "how to label effectively." Discussions about "what to label," "who should embed the watermark label," and "who should detect the watermark label" can also provide some references for further refining implementation standards and supplementing system details in practice.

First, should we only label content or also label "behaviors"? With the improvement of AI autonomy, intelligent agents can not only generate content such as text, images, audio, and video but also actively "do things": browse the web, send emails, place orders for shopping, automatically like, comment, and forward, etc. Such operations go beyond the scope of traditional content generation and belong to the "behavior" itself. However, most existing legal provisions focus on the labeling of content and do not clearly cover the autonomous behaviors of AI, leaving a certain "blind spot." For example, if a large number of AI accounts like and forward a certain piece of information simultaneously, it is easy to create "false popularity," manipulate algorithmic recommendations, disrupt the information ecosystem, and mislead public opinion and public judgment. How to include such behaviors in the labeling scope deserves further attention. Therefore, although current AI labeling focuses more on AI content, with the continuous innovation and wide application of intelligent agents, the transparency and labeling of "AI activities" will become even more important.

Second, who should embed the watermark label, and how should the hierarchical obligations be set? Not all AI service providers have the same capabilities. Upstream developers (such as OpenAI, DeepSeek, Anthropic, etc.) have the ability to control at the model level and can embed watermark mechanisms; while downstream application developers often only fine - tune or call existing models and lack sufficient resources and permissions to achieve independent embedding. If the same obligations are imposed on all entities, it may dampen the enthusiasm of small and medium - sized innovators to participate. For example, the EU is also discussing whether to set "hierarchical obligations": upstream model developers are responsible for embedding watermarks; downstream application developers are responsible for cooperating in detection and not removing or circumventing existing watermarks. In addition, different types of AI systems have differences in application scenarios and risk characteristics. Should different transparency requirements also be formulated? This is still an open question.

Third, who should verify the watermark label, and to whom should the detection tool be authorized? Embedding a watermark is one thing, and being able to verify it is another. If the watermark is only visible to the generator and not enough other entities have the ability to detect or verify it, the watermark becomes "self - justifying" and loses the value of the transparency mechanism. However, the problem is that once the watermark detection tool is widely made public, attackers may also have the opportunity to bypass or tamper with the watermark label, thereby weakening its security. Therefore, a balance needs to be struck between transparency and robustness. At present, a possible compromise is to authorize the watermark detection tool to key nodes with platform responsibilities, such as social media platforms and news distribution platforms. These platforms can identify the content source and complete label verification during the user interaction process, while keeping the technical details of the detection mechanism confidential to prevent abuse and reverse cracking.

Currently, the EU is launching the compilation of practical guidelines for Article 50 of the EU AIA, which is expected to be completed by next May. Its status is similar to that of the "General - Purpose AI Practice Guidelines," but the focus has shifted from "security" to "transparency" to specifically address the above issues.

How to Set and Follow Rules for AI through Model Specifications?

In addition to AI labeling, another possible exploration of transparency is "Model Specifications." In simple terms, model specifications are a document written and made public by AI companies to explain their expectations of what their models "should do" and "should not do." In other words, model specifications are used to define the behavior boundaries, value criteria, and design principles of models. Taking OpenAI as an example, one of the criteria set in its model specifications is to pursue truth together with users. This means that the model should remain neutral when answering questions and not actively guide the user's stance. The rapidly developing intelligent agents should also have specifications for "what can be executed" and "what cannot be executed," clearly defining their behavior boundaries in terms of interaction objects, operation permissions, etc. For example, can an intelligent agent execute transactions on a financial platform on behalf of the user?

The significance of model specifications lies not only in being an "operation manual" within the technology but also in being a transparency mechanism open to the public, allowing users to know what an AI system is designed to be and how it will interact with people. This guarantees users' right to know and the right to choose. For example, a parent wants their child to use an AI assistant but is worried that it may generate inappropriate content. If the model specifications clearly state this, the parent can use it more safely or choose another AI assistant. On the contrary, if the specifications of an AI model are vague or not made public, users can only guess the model's behavior. At the same time, model specifications are also an important basis for feedback from regulatory agencies and the public. A media once exposed an internal policy document of Meta about its chatbot, showing that the example allowed the AI chatbot to have "romantic" interactions with minors, which triggered an uproar in public opinion and attracted regulatory attention. After the exposure, Meta quickly revised the rules. Model specifications are equivalent to a public behavior commitment by enterprises, providing a handle for external supervision and correction.

However, the biggest problem with model specifications is that: enterprises can easily make commitments, but it is difficult for the public to verify whether these commitments are fulfilled. Even if the specifications are written comprehensively, if there is a lack of an execution mechanism, they may become "empty promises." Therefore, "Model Specifications Adherence" has become the core of the discussion in the model specification transparency mechanism.

Currently, judging the compliance of model specifications mainly relies on three types of information: user test feedback, system cards or model cards, and accident report disclosures. However, these methods still have deficiencies. For example, system cards do not cover all model behaviors; it is difficult to judge whether a single accident is an accidental event or a real system defect. Therefore, some people believe that enterprises should not only disclose the content of model specifications but also make public the technology, process, evaluation results of the compliance degree, accidents, or violations of model specifications. For example, xAI embeds model specifications into system prompts, Anthropic uses the method of "Constitutional AI," and OpenAI promotes the method of "Deliberative Alignment." In addition, this information should not only be disclosed before deployment but also continuously tracked and updated after deployment. In other words, not only should "setting rules" be transparent, but "following rules" should also be transparent.

However, the mechanism of model specification compliance is still in the exploration stage, and there is no unified standard yet. There are still many open questions to be discussed around this mechanism.

First, should model specification compliance be mandatory? Currently, the enterprises that publicly release model specifications are mainly concentrated in a few leading companies such as OpenAI, Anthropic, and xAI. If an enterprise has not even formulated model specifications, there is naturally no question of model specification compliance. However, if "model specifications" and "model specification compliance" are made legal obligations too early, it may suppress enterprises' exploration and innovation in governance mechanisms. Many advanced governance methods are still in the experimental stage. If they are regulated too early, enterprises may abandon exploration due to compliance concerns. At the same time, there are also a series of implementation difficulties at the regulatory level: who should verify? How to verify? How to set different verification standards for different AI systems?

Second, which information on "model specification compliance" should be made public? The requirement for transparency does not mean "complete transparency." Details such as model alignment technology and training data processing involved in the specification compliance process may fall into the category of corporate trade secrets. Which key links, data indicators, and technical methods can be disclosed? Which should be protected? In addition, it is very difficult to verify the authenticity and interpretability of the model specification compliance process. Even if an enterprise releases relevant documents, they may be difficult to understand. For example, is there a difference between 95% compliance and 99% compliance, and what is the difference? Therefore, a balance needs to be found between government regulatory requirements, the public's need to know, and the legitimate business interests of enterprises.

Third, if a model fails to fully comply with the specifications, should the enterprise be held responsible? Although the model is a public behavior commitment, at the current stage, technology development is not mature, and AI models still have a high degree of uncertainty and unpredictability. Even if developers have done their best, the model may accidentally violate the specifications. If an enterprise has to be held responsible once the model "crosses the line," it is too harsh for technological development. Generally speaking, model specifications mainly play a transparency function of making the society "see clearly" and are not directly linked to responsibility. A more cautious attitude should be adopted, and the focus should be on whether the enterprise complies with model specifications, whether it discloses accident situations, and whether it corrects problems in a timely manner.

Conclusion: Establish a Verifiable, Feedback - Based, and Improvable AI Governance Path through Transparency

Precisely because our understanding of AI is still far from sufficient, transparency is particularly crucial. Transparency allows us to better "see" the real operation of AI, thus bridging the gap between technological development and social understanding. It not only helps users identify interaction objects and avoid risks but more importantly, it provides the most basic cognitive guarantee for society in the face of technological uncertainty and is the basic premise for governance research and policy - making. Whether it is AI labeling, model specifications, or other broader transparency mechanisms and methods, in essence, they are all trying to establish a verifiable, feedback - based, and improvable AI governance path.

Only when we truly see what AI does, how it does it, and why it does it, can we make rational judgments about what it should do. Furthermore, making AI "visible" is not only the task of regulators but also the starting point for building trust between society and technology. In this sense, transparency is the core of the AI social contract. When we can see the trajectory of AI, understand its logic, and verify its commitments, AI may become a trustworthy partner for humans rather than an unpredictable force .

Editor's Note: This article is compiled based on the sharing content of Alan Chan, a researcher at the Center for the Governance of AI, in the Tencent Research Institute AI & Society Overseas Experts Face - to - Face Dialogue series. However, Alan Chan did not participate in the writing of this article, and the views in the article only represent the personal stance of the compiler and do not represent the views of Alan Chan or his affiliated institution.

This article is from the WeChat official account “Tencent Research Institute” (ID: cyberlawrc), author: Tencent Research Institute, published by 36Kr with authorization.