Behind Haizhi's Participation in Drafting Two National Standards for Graph Intelligence: Laying a Solid Foundation for the Agent Industry via Engineering Practice

How can the performance of graph computing be measured authentically? How can graph neural networks be compressed without losing accuracy? The industry pain points behind the two national standards and the practices of Baize.

Recently, the Cyberspace Administration of China, the National Development and Reform Commission, and the Ministry of Industry and Information Technology jointly issued the "Implementation Opinions on the Standardized Application and Innovative Development of Agents" (hereinafter referred to as the "Implementation Opinions"). For the first time, at the national level, it has delineated three core directions for agents: "behavior control, inherent security, and engineering governance". This milestone policy marks that China's AI governance has entered a new stage of full - lifecycle security governance for agents from the single dimension of "content compliance of large models".

Just as the policy framework is being clarified, the standardization process of the underlying technologies for agents is accelerating simultaneously: Two national standards, "Intelligent Computing - Graph Computing Performance Testing Methods" and "Information Technology - Neural Network Representation and Model Compression - Part 3: Graph Neural Networks", have been successively implemented. These two standards were jointly compiled by multiple parties from industry, academia, and research. They not only address the rigid requirements of agent security governance for the underlying data infrastructure but also enable technology applications to resonate with compliance requirements.

As the core drafting unit of the two standards, Haizhi has brought its front - line engineering practice experience in the field of graph intelligence into the standard discussion, becoming a link connecting policies, standards, and industrial implementation. This article will start from the core issues that the two standards aim to solve and share our technical thinking and practical approaches during the implementation process of the national standards for graph intelligence.

I. Urgent Need to Break Through Industry Pain Points: The Dilemma of Graph Intelligence Implementation without Standards

Before the official implementation of the two standards, graph computing and graph neural networks had long been in an industry state of "self - definition and self - interpretation". Although market products seem to have complete capabilities and diverse technologies, customers generally face a common problem during the procurement and implementation process: Although the products seem to have complete capabilities, in real - world implementation, they often expose problems such as incomparable performance indicators, non - reproducible testing processes, and difficult evaluation of compression effects, which in turn increases the complexity of customer selection, business migration, and project acceptance.

This is not due to the failure of the system's functions but rather the lack of a unified benchmark for indicator calibers, model formats, and testing specifications, which directly leads to misjudgment in selection and obstacles in business launch.

Taking financial risk control as an example, when a state - owned bank was selecting a graph computing product, it encountered the problem of "inflated performance claims" from multiple manufacturers. All manufacturers claimed to support billions of nodes and edges and millisecond - level query responses, but each had its own testing logic:

Some only tested the optimal results after cache pre - heating, some only announced the average latency while hiding key indicators such as P95 and P99 (Editor's note: These are used to measure the distribution of system response times, meaning that 95% and 99% of request response times are less than or equal to this value). Some throughput statistics did not include core links such as data import and index construction.

Once the bank deployed the product in a real anti - fraud scenario, when facing multi - hop relationship queries, high - concurrency access, and super - node expansion of tens of millions of accounts, the system frequently encountered problems such as query timeouts, missing node configurations, and broken relationships. The so - called "millisecond - level response" was seriously out of touch with the actual production, which not only increased the business launch cost but also buried potential risks in risk control.

The field of graph neural networks also suffers from the lack of specifications. The training and inference effects of graph neural network models strongly depend on the modeling quality of upstream graph data. There is a lack of consistent agreements on node types, attribute levels, relationship directions, and feature fields among platforms in the industry, which directly leads to inconsistent and fragmented input semantics for graph neural networks.

For example, enterprise addresses are only stored in the form of raw strings without structured hierarchical splitting, which restricts subsequent feature aggregation and correlation analysis based on cities, regions, and administrative divisions. When core business relationships such as "natural person - controlled enterprises" have problems such as reversed directions, inconsistent definitions, and chaotic coding, graph neural networks will aggregate neighborhood information along the wrong path during the message - passing process, making it impossible to accurately learn the semantics of the real business structure and ultimately resulting in distorted model training effects.

What's more troublesome is that there is also a lack of unified guidance for model compression. Some manufacturers adopt radical pruning and quantization strategies in pursuit of inference speed. Although this reduces computing power consumption, it weakens the ability to identify key relationships, resulting in a decrease in the recall rate of financial risks. The root cause of these problems is essentially the industry in - fighting caused by the lack of unified standards.

It is worth noting that the two graph intelligence standards define the "factual framework" of agents - expressing and storing knowledge in a structured and computable way, enabling large models to conduct rigorous inferences based on definite facts, truly promoting the deep integration of agents and large language models into various industrial scenarios, and achieving large - scale implementation and application.

The graph computing performance testing standard addresses the core issue of "stable support for the graph foundation". By ensuring the stability and reliability of multi - hop retrieval, neighborhood expansion, and relationship inference in knowledge graphs, it avoids passing incorrect context to large models due to system performance fluctuations and data loss.

The graph neural network standard provides a unified expression basis for model migration, compression evaluation, and deployment reproduction by standardizing model representation and compression information description, which helps reduce information loss during model exchange and deployment.

Together, they form a closed - loop of "large models are responsible for natural language understanding and content generation, graph databases precipitate structured facts and relationship constraints, and graph neural networks complete the prediction and generalization of graph structures".

Making the decision - making process of agents more traceable and verifiable, thus strengthening the foundation for reliable operation and helping agents deeply root in the real - world industries and achieve large - scale implementation and application. This is the specific implementation path of "inherent security" and "engineering governance" emphasized in the "Implementation Opinions" at the technical level.

The core value of the two national standards lies in pushing the field of graph intelligence from "experience - driven" to "standard - driven", constructing an engineering system that is "testable, reproducible, migratable, and acceptable", which helps alleviate long - standing industry problems such as inflated performance claims, model isolation, black - box inference, and non - reusable engineering, and fills the gap in the underlying infrastructure for the large - scale implementation of agents.

II. Analysis of Technical Core: Haizhi's Technical Practice and Standard Implementation

The compilation and implementation of the two standards are the results of collaborative construction by industry, academia, and research.

As an enterprise participating in the drafting of both standards, Haizhi has a special role. We are neither a research institution oriented towards theory nor a cloud service provider oriented towards pure platforms. Instead, we are an enterprise that has long been deeply involved in complex relationship scenarios such as government affairs, finance, and urban governance.

This identity determines our way of participating in the standards: fully combining cutting - edge technology research with front - line industrial engineering applications.

1. Graph Computing Performance Testing Standard: From "Single - Point Benchmarking" to "Full - Process Engineering Evaluation"

Before the introduction of the national standard for graph computing performance testing, the biggest pain point in the industry was the lack of unified definitions for key indicators. Throughput, response latency, operational stability, and system compatibility - each manufacturer's statistical logic was incompatible, which directly led to difficulties in customer selection and implementation.

The statistical calibers for throughput varied. Some counted by the number of queries per second, while others calculated by the number of edges processed per second.

Latency was only marked with the average value, ignoring fluctuations in real - world scenarios such as cold start and cache invalidation.

Stability testing was only limited to short - term stress testing and could not cover potential hidden dangers during long - term system operation.

In real - time scenarios such as financial anti - fraud, the contradiction was particularly prominent. The business required the P99 latency of transaction queries to be controlled within a few hundred milliseconds, but some manufacturers only provided average latency data. Although it seemed to meet the standard, the actual tail latency could soar to several seconds, directly causing the business to fail to go live normally.

Our technical thinking is that performance testing should not only test the "optimal moment" but also the "real moment".

In the process of long - term service for complex relationship networks, we have developed an engineering testing system. The core logic is not to run a single algorithm score but to string together data generation, data import, index construction, query sets, concurrent stress testing, resource monitoring, exception recovery, and result verification into a complete process.

This system has led us to form a judgment: The standard should not only define "how fast it can run" but also "how fast and stable it can run under what conditions".

Based on this "full - process testing" engineering practice, we supported the establishment of an evaluation framework covering four core indicators: throughput, latency, stability, and compatibility in the standard discussion. This framework not only covers the basic requirement of "whether it can run" but also takes into account the actual scenario of "whether it can be put into production".

Throughput reflects whether the system can handle large - scale nodes and edges and high - concurrency tasks.

Latency reflects whether scenarios such as online risk control, real - time recommendation, and intelligent operation and maintenance can afford to wait.

Stability reflects whether the system can operate continuously and reliably under long - term operation, hot - spot access, complex queries, and resource fluctuations.

Compatibility is related to whether the customer's existing graph data, query statements, business rules, and upstream and downstream systems can be smoothly migrated.

2. Graph Neural Network Standard: From "Model Isolation" to "Interoperable, Compressible, and Deployable"

The introduction of the graph neural network standard aims to solve three core problems: inability to interoperate models, lack of compression specifications, and lack of deployment standards.

Among them, it is difficult to form a unified caliber on how to standardize the expression of nodes, edges, features, adjacency relationships, message - passing processes, and compression information without restricting algorithm innovation.

Academically, nodes, edges, features, adjacency matrices, and message - passing functions can be abstracted very simply. However, in real - world business systems, a node may have multiple labels, multiple attributes, dynamic features, permission boundaries, time versions, and business primary keys at the same time. An edge may also contain direction, type, weight, validity period, confidence, and multiple semantics.

The standard should neither restrict algorithm innovation nor enable industrial systems to truly interoperate, and the difficulty of defining the boundary is extremely high.

Our technical thinking is that standardization is not about "unifying everything" but finding an engineerable balance between "flexibility" and "interoperability".

Based on our practical experience in knowledge graph construction, we have realized that:

In node representation, it is necessary to clearly distinguish the levels of ID, type, attribute, and feature vector to avoid semantic deviations caused by different granularities.

In relationship mapping, it is necessary to standardize the directions and coding rules of core relationships such as "enterprise investment" and "natural person control" and clarify the division criteria between relationships and attributes. For example, modeling a relationship such as "transfer" that requires tracing details as an intermediate node rather than a simple edge relationship can avoid semantic loss. These are all feasible practices summarized by Haizhi in actual business.

In terms of model compression, we focus on more practical issues: The parameter definitions of technologies such as pruning, quantization, and distillation are not unified, and it is difficult for enterprises to rationally evaluate the real trade - off between "acceleration" and "accuracy loss". The value of the standard lies in promoting manufacturers to clearly mark key information such as compression ratio, accuracy loss, and error boundary, making technology selection move from a "black box" to "transparency" and effectively balancing computing power consumption and model accuracy.

Previously, some manufacturers blindly compressed models in pursuit of inference speed, resulting in a decrease in the ability to identify key relationships. After the implementation of the standard, enterprises can rationally evaluate the actual value of model optimization and avoid the black - box operation of "accelerating for the sake of acceleration", which is crucial for scenarios with extremely high accuracy requirements such as financial risk control and telecommunications anti - fraud.

In the actual business of knowledge graph construction and model deployment, we have formed a set of practical ideas regarding node representation, relationship mapping, and model compression. These ideas provide a reference perspective from the front - line of the industry for the discussion of relevant specifications.

3. Our Differentiated Capability: Scenario - Based Engineering Practice

Compared with the theoretical advantages of universities and the platform advantages of cloud service providers, Haizhi's difference lies in a relatively "cumbersome" ability: The ability to make graph intelligence run stably in a real, unclean, and highly - constrained production environment.

As a representative enterprise of the graph - model integration route, Haizhi has long been deeply involved in complex relationship network scenarios such as finance, government and enterprise, and urban governance, accumulating experience in real large - scale graphs, complex queries, long - term stable operation, and production acceptance. We are well aware of the real constraints of industrial implementation:

The data is unclean and the relationships are incomplete.

The queries are complex and diverse, and the business rules change dynamically.

The permission and compliance requirements are strict.

The system needs to maintain stability under high - concurrency and long - term operation.

These cannot be simulated in a laboratory environment. We always believe that the value of graph intelligence lies not in a single technological breakthrough but in the synergy of "standardization + engineering" - enabling technology to truly be implemented in all industries.

During the standard - setting process, what we brought was not "which article we want to write", but the pitfalls we encountered, the problems we solved, and the patterns we summarized in scenarios such as complex financial network adaptation, ultra - large - scale graph computing engineering verification, and long - term stable operation and maintenance.

These experiences were ultimately transformed into the design logic of key clauses such as the design of graph computing performance testing methods and the definition of professional terms in the standard, and the Agent Harness engineering system was fed back into the standard - setting process. Instead of copying a single solution, we provided an verifiable and reusable thinking framework for the industry.

Making the standard not only a simple "indicator definition" but also closer to the testing, migration, stress testing, acceptance, and continuous optimization processes in real projects, truly guiding actual implementation and application.

For example, in response to the "super - node" problem in graph computing, Haizhi transformed the practical solutions of node splitting, relationship bucketing, and hot - cold separation used in actual projects into the key points of compatibility and stability testing in the standard, helping the industry avoid performance bottlenecks in large - scale graph queries.

This approach of "from practice to standard" has enabled the two national standards to get rid of the dilemma of "armchair theorizing" and truly become the "yardstick" for industrial implementation.

It should be noted that there is no fundamental disagreement among the parties from industry, academia, and research in the general direction during the standard - setting process. The main problem in the field of graph intelligence is the communication cost caused by inconsistent terms and calibers, rather than the opposition of technical routes.

The core work of standard - setting is essentially to re - express scattered practical experiences in a unified language - this is the real value of standards as industrial infrastructure.

III. Future Direction: From Standards to Engineering, from Graphs to Agents

The release of the two national standards will reshape the development ecosystem of the graph intelligence industry, forming a chain reaction at three levels: technology standardization, application scale - up, and business closed - loop formation.

At the technical level, the industry will shift from "capability - promotion - driven" to "standard - verification - driven", entering a new stage of standard - based quantitative verification. Product performance and model capabilities must be evaluated under a unified caliber, forcing manufacturers to focus on underlying technologies and engineering capabilities.

At the application level, enterprises can build a standard process of "data import - performance stress testing - model deployment - acceptance and optimization", significantly reducing the cost of selection trial - and - error and system migration.

At the industrial level, the standards will break down the interface barriers between graph databases, knowledge graph platforms, graph neural network frameworks, and agent applications, promoting the industry to evolve from single - product competition to an industrial ecosystem of collaborative evolution of graph data, graph computing, graph models, and agents.

As the national standard system is gradually improved and the Harness engineering becomes a new paradigm in the AI industry, industry competition is shifting from the comparison of model parameters to the comparison of controllable, governable, and implementable system engineering capabilities.

Based on this trend, Haizhi's future technical layout will focus on four directions:

First, upgrade the distributed graph computing foundation. Continuously improve core capabilities such as large - scale graph storage, multi - hop query, and vector retrieval to adapt to ultra - large - scale complex relationship network scenarios such as urban governance and financial whole - network relationship analysis. The engineering difficulty lies in how to ensure the stable and controllable P99 latency under data skew (power - law distribution) conditions.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Behind the Participation of Haizhi in Drafting Two National Standards for Graph Intelligence: Building a Solid Foundation for the Agent Industry through Engineering Practice