"Yinchao Music", Self-developed Music Model with Full - link Capabilities at Any Scale, Passes Filing, Leading the New Track of AI Music Generation

The all-self-developed music large model in the free weight class from scratch, "Yinchao Music", has successfully passed the filing for generative artificial intelligence services with the Cyberspace Administration of China.

The domestic AI music field has witnessed a breakthrough. Recently, the music large model "Yinchao Music", which is fully self - developed from scratch by Ziyouliangji, has successfully passed the filing of generative artificial intelligence services by the National Internet Information Office (Filing Number: Shanghai - YinChaoYinYue - 202507160059). This indicates that the model meets the national requirements in terms of compliance, security, and reliability. The emergence of Yinchao Music has broken the previous situation of the lack of consumer - oriented commercial - grade music models in China, and has also laid a solid foundation for Ziyouliangji to deeply cultivate the ecological environment in the vertical music field.

Full - link self - development: Overcoming the challenges of "ultra - long context" and "non - linear structure"

Different from the common practice in the industry of fine - tuning open - source models, the Yinchao Music model adheres to complete independent R & D from the most basic architecture. Music generation is different from general text or image generation, and it faces two core challenges. One is the "ultra - long context". A song of a few minutes contains hundreds of thousands of data points. The other is the "non - linear structure". Elements such as melody, harmony, rhythm, and timbre are complexly intertwined and interact with each other. It is difficult to generate truly coherent and musical music with simple linear prediction models.

To address this, the R & D team of Ziyouliangji abandoned traditional ideas and pioneered the use of an AR + NAR hybrid architecture. This leading - edge design enables the model to have both excellent long - term structural coherence and fine local detail generation ability. It can effectively capture the global dynamic changes in music and perform high - fidelity reconstruction. Justin, the person in charge of the algorithm, admitted that it was not easy to achieve this, and the team encountered many difficulties during the R & D process. In the initial model adjustment, the team also had some radical ideas, trying to achieve the expected results at once. However, they found that there were a large number of factors to be balanced in the training of the music model, just like a seesaw. "So we started to do ablation experiments honestly, adding variables one by one, and finally removed the drawbacks as much as possible while maintaining the advantages," he said.

From fusion to reconstruction: Achieving the leap from "imitation" to "creation"

The powerful ability of the Yinchao Music model stems from its core multi - modal representation technology. The model can accept and understand input information in multiple modalities such as sound, text, pictures, and even videos, and perform representation and alignment in a unified high - dimensional space, which greatly broadens the imagination boundary and triggering methods of music creation.

More importantly, the R & D team has closely collaborated with professional musicians and composers, deeply integrating the training of the underlying language model with professional music production logic and music theory knowledge. This makes the "creation" process of the model no longer a simple data imitation or style replication, but internalizes the essential laws of music and can perform real - sense creative generation, ensuring the correctness and artistry of the output works in music logic.

In the reconstruction stage, the model innovatively conducts independent and in - depth modeling and learning of the structural differences between music signals and other types of information, and establishes a composite multi - dimensional evaluation system. This technological breakthrough effectively overcomes common problems in traditional solutions such as blurred details and harsh texture, making the arrangement of the generated works rich in layers, the mixing delicate in sound, and the overall quality reaching the industrial - level production standard, achieving an accurate transformation from "understanding user intentions" to "high - quality music expression".

In addition, in order to pursue the ultimate immersion of the generated music, the team has also independently developed a diffusion - type transformer model (DiT) that can directly perform joint modeling on stereo signals. With its unique attention mechanism, this model can accurately capture and synchronize the subtle phase differences, intensity differences, and time delays between the left and right channels. Therefore, the model generates not a flat mono - channel extension, but a stereo audio with a real and natural sense of space, which can create a credible sound field with width, depth, and positioning for listeners, bringing a truly immersive auditory experience.

Highly praised by professional musicians, with overall performance leading in China

In the preliminary small - scale blind tests, the works generated by the Yinchao Music model have been highly recognized by many senior musicians and producers. They generally believe that in terms of mixing sound, melodic texture, and paragraph structure, the works have reached the professional arrangement level, and the overall effect is leading in China. Chen Shizhe, an associate professor in the Department of Music Engineering at the Shanghai Conservatory of Music and the director of the Music Technology and Art Teaching and Research Section, said after using "Yinchao" multiple times that the maturity of the arrangement and the naturalness of the vocals and timbre of the created songs have exceeded the production level of most professional musicians.

The successful filing of the Yinchao Music model with the National Internet Information Office means that the model has obtained national - level recognition in terms of compliance and security, and has received the "passport" for large - scale commercial use in the market. In the future, the "Yinchao" AI music generation and consumption integrated platform built on this model will provide users with an unprecedented experience, subvert the underlying logic of the original music industry, and open up a new blue - ocean market. "We firmly believe that only by truly mastering the full - link core technologies from underlying algorithms to application innovation can we bring substantial changes to the music industry in this AI wave. In the future, we will continue to explore the infinite possibilities of the combination of AI and music, allowing everyone to enjoy the creative freedom brought by technological empowerment," said Jiang Tao, CTO and Executive CEO of Ziyouliangji.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Leading the new track of AI music generation, the self-developed music model "Yinchao Music" with full-link capabilities at any scale has passed the record-filing.

Full - link self - development: Overcoming the challenges of "ultra - long context" and "non - linear structure"

From fusion to reconstruction: Achieving the leap from "imitation" to "creation"

Highly praised by professional musicians, with overall performance leading in China