HomeArticle

Who is persistently pursuing in-memory computing?

半导体产业纵横2026-04-07 19:56
In - depth Observation of In - Memory Computing in China

In 2026, a long - awaited technological singularity is approaching.

The camera of CCTV's News Broadcast rarely focused on a cutting - edge chip technology. Feng Dan, a deputy to the National People's Congress and the vice - president of Huazhong University of Science and Technology, made an appeal during the Two Sessions: Support Hubei in building a world - class integrated storage and computing industrial base to help the country gain strategic initiative in the new era of "Artificial Intelligence +".

Breakthroughs at the technical level are also occurring simultaneously. At ISSCC 2026, a joint team from Tsinghua University, Huawei, and ByteDance published a paper on in - memory computing chips, which attracted industry attention. The paper first proposed a hybrid in - memory computing (Compute - in - Memory, CiM) chip based on 28nm process. Through innovative architecture design, this chip improves the efficiency and energy efficiency of the core operations of the recommendation system by 1 - 2 orders of magnitude (QPS is increased by 66 times, and QPS/W is increased by 181 times).

01 Integrated Storage and Computing: The Way to Break the Deadlock in the Post - Moore Era

To understand why integrated storage and computing is important, we need to first understand a basic contradiction: data transfer is "eating up" computing efficiency. Since John von Neumann proposed the stored - program computer architecture in 1945, the global computing industry has developed under this framework for more than eighty years. The core feature of this architecture is the separation of the computing unit and the storage unit, and data is frequently transferred between the processor and the memory. This is like a factory where the raw material warehouse is far from the production line. Every time a part is produced, someone has to move the raw materials from the warehouse to the production line and then move the finished product back to the warehouse. When the parts are small, the drawbacks of this model are not obvious; but when the production scale expands rapidly, the energy and time consumed by the transfer begin to become a bottleneck.

In the chip world, this bottleneck has a vivid name: the "memory wall" and the "power consumption wall". Jensen Huang, the CEO of NVIDIA, once admitted: "GPUs spend 70% of their time waiting for data."

To make matters worse, as semiconductor processes approach the physical limit, the performance improvement dividend brought by Moore's Law is gradually fading. The cost - effectiveness ratio of traditional chip process scaling is decreasing, further exacerbating the dilemma of computing power supply. The rapid development of large - model technology further amplifies this contradiction. The parameter scale of large - language models represented by GPT has increased from billions to hundreds of billions, and the demand for storage capacity and bandwidth has increased exponentially.

It is against this background that integrated storage and computing technology has come into the spotlight.

The core logic of integrated storage and computing is very simple: Embed the computing unit directly into the storage array so that data can be computed at the storage location. This concept may seem simple, but it is a paradigm - level innovation at the chip architecture level.

Put simply, if a traditional chip is like a company where employees (data) have to commute between the computing unit and the storage unit located in different places every day, then an integrated storage and computing chip is like a company that builds its office right in the warehouse. The raw materials are at hand and can be used at any time, so the efficiency is naturally very different.

Currently, there are three major schools of integrated storage and computing technology:

First, Near - Memory Computing (NMC). The computing unit is located in the logic layer of the storage chip or is closely integrated with the memory through advanced packaging technology. This is similar to building the warehouse and the factory in the same industrial park. Although they are still in two places, the distance is greatly shortened. The logic layer integration or 3D stacking technology in High - Bandwidth Memory (HBM) belongs to this category.

Second, Processing - in - Memory (PIM). Computing functions are added to the peripheral circuits of the storage chip, enabling some computing tasks to be completed directly inside the memory. This is equivalent to adding a preliminary processing workshop in the warehouse, so that the raw materials do not have to be transported out of the factory area for partial processing.

Third, Computing - in - Memory (CIM). This is the most integrated solution, which directly uses the physical characteristics (such as resistance, charge, magnetism, etc.) of the storage medium to perform computing operations inside the storage array. Integrated storage and computing based on SRAM, RRAM (Resistive Random - Access Memory), or MRAM (Magnetic Random - Access Memory) can achieve highly parallel and ultra - low - power computing. This is equivalent to moving the entire production line into the warehouse. The chip in the opening paper belongs to this category.

Each of the three paths has its own advantages and disadvantages. Near - memory computing is the easiest to implement, but the improvement is relatively limited. In - memory computing has the greatest potential, but the technical challenges are also the most severe.

02 A Hundred Schools of Thought Contend: Technical Schools and Core Players in China's Integrated Storage and Computing Field

It is predicted that the global market size of integrated storage and computing chips will exceed $12 billion in 2025, with China accounting for 30%. Chinese integrated storage and computing companies show rich diversity in their technical routes. This diversity comes from both the exploration of different technical paths and the focus on different application scenarios.

In terms of computing paradigms, they are mainly divided into digital integrated storage and computing and analog integrated storage and computing. Digital in - memory computing has high precision and good compatibility with CMOS processes, and it is currently the mainstream direction of industrialization. Analog in - memory computing has higher energy efficiency but limited precision. The digital - analog hybrid solution attempts to find a balance between precision and energy efficiency.

In terms of storage media, the mainstream technical routes include four major directions: SRAM, DRAM, Flash, and new memristors (ReRAM, MRAM, PCM, etc.). Each medium corresponds to different technical characteristics and applicable scenarios.

The SRAM integrated storage and computing solution is based on the CMOS process and can use advanced process nodes. It has a fast read - write speed, but the storage density is relatively low and the static leakage current is relatively high. The DRAM solution has a higher storage density than SRAM and is suitable for scenarios involving large - capacity models, but its compatibility with the CMOS process is poor. The Flash solution has the advantages of non - volatility and low power consumption, but the read - write speed is relatively slow.

The new memristor solution is the most - watched exploration direction in recent years. New storage media such as ReRAM (Resistive Random - Access Memory), MRAM (Magnetic Random - Access Memory), and PCM (Phase - Change Memory) have good process scalability and ultra - low - power characteristics and are considered the "future land" of integrated storage and computing technology. However, currently, the process maturity and yield of these new media are still the main bottlenecks restricting industrialization.

It is worth mentioning that advanced packaging technology is the key support for achieving high performance in integrated storage and computing. 2.5D packaging achieves the integration of storage and computing units through horizontal stacking and interconnection, while 3D packaging further realizes vertical stacking and extreme integration. Currently, the highest - level packaging in the industry is the 3.5D packaging proposed by TSMC.

According to different application scenarios, Chinese integrated storage and computing companies can be roughly divided into two main camps: the "high - computing - power" camp represented by data centers, intelligent driving, and large - scale models at the end and edge, and the "edge - side AI" camp represented by smart wearables, smart homes, and the Internet of Things. Another underlying line is the exploration of underlying technologies, represented by Xinyuan Semiconductor, the "explorer of new storage media".

High - Computing - Power and Large - Model Direction

These companies mainly target scenarios that require strong computing power support, such as data centers, high - performance computing, and intelligent driving. They are committed to solving the "memory wall" and "power consumption wall" problems in the training and inference of large models.

Horizon Semiconductor is a representative company in the field of high - computing - power integrated storage and computing chips. Its technical route is based on SRAM integrated storage and computing, and it has self - developed the second - generation IPU architecture, Tianxuan. The Tianxuan architecture uses bit - serial computing to integrate the computing unit and the storage unit for near - data processing. Its core technological innovation includes Elastic Acceleration technology, which can achieve a maximum acceleration effect of 160%. In addition, Horizon Semiconductor is the first company in the industry to mass - produce integrated storage and computing chips capable of floating - point operations. Open - source or FP16 floating - point models can run directly without parameter quantization and tuning. For developers, this significantly reduces the migration cost. In terms of product progress, Horizon Semiconductor released the first domestic high - computing - power integrated storage and computing intelligent driving chip, Hongtu H30, with a computing power of 256 TOPS and a power consumption of 35W. This is the first domestic intelligent driving chip with integrated storage and computing. In July 2025, the company released its second - generation mass - produced chip, Manjie M50, which was officially mass - produced in the fourth quarter of 2025.

Yizhu Technology is an AI high - computing - power chip company based on the integrated storage and computing architecture, targeting scenarios such as data centers, cloud computing, and central - side servers. It follows the ReRAM media route. According to its official website, it independently designs and mass - produces high - computing - power computing chips with a fully digital integrated storage and computing architecture based on new storage. In addition, Yizhu Technology actively embraces the RISC - V ecosystem. In the field of AI high - computing - power chips, it is one of the first to introduce RISC - V cores for task scheduling, vector operations, and other operations in large - model services.

Edge - Side, Edge AI, and Low - Power Direction

These companies mainly target scenarios with strict requirements for power consumption, volume, and cost, such as smart wearables, smart homes, and Internet of Things devices. They use integrated storage and computing technology to achieve efficient edge AI computing.

MicroNanoCore is a company worthy of attention. MicroNanoCore aims to provide high - performance, low - power, and extremely cost - effective chip solutions for large - model inference applications such as AI mobile phones, AI PCs, IoT, all - in - ones, servers, and robots. It was incubated at the Zhejiang - Peking University Institute of Advanced Technology and follows the CIM technology route. On the basis of CIM, it integrates "3D near - memory computing" and "RISC - V and integrated storage and computing heterogeneous architecture", pioneering the three - dimensional integrated storage and computing (3D - CIM) architecture. Results from multiple tape - outs and tests show that compared with the traditional von Neumann architecture, MicroNanoCore's in - memory computing CIM technology has achieved a more than four - fold increase in computing power density (with the same cost improvement) and a more than ten - fold reduction in power consumption. In March this year, GigaDevice Semiconductor invested in MicroNanoCore.

Actions Semi is a representative listed company in the layout of integrated storage and computing technology. The company constructs a three - core architecture of CPU + DSP+NPU and innovatively uses SRAM in - memory computing technology, supporting the ANDT toolchain to accelerate algorithm implementation. In terms of technological evolution, Actions Semi is promoting the R & D of the second - generation in - memory computing IP, aiming to multiply the computing power of the single - core NPU, optimize the energy - to - efficiency ratio, and fully support the Transformer architecture.

Zhicun Technology is a representative company in NOR Flash integrated storage and computing technology. Its core products include the WTM2101 and the WTM - 8 series. The WTM2101 is the world's first integrated storage and computing voice chip based on NOR Flash, which was officially mass - produced in January 2022. This chip focuses on low - power voice interaction scenarios at the edge side, with a power consumption of only 5mW. Compared with NPU, DSP, and MCU computing platforms, it can increase the computing power by 10 to 200 times at the same power consumption level. The WTM - 8 series is Zhicun Technology's new - generation computational vision chip, suitable for low - power and high - computing - power scenarios. It supports the Linux operating system and can implement functions such as AI super - resolution, frame interpolation, HDR, detection, and recognition. This series of chips can provide at least 24 TOPS of computing power, while the power consumption is only 5% of similar market solutions.

New Storage Media Direction

Xinyuan Semiconductor is a leader in the industrialization of memristor (ReRAM) integrated storage and computing technology in China. It focuses on the R & D and industrialization of new ReRAM storage technology. Xinyuan Semiconductor's core product is a 28nm - process ReRAM storage chip, which has been mass - produced. The company's ATOM product series uses the characteristics of ReRAM compatible with advanced processes to integrate the storage and computing units. It is the only domestic company to achieve mass production of ReRAM.

ReRAM (Resistive Random - Access Memory) is a new non - volatile storage technology with advantages such as high storage density, compatibility with CMOS processes, and high cost - effectiveness. Compared with DRAM, the storage density of ReRAM can be significantly increased. Compared with Flash, ReRAM has better read - write performance. Xinyuan Semiconductor's technical route represents an important direction of combining integrated storage and computing with new storage media. ByteDance once invested in Xinyuan Semiconductor, indicating the potential of its RRAM technology in terminal devices such as VR/AR.

03 How is the Commercialization Going?

Being technologically leading is one thing, and turning technology into products is another.

The annual report of Actions Semi shows that it was the first in the industry to commercially apply in - memory computing technology and officially launched AI audio chips for edge - side scenarios. Among them, the ATS323X chip has been quickly implemented in the flagship wireless microphones of brand customers and is on the market. At the same time, it has been mass - produced and launched in the wireless gaming headsets of domestic leading brands. The ATS362X chip has also successfully entered the supply chains of several leading professional audio brands.

The WTM2101 chip of Zhicun Technology has achieved a shipment volume of over 10 million units and is used in the smart wearables of brands such as Huawei and Xiaomi. This is currently the most successful case of commercializing integrated storage and computing chips in China, proving the commercial value of integrated storage and computing technology in low - power scenarios at the edge side. As of now, the WTM - 2 series of Zhicun has been delivered to more than 30 customers. In particular, there was a large - order shipment from a leading smart - wearable terminal customer in the first half of this year.

The Hongtu H30 chip of Horizon Semiconductor was released in 2024. It is the first domestic intelligent driving chip with integrated storage and computing, which has passed the AEC - Q100 automotive - grade certification and was mass - produced in 202