HomeArticle

3D NAND: How will it evolve?

半导体行业观察2025-11-10 09:36
NAND flash revolutionizes data storage, and 3D NAND drives the development of AI. Future challenges include thousand-layer stacking and charge trap technology.

Since its introduction to the memory market in the late 1980s, NAND flash memory has fundamentally transformed the way large amounts of data are stored and retrieved.

This non-volatile memory, specifically designed for high-density data storage, is almost ubiquitous in every corner of the electronics market, spanning from smartphones to data centers. It is used in most removable and portable storage devices, such as SD cards and USB flash drives. In recent years, 3D NAND has also played a crucial role in the booming development of artificial intelligence, providing an efficient storage solution for the vast amounts of data required to train AI models.

With the explosive growth of data storage requirements, chip companies are racing to increase the storage cell density of NAND flash memory (measured in gigabits per square millimeter, Gb/mm²) while reducing the cost per bit. More than a decade ago, the semiconductor industry transitioned from two-dimensional NAND to three-dimensional NAND to overcome the limitations of traditional memory size reduction. In recent years, companies have been increasing the storage density by adding more layers of storage cells per chip and increasing the number of bits stored per cell (up to four bits in commercial NAND flash memory).

One of the most significant advancements has been the shift from floating-gate transistors to charge-trap cells. Floating-gate technology stores charge in a conductor, whereas charge-trap cells store charge in an insulator. This reduces the electrostatic coupling between storage cells, thereby improving read and write performance. Additionally, since charge-trap cells can be manufactured in smaller sizes than floating-gate transistors, it paves the way for higher storage densities.

As 3D NAND technology continues to push the physical limits, the semiconductor industry is turning to a variety of new technologies to pack storage cells more closely together – not just horizontally but also vertically. Several innovative technologies developed by imec enable vertical expansion without sacrificing memory performance and reliability: air-gap integration and charge-trap layer separation.

Inside the Charge-Trap Cell: The Building Block of 3D NAND

The semiconductor industry plans to apply gate-all-around (GAA) or nanosheet transistors to logic chips in the coming years. However, the GAA architecture is already widely used in the field of 3D NAND flash memory and is the workhorse for high-density data storage. In this 3D architecture, storage cells are stacked in vertical chains and addressed via horizontal word lines.

In most cases, charge-trap cells serve as the storage devices in 3D NAND. This storage cell is similar to a MOSFET, but it has a thin layer of silicon nitride (SiN) embedded within the gate oxide layer of the transistor. This turns the gate oxide layer into a semiconductor material stack known as an oxide-nitride-oxide (ONO) stack, where each layer serves as a blocking oxide, a trapping nitride, and a tunneling oxide, respectively (Figure 1).

1. This figure shows a 3D NAND GAA architecture with a string of vertical charge-trap cells having an oxide-nitride-oxide (ONO) gate dielectric and a limited number of word lines (WL).

When a positive bias voltage is applied to the gate, electrons in the channel region tunnel through the silicon oxide layer and are trapped in the silicon nitride layer. This raises the threshold voltage of the transistor. The state of the storage cell can be measured by applying a voltage between the source and the drain. If current flows, it indicates that no electrons are trapped, and the storage cell is in the “1” state. If no current is measured, the storage cell is in the so-called “electron-trapped” state, corresponding to “0.”

The charge-trap cell is implemented in the 3D NAND structure using the GAA vertical channel approach. Imagine rotating a planar transistor by 90 degrees so that the vertical conductive channel is surrounded by the gate stack structure.

The manufacturing process of the GAA channel begins with the alternating stacking of conductors (silicon, used as word lines) and insulating layers (silicon oxide, used to separate word lines). Next, advanced dry etching tools are used to drill downward to form cylindrical holes. Finally, silicon oxide and silicon nitride layers are alternately deposited on the sidewalls of the holes, with the polysilicon transistor channel located at the center of all the layers. This structure is commonly referred to as the “macaroni channel.”

Next-Generation 3D NAND: Cell Stacking and Cell Scaling

In the coming years, the memory industry will push the GAA-based 3D NAND flash memory roadmap to its ultimate limit.

Today, leading manufacturers are introducing 3D NAND flash memory chips composed of more than 300 layers of oxide/word-line stacks (Figure 2). It is expected that by 2030, this number will further increase to 1000 layers, equivalent to a storage capacity of approximately 100 Gbit/mm². The challenge lies in maintaining a relatively consistent word-line diameter within a 30-micron-thick stack. However, maintaining the uniformity of all components in such a small space continuously increases the complexity and cost of the process, placing higher demands on high-stack deposition and high-aspect-ratio etching processes.

2. This 3D NAND flash memory diagram highlights the z-spacing between adjacent word lines.

To accommodate more stacked layers, semiconductor companies are investing in the development of various supporting tools to increase the storage density of 3D NAND. These “scaling accelerators” include increasing the number of bits per cell and reducing the xy-spacing of GAA cells (lateral scaling). In addition to improving bit density and cell density, companies are also taking measures to enhance the area efficiency of the storage array.

Another way to increase storage capacity is through stacking technology, which involves stacking flash devices on top of each other to increase the total number of layers. In 3D NAND flash memory, storage cells are connected in series to form a chain, which is achieved by alternately stacking insulating layers and conductor layers and drilling holes in them. The cell stacking process can be repeated two to three times – and possibly four times in the future – to create longer chains on each chip. Each cell stack is sometimes referred to as a “layer.”

By stacking a large number of storage cells and layering these stacks to create taller 3D NAND chips, companies can increase the total number of layers without having to manufacture all the layers at once. For example, a company can assemble 250 layers of storage cells and then stack four of these stacks to form a 3D NAND chip with 1000 layers. The main challenge lies in etching sufficiently deep holes in these multi-layer storage chips and filling these holes uniformly.

Additionally, some companies are separating the underlying logic from the NAND array and reintegrating it onto the NAND array in a configuration called the CMOS-bonded array (CbA). In this configuration, the CMOS chip is manufactured on a separate silicon wafer and then connected to the NAND array using advanced packaging technologies, particularly hybrid bonding. CbA is the next stage of development from the CMOS-under-array (CuA), in which the NAND chip is directly manufactured on top of the CMOS chip in the same monolithic process.

Looking ahead, companies are considering bonding multiple storage arrays to a single CMOS wafer as an alternative to hierarchical stacking – or even bonding multiple array wafers to multiple CMOS wafers.

To control the rising manufacturing costs, imec and other semiconductor companies are also actively exploring vertical or “z-spacing” scaling technologies to reduce the thickness of the oxide and word-line layers. This way, more storage layers can be stacked at a controllable cost.

Pros and Cons of 3D NAND Flash Z-Spacing Scaling

Reducing the spacing between storage layers is crucial for continuously reducing the cost of next-generation 3D NAND. The spacing between adjacent word lines is approximately 40 nanometers, and the goal of z-axis spacing scaling is to further reduce the thickness of the word-line and silicon oxide layers in the stack. In this way, for every additional micron in stack height, the number of storage layers can be increased, thereby increasing the number of storage cells and ultimately reducing costs.

However, without optimization, z-axis spacing scaling can have a negative impact on the electrical performance of storage cells. This may lead to a decrease in threshold voltage, an increase in subthreshold swing, and a decline in data retention. Additionally, it increases the voltage required to program and erase the data stored in the storage cells, which inevitably increases power consumption, reduces the speed of the storage cells (RC delay), and may cause breakdown of the gate dielectric between adjacent cells.

These effects can be traced back to two physical phenomena that become more pronounced when memory cells are packed closer together: inter-cell interference and lateral charge migration.

When the thickness of the word-line layer decreases, the gate length of the charge-trap transistor also shortens accordingly. As a result, the gate's control over the channel gradually weakens, thereby promoting electrostatic coupling between different cells.

In addition to inter-cell interference, the vertical scaling of storage cells also leads to lateral charge migration (or vertical charge loss): the charge trapped inside the storage cell tends to migrate out of the vertical SiN layer, thus affecting data retention.

The charge-trap cell has two geometric directions: z and xy (since the cell has cylindrical symmetry, the x and y dimensions are the same). Charge can leak from the storage cell in both of these directions. Charge can escape from the cell through the tunneling and/or blocking oxide in the gate along the xy direction, and it can also escape along the z direction, eventually entering or getting too close to adjacent cells. This is due to lateral charge migration, which becomes more significant as the vertical size of the cells decreases and the distance between them becomes smaller.

Next, we will discuss the technological drivers that can address these drawbacks, enabling researchers to unlock z-spacing scaling for future generations of 3D NAND flash memory.

Between Word Lines: Using Air Gaps to Reduce Cell Interference

Integrating air gaps between adjacent word lines is a potential solution to the inter-cell interference problem. These air gaps have a lower dielectric constant than the inter-gate dielectric, thereby reducing the electrostatic coupling between storage cells. This technology has been widely used in planar two-dimensional NAND flash memory architectures. However, integrating air gaps into high silicon oxide/word-line stack structures is more challenging.

To overcome these complexities, imec presented a unique integration scheme at the 2025 IEEE International Memory Workshop (IMW) that can precisely control the position of air gaps between word lines.

In 3D NAND memory, thin layers of silicon oxide are placed inside the gates of storage cells – as the “gate dielectric” to separate the word line from the transistor channel – and between the word lines of different storage cells – as the “inter-gate dielectric” to separate adjacent cells from each other (Figure 3). The gate dielectric forms the tunneling and blocking layers of the ONO stack and surrounds the charge-trap SiN layer.

3. This figure shows the 3D integration process flow for air gaps (ad) and transmission electron microscopy (TEM) and energy-dispersive X-ray spectroscopy (EDS) images of air gaps (ef).

Therefore, silicon oxide exists not only inside each storage cell but also between cells. Due to the manufacturing process of 3D NAND storage cells, the gate dielectric extends continuously from one cell to another and intersects with the inter-gate dielectric in the space between adjacent storage cells. imec believes this is an ideal location to place air gaps. However, with current process technologies, removing (or cutting) the charge-trap SiN layer between cells remains a significant challenge.

At imec, we have found a new way to integrate air gaps without cutting the SiN from the storage cells. This innovation introduces air gaps from inside the storage hole area by recessing the inter-gate silicon oxide before depositing the ONO stack. The air gaps are self-aligned with the word lines, enabling very precise placement. This method also has potential scalability, which is a major issue with other proposed solutions.

Results show that devices with air gaps are less sensitive to interference from adjacent cells than those without air gaps. This conclusion is drawn from the smaller threshold voltage shift of devices with air gaps when a so-called “pass voltage” is applied to the unselected gates (Figure 4). This result was obtained on a test device with a limited number of word-line layers, a spacing of 30 nm (gate length of 15 nm, inter-gate silicon oxide dielectric layer thickness of 15 nm), and a storage hole diameter of 80 nm.

4. Threshold voltage changes of charge-trap devices with (left) and without (right) air gaps at different pass voltages.

imec researchers also studied the impact of air gaps on memory performance and reliability. Results show that air gaps do not affect memory operation, and the durability can reach 1000 program/erase cycles, comparable to devices without air gaps.

Based on these results, air-gap integration on the hole side is considered a key step towards future z-axis spacing scaling.

Charge-Trap Cutting: Its Place in the Future of Flash Memory

imec has demonstrated that it is feasible to introduce air gaps in the inter-gate dielectric layer. However, currently, these cavities in the storage cells only extend up to the blocking oxide layer. What if we could drill deeper into the storage cell and introduce air gaps into the blocking oxide and charge-trap layer regions?

We tested this method in simulations, and the results show that this charge-trap layer separation (or charge-trap cutting) can increase the storage window of the storage cell (Figure 5). Additionally, charge-trap cutting can prevent the lateral migration of charge trapped in the storage cell along the SiN line from top to bottom in the oxide/word-line stack height direction.

5. The difference between a continuous gate stack (left) and a gate stack with charge-trap layer cutting and air-gap integration (right).

Data is stored in flash cells by programming the threshold voltage to different levels. To store one bit of data, the cell needs two levels: for example, 0V and 1V. To store two bits of data, the cell needs four levels: for example, 0V, 0.5V, 1V, and 1.5V. As the number of bits increases, the number of required voltage levels also increases.

It is necessary to either increase the total range of the threshold voltage (storage window) or reduce the interval between adjacent levels (1 V when using 1 bit, 0.5 V when using 2 bits). However, when these voltage levels are too close together, it becomes more difficult to distinguish between them. By increasing the storage window, the charge-trap cutting technology can help each storage cell achieve more levels and thus store more bits.

However, integrating charge-trap cutting in 3D NAND flash memory is not easy, as it requires directional etching and deposition on the walls of extremely deep and narrow holes. For this structure, the technology toolkit used for 2D NAND flash memory is no longer applicable. Currently, imec is working with its suppliers to develop new technologies to achieve controllable charge-trap cutting.

Once the charge-trap layer can be interrupted, imec plans to combine it with the air-gap integration scheme to provide a complete and scalable solution to the z-spacing scaling challenge.