The AI era drives innovation and development of memory -EEWORLD

Collect

Driven by the Internet of Things, big data and artificial intelligence (AI), a wide range of industries from transportation and healthcare to retail and entertainment will be transformed, which Applied Materials calls the AI computing era.

In previous computing eras, mainframes/minicomputers, PCs/servers, and smartphones/tablets have all benefited from the advances of Moore’s Law, which has enabled simultaneous increases in performance, power, and area/cost (also known as “PPAC”) with 2D scaling.

Although various applications in the AI era are booming, Moore's Law has slowed down; therefore, the industry needs to make breakthroughs beyond 2D scaling to promote PPAC in a new way. Specifically, we need new computing architectures, new materials, new structures (especially area-saving 3D structures), and advanced packaging for chip stacking and heterogeneous designs.

Architectural changes in the AI era are having an impact on both logic and memory. Machine learning algorithms make heavy use of extremely complex matrix multiplication operations in general logic, which is driving a shift in accelerators and their memories. AI computing involves two distinct memory tasks: the first is to store the intermediate results of the computation; the second is to store the weights associated with the training model.

Performance and power consumption are both important for cloud and edge computing, and innovations in memory can help. One approach to using existing memory technologies is "near memory," where large amounts of working memory are compressed and placed physically close to logic memory, connected by high-speed interfaces. For example, 3D stacking and through-silicon via technologies are becoming increasingly popular. A major drawback of SRAM and DRAM as "working memory" in these applications is that they are volatile memories and require continuous power to retain data (such as weights).

To reduce power consumption in the cloud and at the edge, designers are evaluating new memories that combine high performance with non-volatility because they only use power when actively reading or writing. Three approaches are leading the new class of memories: magnetic RAM (MRAM), phase-change RAM (PCRAM), and resistive RAM (ReRAM).

Instead of using electrical charge, the three memories use new materials to create different states of resistivity, with high and low resistance representing 1 and 0, respectively. MRAM uses changes in the direction of a magnetic field to control resistivity. PCRAM uses changes in the material's arrangement structure from an amorphous state to a crystalline state. ReRAM creates a path for current in the material. Both PCRAM and ReRAM offer intermediate stages of resistivity, which can store multiple layers of bit data in each cell.

Let’s take a look at computing applications in the AI era and learn how they are driving innovation in our future landscape.

Figure 1. The AI era drives the renaissance of semiconductor innovation

IoT edge applications can be divided into low-performance/low-power applications and high-performance/high-power applications.

For example, a security camera with AI algorithms is a low-performance/low-power application, and AI algorithms are well suited for applications such as facial and voice recognition. The design goal is to process as much data as possible at the edge and transmit only the important information to the cloud. Since the sampling frequency is low, the performance requirements are also low. Power consumption, including standby power consumption, is critical, especially for battery-powered devices.

Currently, the industry uses SRAM memory in edge devices. SRAM is not an ideal choice because each memory cell requires up to six transistors and the source leakage power can be high. In terms of storage weight, SRAM is not energy efficient, especially when used in low-frequency designs. As an alternative, MRAM promises to increase transistor density by several times, thereby achieving higher storage density or smaller chip size. Another key feature of MRAM is that this product is specially designed to be installed in the back-end interconnect layer of embedded system-on-chip products (SOC). MRAM can be used to store the operating system and applications of the SOC, which can be achieved without the use of embedded flash chips, thereby reducing the total number and cost of system chips.

High-performance “near-edge” applications, such as defect detection and medical screening, require even higher performance. A variation of MRAM, called spin-orbit torque MRAM (SOT-MRAM), has been shown to potentially outperform spin-transfer torque MRAM (STT-MRAM) in terms of speed and power consumption.

Cloud computing requires the highest possible computing performance, and training requires a large amount of data to be transferred to the vicinity of the machine learning accelerator. Accordingly, the machine learning accelerator needs to be provided with a large on-chip SRAM cache supplemented by a large off-chip DRAM array - which requires the use of a continuous power supply. Power consumption is very important to cloud service providers because data in the AI era will grow exponentially, while power grid power is limited and costly. PCRAM has lower power consumption and cost than DRAM, and higher performance than solid-state drives and mechanical hard drives, making it the preferred solution for cloud computing architecture.

In addition to the broad prospects for the above-mentioned "binary" edge applications, near-edge applications, and cloud applications, research on in-memory computing is also deepening. It is conceivable that frequent matrix multiplication operations are performed in memory arrays for machine learning. Designers are exploring pseudo-crosspoint architectures, in which weights are stored on individual memory nodes. PCRAM, ReRAM, and even ferroelectric field effect transistors (FeFETs) are excellent candidates because they all have the potential for multi-layer storage per cell. At present, ReRAM looks to be the most suitable memory for such applications. Matrix multiplication operations can be completed within the array using Ohm's law and Kirchhoff's law without moving weights in and out of the chip. Multi-layer cell architectures enable a new level of memory density, supporting the design and use of larger models. Comprehensive development and engineering of new materials are required to make these new analog memories a reality, and Applied Materials is currently actively exploring some of the most representative solutions.

As the exponential growth of Moore's Law slows, the AI era will usher in exponential growth in data. This pressure is already driving innovation in architecture, materials, 3D structures, and advanced packaging for chip stacking and heterogeneous integration. Memory is becoming increasingly closely related to the AI computing engine, and ultimately, memory may become the AI computing engine itself. As these innovations emerge, we will see significant improvements in performance, power consumption, and density (area/cost) - as new types of memory are gradually optimized, the needs of edge, near-edge, and cloud applications will eventually be met. We need a complete renaissance in hardware to unleash the full potential of the AI era.

Keywords：Memory Reference address：The AI era drives innovation and development of memory

Previous article：Popular Science - The Development Prospects of Magnetic RAM (MRAM)
Next article：A look at the 52-year journey of flash memory technology

Popular Resources
Popular amplifiers

The AI ​​era drives innovation and development of memory

The AI era drives innovation and development of memory