Driven by the Internet of Things, big data and artificial intelligence (AI), a wide range of industries from transportation and healthcare to retail and entertainment will be transformed, which Applied Materials calls the AI computing era.
In previous computing eras, mainframes/minicomputers, PCs/servers, and smartphones/tablets have all benefited from the advances of Moore’s Law, which has enabled simultaneous increases in performance, power, and area/cost (also known as “PPAC”) with 2D scaling.
Although various applications in the AI era are booming, Moore's Law has slowed down; therefore, the industry needs to make breakthroughs beyond 2D scaling to promote PPAC in a new way. Specifically, we need new computing architectures, new materials, new structures (especially area-saving 3D structures), and advanced packaging for chip stacking and heterogeneous designs.
Architectural changes in the AI era are having an impact on both logic and memory. Machine learning algorithms make heavy use of extremely complex matrix multiplication operations in general logic, which is driving a shift in accelerators and their memories. AI computing involves two distinct memory tasks: the first is to store the intermediate results of the computation; the second is to store the weights associated with the training model.
Performance and power consumption are both important for cloud and edge computing, and innovations in memory can help. One approach to using existing memory technologies is "near memory," where large amounts of working memory are compressed and placed physically close to logic memory, connected by high-speed interfaces. For example, 3D stacking and through-silicon via technologies are becoming increasingly popular. A major drawback of SRAM and DRAM as "working memory" in these applications is that they are volatile memories and require continuous power to retain data (such as weights).
To reduce power consumption in the cloud and at the edge, designers are evaluating new memories that combine high performance with non-volatility because they only use power when actively reading or writing. Three approaches are leading the new class of memories: magnetic RAM (MRAM), phase-change RAM (PCRAM), and resistive RAM (ReRAM).
Instead of using electrical charge, the three memories use new materials to create different states of resistivity, with high and low resistance representing 1 and 0, respectively. MRAM uses changes in the direction of a magnetic field to control resistivity. PCRAM uses changes in the material's arrangement structure from an amorphous state to a crystalline state. ReRAM creates a path for current in the material. Both PCRAM and ReRAM offer intermediate stages of resistivity, which can store multiple layers of bit data in each cell.
Let’s take a look at computing applications in the AI era and learn how they are driving innovation in our future landscape.
Figure 1. The AI era drives the renaissance of semiconductor innovation
IoT edge applications can be divided into low-performance/low-power applications and high-performance/high-power applications.
For example, a security camera with AI algorithms is a low-performance/low-power application, and AI algorithms are well suited for applications such as facial and voice recognition. The design goal is to process as much data as possible at the edge and transmit only the important information to the cloud. Since the sampling frequency is low, the performance requirements are also low. Power consumption, including standby power consumption, is critical, especially for battery-powered devices.
Currently, the industry uses SRAM memory in edge devices. SRAM is not an ideal choice because each memory cell requires up to six transistors and the source leakage power can be high. In terms of storage weight, SRAM is not energy efficient, especially when used in low-frequency designs. As an alternative, MRAM promises to increase transistor density by several times, thereby achieving higher storage density or smaller chip size. Another key feature of MRAM is that this product is specially designed to be installed in the back-end interconnect layer of embedded system-on-chip products (SOC). MRAM can be used to store the operating system and applications of the SOC, which can be achieved without the use of embedded flash chips, thereby reducing the total number and cost of system chips.
High-performance “near-edge” applications, such as defect detection and medical screening, require even higher performance. A variation of MRAM, called spin-orbit torque MRAM (SOT-MRAM), has been shown to potentially outperform spin-transfer torque MRAM (STT-MRAM) in terms of speed and power consumption.
Cloud computing requires the highest possible computing performance, and training requires a large amount of data to be transferred to the vicinity of the machine learning accelerator. Accordingly, the machine learning accelerator needs to be provided with a large on-chip SRAM cache supplemented by a large off-chip DRAM array - which requires the use of a continuous power supply. Power consumption is very important to cloud service providers because data in the AI era will grow exponentially, while power grid power is limited and costly. PCRAM has lower power consumption and cost than DRAM, and higher performance than solid-state drives and mechanical hard drives, making it the preferred solution for cloud computing architecture.
In addition to the broad prospects for the above-mentioned "binary" edge applications, near-edge applications, and cloud applications, research on in-memory computing is also deepening. It is conceivable that frequent matrix multiplication operations are performed in memory arrays for machine learning. Designers are exploring pseudo-crosspoint architectures, in which weights are stored on individual memory nodes. PCRAM, ReRAM, and even ferroelectric field effect transistors (FeFETs) are excellent candidates because they all have the potential for multi-layer storage per cell. At present, ReRAM looks to be the most suitable memory for such applications. Matrix multiplication operations can be completed within the array using Ohm's law and Kirchhoff's law without moving weights in and out of the chip. Multi-layer cell architectures enable a new level of memory density, supporting the design and use of larger models. Comprehensive development and engineering of new materials are required to make these new analog memories a reality, and Applied Materials is currently actively exploring some of the most representative solutions.
As the exponential growth of Moore's Law slows, the AI era will usher in exponential growth in data. This pressure is already driving innovation in architecture, materials, 3D structures, and advanced packaging for chip stacking and heterogeneous integration. Memory is becoming increasingly closely related to the AI computing engine, and ultimately, memory may become the AI computing engine itself. As these innovations emerge, we will see significant improvements in performance, power consumption, and density (area/cost) - as new types of memory are gradually optimized, the needs of edge, near-edge, and cloud applications will eventually be met. We need a complete renaissance in hardware to unleash the full potential of the AI era.
Previous article:Popular Science - The Development Prospects of Magnetic RAM (MRAM)
Next article:A look at the 52-year journey of flash memory technology
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Download from the Internet--ARM Getting Started Notes
- Learn ARM development(22)
- Learn ARM development(21)
- Learn ARM development(20)
- Learn ARM development(19)
- Learn ARM development(14)
- Learn ARM development(15)
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- EEWORLD University Hall----Analog Integrated Circuit Design (Li Zhangquan, Shanghai Jiaotong University)
- [Fudan Micro FM33LG0 Series Development Board Review] CAN Test 2
- Repair a weird HC-SR04
- Request a free ZVS buck-boost evaluation board!
- bq30z50/55 senc file export configuration process
- Three-cell lithium battery charging management IC-VAS5176 charging current test
- USB TYPE C interface problem urgent help
- Notice on the extension of ON Semiconductor and Avnet IoT Innovation Design Competition
- RTC timekeeping function + GPIO control LCD12864 display
- On-chip memory resources of RSL10