Impact of transients on AI accelerator card power supply

Publisher:EE小广播Latest update time:2023-10-26 Source: EEWORLDAuthor: Hamed Sanogo,终端市场专家Keywords:AI Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

summary


Graphics processing units (GPUs), tensor processing units (TPUs), and other types of application-specific integrated circuits (ASICs) enable high-performance computing by providing parallel processing capabilities to meet the needs of accelerating artificial intelligence (AI) training and inference workloads.


AI requires a lot of computing power, especially when learning and reasoning. This demand continues to push the boundaries of power distribution networks to unprecedented new levels. These high-density workloads become more complex, and higher transient demands drive every part of the power distribution network to operate efficiently. The stringent power consumption requirements of AI accelerator cards also have an impact on system performance. This article will discuss the power distribution network requirements for AI accelerator cards, analyze the impact of transients, and introduce ADI's multiphase power supply solutions for these needs.


Introduction


AI technology has completely changed the computing architecture to replicate neural networks that mimic the human brain. AI may seem to be widespread, but in fact, the technology that drives AI is still evolving. Processor accelerator ICs specifically used for AI computing include GPUs, field programmable gate arrays (FPGAs), TPUs, and other types of ASICs. This article refers to them collectively as xPUs.


As AI technology deployment advances rapidly, data centers will continue to purchase AI accelerator cards in bulk. According to a Gartner report, AI chip revenue totaled more than $34 billion in 2021 and is expected to grow to $86 billion by 2026. 1 xPU uses a large-scale parallel computing scheme, which has achieved a huge leap in AI performance compared to ordinary CPUs. With a large number of small cores, xPU is well suited for AI workloads, which helps with neural network training and AI inference. However, xPUs usually consume relatively large power for AI calculations and moving data. In short, xPUs are very power-hungry ICs. Their strict power consumption requirements pose new challenges to AI accelerator cards, which also affects system performance. This article will analyze the power supply network requirements of AI accelerator cards and introduce the multi-phase power supply solution proposed by Analog Devices to meet these stringent requirements.


Power supply challenges brought by AI


AI involves many aspects, but energy efficiency is not one of them. AI requires extremely high computing power when it is working, especially when processing AI workloads such as deep learning and inference. At the system level, AI accelerators play a key role in providing near-instant results, which is what makes it valuable. All xPUs have multiple high-end cores that are made up of billions of transistors and consume hundreds of amps of current. The core voltage (V CORE ) of these xPUs has been reduced to a level below 1.0 V. Figure 1 shows a general block diagram of an AI accelerator card. This article will focus on the multi-phase controller and corresponding power stage IC proposed for such systems.


Figure 1. General AI accelerator card block diagram


The peak current density required by AI accelerator cards is too much for any motherboard to handle. The highly dynamic nature of the workload and the extremely high current transients result in very high di/dt and spike voltage transients lasting several microseconds, which are very destructive and can cause damage to the xPU. The average workload of AI lasts for a long time, and the decoupling capacitors will not always be able to provide the energy to meet the immediate needs. The next section of this article will introduce the multi-phase point-of-load (PoL) solution proposed by Analog Devices, which will eliminate the transients of typical AI accelerators and avoid stressing the entire power distribution network. But first, let's discuss the power design challenges brought by AI.


AI brings new power supply design challenges


Currently, AI power demands far exceed the capabilities of traditional power delivery networks. The requirements for the xPU voltage regulator (VR) are very different from standard PoL regulators. The industry has found that some applications require more than 1000 A of current to be supplied to the xPU at a voltage of less than 1 V. It is important that the power supply is very stable and produces very little noise while eliminating all voltage transients that may cause false triggering inside the xPU. To cope with the staggering current demands, the design of a high-performance AI accelerator VR PoL must meet certain key requirements.


Voltage spike and transient management


One of the key requirements for AI accelerator cards is that the architecture of VR should provide excellent transient voltage management. Providing kilowatts of power to any system is always the primary challenge. The output voltage (including tolerance, ripple, and load transient sags and peaks) must always be higher than the xPU minimum voltage to avoid system hangs, and must also always be lower than the xPU maximum voltage to avoid damaging the xPU. The transient power spikes of the accelerator card may require 2 times or even more of the maximum thermal power target.


What is important here is that the PoL loop bandwidth must be flexible enough to handle the various faster transients encountered. The higher the bandwidth, the faster the loop response and the smaller the voltage deviation. One of the more straightforward ways to achieve a fast transient power rail is to choose a regulator with fast transient performance. The ADI AI V CORE series of ICs feature very low frequency output noise, fast transient response, and high efficiency. In addition to this, the ADI AI power chipset also supports load lines, which helps power designers effectively manage transients and spikes caused by AI workloads.


I 2 R Losses and Thermal Management in Long Power Path Routes


As AI xPU processors continue to increase in current, the density of PoL power delivery solutions has become a critical factor. It is now extremely difficult to reliably deliver power to each part of the xPU without worrying about the dissipated heat affecting the reliability of the chip and causing thermal runaway. In other words, thermal management is one of the major challenges in designing such high-power power supplies. The traditional power delivery method places the regulator on one side of the xPU to transfer power laterally to the processor. Even the smallest resistance of these traces can cause unacceptable voltage (I 2 R) drops. The voltage drop across the resistance of the PCB power plane increases proportionally with the increase in xPU current. This means that a few centimeters of PCB power traces between the VR and the BGA pins will generate a lot of losses. Such losses in the PCB copper power plane have become a dominant factor in computing the efficiency and performance of the regulator design. Compared with the traditional 3-chip (discrete) power delivery solution that requires a large number of high-current traces, the use of a single-chip power stage IC with integrated current and temperature circuit blocks can greatly reduce the number of traces on the PCB.


ADI value proposition : MAX16602 + MAX20790 + coupled inductor


The accuracy of AI regulators has become more stringent. Efficiency and size are top priorities. Performance and power consumption are also under scrutiny. As mentioned in the previous section, solving AI accelerator card VR design problems has become a difficult task. Designers are well aware that large steps in the required current cannot be generated without effectively handling unwanted transient effects. Addressing these transient effects also requires some type of high-precision dynamic voltage positioning or load line scheme. Analog Devices has invested heavily in the AI ​​market and provides a complete solution for 48 V and 12 V systems. This section introduces the ADI AI multiphase power chipset, namely the MAX16602 multiphase controller and the MAX20790 power stage, as well as our patented coupled inductor (CL) technology to help solve these AI PoL design challenges. Figure 2 shows the simplified block diagram connection of the MAX16602, MAX20790, and CL for the 8-phase MAX16602CL8_EV design. This relatively simple design achieves a high current delivery capability of approximately 88 A PK per phase. Internal compensation and advanced control algorithms, along with integrated current sensing circuitry in the power stage and coupled inductors, make this a small size solution with excellent efficiency.


Figure 2. An 8- phase VR design using ADI ’s highly integrated power chipset helps achieve high-density design while reducing external connections.


Single-chip intelligent power stage IC with higher integration density


The MAX20790 is a feature-rich smart power-stage IC designed to work with the MAX16602 (and several other ADI controllers in this product family) to implement high-density multiphase regulators. This monolithic integration virtually eliminates the parasitic resistance and inductance between the FETs and drivers that are common in discrete designs, allowing high switching speeds with significantly lower power losses than traditional solutions. If a switch node (V X ) fault is detected, the power stage is immediately shut down and the fault ID is communicated to the controller. The smart power-stage IC also has an on-chip current sensor. This current sensing circuit block is clearly superior to the method using the inductor DC resistance. It is well known that DCR sensing is inaccurate and requires temperature compensation to make the current measurement reliable.

[1] [2]
Keywords:AI Reference address:Impact of transients on AI accelerator card power supply

Previous article:Improving visual experience: MIPI DSI-2 empowers a new generation of AR/VR
Next article:Mouser now sells Raspberry Pi 5 single-board computer that runs far faster than its predecessor

Recommended ReadingLatest update time:2024-11-16 09:39

Yang Yuanqing announced Lenovo’s first AI PC: it can run personal large models and will be available as soon as September next year
On October 24, the 2023 Lenovo Innovation and Technology Conference Tech World was officially held. At the conference, Lenovo Group Chairman and CEO Yang Yuanqing led the leadership and top industry partners to deeply explore the current status and future of artificial intelligence. At the same time, Yang Yuanqing dem
[Home Electronics]
The large-scale application of flow batteries is full of expectations! At the 2nd Midea Visionary Conference, three academicians discussed AI and energy storage and the future direction of industrial development
On October 16, the second Midea Visionary Conference was held in Shunde, Foshan. Academician of the Chinese Academy of Sciences and professor of Huazhong University of Science and Technology Ding Han, Academician of the Chinese Academy of Sciences and professor of Southern University of Science and Technology Zhao Tian
[New Energy]
Latest MLCommons results announced, Intel demonstrates strong AI inference performance
Intel products show advantages in new MLCommons AI inference performance test Today, MLCommons announced the MLPerf inference v3.1 performance benchmark test results for the 6 billion parameter large language model and the computer vision and natural language processing model GPT-J, including the fourth-generation H
[Embedded]
Latest MLCommons results announced, Intel demonstrates strong AI inference performance
NVIDIA GTC: Huang Renxun will release new AI and Metaverse technologies
NVIDIA GTC: Huang Renxun will release new AI and Metaverse technologies, and the conference will provide more than 200 meetings, trainings, demonstrations, etc. brought by top technical experts and business executives Deep learning experts Yoshua Bengio, Geoff Hinton, Yann LeCun and more will be at the world's top A
[Network Communication]
NVIDIA GTC: Huang Renxun will release new AI and Metaverse technologies
Gartner: Global AI PC shipments are expected to account for 43% of total PC shipments in 2025
By 2026, Al laptops will be the only choice for large enterprise laptops According to the latest forecast from Gartner, global shipments of artificial intelligence personal computers (AI PCs) will reach 114 million units in 2025, a 165.5% increase from 2024 . Gartner defines AI P
[Home Electronics]
Gartner: Global AI PC shipments are expected to account for 43% of total PC shipments in 2025
Mobvoi launches new TicPods 2 series AI interactive true wireless earphones with smaller charging box
Mobvoi has launched the new TicPods 2 series of AI interactive true wireless earphones, which not only have Tickle touch control but also TicMotion head gesture control. The earphones are available in three colors, including Straight Blue, Confession, and Circle Fan. The TicPods 2 series uses a more miniaturized charg
[Embedded]
By connecting robots to the world, AI has the potential to gain intelligence that surpasses humans!
When the capabilities of ChatGPT-4 and other large models were revealed to the world, people began to fear the future. On the one hand, the rapid development of science and technology has brought about a rapid increase in production efficiency, but on the other hand, some people's jobs may be lost... "GPT-4 and
[robot]
Xiong Dapeng, CEO of Yizhu Technology: Welcome the new turning point of computing power growth with AI chip architecture innovation
October 16, 2024 - At the SEMiBAY2024 "HBM and Memory Technology and Application Forum", Xiong Dapeng, founder, chairman and CEO of Yizhu Technology, delivered a speech entitled "Beyond the Limits: Technical Challenges and Solutions Facing High-Computing Power Chips". Dr. Xiong Dapeng proposed that dr
[Network Communication]
Xiong Dapeng, CEO of Yizhu Technology: Welcome the new turning point of computing power growth with AI chip architecture innovation
Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号