Research on SoC low power consumption technology based on adaptive DVFS

Publisher:温柔微笑Latest update time:2014-10-09 Source: eefocus Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
From the perspective of current embedded consumer electronic products, media processing is gradually integrated with wireless communication and 3D games. Its powerful functions have brought about an increase in chip processing capabilities. In complex mobile application environments, power consumption is increasing significantly. For example, for mobile phones, users often want longer standby time, music listening time, and MPEG4 viewing time. In this context, it is urgent to reduce the power consumption of embedded chips.

1 Analysis of low power consumption technology

  Table 1 gives a low power consumption technology analysis table. As can be seen from Table 1, as the channel width decreases, the dynamic power consumption and static power consumption per unit area are increasing.

The power consumption of the chip can be described as:

In the formula: CeffVdd2fclock is the dynamic power consumption part. Where a is the flip rate at the current frequency; Ceff is the node load capacitance; Vdd is the operating voltage; fclock is the operating frequency. IleakVd is the static power consumption part, where Ileak is the leakage current. From formula (1), we can know the parameters that need to be reduced to reduce chip power consumption.

1.1 Means of reducing dynamic power consumption

1.1.1 Reducing α


There are two ways to reduce α: one is to reduce α by optimizing the logic structure through tools; the other is to achieve low α through coding, such as using flip codes. In fact, assuming that each flip is effective and optimal, afclock can be regarded as a constant, but the actual situation is not the case. There is often redundancy in the design driven by each clock. At the same time, for a certain rated upper-level task itself, it may not be suitable for software and hardware division. For fclock, if the module is not used, the module can be directly gated. There are three means of this gate:

(1) Gating at the clock generation end, configured by software. This means requires the design of such functions at the front end, including positive clock gated and reverse clock gated, and its structure is symmetrical. In fact, when designing, the device lib will provide a standard gated unit, which makes the front-end design easier.

(2) Perform hardware judgment in the module and use the gated clock. For example, there is a memory on the AHB bus as an AHB slave. Since the software frequently accesses the module, frequent gated operation will result in discontinuous operation. If the module is designed internally, the HSEL signal of the AHB becomes high, and the next clock is turned on inside the module, which can save the power consumption of the clock flip. Especially for memory, the power consumption of clock flip and non-flip is quite different.

(3) Use synthesis tools to add gates at the near end without designing at the front end.

In theory, a simple frequency reduction will not bring about a change in power consumption, because the workload is fixed, and a frequency reduction can only bring about an increase in running time. However, the power consumption of the clock tree accounts for almost 30% of the chip power consumption, so when the frequency is appropriately reduced, the power consumption of the clock tree will be reduced.

1.1.2 Reduce Ceff

The reduction of Ceff varies greatly depending on the process selection. Therefore, choosing a suitable process is more conducive to reducing Vdd, which can reduce power consumption in a square relationship. However, due to cost, reliability and business considerations, only one process can be selected, such as the 130 nm process, and the voltage can be changed through DVFS. Its core is: (1) The library under a certain process can work normally within a certain voltage range.

(2) Since the operating frequency required for modules or systems working under different tasks is different, the benefits of DVFS can be calculated. Suppose a system can perform MP3 or MP4 decoding tasks. In this way, when MP3 decoding, the required frequency is only 100 MHz; when MP4 decoding, the required frequency is 200 MHz. Through STA analysis, when working at a voltage of 1.1 V, the system can operate at a frequency of 100 MHz; when working at a voltage of 1.3 V, the system can operate at a frequency of 200 MHz. In this way, DVFS technology can be used for adjustment. Assuming that the flip rate and capacitance have not changed, the required power consumption in the two different working modes is reduced by 64%. Of course, the previous values ​​are all hypothetical, and the actual situation is not so ideal.

1.2 Means of reducing static power consumption

To reduce static power consumption, two methods, Multi-Vdd and Multi-Vth, are available, which will not be described in detail here.

2 DVFS system

If DVFS is based on the needs of the CPU's own OS scheduling, the voltage will only change when the frequency itself needs to change. At this time, it can be considered an open-loop DVFS technology. For example, the OEMidle process in Windows Mobile provides a way to adjust the CPU frequency and voltage according to the CPU occupancy rate. However, when using the open-loop adjustment method, sufficient margin is required, and software, especially the operating system, needs to support it, which is not transparent to the software.

For a closed-loop system, a performance monitor is required to monitor performance and directly adjust voltage and frequency according to performance changes. Figure 1 shows a simple adaptive DVFS system.


 

In this system, the CPU is a power do-main with variable voltage, called CPU-subsys. However, for other modules, there is another power domain, called peri_subsys, which includes external memory interface (EMI), media coprocessor (MCP), LCD controller (LCDC), and PerRFormaneeMonitor (PM) module related to voltage control, which is used to monitor chip performance positively; The Power Controller (PC) module is used to calculate the control parameters after receiving the performance description of PM and pass it to the Power Supply (PS) module to provide variable voltage Vdd_arm. At the same time, there is a Level shifter between armsubsys and peri subsys.

For PM module, ARM can be configured through the bus, and PM realizes performance monitoring by monitoring the current in the variable voltage area. For operations with high MIPS requirements, the CPU idle time becomes less and the current demand becomes smaller; for operations with low MIPS requirements, the CPU performs intensive operations and the current demand becomes larger. [page]

The core of this design is how to make the PM module adaptively predict the current demand according to a certain algorithm, and the predicted response time and additional power consumption are relatively small, that is, to achieve the timely and appropriate voltage requirements. For the adaptive algorithm, the simple forward linear prediction shown in Figure 2 can be selected.

3 Simulation Experiment and Results

Figure 3 shows the system model. Such a system is constructed so that the test will be based on the given benchmark program that was run on the development board in advance. The power consumption parameters obtained from the test are converted into normalized nop and mac instruction programs according to the CPU load. These two types of instructions are distributed in the middle of the test vector. The CPU behavior model executes the relevant program. The model can only fetch instructions and execute 2-stage pipelines. For nop operations, nop is performed in the execution phase; for mac operations, mac is performed on fixed data in the execution phase, which can simplify the design. The CPU BM is written in Verilog. The CPU has an AHB bus to control access to memory. The MEM module uses the ahb interface to store compiled binary instructions and fix the frequency. The PM Model monitors the flip rate of the CPU BM. After monitoring the flip rate of each stage, it flows into the adaptive filter as input, calculates the required regulation voltage, and gives it to the PS Model; at the same time, it outputs the flip rate to the PC Model.

The PC Model uses the flip rate, clock, and voltage as inputs to calculate the system power consumption. The PS Model adjusts the voltage and frequency according to the voltage adjustment instructions issued by the PM. Since it is an rtl Model, the voltage adjustment is invisible. It is just based on the actual situation. If the voltage goes from low to high, the voltage is adjusted first, and then the frequency is adjusted; vice versa.

For the adaptively selected voltage, it can be implemented as shown in Figure 4. Table 2 shows the relationship between the CPU and bus frequencies when the voltage is implemented according to the 130 nm process. When adjusting the voltage, the clock is paused for several clock cycles. Assuming that the RC parameters of the power supply network remain unchanged, it is considered that the voltage switching is proportional to the switching voltage difference, as shown in Figure 4.



The step size of the forward prediction is adjusted from 1 to 50 ms according to the beat of the real-time operating system. Through practice, different power consumption values ​​under different step sizes can be obtained as shown in Figure 5, and the additional overhead of each switch is also calculated.

As shown in Figure 5, there is a certain relationship between power consumption, efficiency and adjustment step size. After selecting the adjustment step size reasonably, the efficiency and power consumption can be balanced. When the step size is 25 ms, the power consumption is less than 25% of DVFS, and the efficiency loss is only 1/3. It can be seen that when the total load utilization of CPU resources is 30%, the step size is relatively reasonable.

4 Conclusion

An adaptive dynamic voltage and frequency adjustment method is provided, and a corresponding system model is constructed. The model is simulated on a computer and a set of balanced forward prediction parameters is obtained. The experimental results verify the effectiveness of the adaptive dynamic voltage and frequency adjustment method and provide an effective way to evaluate the dynamic voltage and frequency adjustment simulation.
Reference address:Research on SoC low power consumption technology based on adaptive DVFS

Previous article:Embedded Applications' Requirements for Microprocessors
Next article:A brief analysis of the design ideas of embedded DVR in financial monitoring

Recommended ReadingLatest update time:2024-11-17 00:19

Implementation of complex SoC design based on LEON3 processor and Speed ​​coprocessor
Preface With the development of science and technology, signal processing systems require not only multi-functions and high performance, but also short development and production cycles. Programmable dedicated processors are undoubtedly the best way to achieve this goal. Programmable dedicated processors can be
[Embedded]
Implementation of complex SoC design based on LEON3 processor and Speed ​​coprocessor
Technology and ecology "resonate at the same frequency", Arm Technology and Cixin Technology work together to promote the development of the Arm CPU industry
Recently, ARM Technology (China) Co., Ltd. (hereinafter referred to as "ARM Technology") and C-Core Technology (Shanghai) Co., Ltd. (hereinafter referred to as "C-Core Technology") announced to deepen cooperation. The two parties will combine their respective advantages and resources, relying on ARM Technology's high-
[Embedded]
Technology and ecology
Redmi's cheapest 5G phone may be equipped with MediaTek Dimensity 800 series SoC
       On May 8, according to 91Mobile, the new Redmi 5G phone is about to debut, equipped with MediaTek Dimensity 800 series chips.   It is reported that Dimensity 800 is MediaTek's 5G chip for the mid-range market. It is built based on a 7nm process, supports SA and NSA dual-mode 5G, and consists of Cortex A76×4+Cor
[Mobile phone portable]
PowerVR GPU + NNA joins forces with RISC-V CPU to innovate the ecosystem
Imagination Technologies has announced that it has joined SiFive’s DesignShare ecosystem, giving system designers easy access to its industry-leading PowerVR graphics processor (GPU) and neural network accelerator (NNA) silicon intellectual property (IP) cores. PowerVR GPUs will be the first fully functional GPUs supp
[Internet of Things]
PowerVR GPU + NNA joins forces with RISC-V CPU to innovate the ecosystem
Dual ARM7 SoC reference design and multi-voltage AVS implementation
    The use of voltage scaling techniques in conjunction with frequency scaling techniques adds new principles to clock switching to ensure safe voltage levels for the new clock frequency. In addition, the voltage scaling function requires the creation of voltage domains within the SoC. This creates a voltage domain i
[Microcontroller]
Dual ARM7 SoC reference design and multi-voltage AVS implementation
Keil 5 uses JLink to connect to a running CPU
environment: Target CPU: STM32F429 Connection tool: JLink V9 Connection method:SWO Debugging tool: Keil MDK 5.20 scenes to be used: The target board is running, but there is a bug and needs to be debugged, but there is no online debug. So you need to use a debugger to attach to the running board without
[Microcontroller]
Keil 5 uses JLink to connect to a running CPU
Tianbo Education's "Elf in Hand" smart card uses Nordic low-power Bluetooth SOC
Nordic Semiconductor has announced that Guangdong Telpo Education Technology Co., Ltd. (TelpoEdu), a Guangdong-based developer of smart campus hardware, has chosen Nordic’s nRF52810 Bluetooth® Low Energy (Bluetooth LE) System-on-Chip (SoC) to provide wireless connectivity for its “Elf in Hand” smart card, which suppor
[Internet of Things]
Tianbo Education's
Latest Microcontroller Articles
  • Download from the Internet--ARM Getting Started Notes
    A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
  • Learn ARM development(22)
    Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
  • Learn ARM development(21)
    First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
  • Learn ARM development(20)
    With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
  • Learn ARM development(19)
    After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
  • Learn ARM development(14)
  • Learn ARM development(15)
  • Learn ARM development(16)
  • Learn ARM development(17)
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号