1 Overview
The complexity of SoC chip design is increasing day by day, and its internal clock design is becoming more and more complex. There are usually several clock domains inside a SoC chip. The system dynamic power consumption caused by the clock network has become a research hotspot in recent years. The system dynamic power consumption caused by the clock network can be divided into two aspects: (1) Since the clock network is used to provide clock signals for all timing units inside the chip, the speed of the clock frequency determines the dynamic power consumption of the timing unit and the logic unit connected to it. Turning off the clock will eliminate the dynamic power consumption of the circuit. (2) The characteristics of the clock network itself will lead to huge dynamic power consumption: 1) The clock network is the largest interconnection network in the chip, and its load is huge. The load comes from a large number of delay units inserted due to the capacitance of the interconnection line and the deviation of the balanced clock tree; 2) The clock network is the interconnection network with the highest flip rate in the chip. The flip rate directly determines the dynamic power consumption of the interconnection line and the dynamic power consumption of the standard unit driven by the interconnection line.
Aiming at the two types of system dynamic power consumption caused by the clock network, this paper studies and implements three clock low power technologies.
2 Dynamic Clock Management
The working state of a SoC chip varies greatly. Some applications require all modules inside the chip, while other applications only require some modules. In some applications, the chip needs to run at full speed, while in other applications, it can run at a very low operating frequency [1]. Combining the above two points, dynamic management of the chip clock can be divided into two aspects: dynamically switching the clock of the chip's internal modules, and dynamically configuring the clock frequency of the chip's internal modules.
This article takes the audio and video decoding SoC chip - rsthu1 as an example to introduce the dynamic clock management technology used in system-level design.
In order to realize the dynamic clock management of rsthu1, four working modes are defined in the chip system-level design, as shown in Table 1, where a solid circle indicates that this module is turned on; a hollow circle indicates that this module is turned off.
When the chip works in normal mode, the high-speed clock HCLK is used to supply the four main modules in the system, Risc0, Risc1, Decoder, and BE (Bit Engine). At this time, the system runs at full speed to perform audio and video decoding. When the chip works in low-speed mode, the low-speed clock VCLK is used to supply the above four modules. At this time, the system can run simple applications, ensuring the continuous operation of the system and reducing the clock frequency, that is, reducing the dynamic power consumption of the system. When the chip works in idle mode, only the operation of the operating system is retained, and the low-speed clock VCLK is used to supply Risc0, and the clock supply of other modules is turned off, eliminating the dynamic power consumption generated by the modules except Risc0. When the chip works in sleep mode, the clock supply of all modules is turned off to eliminate the dynamic power consumption generated when the system is not working.
Prime Power, a power analysis tool from Synopsys, is used to perform power analysis based on simulation waveforms of four operating modes at the RTL level. The results are shown in Table 2.
It can be seen that after adopting dynamic clock management technology in system-level design, the system dynamic power consumption decreases step by step in the four working modes of normal, low speed, idle and sleep, and the power consumption optimization effect is obvious.
3. Gated Clock
The following statements often appear in RTL code:
@posedge (CLK)
begin
if (EN == 1)
Data_out = Data_in;
end
If the above code is directly subjected to logic synthesis, the circuit structure shown in Figure 1 will be generated. The control signal for controlling the register state update is placed before the input end of the register, and the register state is updated by controlling whether new data is received. In the circuit of this structure, the register clock end is still flipping when the register state is not updated, which will waste the system dynamic power consumption.
The structure of Figure 2 is adopted, and the control signal is placed before the clock end of the register. By controlling whether the register is flipped, the register state is controlled to be updated. Compared with the circuit structure of Figure 1, the clock signal of the circuit structure of Figure 2 will not flip when the register state is not updated, eliminating the waste of system dynamic power consumption caused by this. Since multiple MUXs are replaced by a gated clock unit, the power consumption is further reduced.
The gated clock unit can be inserted during logic synthesis using the power optimization tool Power Compiler from Synopsys. The advantages are as follows [3]: (1) there is no need to modify the RTL code, as Power Compiler will automatically detect the statements in the RTL code where the gated clock can be inserted; and (2) the gated clock unit will be automatically inserted into the gate-level netlist during logic synthesis.
Power Compiler was used to perform gated clock synthesis on rsthu1, and Prime Power was used to perform power consumption analysis. The results are shown in Table 3. It can be seen that the total power consumption was reduced by 34.52% by using gated clock technology during logic synthesis, and the power consumption optimization effect was obvious.
4 Low Power Clock Tree Synthesis
By observing the growth process of the clock tree, it can be found that the growth of the clock tree is divided into horizontal expansion and vertical extension, as shown in FIG3 , where Arrow1 and Arrow3 are vertical extensions; Arrow2 is a horizontal expansion.
Ordinary clock tree synthesis aims to reduce clock skew, increase vertical extension, reduce horizontal expansion, invest more buffers, and adjust the delay of each clock path in a finer granularity to obtain smaller clock skew. The above method increases the scale of the clock tree at the expense of increasing the clock tree size. The clock tree obtained by synthesis is shown in Figure 4 (a).
For power consumption considerations, it is hoped that the scale of the clock tree can be reduced. The scale of the clock tree can be effectively reduced by reducing the vertical extension of the clock tree and increasing the horizontal expansion, as shown in Figure 4 (b). However, due to the reduction in the number of buffers, the flat clock tree will adjust the delay of each clock path in a coarse-grained manner compared to the deep clock tree, resulting in a larger clock deviation. It can be seen that the goal of reducing the scale of the clock tree is to perform low-power clock tree synthesis at the cost of increasing a certain clock deviation.
When performing clock tree synthesis, the back-end tool can constrain the clock tree structure through synthesis parameters, as shown in Table 4.
When performing clock tree synthesis on the fast clock HCLK of rsthu1, low-power clock tree synthesis with the goal of reducing the clock tree size is used, and the results are shown in Table 5. The maximum fan-out is increased, and the total path delay and the upper limit of the number of buffers at each level are reduced. After increasing the maximum fan-out, the clock tree size is reduced by 20.21%, while the clock skew only increases by 0.023 ns. Therefore, the deterioration of the skew results caused by reducing the clock tree size is acceptable.
5 Conclusion
At present, there are many clock low-power technologies, which can further reduce the power consumption caused by the clock network in the design of SoC chips. In future research work, more extensive and in-depth exploration is needed.
Previous article:A brief discussion on design techniques to reduce power consumption
Next article:Dynamic energy consumption management scheme for embedded systems
Recommended ReadingLatest update time:2024-11-16 15:20
- MathWorks and NXP Collaborate to Launch Model-Based Design Toolbox for Battery Management Systems
- STMicroelectronics' advanced galvanically isolated gate driver STGAP3S provides flexible protection for IGBTs and SiC MOSFETs
- New diaphragm-free solid-state lithium battery technology is launched: the distance between the positive and negative electrodes is less than 0.000001 meters
- [“Source” Observe the Autumn Series] Application and testing of the next generation of semiconductor gallium oxide device photodetectors
- 采用自主设计封装,绝缘电阻显著提高!ROHM开发出更高电压xEV系统的SiC肖特基势垒二极管
- Will GaN replace SiC? PI's disruptive 1700V InnoMux2 is here to demonstrate
- From Isolation to the Third and a Half Generation: Understanding Naxinwei's Gate Driver IC in One Article
- The appeal of 48 V technology: importance, benefits and key factors in system-level applications
- Important breakthrough in recycling of used lithium-ion batteries
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- DSP generates bin file method
- Leez AI LAN Object Detection
- Capacitors in circuits,,,,,
- [Rewarded Live Broadcast] Recommend "TI mmWave Packaged Antenna Sensor IWR6843" to achieve building industrial sensing simply and efficiently
- Theoretical speed calculation of WiFi protocols
- BQ76930: Design help needed for BQ76930 based BMS
- Application of the concepts of virtual short and virtual disconnect in operational amplifiers in actual product circuits
- What is the application of PWM
- Guess the questions in the electronic competition and win prizes!
- MSP430f5529