How to Optimize Power Consumption in Embedded DSP Applications

Publisher:dadigtLatest update time:2011-08-09 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

Power efficiency can be improved by using hardware and software techniques, and this is more easily achieved using a DSP RTOS with built-in power management APIs.

By Scott Gary, Texas Instruments

Both wireless and wired system designers must pay attention to power efficiency, although their starting points are different.

For mobile devices, longer battery life, longer talk time or longer working time are obvious advantages. Reducing power requirements means using smaller batteries or choosing different battery technologies, which also alleviates the battery heating problem to a certain extent.

For wired systems, designers can improve battery efficiency by reducing the size of the power supply, reducing cooling requirements and reducing fan noise. Less often mentioned is the fact that improving power efficiency also frees up space for adding components that can increase system performance, which is important especially when the design team wants to add more than one processor.

When designing an embedded DSP processor or a system with stringent power requirements, using DSP-specific techniques, operating systems, and their supporting software can reduce power consumption. DSP or dual-processor designs that go beyond traditional techniques excel in energy conservation.

This article discusses traditional and DSP-specific power optimization techniques, beginning with definitions and explanations of the terms and principles used.

Power consumption basics

The total power consumption of a complementary metal oxide semiconductor (CMOS) circuit is the sum of dynamic power consumption and static power consumption [Reference 3]:

Dynamic power consumption occurs when the gate undergoes a logic state transition and generates the switching current required to charge the internal node and the shoot-through current caused by the P channel and N channel being turned on temporarily at the same time. Its approximate value can be estimated by the following formula:

Where Cpd is the dynamic capacitance, F is the switching frequency, Vcc is the supply voltage, and Nsw is the number of bits converted.

In addition, the voltage (Vcc) determines the maximum switching frequency (F) under stable operating conditions.

The above relationship contains two important concepts:

  • Dynamic power dissipation is linear with switching frequency and quadratic with supply voltage.
  • The maximum safe switching frequency depends on the supply voltage.

For the purposes of this discussion, a specific frequency and voltage pair is referred to as a "set point."

Obviously, reducing the CPU clock rate will reduce dynamic power proportionally. Since dynamic power has a quadratic relationship with the supply voltage, additional power reductions may be achieved by reducing the voltage without affecting system performance.

However, for a particular set of tasks, reducing the CPU clock rate will also proportionally increase the time to execute that set of tasks, so the application must be carefully analyzed to ensure that its real-time requirements are met.

Static power consumption is primarily due to transistor leakage current. Generally speaking, the static power consumption of CMOS circuits is very low and negligible compared to their dynamic power consumption. Embedded applications often "idle" the CPU clock during periods of inactivity to reduce dynamic power consumption, which can significantly reduce overall power consumption.

In future designs, special attention must be paid to static power consumption because new, higher-performance transistors will have significantly higher leakage currents [Reference 13].

Common technologies in embedded systems

Common power management techniques can be divided into two categories: implemented through early hardware design decisions or implemented while the system is running.

Decisions made early in the design are critical to meeting performance and power consumption requirements. The following are the top ten factors that need to be considered in the design, including hardware selection, design strategy, and architecture selection. Most of the factors are basic requirements for embedded systems, while others need to be considered separately. Although the following decisions are made early in the design, some still need to be re-verified throughout the design cycle. They are listed below:

  1. Choose low-power components
  2. Split voltage and clock domains;
  3. Support voltage and clock scaling function;
  4. Enable the hold voltage gating function;
  5. Use software interrupts to reduce polling;
  6. Adopting a hierarchical memory model;
  7. Reduce output load;
  8. Turn off non-critical, unpowered resources during boot;
  9. Minimize the number of active PLLs;
  10. Use the clock divider to quickly change frequencies.

Detailed information about the above list is shown in Table 1.

Table 1. Reducing power consumption through early hardware design decisions

After determining the system architecture, the design team needs to turn its attention to the system runtime environment. Although there are only 14 items listed below, most of them should be kept in mind during the design process.

  1. Turn off the gate clock when not needed
  2. Actively shut down unnecessary power consumption during boot
  3. Gate to power subsystems only when needed
  4. Activate peripheral low power mode
  5. Taking advantage of peripheral activity detectors
  6. Use auto refresh mode
  7. Determine the minimum required frequency and voltage through a benchmark application
  8. Adjust CPU frequency and voltage based on overall activity
  9. Dynamically schedule CPU frequency and voltage to match predicted workload
  10. Optimize code execution speed
  11. Use low-power code sequences and data models
  12. Use code coverage techniques to reduce the need for high-speed memory
  13. Reduced performance mode when replacing power supplies
  14. Balancing Accuracy and Power Consumption

A sophisticated design team must be at least conceptually familiar with the above embedded system application design elements (one of which is related to DSP circuits). Details on the above list are shown in Table 2.

Table 2. Common runtime techniques for reducing power consumption

Implementing the practices and strategies described in Tables 1 and 2 is not easy. Any design that reduces power consumption may have a negative impact on performance or cause system instability. The following table lists the main challenges faced when using basic power management techniques.

Table 3. Major challenges faced in actual embedded system design

How DSP RTOS solves the problem

Most experienced embedded systems designers know that many of the technical issues listed in Table 2 can be addressed in the operating system without having to start "from scratch" for each new design project.

A subset of the most valuable and generally accepted techniques mentioned above are already included in RTOS, including: idle, power down, device driver notification, memory management, V/F scaling. Building these techniques into an RTOS requires a lot of skill because the design goals vary. The designer must have the option to mix or match subsets. The key design goals are efficiency, flexibility, and loose coupling of the operating system.

The Power Manager (PWRM) of TI's DSP/BIOS™ operating system is well suited for use as a power management module for an existing RTOS [Reference 4]. Although the implementation described below is specific to DSP/BIOS, the concepts can be easily applied to other operating systems or even to non-OS environments.

Power Manager Requirements

The key requirements for a power manager implementation are as follows:

  1. Management decisions must be triggered by applications, not the operating system;
  2. Power management activities should be transparent to most application code;
  3. The power manager must support voltage and frequency (V/F) scaling and take full advantage of chip idle and sleep modes;
  4. The power manager must coordinate the handling of power events within application code, drivers, and the operating system itself, and notify clients when certain events occur;
  5. Power management features must be available in any threaded context and must also be available to multiple instances of a particular client (such as multiple instances of a device driver);
  6. When issuing power event notifications to clients, the power manager must support deferred completion of event processing and notify other clients while waiting for completion signals from deferred clients;
  7. The power manager must be scalable and portable to different platforms with different capabilities.

To meet the above key requirements, the power manager can be added as an auxiliary module of DSP/BIOS, as shown in Figure 1.

Figure 1. Power Manager Partitions

The power manager is outside the kernel and is not a task in the system, but rather a set of APIs that can perform tasks in the application control thread and in the context of the device driver.

This means that no kernel modifications are required. However, on platforms where the CPU clock is coupled to the OS timer clock, the DSP/BIOS clock module (CLK) needs to be supplemented with routines that are important for frequency scaling, as these routines can adapt to the OS clock as a client of the PWRM.

The power manager writes and reads clock idle configuration registers and interfaces directly with the DSP hardware through a platform-specific Power Scaling Library (PSL) [Ref. 5] that controls the CPU clock rate and voltage regulation circuitry. The PSL isolates the power manager and the rest of the application from the low-level implementation details of the frequency and voltage control hardware.

The power manager has several application-dependent tasks. It is statically configured by the design engineer and dynamically called at runtime:

  • Idle Clock Domains – The Power Manager provides interfaces to idle specific clock domains to reduce effective power consumption. It also provides mechanisms to automatically idle the DSP CPU and caches at appropriate points in the OS idle loop.
  • Reduce power consumption at boot time - The power manager includes a hook mechanism that enables developers to program power saving features to be automatically invoked at boot time.
  • Voltage and Frequency (V/F) Scaling - The Power Manager provides an interface that allows applications to dynamically change the operating voltage and frequency of the DPS core. Therefore, applications can use this feature to adjust power consumption according to relevant processing requirements. The Power Manager API can set whether the voltage in the application should be scaled along with the frequency, and whether tasks can continue to execute during the down-voltage transition process. The transition latency depends on the load and may be long. If the processor works normally during the down-voltage transition, the application is allowed to continue to execute. In addition, the Power Manager also includes APIs for querying V/F set point properties and latency.
  • Sleep Modes – The power manager includes configuration and runtime interfaces that allow developers to invoke custom sleep modes to save power during periods of inactivity.
  • Registration and notification of power events - To adjust V/F scaling, sleep mode, and other events throughout the application, the power management has a registration and notification mechanism that enables entities such as application code, peripheral drivers, package content, and OS clock modules to register for notification of specific events that affect these entities, such as "V/F set point change is about to occur", "V/F set point change is completed", "entering sleep mode", "waking up from sleep mode", and "power failure". The notification process is an important feature of the power manager. The "unregistered" function can be used when notification is not required.

Power Manager API

Table 4 summarizes the runtime APIs.

Function Number

Function

PWRM_changeSetpoint

Initialize changes to the latest V/F setpoints

PWRM_configure

Set new configuration parameters for PWRM

PWRM_getCapabilities

Get information about PWRM performance on this platform

PWRM_getCurrentSetpoint

Get the current effective set point

PWRM_getNumSetpoints

Get the number of setpoints available on this platform

PWRM_getSetpointInfo

Get the corresponding frequency and voltage value of the set point

PWRM_getTransitionLatency

Get the scaled delay between two setpoints

PWRM_idleClocks

Put a specific clock domain into idle mode immediately

PWRM_registerNotify

Registers a function to be called when a specific power event occurs

PWRM_sleepDSP

Transition the DSP to a new sleep state

PWRM_releaseDependency

Remove previously declared resource dependencies

PWRM_setDependency

Declare a dependency on a power-manageable resource

PWRM_unregisterNotify

Not registered for event notification from PWRM

Table 4. Summary of the Power Management Runtime API

Strategy Implementation

Now that the foundation for improving power efficiency has been established, the next step is to define a strategy for developing low-power applications and taking advantage of some of the technologies and support in the OS.

The proposed strategy consists of the following 11 steps. The strategy is repeatable:
the steps can be revisited whenever power management goals cannot be met, meaning that additional runtime scenarios are required to meet the application power budget.

  1. Power efficiency is considered from the beginning;
  2. Choose low-power components;
  3. Model and estimate the power supply and perform corresponding hardware testing;
  4. Design HW with hook mechanisms for power management and measurement;
  5. Build SW that can significantly improve work efficiency;
  6. Enable simple power management "on/off toggle" feature;
  7. Be the first to start working even without power management;
  8. Repeatedly enable the "power on" feature and measure the power consumption overhead (payoff);
  9. Enable code generation optimization, reset code and data, and adjust "hotspot" monitoring;
  10. Calibrate to minimize frequency and voltage;
  11. Activate all power management features and deploy accordingly.

Table 5 provides a very detailed summary of the above strategies. We will discuss below how to effectively apply the above strategies.

Table 5. Detailed strategy for low-power application development

Audio Application Examples

The off-the-shelf DSP evaluation board 5509A EVM PLUS board is selected as the test platform. This evaluation board not only supports V/F scaling, but also contains hook mechanisms for DSP core and overall system power supply measurements.

It is important to note that the EVM, as an easy-to-use evaluation platform, does not ship with the best power configuration. Also, when evaluating results, keep in mind that due to its ease of configuration, the total system power measured on the EVM should be greater than on a typically deployed platform. The EVM also allows for measuring the effectiveness of various techniques at both the DSP core level and the system level.

Step 1 is self-explanatory. Steps 2 and 4 are essentially all done with this particular EVM, which demonstrates the platform's broad applicability. Step 3 (experiments) is done on the EVM to measure the effects of various techniques (core and system power on-chip vs. off-chip accesses, DMA vs. CPU transfers, the effects of idle peripherals and clock domains, etc.).

Architecture

An example application is shown in Figure 2. For more information on this application, including a stand-alone application note and source code, see Reference 15.

Figure 2. Audio application

The audio signal is sampled and played back to the DSP via Multi-Channel Buffered Serial Ports (McBSPs). The DSP DMA engine inputs or reads the sampled signal to or from the McBSP. The stereo audio data is separated into two data streams via the RxSplit task and the Processing Task. DIP switches are used to select G726 encode/decode processing or simple volume control. The two channels are then combined in the TxJoin task and output to the speaker.

The Control task is triggered periodically and checks the DIP switches to determine if a mode switch is required (such as changing processing mode or going to sleep). Depending on the application mode, the Control task may check the CPU load and change the V/F setpoint if appropriate.

Key design decisions related to power supplies include:

  1. Use OS threads and blocking primitives to idle the clock;
  2. Use DMA to improve the efficiency of background data transfer. Interrupt the CPU after completing the transfer in the DMA block (rather than importing or reading from the serial port every time a data sample is imported or read);
  3. Control serial ports using a shared external clock (to allow frequency scaling of the DSP CPU without reprogramming the serial ports);
  4. Register callback to set up hook mechanism for codec driver to shut down codec when application enters deep sleep mode;
  5. Use the calibration function to restore the set point frequency (and voltage) before audio quality degrades;
  6. Use the clock adaptation feature of the power manager to allow periodic functions to continue to operate at a specific rate after frequency scaling;
  7. Use the Power Manager "deep sleep" interface between DSP reboots.

in conclusion

The overall results are summarized in Table 6. The main differences between the before and after patterns are in bold.

set up

DSP Core (mW)

DSP Savings (%)

Circuit board (mW)

Circuit board saving (%)

1. CPU operating frequency is 192MHz, voltage is 1.6v,
all are off-chip code Boot power consumption reduction function: off Idle loop: domain is active

207.8

--

2219

--

2. CPU frequency is 192MHz, voltage is 1.6v,
all on-chip code Reduce boot power consumption: off Idle loop: domain is active

203.3

2.17

1789


3. CPU operating frequency is 192MHz, voltage is 1.6v,
all on-chip code
Reduce boot power consumption: Enable Idle loop: The domain is in idle state

155.2


1663

25.1

4. CPU operating frequency is 144MHz , voltage is 1.4v,
all on-chip code
Reduce boot power consumption: Enable
Idle loop: The domain is in idle state

99.5


1605

27.7

5. DSP is in deep sleep (completely idle) state The frequency reaches the maximum when the voltage value is minimum before sleep The codec is powered off

0.361


1352


Table 6. Power saving effect

  • Mode #1 is a baseline measurement using all off-chip code.
  • Mode #2 eliminates all on-chip code, with smaller power savings at the DSP level, but achieves 19% board-level power savings.
  • Mode #3 includes some boot-time power saving configurations (such as shutting down the DSP's CLKOUT signal, automatic idle configuration of unused timers, and shutting down onboard LEDs) and idling in the BIOS idle loop, resulting in 25% DSP core-level power savings.
  • Mode #4 is where the set point drops to 144MHz at 1.4v, allowing audio processing while still meeting real-time minimum requirements, resulting in a 52% DSP core level power saving.
  • Mode #5 is the power consumption when the application is in standby mode. This mode configuration includes the external codec shutdown, the set point supports fast start-up of the driver with minimum voltage and maximum frequency, and the DSP is in a gated clock deep sleep mode. The standby power consumption in this mode is only 361µW.

Designers can choose the appropriate technology based on the requirements of specific applications, which is more conducive to supporting RTOS integration of high-return technologies. With this support function of the OS, designers can easily and safely improve the power efficiency of their applications with low overhead.

The power optimization strategy discussed in this article is a general model that can be used to reduce and regulate application power consumption from the beginning of an embedded project. The strategy can be used repeatedly and the previous steps can be repeated when measuring power consumption is not enough or additional runtime techniques are needed. For example, using this strategy in an audio application can save a lot of power with just a few high-payoff power saving techniques.

Reference address:How to Optimize Power Consumption in Embedded DSP Applications

Previous article:Design and implementation of an embedded RPC
Next article:A Design of Virtual Instrument Based on PCI Bus and DSP Technology

Latest Industrial Control Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号