The consumer handheld device market is developing in leaps and bounds. The processing power of portable products is increasing, and more and more applications are supported; the speed of product replacement is accelerating, and new products must meet the time-to-market requirements in order to obtain the greatest market opportunities; the shortening of product life cycles requires shortening the development cycle, while placing greater emphasis on reusability and reprogrammability. There is also an interesting trend in the emerging handheld device market, that is, the shipment volume of each device in a series is decreasing, but the customized functions between series devices are increasing, which effectively increases the total shipment volume of the product. In this way, the key challenge becomes how to develop a system that can be widely reused and customized at the same time.

  To meet the above challenges, more and more designers are starting to use FPGAs for handheld product development. FPGAs are becoming increasingly powerful and feature-rich, while gate count, area, and frequency are also increasing. FPGA development and turnaround time is much shorter than custom ASICs, and the additional advantage of reprogrammability makes FPGAs an attractive solution in the field of handheld embedded systems. In designs based on ASIC or FPGA, designers must carefully consider certain performance criteria, and the challenges they face are mainly reflected in area, speed, and power consumption.

  As with ASICs, vendors also need to address area and speed challenges in FPGA design. As the number of gates continues to increase, FPGAs require larger areas and sizes to accommodate more applications, and design tools need to adopt better algorithms to use area more efficiently. The evolving FPGA technology also brings a series of new challenges to designers, one of which is power utilization, which is an urgent problem to be solved for designing FPGA-based embedded systems for handheld or portable devices.

  FPGAs in Embedded Systems

  A typical embedded system consists of a processor, memory, standard interfaces including USB, SPI, I2C, and peripherals such as LCD display and audio output. The core of the device is still the processor and processor interfaces, which are connected to various peripherals through on-board connections. The performance of the system mainly depends on the performance of the processor, which usually has a very standard architecture and is not easy to customize.

  Sometimes the processor may be busy processing information from low-speed peripherals. Although the processor utilization rate may reach 100% in this case, it is not doing microprocessor-centric affairs, but working at a particularly low performance level. Regardless of its core frequency, the microprocessor must wait for data from the low-speed clock. This also leads to higher power consumption because the processor utilization rate is 100%. The result will shorten battery life and require a larger heat sink or fan for cooling, which will ultimately affect the reliability of the entire system.

  Therefore, FPGAs begin to play an important role in this regard because they can offload many peripheral interaction tasks from the processor. As shown in Figure 1, an embedded distribution system for uncompressed audio and video data streams using a standard Gigabit TCP/IP network. It has a dedicated DSP processor that is connected to the Xilinx FPGA through a standard bus interface, and the FPGA is then connected to various low-speed peripherals.

  Figure 1: FPGA architecture for audio/video distribution system.

 

As a starter development kit, this FPGA connects to 12-bit PCM audio input and 12-bit PCM audio output through the I2S interface; it also connects to the video encoder and decoder, and communicates with I2C slave devices and RS232 devices; there are few general-purpose I/Os connected to the FPGA. The standard bus connected to the processor operates at a high-speed 66MHz, while the audio peripherals operate at a low speed of 1.182MHz; the UART and I2C serial interfaces operate at 56.6kHz and 100kHz respectively. Since data transmission occurs in multiple clock domains, only the processor can configure the data flow.

  In this case, the processor no longer interacts with the low-speed peripherals, but the FPGA reads data from the low-speed PCM ADC audio device and stores the data in the FPGA's internal cache. The processor can periodically read data from this cache, or when there is enough data in the cache, the FPGA sends an interrupt to the processor. In this way, the processor has more time to perform necessary processor-centric work and enters sleep mode when idle.

  Power consumption issues

  在电池供电的嵌入式系统中,节能是最重要的考虑因素。功耗可以被分成三大类:启动功耗、静态功耗和动态功耗。设计人员无法控制启动功耗,而启动功耗在决定电源选型中扮演着重要的角色。大多数最大电流值指的就是这个阶段所达到的值。但静态功耗和动态功耗是两个不同的领域,通过合理的规划和以下正确的指导原则,使用FPGA的嵌入式设计人员可以在功耗优化方面作出显著改进。

  Static power consumption refers to the power consumption generated when current still flows through the components when the system is not working. It is generally caused by device bias current and leakage current. Static power consumption also depends on the operating voltage. Reducing the operating voltage can reduce static power consumption, but this strategy is not always in the hands of designers. What designers can do is to define a reasonable architecture in which the least resources need to be used, while using resource sharing as much as possible and using FPGA modules in the most efficient way.

  Another technique to reduce static power consumption is to estimate power consumption early in the design cycle, change the topology or use different IP blocks. For example, the Xilinx xPower Estimator tool is very useful at this time, and it can know whether the design meets the power budget very early. Power consumption estimation in the early stage may not be completely accurate, but it is indeed helpful as a guiding tool.

  Dynamic power consumption is caused by some behaviors of FPGA gates, such as signal switching. When two gates are temporarily turned on, current flows and capacitance is generated. The speed of signal switching determines the amount of power consumption. Another factor affecting dynamic power consumption is the inherent capacitance formed in the internal structure of the circuit. Dynamic power consumption is a function of clock frequency, the number of gates being switched, and the rate at which these gates are switched. Gate fanout and capacitive loading on the traces will increase dynamic power consumption, and the power consumption value is proportional to the product of capacitance, voltage and frequency squared.

  Designers have the most control over this power consumption, and they can use many techniques to achieve the greatest improvement in dynamic power consumption. Reducing the signal switching frequency can reduce power consumption exponentially. As shown in Figure 1, the control logic for the UART, parity checks, or frame overrun errors all occur in the lower speed clock domain. Even if the number of gates is not reduced, power consumption will be reduced. Designers can also reduce dynamic power consumption by reducing the overall operating frequency (if feasible). For example, after completing the feasibility and performance analysis, the designer decided that the above design can operate not only at 133MHz, but also at 66MHz. The DSP supports both speeds, and reducing the voltage will also help reduce power consumption.

  另外一种技术是减少处于工作模式的有效门数。有时某部分逻辑虽然在开机时被打开和配置,但实际上不要求做任何事情。例如,模拟音频捕获单元处于工作状态,设备却不在执行任何数字SPDIF音频捕获。在这种情况下,一般的数字SPDIF音频捕获电路仍将执行数据采样、双相解码等工作,因而无谓地浪费功率。如果禁用整个数字SPDIF音频捕获电路,使电路中不发生任何信号开关动作,那么动态功耗将会显著降低。

  Designers can achieve this by disabling the clock to this part of the circuit. A simple way to do this is to AND the clock signal with the enable signal, as shown in Figure 2. If the enable signal is low, the output of the AND gate will remain low. If the enable signal is high, the AND gate will output the clock signal.

  Figure 2: A simple clock gating mechanism.

  There are other approaches that can be used. If possible and the topology supports it, the number of signal lines can be reduced by multiplexing the address and data lines. In our example, the output to the video encoder is 16 bits of data, which can be multiplexed into 8 bits and sent out on both edges of the clock (rising and falling). This can also save dynamic power. In addition, choosing a serial interface instead of a parallel interface can also reduce power consumption. Using LVTTL or LVCMOS I/O with lower capacitive loading is also helpful.

Embedded Processors

  Embedding the processor into the FPGA is another strategy that handheld device designers can adopt, which can bring many benefits. First, the above challenges brought by custom processors are reduced. Second, the interaction between peripherals and processors occurs inside the FPGA, which can reduce the number of I/Os. Since I/Os consume a lot of power, this move can also achieve a certain degree of energy saving. Xilinx's Virtex-5 version supports PowerPC 440 processors, hard processors, and MicroBlaze soft processors, all of which can be used by designers to create high-end or low-end application systems.

  With the invention of 90nm and 65nm semiconductor technology, the size of gates is shrinking, which makes the static power consumption problem more and more prominent. This is a very challenging phenomenon in today's world where people are more and more sensitive to power consumption indicators. As the power consumption problem has gained the attention of many FPGA suppliers, many exciting new technologies have emerged in this field. Low-power design will determine how strong the integration capability of a system is, and the industry is in urgent need of standardizing design technologies that focus on power consumption.