Application of Microcontrollers with Hardware Vector Floating Point Units in Medical Electronics

Publisher:sjjawx831Latest update time:2011-06-20 Source: 恩智浦 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
A microcontroller is a single-chip microcomputer that integrates the main parts of a microcomputer on one chip. Microcontrollers were born in the mid-1970s. After more than 20 years of development, their cost has become lower and lower, while their performance has become more and more powerful, which has made their applications ubiquitous in various fields. For example, motor control, barcode readers/scanners, consumer electronics, gaming equipment, telephones, HVAC, building security and access control, industrial control and automation, and white appliances (washing machines, microwave ovens).

Many of today's embedded industrial and automotive systems are designed based on 8-bit or 16-bit microcontroller architectures. With the advent of new low-power 32-bit architectures, these applications have the potential to achieve higher performance, accuracy, and power efficiency. In addition, increased processing power will help to achieve new product differentiation features, including advanced control algorithms, GUI displays, voice control, and next-generation interfaces such as capacitive touch sensing. 8-bit/16-bit microcontrollers usually consume a lot of computing resources to do these tasks. Today, powerful microcontrollers with built-in floating-point operations are beginning to appear, and 32-bit microcontrollers have enough power to implement many of these functions.

Evaluating the performance of microcontrollers

Compared with professional DSP processors, microcontrollers have the following advantages for signal processing:

(1) Effective loop control; (2) Rich peripherals; (3) Single processor structure, instruction set and development tool chain; (4) Unified interrupt and task switching environment, homogeneous memory; (5) The same operating system manages both control and signal processing tasks, based on MMU; (6) Short time to market due to greatly simplified development process; (7) Popular microcontrollers are easily available and development tools are low cost.

How to evaluate whether the performance of a microcontroller meets the application requirements is a question that engineers need to consider in the early stages of project design. Evaluating and summarizing information from data sheets is an effective method, and another method is to use a certain type of evaluation board to perform specific performance tests and power consumption tests. Both methods have their own disadvantages.

The efficiency difference between 32-bit and 8/16-bit systems is quite large. On a 16-bit processor, a normal 32-bit multiply/accumulate operation requires 4 multiplications and 4 additions. The need to access memory to store intermediate results or release multiple registers further reduces execution efficiency and may slow down other operations. Therefore, on a 16-bit processor, a 32-bit multiplication may require 20 to 40 cycles. The 32-bit UC3C processor only requires a single cycle. In addition, the 32-bit pipeline is wider, so data and instructions can be retrieved from memory faster.

During the evaluation process, three steps were used: (1) abstracting the system characteristics by running various system test benchmarks and varying different system parameters; (2) interpreting the collected characteristic data to establish the system behavior; and (3) using the system behavior to determine how to set control parameters so that the system performance achieves the desired effect.

Characterization

In theory, performance testing is a qualitative or quantitative assessment of the behavior of a system. In practice, the behavior of the system may not be detailed enough to define complete quality tests, and creating the tests may be too expensive to warrant their development. A good compromise for characterizing a system is to use a test benchmark as a test or series of tests executed in software that provide quantitative data that can be used to compare the characteristics of different systems.

To characterize the microcontroller, a set of performance test benchmarks were selected from the EEMBC Auto-Bench group. These benchmarks help predict the performance of microcontrollers in automotive electronics, industrial and general applications. Each benchmark test was run through multiple iterations to eliminate the impact of some startup code that is only run once at the beginning of each test. One advantage of using this industry-standard benchmark suite is that the resulting data can be compared with test data from other microcontrollers of similar architecture to judge overall system performance.

The microcontroller tested here is based on the ARM926EJ-S core with a hardware vector floating-point coprocessor and a 32 KB instruction cache (I-cache). This test measures the performance of the floating-point coprocessor and the instruction cache. The Auto-Bench test benchmarks were run at different operating frequencies of the microcontroller, and the energy consumed in each benchmark execution was measured using Energy-Bench. Energy-Bench is another EEMBC tool that measures the energy consumed by the processor while the benchmark load is running. The data collected from Energy-Bench allows observation of the energy efficiency of the microcontroller under various different loads. Having chosen these tools to evaluate the microcontroller, the next step is to determine the performance of the microcontroller under different operating conditions.

Performance Analysis

To analyze the performance of a microcontroller, it is necessary to determine the overall system response under different conditions. In the test project, the performance of the floating point coprocessor and the instruction cache on the NXP microcontroller needs to be evaluated.

Run the Auto-bench benchmark suite, changing four parameters: operating frequency, CPU core voltage, instruction cache state, and floating-point coprocessor state.

Figure 1 is a schematic diagram of setting up the Auto-Bench/Energy-Bench test environment. It consists of three parts: data acquisition system (DAC), software development environment and test target. The DAC of National Instruments is connected to a PC, which runs Energy-Bench, a power consumption and energy consumption test software. The software test environment uses KeilTM integrated development tools to compile, download and run the Auto-Bench test benchmark. By isolating the three power supply voltages supplied to the microprocessor, Energy-Bench can measure the energy consumed in the Auto-Bench benchmark test and calculate the total energy consumed in each test.

Auto-Bench was run at 4 different frequencies (13 MHz, 52 MHz, 104 MHz, and 208 MHz) and in combination with other test conditions including turning the floating point coprocessor on or off and turning the instruction cache on or off. The floating point coprocessor was disabled by default, causing the compiler to use software floating point for any case where floating point operations were required.

There is much more data collected than can be presented in this article, but here are two representative cases to show how the collected characterization data determines the performance of the system. Figure 2 shows the test data results of EEMBC's finite impulse response filter (FIR) in a graphical manner. Figure 3 shows the basic integer floating point data results collected by EEMBC in a graphical manner. Two different benchmarks were run at 13 MHz, varying the CPU core voltage between 0.9 V and 1.2 V. When the test benchmark was run with the CPU clock set to 208 MHz, the AHB clock was set to its limit of 104 MHz. In all other test frequencies, the CPU clock and AHB clock were the same.

Figure 2 EEMBC’s finite impulse response (FIR) test data results

Figure 3. Basic integer floating point data results collected from EEMBC

Floating point operations are real number operations. Since computers can only store integers, real numbers are approximations, so floating point operations are very slow and have errors. Most machines now have 32 bits, which means that if all 32 bits are used to represent integers, then for unsigned integers it is 0 to 2^32-1, and for signed integers it is -2^31 to 2^31-1.

First, let's look at the performance of the instruction cache, and observe Figure 2 and the graph plotted against cycles/s. The data shows that at all frequencies, the absolute performance of the microcontroller is better when the instruction cache is enabled. Second, even though the instruction cache provides better absolute performance as the CPU clock frequency increases, the relative magnitude of the improvement is not linear. The reader can verify this behavior by observing the graph plotted against cycles/s/MHz. Figure 2 shows that for almost all CPU clock frequencies, performance increases linearly by about 100 cycles/s/MHz, except when running at 208 MHz, where performance drops to 60 or 80 cycles/s/MHz, depending on whether the instruction cache is enabled or not.

It is obvious that the system runs faster when the instruction cache is enabled because the number of reads and writes to the AHB RAM is reduced when the CPU executes instructions from the instruction cache.

The non-linear performance characteristics are the result of the AHB clock having an upper limit of 104 MHz. When the AHB clock is slower than the CPU clock, the CPU must wait longer to fetch instructions from the RAM on the AHB bus, resulting in a smaller relative performance increase per MHz.

Next, we analyze the impact of the instruction cache on energy consumption. If we only consider the absolute power consumption in Figure 2, we may conclude that turning off the instruction cache can save energy for the entire system. However, the Energy-Bench data shows that when the instruction cache is enabled, the energy consumed per benchmark cycle is actually lower than when the instruction cache is turned off.

A closer look at the Energy graph shows that when the instruction cache is enabled, the energy consumed per cycle at 208 MHz, 1.2 V is even lower than at other operating frequencies. In fact, there is a 10% to 12% improvement. In other words, executing the same benchmark with the instruction cache enabled, running at high speed (208 MHz) for a shorter period of time is more energy efficient than running at a lower speed (52 MHz or 104 MHz) for a longer period of time.

From Figure 3 and the graph of cycles/s, we can see the efficiency and energy consumption of using the floating-point coprocessor. This graph shows quite vividly the performance effect of the integrated floating-point coprocessor. At a frequency of 208 MHz, with the instruction cache enabled and software floating-point operations, the microcontroller runs at about 8,500 cycles/s; with the floating-point coprocessor, this value increases to more than 32,500 cycles/s, a performance improvement of more than 280%.

To examine the effect of the floating-point coprocessor on energy consumption, see the energy graph in Figure 3. When using software floating-point operations with the instruction cache enabled, the energy per benchmark load at 208 MHz shows that the microcontroller consumes approximately 16 J per cycle; with the floating-point coprocessor this is less than 4 J/cycle - a savings of more than 75% for the same workload.

Figure 2 and the cycles/s graph show that the performance benchmarks are equivalent at 13 MHz and supply voltages of 0.9 V and 1.2 V.

However, the power graph shows that the power consumption at 1.2 V is about 75% higher than at 0.9 V.

System control parameters

In the test example, the EEMBC characterization tool used determines the performance of the instruction cache and floating point coprocessor in the target test system. Based on this performance, general configuration parameters can be selected to provide the best conditions for system performance with low energy consumption.

Here are some parameter choices that can control system power utilization and performance in environments like those of the EEMBC Auto-Bench benchmark suite:

(1) Enabling instruction cache can improve performance;

(2) The use of hardware floating-point coprocessors significantly improves computing performance and significantly reduces energy consumption compared to software floating-point coprocessors;

(3) At 208 MHz, with instruction cache enabled, the energy consumption is better than at lower frequencies;

(4) For 13 MHz low power operation, the core voltage is much better at 0.9 V than at 1.2 V.

Beyond these general summaries, the fact that system performance is determined is based on data from industry-standard performance and energy benchmarks that are publicly available and independently verified.

Using EEMBC Auto-Bench and Energy-Bench, you can get consistent performance analysis that is easy to demonstrate to others and can be repeated and verified.

Designing an embedded system is often a challenging task, as almost every embedded system has a relatively unique hardware configuration. Specific code often needs to be rewritten for a specific embedded operating system. There are often very strict energy consumption constraints. This article provides a quantitative scientific test method to help embedded engineers consider how to choose a controller suitable for a specific application to build a system. Even if the embedded systems tested vary greatly, solid data can still help system evaluators compare the same performance characteristics.

In the test setup for this article, the EEMBC characterization tool was used to determine the performance of the NXP microcontroller. This performance information was then used to select the best control parameters for the specific operating environment. The test routine quantified the system performance using the instruction cache and floating-point coprocessor of the microcontroller in the evaluation system. The collected characterization data facilitated the definition of system behavior and provided a methodology to select operating parameters to control system performance and energy consumption.

Test results show that the use of hardware vector floating-point units can improve system performance by about 5 times, reduce the amount of code, and lower power consumption.

The hardware floating-point coprocessor VFP9 is a feature of NXP's LPC3000 series based on the ARM926EJ-S core. NXP's low-power 90 nm process technology can achieve this function with very small chip area and extremely low power consumption, making the LPC3000 ARM9 microcontroller very suitable for industrial applications such as medical electronics that require signal processing.

Reference address:Application of Microcontrollers with Hardware Vector Floating Point Units in Medical Electronics

Previous article:Application of Microcontrollers with Hardware Vector Floating Point Units in Medical Electronics
Next article:Carbon nanotube electrical probe array patented to detect electrical activity inside cells

Recommended ReadingLatest update time:2024-11-16 16:34

Realization of High-speed Data Acquisition System Based on ARM9
1 Introduction In scientific research, production and people's daily life, the measurement and control of analog quantities are very common. In order to measure and control physical quantities such as temperature, pressure, flow, speed, displacement, etc., the above physical quantities are converted into electr
[Test Measurement]
Realization of High-speed Data Acquisition System Based on ARM9
Distributed Control System of Humanoid Robot Based on ARM9
1 Introduction Humanoid robots have similar basic appearance features and walking functions to humans, as well as vision, hearing and other functions. They can walk like humans, are flexible and light, and have good adaptability to the walking environment. They can walk on flat ground and on complex non-structu
[Microcontroller]
Distributed Control System of Humanoid Robot Based on ARM9
Design of IP Phone Communication Based on ARM9 Microprocessor S3C2410
0 Introduction IP phones have been widely recognized by many consumers for their advantages such as low call rates, convenient integration and intelligence, and have thus had a huge impact on the long-distance and international call services of the original fixed-line operators. Therefore, with the direct access
[Microcontroller]
Design of IP Phone Communication Based on ARM9 Microprocessor S3C2410
Design of multi-tag multi-protocol RFID reader based on ARM9 embedded platform
1 RFID system structure principle Radio frequency identification technology is a contactless automatic identification technology, often called inductive electronic chip or proximity card, induction card, contactless card, electronic tag, electronic barcode, etc. A complete RFID reading system consists of three parts
[Microcontroller]
Design of multi-tag multi-protocol RFID reader based on ARM9 embedded platform
Design of power load terminal system composed of ARM9 and Linux operating system
  As customers' requirements for power quality gradually increase, traditional power networks are difficult to meet development requirements. For this reason, the idea of ​​developing a "full coverage, full collection, full prepayment" smart grid is proposed to achieve the upgrade of traditional power grids. The power
[Microcontroller]
Design of power load terminal system composed of ARM9 and Linux operating system
Design and driver implementation of multi-row keyboard based on ARM9
1 Introduction In many embedded systems, especially those with frequent human-machine interaction (HMI), keyboards are the most widely used input devices. Due to the functional heterogeneity of embedded devices, it is not feasible to provide a universal keyboard for them. Generally, it is necessary to design th
[Microcontroller]
Design and driver implementation of multi-row keyboard based on ARM9
Design of CNC milling machine system based on ARM9
introduction At present, the CNC systems used in China are usually based on general-purpose computers or industrial computers with motion control cards, using Windows operating systems, and installing expensive CNC software. Such system software costs are high, hardware resources are wasted, and power consumpti
[Microcontroller]
Design of CNC milling machine system based on ARM9
Latest Medical Electronics Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号