Question: How to use DMA to speed up peripheral monitoring in low-power wearable devices?
This article introduces the use cases, advantages, and disadvantages of using direct memory access (DMA) in embedded system programs. This article describes how DMA interacts with peripherals and memory modules to improve the efficiency of CPU operations. It will also introduce readers to different DMA bus access architectures and their respective advantages.
A common task performed by embedded systems is managing external inputs. Managing inputs puts a lot of unnecessary computational pressure on the processor, causing the processor to be in active power mode longer and respond more slowly. To optimize power, maintain fast responses to events, and manage large amounts of continuous data transfer, microcontrollers with direct memory access (DMA) provide a better solution.
In system applications involving peripherals, a microprocessor can become bottlenecked at many points. For example, managing an ADC that is constantly sending data can cause the processor to be frequently interrupted, making it difficult to complete other tasks. DMA is a method of moving data and minimizing processor involvement in large or fast data processing transactions. You can think of a DMA controller as a coprocessor whose only role is to interact with memory and peripherals. In this way, the main processor can successfully manage busy peripherals, focus on other tasks, or even go to sleep to save power while processing data in the background. For example, on the Arm ® architecture, the DMA module can operate in LP2 (sleep) or LP3 (run) modes. This has obvious advantages for applications that require longer battery life, such as wearable sensor hubs and smart watches.
DMA is useful in many digital systems, sometimes even for managing large amounts of bus traffic. It is also used in network cards, graphics cards, and even some of the original IBM personal computers. That said, there are some trade-offs to consider when integrating DMA into a design.
Table 1. Advantages of using DMA
Table 2. Disadvantages of using DMA
While DMA controllers are very effective in saving power or accelerating embedded systems, their implementation schemes are not yet highly standardized. There are a variety of schemes that can be used to ensure that internal bus access is not granted at the same time as the CPU. The main goal of the bus access scheme is to avoid simultaneous access to the same memory location to avoid cache discontinuities and logic errors. A single DMA controller is usually configured to employ one of these schemes, as using each scheme may require different hardware or firmware control. The bus access schemes used by most DMA controllers are burst, cycle stealing, and transparent mode DMA.
Transparent DMA can only perform one operation at a time, however, it must also wait for the processor to execute instructions to gain access to the required data or address bus. Verifying this access restriction requires additional logic, and this type of DMA is generally the slowest. Transparent DMA may have an advantage in applications that do not require access to the memory bus but do require additional processing. In this case, the advantage is that throttling of the CPU is eliminated because the processor does not need to be completely stopped.
Table 3. Summary of DMA types and their advantages and disadvantages
Figure 1. Architecture diagram of the burst DMA during DMA operation.
Burst DMA occurs in large, infrequent bursts during which the DMA sends as much data as the buffer can hold to the destination buffer. The DMA controller blocks the CPU for a short period of time to move the large amount of memory, then hands the bus back to the main CPU and repeats the process until the transfer is complete. Burst DMA is generally considered the fastest type.
Figure 2. Cycle-stealing DMA occurs between two CPU cycles during DMA operation.
In contrast, single-byte transfer or cycle-stealing DMA takes a hint from the CPU and performs operations only between CPU instructions. It inserts an operation between two CPU cycles, so it actually "steals" CPU time. It is usually slower than burst DMA due to the limitation of only being able to perform one operation at a time.
Figure 3. Transparent DMA occurs when the processor is performing tasks that do not access the data or address buses during DMA operation.
Figure 4. Block diagram of the DMA controller on the MAX32660.
An example of a burst DMA controller is the MAX32660 (see Figure 4). The upper path corresponds to the data flow, while the lower path represents the control/status flow between the advanced high-performance bus (AHB) and the DMA logic. The DMA controller can be used as a buffer interface between the AHB and memory or peripheral modules, depending on how it is configured. The DMA logic sits between the DMA buffers and each peripheral, managing each unique peripheral bus independently during processing. The DMA can move up to 32 bytes in a single run, as long as that much data can fit in the source/destination buffers. The buffers can store up to 16 MB and can be configured to send or receive I2C, SPI, I2S, and UART in addition to internal memory transfers . Programming DMA control may vary slightly depending on the protocol, but peripheral transactions are completely managed by the DMA controller. An arbitration module controls bus access restrictions between the four DMA channels and the CPU, granting requests based on a priority system.
In summary, DMA is a critical feature for modern embedded systems that manage a large number of sensors and require high throughput, high efficiency, and low power operation. It is like a coprocessor dedicated to handling memory and peripheral bus transactions.
Many applications must use DMA to minimize power consumption and processor load. For example, health and wearable devices can handle large amounts of data throughput, but they must also conserve battery power as much as possible while handling sensitive data. Analog Devices has adopted a fast burst DMA architecture on microcontrollers suitable for low-power wearables, such as the MAX32660 and MAX32670. In addition, DARWIN Arm microcontrollers, such as the MAX32666, are designed for wearable and IoT applications with integrated Bluetooth ® 5. These devices use two 8-channel burst DMA controllers to support event-based transactions. They even come with excellent security hardware, with a secure bootloader and a Trust Protection Unit (TPU) that can accelerate ECDSA, SHA-2, and AES encryption. From early IBM computers to network cards to today’s secure, low-power wearables and IoT devices, DMA is an essential feature of modern digital systems.