The result of the cooperation between the microcontroller development team and the compiler developers is that the generated code is more efficient and has better performance. This article describes the adjustments made to the microcontroller architecture and instruction set during the compiler development phase in order to make the ATMEL AVR microcontroller series more suitable for the C compiler.
The core of the AVR architecture is a fast-access RISC register file. This file consists of 32 8-bit general-purpose registers. The microcontroller can load any two registers from this file into the Arithmetic Logic Unit (ALU) in a single clock cycle, perform the required operation, and write the result back to any register. The ALU supports operations and logic functions between registers or between a register and a constant. Single-register operations are also performed in the ALU. The microcontroller uses a Harvard architecture in which the program memory space is isolated from the data memory space. The program memory uses a single-stage pipeline access technology. While one instruction is being executed, the next instruction has been pre-fetched from the program memory. Because its arithmetic and logic operations are truly completed in a single cycle, the performance of the AVR microcontroller reaches one MIPS per MHz.
Figure 1 AVR architecture
Fine-tuning microcontrollers
There are many advantages to using high-level languages (HLL) instead of assembly language to develop microcontroller applications, but there has always been a major disadvantage, which is the increasing amount of code. When developing AVR microcontrollers, we considered using C language to develop applications, which made it possible to build an efficient C compiler for the device. To further improve this feature, we started developing a C compiler before the AVR architecture and instruction set were completed. We first had our AVR architecture and instruction set evaluated by compiler experts from IAR Systems in Sweden, and finally developed a microcontroller that is well suited to run code generated by the C compiler.
Addressing Modes
In order for the compiler to generate efficient code, it is important to match the addressing mode to the needs of the C language. The AVR architecture originally had two pointer registers. These two pointers can be used for indirect addressing, post increment indirect addressing, pre-decrement indirect addressing, and indirect addressing with displacement, which can well support pointer operations. In addition, there is a page direct addressing mode for accessing variables in data memory.
Pointer displacement
Indirect addressing with displacement is a very useful addressing mode, even from the perspective of a C compiler. For example, by pointing a pointer to the first member of a struct, you can access any other location within the struct as long as the displacement allows, without changing the 16-bit pointer. Indirect addressing with displacement is also often used to access variables on the software stack. Function parameters and autos are often placed on the software stack so that they can be read and written without changing the pointer. Displacement addressing is also very useful when addressing elements in an array.
Although the displacement mode is very useful in many cases, there is still a problem with limited displacement. The displacement was originally limited to 16 positions, but in practice, it is often used to exceed this number. Therefore, a new pointer must be loaded at the location that the displacement mode cannot access. In order to expand the access range of the displacement mode, we had to change other parts of the instruction set to get enough encoding space. At the same time, we also learned that it is difficult for C compilers to use page direct addressing mode. So, the page direct addressing mode was cancelled, and the freed up space was used to expand the displacement mode to 64 positions, which is enough to meet most indirect addressing requirements. The original page direct addressing mode became a two-word non-page direct addressing mode.
Number of Memory Pointers
AVR microcontrollers are originally configured with two 16-bit memory pointers. If a C compiler is to be used, one of the pointers must be dedicated to the software stack, which leaves only one memory pointer. In many cases, it is necessary to copy memory from one area to another. However, since there is only one pointer, it is necessary to read 1 byte, set the pointer, determine the write destination, write this byte, and then set the pointer back to the source location. If a third memory pointer is added (simplified functionality), the memory area copy can be completed without setting the pointer. As shown in the following example, a very efficient memory read and write loop can be constructed by using the post-increment indirect addressing mode (assuming: point pointer Z to the first byte of the source and X to the first byte of the destination):
LDI R16,0x60 ;Load byte count
loop: LD R17,Z+ ;Load byte,
increment pointer
ST X+,R17 ;Store
byte, increment pointer
SUBI R16,1 ;Decrement
counter
BRNE loop ; Branch
if more bytes
The possibility of post-increment and pre-decrement (+1, -1) operations on pointers is also very useful for implementing stacks. This can of course also be used for software runtime stacks.
Direct Addressing
As mentioned in the pointer displacement section, the AVR originally had a page direct addressing mode, but this mode was difficult for the compiler to use and was inefficient. Since we needed more coding space to increase the displacement, the page direct addressing mode was eliminated. However, if there were no direct addressing mode at all, the code efficiency would also be reduced because in some cases it would be necessary to access variables stored in the data memory. In particular, when processing static characters, the code overhead would be large (up to 50%) because static variables must be stored in the data memory and cannot be automatically placed in registers. To overcome the problem of low code efficiency, we use a 16-bit address to add some non-page direct addressing instructions. In this way, the addressing of 64KB of data space can be completed with a single instruction. To access such a large memory area, the access instruction must be two 16-bit words.
If the number of bytes accessed is small (for example, reading a single character), this addressing method is more efficient than pointers. For larger areas, it may still be more efficient to use indirect addressing (see the example below).
Loading of a character:
Indirect addressing (6 Bytes): Direct addressing (4 Bytes):
LDI R30,LOW(CHARVAR) LDS R16,CHARVAR
LDI R31,HIGH(CHARVAR)
LD R16, Z
Loading of a long integer:
Indirect addressing ( 12 Bytes) Direct addressing (16 Bytes)
LDI R30,LOW(LONGVAR) LDS R0,LONGVAR
LDI R31,HIGH(LONGVAR) LDS R1,LONGVAR+1
LDD R0,Z LDS R2,LONGVAR+2
LDD R1,Z+1 LDS R3,LONGVAR+3
LDD R2,Z+2
LDD R3,Z+3
Zero Flag Propagation
To implement conditional branching, some instructions are needed to manipulate the AVR status register, which consists of a number of flags. The conditional branch instruction that follows this instruction may or may not perform the branch, depending on the setting of these flags. Using arithmetic instructions to manipulate these flags, we can check the relationship between a number A and another number B. When the numbers being checked are 8-bit numbers, there is no problem because all flags depend on the flag value set by a single instruction. When the numbers being checked are 16-bit or 32-bit numbers (which is often the case in C), the problem is a bit tricky, for example, a 32-bit subtraction operation is equivalent to performing four 8-bit subtraction operations in a row, and each 8-bit subtraction generates a new set of flags.
To propagate the carry flag, most processors include instructions that can handle the previously set value of the carry flag. For example, the subtract with carry (SBC) instruction; executing the SBC A,B statement is equivalent to turning A into the carry bit. But to correctly complete all the conditional branch operations, there is another flag that needs to be propagated, the zero flag.
Example:
A=R3:R2:R1:R0,
B=R7:R6:R5:R4
We want to subtract B from A and jump to a specific location if A = B. If the zero flag depends only on the last arithmetic instruction, then the following instruction will not be executed:
SUB R0,R4
SBC R1,R5
SBC R2,R6
SBC R3,R7; R3=R7
=> Zero flag set
BREQ destination
This is because the flag values used by the BREQ instruction depend only on the flag values set by the last SBC instruction. If the most high-order bits are directly equal, the zero flag will be set and the branch will be taken even if the 32-bit numbers are not equal. This problem also occurs with other conditional branches.
There are two ways to solve this problem. One is to save the flags produced by each instruction, and then check all zero flags after the fourth subtraction is completed. Another more sophisticated method is to propagate the zero flag in the carry instruction (see below):
Znew =Not(R7) AND
Not(R6) AND
...
Not(R0) AND
Zold
By propagating the zero flag in this way, all conditional branches are executed after the last subtraction operation is completed, because the remaining part of the participating flags (overflow and positive flags) depends only on the most significant byte.
Previous article:Realization of a 360-degree free adjustment device for the electric fan's shaking angle based on AVR single-chip microcomputer
Next article:Method of ICCAVR debugging in Proteus
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Since it is so similar to inductors, why can you only use ferrite beads in your circuit?
- Device tree interrupt node
- Can CC2541 support 10 external interrupts?
- PCB failure analysis and some cases
- Evaluation report summary: Nengdian Technology capacitive and photoelectric liquid level sensors
- Xunwei i.MX6ULL Terminator Mfgtools Modify Only Burn Uboot, Kernel, File System
- How to choose the flow capacity
- ADRV9002 Dual Narrow Bandwidth RF Transceiver
- Help, STM32 address data acquisition HardFault_Handler
- CC2640R2 uses the IDE's post-build function to generate a single firmware file