ARM instruction study notes-EEWORLD

Collect

To learn the arm instruction set, you must first understand what arm is. As we all know, arm is an advanced RISC machine. And what is RISC? That is: reduced instruction set computer. It is a microprocessor that executes fewer types of computer instructions, which originated from the MIPS host in the 1980s. Since fewer and simpler instructions require fewer transistors and circuit elements, it can perform operations at a faster speed (executing more millions of instructions per second, i.e. MIPS). I think this is also the reason why the NDS game console uses the arm processor, because it is small and fast, and also cheap. Arm is a British company that has designed a large number of high-performance, low-cost, low-energy RISC processors, related technologies and software. The technology has the characteristics of high performance, low cost and low energy consumption. It is applicable to many fields, such as embedded control, consumer/educational multimedia, DSP and mobile applications. This company ARM neither produces nor sells chips, it only sells chip technology licenses. Looking at the mobile phones around us, Nokia n86, n97, n95, n96... Motorola, Sony Ericsson, Apple, Samsung, all use arm processors in large quantities. The processors ARM946E-S and ARM7TDMI in the NDS we are going to contact are embedded 32BIT RISC CPUs developed by ARM. It has the advantages of low power consumption, high performance, small size and low price. It is very suitable for handheld use. ARM946E-S is a core with a 5-stage pipeline, integrated with Thumb extension function, debugging function and Harvard bus. Under the same process, it is more than twice the performance of ARM7TDMI. After having a simple understanding, I began to invest in the study of arm instructions.

References --------- "arm Baidu Encyclopedia", "NDS Baidu Encyclopedia"

The next step is to learn arm instructions.

Because I have learned 8086 instructions before and have written assembly programs such as high-precision calculations, I am familiar with arm instructions.

I kept comparing these two sets of instructions in my mind.

The arm7TDMI(-S) instruction system has two instruction sets, namely the 32-bit Arm instruction set and the 16-bit thumb instruction set. To put it simply: arm supports all the features of the arm kernel, and is efficient and fast; while the thumb instruction set is flexible and compact. The two can call each other. The Thumb instruction set can be regarded as a compressed subset of Arm, which is proposed to address the code density problem. Thumb instructions have corresponding Arm instructions, but it is not a complete system. For example: the Thumb instruction set does not have coprocessor instructions, semaphore instructions, and instructions to access CPSR or SPSR, and does not have multiplication and addition instructions and 64-bit multiplication instructions, etc., and the second operand of the instruction is limited; except for the conditional execution function of the jump instruction B, other instructions are basically unconditional execution., etc. I will not describe them one by one. In addition to having many functions that Thumb does not have, the biggest feature of the Arm instruction set is: high efficiency.

Arm has 37 registers, including

31 general registers (Rxx)

6 status registers (xPSR)

A detailed description of these 37 registers can be obtained from the website nocash.emubase.de.

When learning arm instructions, the first thing you come into contact with is the addressing instruction.

Arm addressing instructions can be divided into nine categories:

Are there any Arm instructions in 80x86?

1. Register addressing;

2. Immediate addressing;
3. Register shift addressing; None

4. Register indirect addressing;
5. Base address addressing;

6.Multiple register addressing; None

7. Stack addressing; None

8. Block copy addressing; None
9. Relative addressing.

It can be seen that a big feature of the arm addressing instruction is its register shift addressing, that is, the second operand can be selected to perform a shift operation before combining with the first operand, for example: MOV R0, R2, LSL #3. In the 80x86 instruction, this requires three steps: one assignment (otherwise it affects the value of the addressing variable), one shift, and one addressing. It not only brings visual inconvenience, but also brings trouble to writing (more prone to errors). In addition, the Arm instruction can also perform multi-register addressing, which undoubtedly simplifies the operation (to be specific, it means writing a lot less "LD" and "ST"). You can also see that we only need to add a "!" in the arm instruction to decide whether the intermediate value is retained. What is more convenient is that we can freely choose the change of the pointer before and after the address change. For example, there are four types of block copy addressing: STMIA, STMIB, STMDA, STMDB. Whether to decrease or increase, whether to change the address first or copy first, Arm has designed it for you in advance. It cannot be said that it is comprehensive. These are not available in 80x86, which greatly facilitates programmers' program design.

By the way, another convenience is that when loading/filling data, you can add H/B after the command to indicate the data operation on halfword/byte, and the default is word. And ARM can realize the data transfer between a group of registers and a continuous memory, such as LDMIA and STMIA instructions.

After reading the addressing, the next step is to focus on the Arm instructions.

The basic format of Arm instructions is:

{}{s} ,{,}

ARM instruction study notes! ! ! ! - fengjie1314126@126 - fengjie1314126@126's blog

————————arm's 32 pseudo-instruction binary format

The items in <> are required, while the items in {} are optional.

Opcode: instruction mnemonic

Cond: Execution condition

S: Whether to affect the value of the cpsr register

Rd: destination register

Rn: register of the first operand

Oprand2: The second operand

ARM instructions can be divided into 6 categories

(1) Jump instructions B, BL, BLX, BX

(2) Data processing instructions Data transfer\arithmetic logic operation\comparison

(3) Program status register transfer instruction

(4) Load/store instructions

(5) Coprocessor instructions

(6) Abnormal interrupt instruction

You can see the item, which is another feature of arm instructions. Various operation instructions can be combined with conditional instructions. For example, in 80x86, to write a command to find ra, rb and store a large number in the rc register, you need to write

MOVE rc ra

CMP ra rb

JG NEXT

MOVE rc rb

..............

In arm, it can be easily written as

MOVE rc ra

CMP ra rb

MOVECC rc rb

Flexible use of the second operand can also greatly improve the efficiency of the code. For example, if we want to multiply the number in the r1 register by 9, we can easily write it in ARM as

ADD R1, R1, R1, LSL #3

If we change to 80x86, it will be a lot of steps.

MOVE R2, R1

SHL R2, 3

ADD R1, R2

That is to say, arm instructions are more user-friendly. Combined with my later study of arm pseudo-instructions, I personally feel that the arm instruction set is a high-level language in assembly.

ARM instruction study notes! ! ! ! - fengjie1314126@126 - fengjie1314126@126's blog

In ARM, multiplication can be performed on any two registers and the result can be saved in any register, but in 80x86, the multiplicand must be saved in AL/AX before multiplication. And the high and low bits must be saved separately in AX/AL and DX/AH, which is troublesome!

ARM's interrupt instructions are:

(1) Reset abnormality

(2) Undefined instruction exception

(3) Software interrupt exception

(4) Prefetch interrupt exception

(5) Data termination exception

(6) Interrupt request exception

(7) Fast interrupt (FIQ) request exception

There are so many types that it is a headache to learn, unlike 80x86 which is clear from beginning to end about INT interrupt. I don't understand interrupt very well, and I need to continue to study hard.

The next step is to learn pseudo-instructions. First of all, pseudo-instructions are not instructions in the ARM instruction set. They are just defined by the compiler for the convenience of programming. They can be used like other ARM instructions, but they will be replaced by equivalent ARM instructions during compilation. Although some pseudo-instructions are just some extremely simple replacements, they greatly facilitate our programming. It can be said that it is a special mnemonic.

The study of pseudo instructions is only conceptual, not profound. Although I know the meaning of many instructions, I have no idea where to use them. I will have to wait for more time in the future.

In general, there are several ARM instructions:

1. Symbol definition directive

Global variable declarations: GBLA, GBLL, and GBLS.
    Local variable declarations: LCLA, LCLL, and LCLS.
    Variable assignments: SETA, SETL, and SETS.
    Define a name for a general register list: RLIST.
    Define a name for a coprocessor register: CN.
    Define a name for a coprocessor: CP.
    Define a name for a VFP register: DN and SN.
    Define a name for an FPA floating point register: FN.

2. The data definition pseudo-instruction
         declares a literal pool: LTORG.
         Defines the first address of a structured memory table: MAP.
         Defines a data field in the structured memory table: FIELD.
         Allocates a block of memory space and initializes it with 0: SPACE.
         Allocates a section of byte memory units and initializes them with specified data: DCB.
         Allocates a section of word memory units and initializes them with the data of the instruction: DCD and DCDU. Allocates a section of
         word memory units and initializes the content of each unit to the offset of the unit relative to the static base address register: DCDO.
         Allocates a section of double-word memory units and initializes them with double-precision floating-point data: DCFD and DCFDU. Allocates
         a section of word memory units and initializes them with single-precision floating-point data: DCFS and DCFSU. Allocates a section of word memory units and
         initializes them with single-precision floating-point data, specifying that the memory unit stores code instead of data: DCI.
         Allocates a section of double-word memory units and initializes them with 64-bit integer data: DCQ and DCQU.
         Allocate a half-word of memory and initialize it with the specified data: DCW and DCWU.

Assertion error: ASSERT. This instruction is quite magical. You can write some assertion error instructions such as ASSERT top<>temp at the beginning of the program. In the second pass of the assembly compiler scanning the assembly program, if the
ASSERT condition is not established, the ASSERT pseudo-instruction will report the error information, thereby reducing errors. It is a bit like try and catch in C++.

3. Assembly control pseudo-instructions/macro pseudo-instructions

Assembly control pseudo instructions are used for conditional assembly, macro definition, repeated assembly control, etc. Such pseudo instructions are as follows:
     Conditional assembly control: IF, ELSE and ENDIF
     Macro definition: MACRO and MEND
     Repeated assembly: WHILE and WEND

These are the places where it is a bit like a high-level language, and pseudo-instructions can be used to implement certain high-level statements. In fact, isn't a high-level language just a package of assembly instructions? It's pretty much the same.

It is worth mentioning MACRO and MEND, which feel like #define in C and are very powerful.

Its pseudo-instruction format:
MACRO
{$label} macroname {$parameter} {$parameter}…

Among them: $label When the macro instruction is expanded, label can be replaced by the corresponding symbol, usually a label. Using $ before a symbol means that the corresponding value will be used to replace the symbol after $ when it is assembled.
macroname The name of the macro defined.
$parameter The parameter of the macro instruction. When the macro instruction is expanded, it will be replaced by the corresponding value, similar to the formal parameter in the function.

It can realize the parameter passing! ! This pseudo instruction makes people want to try it. There will definitely be a chance in the future!

Here is a simple assumption:

In C: #define bigger(a,b) (a > b)

It can be written as (it may be wrong, just try it):

MACRO

$label bigger $a, $b

$label

;GL1 is a defined global variable

CMP $a, $b

MOVEGT GL1, 1

MOVELE GL1, 1

MEND

Call test bigger a, b

Then get the size result in GL1

Of course we can compare directly, but here it is just for demonstration.

4. Miscellaneous pseudo instructions

Miscellaneous pseudo-instructions are commonly used in assembly programming design, such as segment definition pseudo-instructions, entry point setting pseudo-instructions, include file pseudo-instructions, label export or import declarations, etc. Such pseudo-instructions are as follows:
         Boundary alignment: ALIGN.
         Segment definition: AREA.
         Instruction set definition: CODE16 and CODE32.
         End of assembly: END.
         Program entry: ENTRY.
         Constant definition: EQU.
         Declare that a symbol can be referenced by other files: EXPORT and GLORBAL.
         Declare an external symbol: IMPORT and EXTERN.
         Include files: GET and INCLUDE.
         Include files that are not assembled: INCBIN.
         Keep local symbols in the symbol table: KEEP.
         Disable floating-point instructions: NOFP.
         Indicate the dependency between two segments: REQUIRE.
         Stack 8-byte alignment: PEQUIRE8 and PRESERVE8.
         Name a specific register: RN.
         Mark the boundaries of the scope of use of local labels: ROUT.

The last part is mixed programming of C and assembly.

(1) Embed assembly in C. I once read a story: A physicist wrote a program to simulate the movement of celestial bodies. After optimizing the algorithm and instructions, he shortened the time it took to produce results from a program that would have taken several years to just over ten minutes. A very important step in this process was to directly write some highly reusable high-level language program blocks in assembly language. This greatly shortened the maximum time the program could run. I was also curious about how to embed assembly in a high-level language.

_asm

{

command[;command]

...

[command]
}

The built-in assembler has many restrictions on registers, constants, labels, etc., so I won't go into details.

(2) Embedding C language programs in assembly

(3) C and assembly call each other

In the process of learning ARM instructions, I encountered many problems. The first time I encountered them, I was often very confused. There are also some things that should be paid attention to. Of course, there are far more problems and things that need to be paid attention to. These are just some of the more typical ones that I think are written here to share with you:

1. #immed_8r constant expression "The constant must correspond to an 8-bit bitmap, that is, the constant is obtained by cyclically shifting an even number of bits of an 8-bit constant."

What it means is this: #immed_8r represents a 32-bit number when processed by the chip, but it is obtained by circularly shifting the even bits of an 8-bit number (for example: 01011010, which is 0x5A) (1000 0000 0000 0000 0000 0000 0001 0110, which is 0x5A obtained by circularly shifting right by 2 bits (even bits)).

However, 1010 0000 0000 0000 0000 0000 0001 0110 does not conform to this rule and will definitely fail when compiled, because you can get it by circularly shifting 1011 0101 to the right, but you cannot get it by circularly shifting an even number of bits.

1011 0000 0000 0000 0000 0000 0001 0110 does not conform to this rule either. It is obvious that 1 0110 1011 has 9 bits.

ARM instruction study notes! ! ! ! - fengjie1314126@126 - fengjie1314126@126's blog

2. What is signed extension?

When assigning a value from 16 bits to 32 bits, if unsigned extension is selected, the upper bits are padded with zeros. If signed extension is selected, the 16 bits in the 32 bits are padded with the highest 16 bits.

For example, 1101010110101010------->1111111111111111101010110101010

0101010110101010------->000000000000000101010110101010

As for why this is done, you can refer to the definition of the complement code. Here is a simple method for calculating the complement code:

The two's complement of an N-bit number with an absolute value of k is: 2^n - k. It is a little clearer than taking the inverse and adding one.

3. We said there are four types of stack addressing modes, LDMFA, STMFA, LDMEA, and STMEA.

Note that F stands for full, E stands for empty, A stands for after, and B stands for before.

We assume that in C language stack[] is a stack array and top is the top pointer of the stack. For ease of understanding, I describe it in C language.

A stack is a data structure that works in a First In Last Out (FILO) manner and uses a special register called the stack pointer to indicate the current operation position. The stack pointer always points to the top of the stack.

When the stack pointer points to the last data pushed into the stack, it is called a full stack, and when the stack pointer points to the next empty location where data will be placed, it is called an empty stack.

At the same time, according to the way the stack is generated, it can be divided into ascending stack and descending stack. When the stack is generated from low address to high address, it is called ascending stack, and when the stack is generated from high address to low address, it is called descending stack. In this way, there are four types of stack working modes, and ARM microprocessors support these four types of stack working modes, namely:

◎ Full descending full descending stack

The stack header is a high address, and the stack grows toward lower addresses. The stack pointer always points to the last element of the stack (the last element is the last data pushed in).

The ARM-Thumb procedure call standard and the ARM and Thumb C/C++ compilers always use a Full descending type stack.

C language representation: stack[--top] = value

◎ Full ascending full increment stack

The stack header is the low address, and the stack grows toward the high address. The stack pointer always points to the last element of the stack (the last element is the last data pushed in).

C language representation: stack[top--] = value

◎ Empty descending empty descending stack

The stack starts at a low address and grows toward higher addresses. The stack pointer always points to the next empty location where data will be placed.

C language representation: stack[++top] = value

◎ Empty ascending empty ascending stack

The stack head is a high address, and the stack grows toward lower addresses. The stack pointer always points to the next empty location where data will be placed.

Assembly instructions for operating the stack

C language representation: stack[top++] = value

4. Arithmetic shift/logical shift/circular shift

Arithmetic shift, logical shift Logical right shift fills the highest bit with 0, and the lowest bit enters CF, which is equivalent to dividing by 2 for each shift. Generally, for unsigned numbers, it is used as follows: 133/8=16 with a remainder of 5 MOV AL,10000101B MOV CL,03H SHR AL,CL AL=10H=16 Arithmetic right shift The highest bit (i.e. the sign bit) remains unchanged, instead of filling the lowest bit with 0 and entering CF. It is equivalent to dividing by 2 for each shift. Generally, for signed numbers, it is used as 8/8 MOV AL,10000000B MOV CL,03H SAR AL,CL AL=0F0H=-16

ARM instruction study notes! ! ! ! - fengjie1314126@126 - fengjie1314126@126's blog

----------Corresponding to logical left shift, logical right shift, arithmetic right shift, circular right shift respectively

5. About ARM's B, BL jump instructions

Assume that the address of the jump instruction is A, and the address of the jump target is B.
B, BL instructions save the offset address, and the calculation method of this address is:
1. B-(A+8). A+8 is because the ARM pipeline makes the actual value of PC be A+8 when the instruction is executed to A. 2. The value obtained in the first step is a multiple of 4, because ARM instructions are 4-aligned, that is, the lowest two bits are 00. So this value is shifted right by two bits.
3. Get the final offset

When executing:
1. Take out the offset
2. Shift left two bits
3. Add PC. At this time, the value of PC is exactly the address value of the target, that is, the target address instruction enters the instruction fetch, and the first two stages of the pipeline are cleared

But why is it subtracted by 8? This is because ARM7 has a three-stage pipeline.

So what is the three-stage assembly line?

PC stands for program counter. The pipeline uses three stages, so instructions are executed in three stages: 1. Fetch (load an instruction from memory); 2. Decode (identify the instruction to be executed); 3. Execute (process the instruction and write the result back to the register). That is, when executing, the instruction is already two words ahead, that is, 8 bytes.

6.What is a soft interrupt?

　　Soft interrupts use the concept of hardware interrupts to simulate software and achieve macro-asynchronous execution effects. In many cases, soft interrupts are somewhat similar to "signals". At the same time, soft interrupts are corresponding to hard interrupts. "Hard interrupts are interrupts to the CPU by external devices", "soft interrupts are usually interrupts to the kernel by hard interrupt service programs", and "signals are interrupts to a process by the kernel (or other processes)" (Chapter 3 of "Linux Kernel Source Code Scenario Analysis"). A typical application of soft interrupts is the so-called "bottom half", which gets its name from the mechanism that separates hardware interrupt processing into two stages: the "upper half" and the "lower half": the upper half runs in the context of shielded interrupts to complete critical processing actions; while the lower half is relatively less urgent and is usually more time-consuming, so the system arranges its own running time and does not execute in the interrupt service context. The application of the bottom half is also the reason that inspired the kernel to develop the current soft interrupt mechanism. Soft interrupts are an upgrade of the original "bottom half processing" of the Linux system. It is a new processing method developed on the original basis to adapt to multi-CPU and multi-threaded soft interrupt processing. Generally speaking, soft interrupts are caused by kernel mechanism trigger events (such as process timeout), but it cannot be ignored that a large number of soft interrupts are also caused by hardware-related interrupts. For example, when a printer port generates a hardware interrupt, it will notify the hardware-related hard interrupt, and the hard interrupt will generate a soft interrupt and send it to the operating system kernel. In this way, the kernel will wake up the processing process sleeping in the printer task queue according to this soft interrupt. In network programming, soft interrupts are used to trigger the execution of protocol layer code.

7. About switching program status

The program cannot directly switch the program state to Thumb state by modifying the T control bit in CPSR. The program state switch must be completed through instructions such as BX.

8. For the LDMIA instruction, the final value of Rn is the loaded value, not the increased address

Reference address：ARM instruction study notes

Previous article：Analysis of SDIO mode driver for SD card based on ARM with SD controller
Next article：Lesson 1: S3C2440 programming

Popular Resources
Popular amplifiers