Detailed explanation of the stack pointer register SP and the role of the stack-EEWORLD

Collect

A stack is a storage structure with a special access attribute of "Last In First Out" (LIFO). A stack generally uses RAM physical resources as a storage body and is implemented with a LIFO access interface.

How to implement the stack:

In the random access memory area, an area is allocated as the stack area. Data can be stored (pushed) into this area one by one in sequence. This process is called "push". Usually, a pointer (stack pointer SP---StackPointer) is used to make an adjustment. SP always points to the data unit (top of the stack) where the last data pushed into the stack is located. When reading data from the stack, the stack data is read according to the stack unit pointed to by the stack pointer. This process is called "pop". Every time a data is popped, SP is adjusted in the opposite direction, thus realizing the principle of last in, first out.

The stack is a widely used technology in computers. Based on the LIFO feature of data in and out of the stack, it is often used to save interrupt breakpoints, save subroutine call return points, save CPU field data, etc. It is also used to pass parameters between programs.

In ARM processors, register R13 is usually used as the stack pointer (SP). ARM processors have a total of 6 stack pointers (SP) for different modes, of which user mode and system mode share one SP, and each exception mode has its own dedicated R13 register (SP). They usually point to the dedicated stack corresponding to each mode, that is, the ARM processor allows user programs to have six different stack spaces. These stack pointers are R13, R13_svc, R13_abt, R13_und, R13_irq, and R13_fiq, as shown in Table 2-3 Stack Pointer Register.

In order to describe the stack more accurately, the stack is divided into 'increasing stack' (SP changes towards a larger value) and 'decrementing stack' (SP changes towards a smaller value) according to the increase and decrease direction of the stack pointer during the "push" operation; and according to whether the storage unit pointed to by the SP pointer contains stack data, the stack is divided into 'full stack' (the unit pointed to by the SP contains valid stack data) and 'empty stack' (the unit pointed to by the SP does not contain valid stack data).

There are four stacking modes in this way: full increment, empty increment, full decrement and empty decrement.

The stack operation of the ARM processor is very flexible and supports all four types of stacks.

R13 in ARM processors is used as SP. When the stack is not used, R13 can also be used as a general data register.

The overall role of the stack:

1. Protect the scene

The scene/context is equivalent to the crime scene. There are always some crime scenes that need to be recorded, otherwise they will be destroyed by others and cannot be restored. The scene here refers to some registers used when the CPU is running, such as r0, r1, etc. For the values of these registers, if they are not saved and directly jump to the sub-function for execution, they are likely to be destroyed, because these registers are also used for the function execution. Therefore, before calling the function, these registers and other scenes should be temporarily saved (pushed into the stack), and after the called function is executed, the stack should be popped (popped) and the scene should be restored. In this way, the CPU can continue to execute correctly.

To save register values, you usually use the push instruction to put the corresponding register values into the stack one by one, which is called pushing. Then, after the called sub-function is executed, call pop again to assign the values in the stack to the corresponding registers you used when you first pushed the stack, and pop the corresponding values from the stack, which is called popping.

The saved registers also include the value of lr (because if the bl instruction is used to jump, the previous pc value is stored in lr). After the subroutine is executed, the lr value in the stack is popped out and assigned to pc, thus achieving the correct return of the subfunction.

2. Passing parameters

When a C language function is called, some parameters will be passed to the called function. For these C language level parameters, when they are translated into assembly language by the compiler, they need to be stored somewhere and accessible to the called function, otherwise they cannot be passed. There are two situations for finding a place to store them. First, if the parameters passed are no more than 4, they can be transferred through registers. Because the values of the corresponding registers have been saved in the previous save scene action, these registers are free at this time and can be used for us to store parameters. Second, if there are more than 4 parameters and the registers are not enough, a stack must be used.

3. Temporary variables are stored on the stack

These temporary variables include non-static local variables of the function and other temporary variables automatically generated by the compiler

An example of how to use the stack in C language function calls

The above explanation is a bit abstract, so it is easier to understand if we use an example to explain it briefly:

Use arm-inux-objdump –d u-boot dump_u-boot.txt to get the dump_u-boot.txt file. This file contains the u-boot executable assembly code, from which we can see the assembly code corresponding to the corresponding C program.

The following are the assembly codes of two functions, one is clock_init, and the other is the function CopyCode2Ram in the same C source file as clock_init:

33d0091c: CopyCode2Ram:

33d0091c: e92d4070 push {r4, r5, r6, lr}

33d00920: e1a06000 mov r6, r0

33d00924: e1a05001 mov r5, r1

33d00928: e1a04002 mov r4, r2

33d0092c:ebffffef bl 33d008f0 b BootFrmNORFlash

......

33d00984: ebffff14 bl 33d005dc nand_read_ll

......

33d009a8: e3a00000 mov r0, #0 ; 0x0

33d009ac: e8bd8070 pop {r4, r5, r6, pc}

33d009b0: clock_init:

33d009b0: e3a02313 mov r2, #1275068416 ;0x4c000000

33d009b4: e3a03005 mov r3, #5 ; 0x5

33d009b8: e5823014 str r3,

......

33d009f8: e1a0f00e mov pc, lr

(1) First analyze the assembly code corresponding to clock_init. You can see that the first line of the function
: 33d009b0: e3a02313 mov r2, #1275068416 ;0x4c000000
does not have the push instruction we expected, that is, the values of some registers are not put into the stack. This is because the registers r2, r3, etc. used by clock_init do not conflict with the register r0 used before calling clock_init, so there is no need to push to save here. There is a register to pay attention to, r14, that is, lr. When calling clock_init, the bl instruction is used, so the pc value at the time of the jump is automatically assigned to lr, so there is no need to push to save the PC value to the stack. The last line of the assembly code corresponding to clock_init: 33d009f8: e1a0f00e mov pc, lr is the common mov pc, lr, which assigns the lr value, that is, the PC value saved when the function is called, to the current PC, so that the function is correctly returned, that is, it returns to the position of the next instruction when the function is called. The CPU can continue to execute the remaining code in the original function.

(2) The first line of assembly code corresponding to CopyCode2Ram is: 33d0091c: e92d4070 push {r4, r5, r6, lr}
is what we expect. Push is used to save r4, r5, r6, lr because this function also includes other function calls
: 33d0092c: ebffffef bl 33d008f0 b BootFrmNORFlash……
33d00984: ebffff14 bl 33d005dc nand_read_ll
……

The bl instruction is also used, which will change the lr value when we first enter clock_init, so it must also be pushed and saved temporarily.

Correspondingly, the last line of CopyCode2Ram: 33d009ac: e8bd8070 pop {r4, r5, r6, pc} pops out the previously pushed values and returns them to the corresponding registers. The last one pops out the value of lr that was pushed at the beginning and assigns it to PC, realizing the return of the function. In addition, we noticed that the second to last line of CopyCode2Ram: 33d009a8: e3a00000 movr0, #0 ; 0x0 assigns 0 to the r0 register, which is what we call the transfer of return values. The return value here is 0, which also corresponds to "return 0" in C code.

Of course, you can also use other temporarily unused registers to pass the return value.

The register used to pass the return value is designed according to the usage convention of ARM's APCS register. It is best to handle it according to the convention and do not change it casually. This will make the program more standardized.

Reference address：Detailed explanation of the stack pointer register SP and the role of the stack

Previous article：The role of the stack in the program (ARM structure)
Next article：About setting stack pointer sp register r13

Popular Resources
Popular amplifiers