Preface
I will use some space here to describe how to initialize the interrupt vector table in Linux under the arm architecture. Because this method is very universal, I call it code migration. You said that everyone knows how to move code, isn't it just copying? It is true, but there are also skills in copying. Copying is very simple, in fact, it is memcpy, which is needless to say. What I want to talk about here is how to design your code to be copied at will, in other words, it is called position-independent code, which can be used anywhere. I have used a similar method for startup before, and I will talk about it today.
Scenario 1 copy
Let's look at the actual action first. The code is located in arch/arm/traps.c, kernel version: 2.6.27. This is the initialization code, setup_arch()->early_trap_init(). Friends who are familiar with the initialization part may have seen this code.
void __init early_trap_init(void)
{
unsigned long vectors = CONFIG_VECTORS_BASE;
extern char __stubs_start[], __stubs_end[];
&nsp; extern char __vectors_start[], __vectors_end[];
extern char __kuser_helper_start[], __kuser_helper_end[];
int kuser_sz = __kuser_helper_end - __kuser_helper_start;
/*
* Copy the vectors, stubs and kuser helpers (in entry-armv.S)
* into the vector page, mapped at 0xffff0000, and ensure these
* are visible to the instruction stream.
*/
memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
memcpy((void *)vectors + 0x200, __stubs_start, __stubs_end - __stubs_start);
memcpy((void *)vectors + 0x1000 - kuser_sz, __kuser_helper_start, kuser_sz);
…
}
The actual copy operation is clear at a glance, which is two memcpy (the third one is actually copying something else, the principle is the same, so I won't mention it here). The source of the copy is vectors, and the value is CONFIG_VECTORS_BASE, which is generally 0xffff0000, of course you can configure this value yourself according to the hardware settings. What is copied to? The first part is the code from __vectors_start to __vectors_end, and the second part is the code from __stubs_start to __stubs_end, and the second part is copied to the starting position of vectors + 0x200. In other words, the distance between the two parts is 0x200, or 512 bytes.
Let's take a look at what __vectors_start, __vectors_end, font face="Times New Roman">__stubs_start, and __stubs_end are. As long as we know where they are defined, we will know what's going on.
Scenario 2 The protagonist makes his debut
They are hidden in arch/arm/kernel/entry-armv.S. This file is the entry code of each mode in arm. Friends who are familiar with arm know that arm has several modes. Those who don't know can check it out by themselves. I won't say it. Let's take a fragment, the part related to our explanation. In order to make it clearer for everyone, I deleted some codes and comments and highlighted the main trunk. Interested friends can view the source code and study it in full. It is quite meaningful.
.globl __stubs_start
__stubs_start:
/*
* Interrupt dispatcher
*/
vector_stub irq, IRQ_MODE, 4
// Please note that vector_stub is a macro, which is a block of code after expansion. Below is a jump table. We expand the code structure to the following structure: (The expansion process of vector_stub dabt, ABT_MODE, 8, etc. is the same, so we will not mention it here)
// -------------------------------- begin Expand
.align 5
vector_irq:
sub lr, lr, 4
@ Save r0, lr_
@ (parent CPSR)
@
stmia sp, {r0, lr} @ save r0, lr
mrs lr, spsr
str lr, [sp, #8] @ save spsr
@ Prepare for SVC32 mode. IRQs remain disabled.
@
mrs r0, cpsr
eor r0, r0, IRQ_MODE ^ SVC_MODE)
msr spsr_cxsf, r0
@ the branch table must immediately follow this code
@
and lr, lr, #0x0f
mov r0, sp
ldr lr, [pc, lr, lsl #2]
movs pc, lr @ branch to handler in SVC mode
// -------------------------------- end Expand
.long __irq_usr @ 0 (USR_26 / USR_32)
.long __irq_invalid @ 1 (FIQ_26 / FIQ_32)
.long __irq_invalid @ 2 (IRQ_26 / IRQ_32)
.long __irq_svc @ 3 (SVC_26 / SVC_32)
. . .
.long __irq_invalid@f
/*
* Data abort dispatcher
* Enter in ABT mode, spsr = USR CPSR, lr = USR PC
*/
vector_stub dabt, ABT_MODE, 8
.long __dabt_usr @ 0 (USR_26 / USR_32)
.long __dabt_invalid @ 1 (FIQ_26 / FIQ_32)
.long __dabt_invalid @ 2 (IRQ_26 / IRQ_32)
.long __dabt_svc @ 3 (SVC_26 / SVC_32)
. . .
.long __dabt_invalid@f
/*
* Prefetch abort dispatcher
* Enter in ABT mode, spsr = USR CPSR, lr = USR PC
*/
vector_stub pabt, ABT_MODE, 4
.long __pabt_usr @ 0 (USR_26 / USR_32)
.long __pabt_invalid @ 1 (FIQ_26 / FIQ_32)
.long __pabt_invalid @ 2 (IRQ_26 / IRQ_32)
.long __pabt_svc @ 3 (SVC_26 / SVC_32)
. . .
.long __pabt_invalid@f
/*
* Undef instr entry dispatcher
* Enter in UND mode, spsr = SVC/USR CPSR, lr = SVC/USR PC
*/
vector_stub und, UND_MODE
.long __und_usr @ 0 (USR_26 / USR_32)
.long __und_invalid @ 1 (FIQ_26 / FIQ_32)
.long __und_invalid @ 2 (IRQ_26 / IRQ_32)
.long __und_svc @ 3 (SVC_26 / SVC_32)
. . .
.long __und_invalid@f
.align 5
vector_fiq:
disable_fiq
subs pc, lr, #4
vector_addrexcptn:
b vector_addrexcptn
/*
* We group all the following data together to optimize
* for CPUs with separate I & D caches.
*/
.align 5
.LCvswi:
.word vector_swi
.globl __stubs_end
__stubs_end:
.equ stubs_offset, __vectors_start + 0x200 - __stubs_start
.globl __vectors_start
__vectors_start:
swi SYS_ERROR0
b vector_und + stubs_offset
ldr pc, .LCvswi + stubs_offset
b vector_pabt + stubs_offset
b vector_dabt + stubs_offset
b vector_addrexcptn + stubs_offset
b vector_irq + stubs_offset
b vector_fiq + stubs_offset
.globl __vectors_end
__vectors_end:
To make it clearer for everyone, I simplified the code structure again as follows:
.globl __stubs_start
__stubs_start:
.align 5
vector_irq:
[code part] // Expand code
[jump table part] //Address jump table
. . .
.align 5
vector_dabt:
[code part]
[jump table part]
. . .
.align 5
vector_pabt:
[code part]
[jump table part]
. . .
.align 5
vector_und:
[code part]
[jump table part]
. . .
.align 5
vector_fiq:
. . .
.globl __stubs_end
__stubs_end:
.globl __vectors_start
__vectors_start:
swi SYS_ERROR0
b vector_und + stubs_offset
ldr pc, .LCvswi + stubs_offset
b vector_pabt + stubs_offset
b vector_dabt + stubs_offset
b vector_addrexcptn + stubs_offset
b vector_irq + stubs_offset
b vector_fiq + stubs_offset
.globl __vectors_end
__vectors_end:
I won't spend too much time explaining the meaning of the code here, as this is not the purpose of this article. As long as you understand the structure, you will have achieved your goal. But I will take some time to study the characteristics of the expanded code part (blue). This part of the code is position-independent. Let's study it a little bit to see why it is written this way.
.align 5
vector_irq:
[code part] // Expand code
[jump table part] //Address jump table
. . .
First of all, this part of the code has roughly the same structure, with some code in front and a jump table behind. The jump table defines some addresses. Let's cut this part and see
. . .
@ the branch table must immediately follow this code
@
and lr, lr, #0x0f (1) // lr currently stores the value of the previous status register, and performs AND on the last few bits.
// It is to determine whether it is in user mode or kernel mode before the interrupt. This value is used as a jump
// Transfer table index
mov r0, sp (2) // Use it for other purposes, and pass the sp value as the first parameter to the following function
ldr lr, [pc, lr, lsl #2] (3) // pc is the address of the currently executed instruction plus 8, i.e. the base address of the jump table, lr is the index
// Very good technique, it is always correct to get the current address from PC
mov pc, lr @ branch to handler in SVC mode
[jump table]
.long __irq_usr @ 0 (USR_26 / USR_32)
.long __irq_invalid @ 1 (FIQ_26 / FIQ_32)
.long __irq_invalid @ 2 (IRQ_26 / IRQ_32)
.long __irq_svc @ 3 (SVC_26 / SVC_32)
The actual jump is done in the last sentence, and everyone can see it clearly. Where does it jump to? If the interrupt was in svc mode before, it will jump to __irq_svc. We found that b (bl, bx, etc.) is not used directly here.
ü First, there is an offset after the b jump, and this offset is limited and cannot be too large
ü Second, you don't know whether the offset after the b jump will remain the same after the code is copied, because we need to move the code. So if you are not sure that the offset will remain unchanged after the move, then you can use the absolute address. The first three sentences of the above code calculate the absolute address, and then use the absolute address to assign it to pc to directly complete the jump.
These are some tips. In short, you need to pay attention to the jump part when writing position-independent code. Use b to jump or directly assign an absolute address (implemented through a jump table). If you cannot ensure that the offset is consistent after the move, you must pay attention when writing this part and use some tips.
You can use the -fPIC and -S options of gcc to assemble a small function. fPIC is the position-independent option. I believe that people who have compiled dynamic libraries are familiar with it. Take a look at how it is done. You will find that it is similar in nature.
Scenario 3: The Big Move
I use a chapter to introduce the large migration process, as well as some Linux problems and solutions during the migration. I put the entire migration process into a diagram and then discussed some technical details. We can see that this is a huge diagram, and the content of our chapter is all in the diagram.
We call the code organization before migration the Code/Load view, because this is the organization in the code (or image), and the code organization after migration is called the Exec view, which reflects the code in memory when the code is executed. I just talked about the first scenario, and those who forgot to go back to the first scenario, the execution process of the two memcpy is also shown in the figure, which is the blue and red dotted lines with arrows. This is the process of copying the code from the code view to the exec view, which is clear at a glance, no need to say more.
Now there is a problem. We find that the code between __vector_start and __vector_end is a bit weird. Let's take a look at it again:
.equ stubs_offset, __vectors_start + 0x200 - __stubs_start
.globl __vectors_start
__vectors_start:
swi SYS_ERROR0
b vector_und + stubs_offset
ldr pc, .LCvswi + stubs_offset
b vector_pabt + stubs_offset
b vector_dabt + stubs_offset
b vector_addrexcptn + stubs_offset
b vector_irq + stubs_offset
b vector_fiq + stubs_offset
.globl __vectors_end
__vectors_end:
In the second scenario, we said that this is called position-independent code because it needs to be copied to another place. And it is full of jump instructions. We found that except for the third line of code that uses an absolute address for jumping, the rest use b jumps. For example, b vector_dabt + stubs_offset, (vector_dabt is between __stubs_start and __stubs_end). If you use b vector_dabt, this will definitely be a problem, because the organization (map) of the exec view after copy is different, so the offset after b is wrong. Here, we have to adjust this offset. Stubs_offset is this adjustment value, which can be calculated. The specific calculation process is explained clearly in the figure, so I won’t mention it here. You can see the detailed derivation process in the figure.
In fact, although the instruction ldr pc, .LCvswi + stubs_offset uses an absolute address jump and a jump table, the address search process also uses this technique. We can see
.align 5
.LCvswi:
.word vector_swi
The .LCvswi location stores an address, which is where we will jump to. .align 5 means 32-byte alignment, which is to ensure cache line alignment. To find this address in the exec view, we need to add an offset. The principle is the same, because .LCvswi is between __stubs_start and __stubs_end, this area has been moved, so we can't use this label address directly. vector_swi has not been moved, so it can be used directly.
To sum up, I think what I am going to talk about is the technical details of Linux, and it does describe the principles and precautions of the code migration process. In fact, what is more important is how we can reverse this process, that is, how to design in situations involving code migration, and how to use these technologies to implement this design process. You can follow these steps:
1. Draw the big picture, determine the Code view and Exec view according to your requirements, design the migration section and the location of the migration
2. Write the code to be moved, and use the position-independent techniques (mentioned above) to encode and verify it.
3. Use code similar to memcpy to move
Previous article:Brief analysis of the interrupt process under arm linux initialization
Next article:A brief analysis of the Arm Linux operating system calling process.
Recommended ReadingLatest update time:2024-11-16 13:30
- Popular Resources
- Popular amplifiers
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Is there any requirement for the length of the data packet of the serial port? For example, ACK 00 00 means that the data has been received, instead of...
- Experience in drawing PCB circuit boards
- Understanding CCS Link Command Files (.cmd)
- [Flower carving DIY] Interesting and fun music visualization series project (28) - LED table tennis lamp
- Design of overvoltage protection circuit in TI chip Buck-Boost circuit
- Sharing the OS transplantation and application of Lingdongwei MM32 MCU--Based on I2C interface control of AMetal platform
- What do you think of the official open source of Huawei's Ark Compiler?
- [New version of Zhongke Bluexun AB32VG1 RISC-V development board] - 5: Enhanced version of Blink
- Current Status and Future Development of Satellite Mobile Communications
- Intelligently Connected World—Application of Internet of Vehicles and Future Development of Digitalization