The following code is based on the kernel linux2.6.38.3 (downloaded from trimslice official website)
This article mainly analyzes the processing flow when an interrupt occurs in ARM, taking the occurrence of IRQ in usr state as an example, that is, usr—>irq.
1. Initialization of the kernel exception vector table
1.1 Initialization process
When the ARM Linux kernel starts, the first thing that runs is
Figure 1. Vector table migration and offset calculation diagram
Explanation of Figure 1: The two directional horizontal lines above represent the direction of address growth. The one below is the Code/Load view, which shows the organization of the code in the generated binary kernel before migration. The Exec view above shows the allocation of the code after it starts executing in memory.
2. Linux's processing flow for ARM exceptions and interrupts
2.1 When an IRQ occurs, the hardware completes the operation
R14_irq = address of next instruction to be executed + 4 /*Set register lr_mode to the return address*/
SPSR_irq = CPSR /*Save the current state of the processor, interrupt mask bit and each condition flag bit*/
CPSR[4:0] = 0b10010 /*Set the corresponding bit in the current program status register CPSR to enter IRQ mode*/
CPSR[5] = 0 /*Execute in ARM state*/
/*CPSR[6] unchanged*/
CPSR[7] = 1 /*Disable normal interrupts*/
If high vectors configured then
PC = 0xFFFF0018 /* Set the program counter (PC) value to the interrupt vector address of the exception interrupt.
* Jump to the corresponding exception interrupt handler for execution. For ARMv7, the vector table is generally 0xFFFF0018
*/
else
PC=0x00000018
2.2 Instruction flow jump process
After the above CPU operation is completed, the PC jumps to 0xFFFF0018, which is the address of the instruction W(b) vector_irq + stubs_offset. Then it jumps to vector_stub irq, IRQ_MODE, 4 to execute the corresponding exception and interrupt handling functions.
Next, let’s look at the code in detail:
.globl __vectors_start //Exception vector table starts at 0xFFFF0000 __vectors_start: ARM( swi SYS_ERROR0 ) THUMB( svc #0 ) THUMB( nop ) W(b) vector_und + stubs_offset W(ldr) pc, .LCvswi + stubs_offset W(b) vector_pabt + stubs_offset W(b) vector_dabt + stubs_offset W(b) vector_addrexcptn + stubs_offset W(b) vector_irq + stubs_offset //jump address after interrupt occurs 0xFFFF0018 W(b) vector_fiq + stubs_offset .globl __vectors_end __vectors_end: |
stubs_offset is just an offset used to correct the jump address. The main operation is performed by vector_irq. vector_irq is generated by the macro vector_stub irq,IRQ_MODE,4 (IRQ_MODE is defined in include\asm\ptrace.h: 0x12). The following is the code generated by vector_irq (in the assembly code, statements starting with @, //, // all represent comments):
.align 5 vector_irq: sub lr, lr, 4 //Because when an exception occurs, the CPU assigns the pc address + 4 to lr, make a correction here. @ Save r0, lr_ @ (parent CPSR) @ stmia sp, {r0, lr} //Save r0, lr to the irq stack (each exception has its own stack) mrs lr, spsr //lr saves the value of spsr_irq, that is, the value of cpsr in usr state (see 2.1) str lr, [sp, #8] // Save spsr to [sp+8]
@ Prepare for SVC32 mode. IRQs remain disabled. @ mrs r0, cpsr eor r0, r0,#(IRQ_MODE ^ SVC_MODE|PSR_ISETSTATE) //PSR_ISETSTATE: Select ARM/Thumb instruction set msr spsr_cxsf, r0 //cxsf here represents the 4 8-bit data fields occupied from low to high |
XOR operation can exchange positions, that is, A^B^C is equivalent to A^C^B. So here r0^(IRQ_MODE ^ SVC_MODE|PSR_ISETSTATE) is equivalent to r0^IRQ_MODE^SVC_MODE. Since the lower 5 mode bits of r0 are the same as IRQ_MODE, the lower 5 bits of the operation result of r0^IRQ_MODE are all cleared, and then ^SVC_MODE, that is, the lower 5 bits are set to SVC_MODE, and the other bits remain unchanged.
@ the branch table must immediately follow this code and lr, lr, #0x0f //Extract the processor mode before the exception occurred, which is the usr mode here mov r0, sp ldr lr, [pc, lr, lsl #2] movs pc, lr |
sp is the stack pointer in SVC32 mode. Here it is moved to r0 and can be used as the first parameter of the C function, that is, the pt_regs parameter in the C function.
pc is the current address + 8, which is the base address of the jump table following this code. lr is used to index in the jump table. lr shifted left by two bits is equivalent to *4, because each entry is 4 bytes. If entering irq mode from usr mode, lr=pc+4*0. If entering irq from svc mode, lr=pc+4*3, which is the address of __irq_svc. Other addresses enter __irq_invalid error processing because irq exception cannot be entered from other modes.
Assuming that irq is entered from usr, the first instruction in the jump table is executed. The base address of the jump is the current pc, because ARMv4 is a three-stage pipeline structure, it always points to the address of the next two instructions of the current instruction. Although the instruction pipeline of later versions is extended to 5 and 8 levels, this feature has been compatible, that is, pc(execute)=pc(fetch) + 8, where: pc(fetch) is the instruction currently being executed, that is, the value of PC when the instruction was previously fetched; pc(execute): the current instruction is executed. If pc is used in the calculation, it refers to the value of pc at this time.
When the target register of the mov instruction is PC and the instruction ends with S, it will restore the value of spsr to cpsr. As mentioned above, the current spsr stores the value of r0, which is svc mode. Therefore, this instruction jumps to __irq_usr and changes the processor mode to svc mode. The reason why exception handling must enter svc mode is that exception handling must enter the PL1 privilege level; another reason is to enable nested interrupts. The specific reason is explained in question 4. __irq_svc and __irq_invalid will not be discussed for the time being.
/* * The jump table below Interrupt dispatcher must be immediately after vector_irq */ vector_stub irq, IRQ_MODE, 4 //生成vector_irq
/* Enter the interrupt processing function from user mode */ .long __irq_usr @ 0 (USR_26 / USR_32) .long __irq_invalid @ 1 (FIQ_26 / FIQ_32) .long __irq_invalid @ 2 (IRQ_26 / IRQ_32) /* Enter the interrupt processing function from SVC */ .long __irq_svc @ 3 (SVC_26 / SVC_32) .long __irq_invalid @ 4 .long __irq_invalid @ 5 .long __irq_invalid @ 6 |
Figure 2 IRQ interrupt processing jump diagram
Note that the following operations are all in svc mode. Because the SVC mode is used for ISP processing, all SVC mode registers need to be saved to the SVC stack, and then the interrupt service routine (ISP) irq_handler is called.
2.2.1 __race_usr
.align 5 __irq_usr: usr_entry //Used to initialize the interrupt processing stack when an interrupt occurs in user mode, and save all SVC state registers to the stack. kuser_cmpxchg_check //For lower version ARM cores, user state cannot implement atomic compare and exchange. //An interrupt occurs during the sub-comparison exchange process, which requires special processing and is skipped get_thread_info tsk //According to the current sp pointer, clear the rightmost 13 bits of the pointer to 0 and obtain the thread_info of the current task #ifdef CONFIG_PREEMPT //If preemption is possible, increment the task's preemption count ldr r8, [tsk, #TI_PREEMPT] //T is defined as offsetof(struct thread_info, preempt_count), obviously through tsk It is easy to get the address of the process preempt_count member add r7, r8, #1 @ increment it str r7, [tsk, #TI_PREEMPT] #endif irq_handler //interrupt service routine, analyzed later #ifdef CONFIG_PREEMPT ldr r0, [tsk, #TI_PREEMPT] // Get the current preemption count str r8, [tsk, #TI_PREEMPT] // and save the value in r8 back. This is equivalent to subtracting the preemption count incremented in the previous step. teq r0, r7 //r0, r7 are the preemption counts before and after calling irq_handler. The comparison here is to prevent the driver's ISR //The program did not pair the operation preemption count, resulting in a system error. ARM( strne r0, [r0, -r0] ) //If the preemption count is destroyed, force write to 0. THUMB( movne r0, #0 ) THUMB( collapse r0, [r0] ) #endif mov why, #0 b ret_to_user //Return to user mode UNWIND(.fnend ) ENDPROC(__irq_usr) |
Next, let's look at the functions of each function
arch/arm/include/asm/ptrace.h struct pt_regs { unsigned long uregs[18]; }; #endif /* __KERNEL__ */ #define ARM_cpsr uregs[16] #define ARM_pc uregs[15] #define ARM_lr uregs[14] #define ARM_sp uregs[13] #define ARM_ip uregs[12] #define ARM_fp uregs[11] #define ARM_r10 uregs[10] #define ARM_r9 uregs[9] …… #define ARM_ORIG_r0 uregs[17] |
pt_regs structure definition, which will be used later.
.macro
Figure 3 Saving the interrupt stack 2.2.3 get_thread_infoThe get_thread_info macro is used to right-shift and left-shift 13 bits respectively according to the current sp value through lsr and lsl, which is equivalent to rounding sp down to 8K alignment. This is the address where thread_info is located.
linux/arch/arm/kernel/entry-armv.S /* * Interrupt handling. Preserves r7, r8, r9 */ .macro irq_handler #ifdefCONFIG_MULTI_IRQ_HANDLER ldr r5,=handle_arch_irq mov r0,sp ldr r5,[r5] adr lr,BSYM(9997f) teq r5,#0 movne pc,r5 #endif arch_irq_handler_default 9997: .endm
2.2.5 arch_irq_handler_defaultirq_handler is the real IRQ interrupt processing entry. In interrupt processing, the r7, r8 and r9 registers need to be reserved. They are used to handle kernel preemption. In the case of no MULTI_IRQ_HANDLER configuration, the logic of irq_handler is very simple, which is to simply call arch_irq_handler_default. If this option is configured, the platform code can modify the global variable: handle_arch_irq. Only the default implementation is discussed here.
2.2.6 get_irqnr_preambleget_irqnr_preamble is used to obtain the interrupt status register base address, which is specific to the CPU. Here the CPU uses tegra, and its definition is as follows
2.2.7 get_irqnr_and_baseget_irqnr_and_base is used to get the interrupt number.
The two macros get_irqnr_preamble and get_irqnr_and_base are defined by the machine-level code. The purpose is to obtain the IRQ number from the interrupt controller, and then call asm_do_IRQ. From this function, the interrupt program enters the C code, and the parameters passed in are the IRQ number and the register structure pointer. 2.2.8 asm_do_IRQ
Figure 4 asm_do_IRQ flow chart asm_do_IRQ is the core function of ARM for handling hardware interrupts. The first parameter specifies the interrupt number of the hard interrupt, and the second parameter is a structure composed of register backups, which saves the value of the register corresponding to the mode when the interrupt occurs and is used when the interrupt returns.
2.2.9 irq_enterirq_enter is to update some system statistics, and at the same time, the process preemption is prohibited in the __irq_enter macro. Although when an IRQ is generated, ARM will automatically set the I bit in the CPSR to prohibit new IRQ requests, and it will not be turned on until the interrupt control is transferred to the corresponding flow control layer through local_irq_enable(). Why do we still need to prohibit preemption? This is because we need to consider the problem of interrupt nesting. Once the flow control layer or the driver actively turns on the IRQ through local_irq_enable, and the interrupt has not been processed yet, a new irq request arrives, and the code will enter irq_enter again. When this nested interrupt returns, the kernel does not want to preempt the scheduling, but waits until the outermost interrupt is processed before making a scheduling action, so there is a process of prohibiting preemption.
2.2.10 generic_handle_irq
2.2.11 ret_to_userAfter the above content is processed, return to the user layer.
3. Questions and Answers Question 1: vector_irq is already the entry function for exception and interrupt processing, why do we need to add stubs_offset? ( b vector_irq + stubs_offset )Answer: (1) When the kernel is just started (head.S file), the starting address of the exception vector table (for example, 0xffff0000) is determined by setting the c1 register of CP15. Therefore, the exception vector table in the already written kernel code needs to be moved to 0xffff0000. Only in this way can the kernel handle the exception correctly when an exception occurs. (2) From the above code, we can see that both the vector table and stubs (interrupt processing function) have been moved. If b vector_irq is still used, then the actual execution cannot jump to the moved vector_irq, because the instruction code contains the original offset, so the offset in the instruction code needs to be written as the moved one. As for why the address after the move is vector_irq+stubs_offset, as shown in Figure 1. The following figure is a schematic diagram of the move, which explains the move process more clearly.
Question 2: Why is the b instruction used to jump instead of the ldr absolute jump in the exception vector table?Answer: Because using the b instruction to jump is more efficient than an absolute jump (ldr pc, XXXX), and because of its high efficiency, the code between __stubs_start and __stubs_end is moved to the starting point of 0xffff0200. Notice: Because the b jump instruction can only jump within +/-32MB, it must be copied to around 0xffff0000. The b instruction is a jump relative to the current PC. When the assembler sees the B instruction, it converts the label to be jumped into an offset relative to the current PC and writes it into the instruction code.
After Uboot is started, the kernel jumps into linux/arch/arm/kernel/head.S and starts execution.
Question 3: Why does it enter head.S first to start execution?A: The Makefile in the top-level directory of the kernel source code defines the vmlinux generation rules: # vmlinux image - includingupdated kernel symbols vmlinux: $(vmlinux-lds)$(vmlinux-init) $(vmlinux-main) vmlinux.o $(kallsyms.o)FORCE $(vmlinux-lds) is the compilation and connection script. For the ARM platform, it is the arch/arm/kernel/vmlinux-lds file. vmlinux-init is also defined in the top-level Makefile: vmlinux-init := $(head-y)$(init-y) head-y is defined in arch/arm/Makefile: head-y:=arch/arm/kernel/head$(MMUEX T).o arch/arm/kernel/init_task.o … ifeq ($(CONFIG_MMU),) MMUEXT := -noun endif For processors with MMU, MMUEXT is a blank string, so arch/arm/kernel/head.O is the first file to be linked, and this file is generated by compiling arch/arm/kernel/head.S. Based on the above analysis, we can conclude that the entry point of the non-compressed ARM Linux kernel is in arch/arm/kernel/head.s.
Question 4: Why must the interrupt enter svc mode?One of the most important reasons is: If an interrupt mode (such as entering irq mode from usr, in irq mode) is re-enabled, and the BL instruction is used to call a subroutine in this interrupt routine, the BL instruction will automatically save the subroutine return address to the current mode's lr (i.e. r14_irq), which will then be destroyed by the interrupt generated in the current mode, because when an interrupt is generated, the CPU will save the current mode's PC to r14_irq, thus flushing the subroutine return address that has just been saved. To avoid this, the interrupt routine should switch to SVC or system mode, so that the BL instruction can use r14_svc to save the subroutine return address.
Question 5: Why do some jump tables use b instruction jumps while others use ldr px,xxxx?W(b) vector_und+ stubs_offset W(ldr) pc, .LCvswi + stubs_offset W(b) vector_pabt+ stubs_offset W(b) vector_dabt+ stubs_offset W(b) vector_addrexcptn+ stubs_offset W(b) vector_irq+ stubs_offset W(b) vector_fiq+ stubs_offset
.LCvswi: .word vector_swi Since the system call exception code is compiled in other files, its entry address is far away from the exception vector, and it is impossible to jump to it using the b instruction (the b instruction can only jump to the 32M range relative to the current PC). Therefore, its address is stored in LCvswi, and its entry address is loaded from the memory address. The principle is the same as other calls. This is why the system call speed is slightly slower. Question 6: Why can ARM handle interrupts? Because the ARM architecture CPU has a mechanism, as long as an interrupt occurs, the CPU will automatically jump to a specific address (that is, an address in the interrupt vector table) according to the interrupt type. The following table shows the interrupt vector table.
ARM interrupt vector table and address Question 7: What is High vector? A: In Linux 3.1.0, arch/arm/include/asm/system.hline121 is defined as follows: #if __LINUX_ARM_ARCH__ >=4 #define vectors_high() (cr_alignment & CR_V) #else #define vectors_high() (0) #endif This means that if the ARM architecture used is greater than or equal to 4, define vectors_high()=http://blog.chinaunix.net/uid-361890-id-175347.html 2. "LINUX3.0 Kernel Source Code Analysis" Chapter 2: Interrupts and Exceptions http://blog.chinaunix.net/uid-25845340-id-2982887.html 3. Kernel Memory Layout on ARM Linux http://www.arm.linux.org.uk/developer/memory.txt 4. http://emblinux.sinaapp.com/ar01s16.html#id3603818 5. Linux interrupt subsystem 2: Arch-related hardware encapsulation layer http://blog.csdn.net/droidphone/article/details/7467436 Appendix 1 Kernel Memory Layout on ARM Linux Start End Use -------------------------------------------------------------------------- ffff8000 ffffffff copy_user_page / clear_user_page use. ForSA11xx and Xscale, this is used to setupa minicache mapping.
ffff1000 ffff7fff Reserved. Platformsmust not use this address range.
ffff0000 ffff0fff CPUvector page. The CPU vectors are mapped here ifthe CPU supports vector relocation(control register V bit.)
ffc00000 fffeffff DMA memory mapping region. Memory returned bythe dma_alloc_xxx functions will be dynamicallymapped here.
ff000000 ffbfffff Reserved for future expansion of DMA mappingregion.
VMALLOC_END feffffff Free for platform use, recommended. VMALLOC_ENDmust be aligned to a 2MB boundary.
VMALLOC_START VMALLOC_END-1 vmalloc() /ioremap() space. Memoryreturned by vmalloc/ioremap will bedynamically placed in this region. VMALLOC_STARTmay be based upon the value ofthe high_memory variable.
PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region. Thismaps the platforms RAM, and typically mapsall platform RAM in a 1:1 relationship.
TASK_SIZE PAGE_OFFSET-1 Kernel module space Kernelmodules inserted via insmod are placedhere using dynamic mappings.
00001000 TASK_SIZE-1 User space mappings Per-threadmappings are placed here via themmap() system call.
00000000 00000fff CPU vector page / null pointer trap CPUswhich do not support vector remapping placetheir vector page here. NULL pointer dereferencesby both the kernel and user spaceare also caught via this mapping.
Please note that mappings which collidewith the above areas may result in a non-bootable kernel, or may cause thekernel to (eventually) panic at run time.
Since future CPUs may impact the kernelmapping layout, user programs must not access any memory which is notmapped inside their 0x0001000 to TASK_SIZE address range. If they wish to access these areas, they must set up their own mappings using open()and mmap().
Previous article:Ubuntu9.10 makes ARM cross compiler Recommended ReadingLatest update time:2024-11-16 16:26
Design of remote debugging module for embedded devices based on ARM & Linux
0 Introduction
At present, most industrial equipment has dedicated debugging interface and debugging software. This method can effectively solve the problem of on-site debugging and maintenance and upgrading of the system, but it also has some disadvantages. First, relying on dedicated debugging software, as the prod
[Microcontroller]
Transplantation of μCOS-Ⅱ on ARM Series MCU S3C44B0x
Introduction
At present, embedded systems have been more and more widely used in various fields such as industrial control, household appliances, mobile communications, PDAs, etc. As users have higher and higher requirements for the performance of embedded products, program design has become more and more co
[Microcontroller]
Mir ARM+FPGA architecture development board PCIE2SCREEN example analysis and test
Mir ARM+FPGA architecture development board PCIE2SCREEN example analysis and test
This test content is the ARM+FPGA architecture-based Mir MYD-JX8MMA7 development board, and its ARM-side test routine pcie2screen, and introduces the modification of the FPGA-side program.
01. Test routine p
[Embedded]
ARM architecture learning 2
There are three versions of cores under the ARM V7 series:
7A
7R
7M
A single-chip microcomputer chip includes almost all functional chips. It does not mean that the structure is simple or the functions are few.
The process of executing the program:
1. Read instructions from memory to the CPU. Only
[Microcontroller]
ARM Linux External Interrupts
Recently I am learning the whole set of external interrupt processing of ARM Linux. I have collected some information on the Internet and I have almost understood the whole process. If I don't have these materials, I really don't have the confidence to read the code from assembly. Thanks to Jimmy.lee from Pentium Era a
[Microcontroller]
Introduction to ARM system hardware and software
Hardware: The ARM processor integrated on the chip accesses peripherals through registers mapped to memory addresses. It also includes a special peripheral called a controller that implements memory control and interrupts. The on-chip AMBA connects the peripherals to the processor.
Software: The initialization code co
[Microcontroller]
ARM exception handling study notes
Exception handling on ARM
This part is difficult to understand.
When an abnormal interrupt occurs, the system will jump to the corresponding abnormal interrupt handler after executing the current instruction. When the abnormal interrupt handler is executed, the program returns to the next instruction of the inte
[Microcontroller]
ARM holds 2025 second quarter financial report meeting, V9 architecture continues to be a great success
On September 30, 2024, ARM held a second quarter earnings conference call for fiscal year 2025. The company's CEO Rene Haas said: "During this year, we have exceeded all expectations in executing our growth strategy. The ubiquitous demand for artificial intelligence is increasing the demand for ARM computing platfor
[Semiconductor design/manufacturing]
Latest Microcontroller Articles
He Limin Column
Microcontroller and Embedded Systems Bible
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
MoreSelected Circuit Diagrams
MorePopular Articles
MoreDaily News
Guess you like
|