As mentioned above, after decompression, the code will jump to the vmlinux after decompression to start execution. To see where to start execution, we can look at the generated vmlinux.lds (arch/arm/kernel/) file:
view plaincopy to clipboardprint?
OUTPUT_ARCH(arm)
ENTRY(stext)
jiffies = jiffies_64;
SECTIONS
{
. = 0x80000000 + 0x00008000;
.text.head : {
_stext = .;
_sinittext = .;
*(.text.h
OUTPUT_ARCH(arm)
ENTRY(stext)
jiffies = jiffies_64;
SECTIONS
{
. = 0x80000000 + 0x00008000;
.text.head : {
_stext = .;
_sinittext = .;
*(.text.h
Obviously, the first section of our vmlinx is .text.head. We cannot see the content of ENTRY here, because we don't have an operating system at this time and don't know how to parse the entry address here. We can only analyze its section (but generally speaking, the result of ENTRY here is the same as the result of our analysis from seciton). The .text.head section here can be easily found in arch/arm/kernel/head.S, and the first symbol in it is our stext:
view plaincopy to clipboardprint?
.section ".text.head", "ax"
Y(stext)
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
@ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
bl __lookup_processor_type @ r5=procinfo r9=cpuid
.section ".text.head", "ax"
ENTRY(stext)
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
@ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
bl __lookup_processor_type @ r5=procinfo r9=cpuid
The ENTRY macro here can actually be found in include/linux/linkage.h. We can see that it actually declares a GLOBAL Symbol. The only difference between the following ENDPROC and END is that the former declares a function that can be called in c.
view plaincopy to clipboardprint?
#ifndef ENTRY
#define ENTRY(name) /
.globl name; /
ALIGN; /
name:
#endif
#ifndef WEAK
#define WEAK(name) /
.weak name; /
name:
#endif
#ifndef END
#define END(name) /
.size name, .-name
#endif
/* If symbol 'name' is treated as a subroutine (gets called, and returns)
* then please use ENDPROC to mark 'name' as STT_FUNC for the benefit of
* static analysis tools such as stack depth analyzer.
*/
#ifndef ENDPROC
#define ENDPROC(name) /
.type name, @function; /
END(name)
#endif
#ifndef ENTRY
#define ENTRY(name) /
.globl name; /
ALIGN; /
name:
#endif
#ifndef WEAK
#define WEAK(name) /
.weak name; /
name:
#endif
#ifndef END
#define END(name) /
.size name, .-name
#endif
/* If symbol 'name' is treated as a subroutine (gets called, and returns)
* then please use ENDPROC to mark 'name' as STT_FUNC for the benefit of
* static analysis tools such as stack depth analyzer.
*/
#ifndef ENDPROC
#define ENDPROC(name) /
.type name, @function; /
END(name)
#endif
After finding the start code of vmlinux, we will analyze it. Let's first summarize the functions completed by this part of the code. head.S will first check the validity of proc, arch and atag, then create an initialization page table, perform necessary CPU processing, turn on MMU, and jump to the start_kernel symbol to start executing the following C code. There are many variables here that we need to pay special attention to when we transplant the kernel, which will be discussed one by one below.
Here we first look at the register information when this assembly starts running. The register content here is actually the same as when the bootloader jumps to the decompression code, that is, r1=arch r2=atag addr. Now let's take a closer look at the running process of head.S:
view plaincopy to clipboardprint?
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
@ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
@ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
First enter SVC mode and turn off all interrupts, and read the CPU ID from the arm coprocessor. The CPU here mainly refers to the CPU model related to the arm architecture, such as ARM9, ARM11, etc.
view plaincopy to clipboardprint?
Then jump to __lookup_processor_type, this function is defined in head-common.S, the bl instruction here will save the current pc in lr, and finally __lookup_processor_type will return from this function. Let's take a closer look at this function:
view plaincopy to clipboardprint?
__lookup_processor_type:
adr r3, 3f
ldmda r3, {r5 - r7}
sub r3, r3, r7 @ get offset between virt&phys
add r5, r5, r3 @ convert virt addresses to
add r6, r6, r3 @ physical address space
1: ldmia r5, {r3, r4} @ value, mask
and r4, r4, r9 @ mask wanted bits
teq r3, r4
beq 2f
add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
cmp r5, r6
blo 1b
mov r5, #0 @ unknown processor
2: mov pc, lr
ENDPROC(__lookup_processor_type)
__lookup_processor_type:
adr r3, 3f
ldmda r3, {r5 - r7}
sub r3, r3, r7 @ get offset between virt&phys
add r5, r5, r3 @ convert virt addresses to
add r6, r6, r3 @ physical address space
1: ldmia r5, {r3, r4} @ value, mask
and r4, r4, r9 @ mask wanted bits
teq r3, r4
beq 2f
add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
cmp r5, r6
blo 1b
mov r5, #0 @ unknown processor
2: mov pc, lr
ENDPROC(__lookup_processor_type)
The execution process here is actually quite simple. It reads the proc_info_list structure registered in the __proc_info_begin and __proc_info_end segments. The definition of this structure is in arch/arm/include/asm/procinfo.h. The specific implementation depends on the architecture of the CPU you are using. Find the specific implementation in arch/arm/mm/. Here we use proc-v6.S for ARM11. We can take a look at this structure:
view plaincopy to clipboardprint?
.section ".proc.info.init", #alloc, #execinstr
/*
* Match any ARMv6 processor core.
*/
.type __v6_proc_info, #object
_proc_info:
.long 0x0007b000
.long 0x0007f000
.long PMD_TYPE_SECT | /
PMD_SECT_BUFFERABLE | /
PMD_SECT_CACHEABLE | /
PMD_SECT_AP_WRITE | /
PMD_SECT_AP_READ
.long PMD_TYPE_SECT | /
PMD_SECT_XN | /
PMD_SECT_AP_WRITE | /
PMD_SECT_AP_READ
b __v6_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_JAVA
.long cpu_v6_name
.long v6_processor_functions
.long v6wbi_tlb_fns
.long v6_user_fns
.long v6_cache_fns
.size __v6_proc_info, . - __v6_proc_info
.section ".proc.info.init", #alloc, #execinstr
/*
* Match any ARMv6 processor core.
*/
.type __v6_proc_info, #object
__v6_proc_info:
.long 0x0007b000
.long 0x0007f000
.long PMD_TYPE_SECT | /
PMD_SECT_BUFFERABLE | /
PMD_SECT_CACHEABLE | /
PMD_SECT_AP_WRITE | /
PMD_SECT_AP_READ
.long PMD_TYPE_SECT | /
PMD_SECT_XN | /
PMD_SECT_AP_WRITE | /
PMD_SECT_AP_READ
b __v6_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_JAVA
.long cpu_v6_name
.long v6_processor_functions
.long v6wbi_tlb_fns
.long v6_user_fns
.long v6_cache_fns
.size __v6_proc_info, . - __v6_proc_info
By looking at .h, we know the meaning of each member variable. The lookup process here is actually to first find out the actual physical address of the proc_info_list and read its content. Then, the mask, which is 0x007f000 here, is compared with the register and then with 0x007b00. If they are the same, the verification is successful. If they are different, the next proc_info information will be read. Because there is usually only one proc, there is generally no loop here. If the correct register is detected, the correct physical address of proc_info_list will be assigned to the register. If not detected, the register value will be assigned 0 and then returned through LR.
view plaincopy to clipboardprint?
bl __lookup_machine_type @ r5=machinfo
movs r8, r5 @ invalid machine (r5=0)?
beq __error_a @ yes, error 'a'
bl __lookup_machine_type @ r5=machinfo
movs r8, r5 @ invalid machine (r5=0)?
beq __error_a @ yes, error 'a'
After checking proc_info_list, we start to check machine_type. The implementation of this function is also in head-common.S. Let's take a look at its specific implementation:
view plaincopy to clipboardprint?
__lookup_machine_type:
adr r3, 3b
ldmia r3, {r4, r5, r6}
sub r3, r3, r4 @ get offset between virt&phys
add r5, r5, r3 @ convert virt addresses to
add r6, r6, r3 @ physical address space
1: ldr r3, [r5, #MACHINFO_TYPE] @ get machine type
teq r3, r1 @ matches loader number?
beq 2f @ found
add r5, r5, #SIZEOF_MACHINE_DESC @ next machine_desc
cmp r5, r6
blo 1b
mov r5, #0 @ unknown machine
2: mov pc, lr
ENDPROC(__lookup_machine_type)
__lookup_machine_type:
adr r3, 3b
ldmia r3, {r4, r5, r6}
sub r3, r3, r4 @ get offset between virt&phys
add r5, r5, r3 @ convert virt addresses to
add r6, r6, r3 @ physical address space
1: ldr r3, [r5, #MACHINFO_TYPE] @ get machine type
teq r3, r1 @ matches loader number?
beq 2f @ found
add r5, r5, #SIZEOF_MACHINE_DESC @ next machine_desc
cmp r5, r6
blo 1b
mov r5, #0 @ unknown machine
2: mov pc, lr
ENDPROC(__lookup_machine_type)
The process here is basically the same as the proc check. The main purpose here is to check the chip type. For example, our current chip is MSM7X27FFA, which is also a structure. Its header file is in arch/arm/include/asm/arch/arch.h (machine_desc). Its specific implementation varies depending on your choice of chip type. Here we use Qualcomm's 7x27, which is specifically implemented in arch/arm/mach-msm/board-msm7x27.c. These structures will eventually be registered in the _arch_info_begin and _arch_info_end segments. For details, you can look at vmlinux.lds or system.map. The lookup here will match the type in __arch_info according to the nr passed by the bootloader. If not, it will look for the next machin_desk structure until the corresponding structure is found, and the address of the structure will be assigned to the register. If not, it will be assigned to 0. Generally speaking, there will be several machine_types here, because different chip types may use the same CPU architecture.
After checking the processor and machine, the validity of the atags parameter will be checked. The specific definition of this atag can be seen in ./include/asm/setup.h. It is actually a combination of a structure and a union. The size inside is calculated in words. The atags param here is created by the bootloader. It contains some information about the allocation of ramdisk and other memories. It is stored in the address defined in the boot.img header structure. For details, you can see our analysis of the bootloader later~
view plaincopy to clipboardprint?
__vet_atags:
tst r2, #0x3 @ aligned?
bne 1f
ldr r5, [r2, #0] @ is first tag ATAG_CORE?
cmp r5, #ATAG_CORE_SIZE
cmpne r5, #ATAG_CORE_SIZE_EMPTY
bne 1f
ldr r5, [r2, #4]
ldr r6, =ATAG_CORE
cmp r5, r6
bne 1f
mov pc, lr @ atag pointer is ok
1: mov r2, #0
mov pc, lr
ENDPROC(__vet_atags)
__vet_atags:
tst r2, #0x3 @ aligned?
bne 1f
ldr r5, [r2, #0] @ is first tag ATAG_CORE?
cmp r5, #ATAG_CORE_SIZE
cmpne r5, #ATAG_CORE_SIZE_EMPTY
bne 1f
ldr r5, [r2, #4]
ldr r6, =ATAG_CORE
cmp r5, r6
bne 1f
mov pc, lr @ atag pointer is ok
1: mov r2, #0
mov pc, lr
ENDPROC(__vet_atags)
The check of atag here mainly checks whether it starts with ATAG_CORE and whether the size is correct. There is basically nothing to analyze, and the code is also pretty good. Let's take a look at the next highlight, which is to create an initialization page table. To be honest, I didn't understand this part clearly. It requires a considerable understanding of ARM VIRT MMU. I don't have much time to analyze the spec here. I just roughly flipped through the manu of ARM V7 and knew that the page table established here is the arm's secition page table, which completes the mapping of the memory starting with 1m memory. This page table is established between the kernel and the atag paramert, which is generally between 4000-8000. I won't post the specific code and process here. You can look at the reference link and the analysis of other prawns. I haven't understood it yet. I will come back to study it carefully when I study ARM MMU in detail in the future. However, although the code is not analyzed, there are several important addresses that need special analysis.
These addresses are defined in arch/arm/include/asm/memory.h. Let's analyze this header file. First, it includes arch/memory.h. Let's look at arch/arm/mach-msm/include/mach/memory.h, which defines #define PHYS_OFFSET UL (0x00200000). This is actually the initial address of the physical memory of memory. This address is consistent with what we defined in boardconfig.h before. Then let's look at asm/memory.h, which defines the first address of our memory virtual address #define PAGE_OFFSET UL (CONFIG_PAGE_OFFSET).
In addition, we can see in head.S that the definition of the kernel's physical or virtual address has an offset. Where does this offset come from? In fact, we can find it in arch/arm/Makefile: textofs-y := 0x00008000 TEXT_OFFSET := $(textofs-y) Then we look at the physical address and link address when the kernel is started. In fact, it is consistent with what we defined in boardconfig.h and Makefile.boot before.
After the initialization page table is established, the link address of the __switch_data symbol will be placed in sp first, and then the physical address of __enable_mmu will be obtained, and then the INITFUNC in __proc_info_list will be jumped to execute. This offset is defined in arch/arm/kernel/asm-offset.c, which actually gets the __cpu_flush function in __proc_info_list to execute.
view plaincopy to clipboardprint?
ldr r13, __switch_data @ address to jump to after
@ mmu has been enabled
adr lr, __enable_mmu @ return (PIC) address
add pc, r10, #PROCINFO_INITFUNC
ldr r13, __switch_data @ address to jump to after
@ mmu has been enabled
adr lr, __enable_mmu @ return (PIC) address
add pc, r10, #PROCINFO_INITFUNC
This __cpu_flush here is the __v6_setup function in our proc-v6.S. I will not analyze its specific implementation. It is all operations on the arm control registers. Here is its comment on this part of the operation. After reading it, you will basically know the functions it completes.
/*
* __v6_setup
*
* Initialise TLB, Caches, and MMU state ready to switch the MMU
* on. Return in r0 the new CP15 C1 control register setting.
*
* We automatically detect if we have a Harvard cache, and use the
* Harvard cache control instructions insead of the unified cache
* control instructions.
*
* This should be able to cover all ARMv6 cores.
*
* It is assumed that:
* - cache type register is implemented
*/
After completing this part of the CPU operation, the next step is to turn on the MMU. There is nothing much to say about this part. It is also an operation of the arm control register. After turning on the MMU, we can use the virtual address, and we don't need to relocate the address ourselves. The ARM hardware will complete this part of the work. After turning on the MMU, the value of SP will be assigned to PC, so that the code will jump to __switch_data to run. This __switch_data is a structure defined in head-common.S. We actually jump to its function pointer __mmap_switched to execute.
We just briefly look at the execution process of this switch. The copy data_loc segment and the clearing of the .bss segment are not mentioned. Later, the proc information and machine information will be saved in the __switch_data structure, and this structure will be used in the setup_arch of start_kernel in the future. This will be discussed in the detailed analysis of start_kernel later. In addition, this switch also involves some operations on the control register. I did not study the spec carefully here, so I will not explain it if I don’t understand it.
OK, after the switch operation is completed, b start_kernel will be called. This will enter the running of the C code. The next article will carefully study the start_kernel function.
Ref:
http://linux.chinaunix.net/bbs/thread-1021226-1-1.html
http://blog.csdn.net/yhmhappy2006/archive/2008/08/06/2775239.aspx
http://blog.csdn.net/sustzombie/archive/2010/06/12/5667607.aspx
Previous article:Android arm linux kernel startup process
Next article:Arm Linux Kernel build scenario analysis
Recommended ReadingLatest update time:2024-11-15 15:38
- Popular Resources
- Popular amplifiers
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Download from the Internet--ARM Getting Started Notes
- Learn ARM development(22)
- Learn ARM development(21)
- Learn ARM development(20)
- Learn ARM development(19)
- Learn ARM development(14)
- Learn ARM development(15)
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- Digital tube dynamic scanning
- Are there any tutorials or materials on NB-IOT Internet of Things?
- Slope compensation question
- FPGA Design Ideas and Techniques: Serial-to-Parallel Conversion
- Implementation of booting Linux from TF card on i.MX6UL
- Grand opening tonight at 19:00! Murata flagship store's first Taobao live show for its anniversary
- [EEWorld invites you to play disassembly] Anker PowerPort Speed2 dual-port QC3.0 charger disassembly
- More than 100 fatal questions about STM32, how many do you know?
- Can you help me transfer the Ambarella A2S HD camera PCB + schematic diagram to PDF?
- RP2040 microcontroller chip goes on sale for $1