Android arm linux kernel startup process 2-EEWORLD

Collect

I feel heavy-hearted when writing this summary, because there are still many things I don't understand. . . I feel that my knowledge is still very shallow, and the future is still long. However, the basic thread is clear, and the specific details can only be left for later when I have time. The second part of the startup process here refers to the part of code that the kernel starts to execute after decompression. This part of the code is closely related to the ARM architecture, so it is best to read the ARM ARCHITECTURE REFERENCE MANUL carefully, especially the content about control registers and MMU.

As mentioned above, after decompression, the code will jump to the vmlinux after decompression to start execution. To see where to start execution, we can look at the generated vmlinux.lds (arch/arm/kernel/) file:

      view plaincopy to clipboardprint?
OUTPUT_ARCH(arm)
ENTRY(stext)
jiffies = jiffies_64;
SECTIONS
{
. = 0x80000000 + 0x00008000;
.text.head : {
_stext = .;
_sinittext = .;
*(.text.h
OUTPUT_ARCH(arm)
ENTRY (stext)
jiffies = jiffies_64;
SECTIONS
{
. = 0x80000000 + 0x00008000;
.text.head : {
_stext = .;
_sinittext = .;
*(.text.h

Obviously, the first section of our vmlinx is .text.head. We cannot see the content of ENTRY here, because we don't have an operating system at this time and don't know how to parse the entry address here. We can only analyze its section (but generally speaking, the result of ENTRY here is the same as the result of our analysis from seciton). The .text.head section here can be easily found in arch/arm/kernel/head.S, and the first symbol in it is our stext:

      view plaincopy to clipboardprint?
.section ".text.head", "ax"
Y(stext)
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
                    @ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
bl __lookup_processor_type @ r5=procinfo r9=cpuid
    .section ".text.head", "ax"
ENTRY(stext)
    msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
                        @ and irqs disabled
    mrc p15, 0, r9 , c0, c0 @ get processor id
    bl __lookup_processor_type @ r5=procinfo r9=cpuid

The ENTRY macro here can actually be found in include/linux/linkage.h. We can see that it actually declares a GLOBAL Symbol. The only difference between the following ENDPROC and END is that the former declares a function that can be called in c.

      view plaincopy to clipboardprint?
#ifndef ENTRY
#define ENTRY(name) /
.globl name; /
ALIGN; /
name:
#endif
#ifndef WEAK
#define WEAK(name) /
    .weak name; /
    name:
#endif
#ifndef END
# define END(name) /
.size name, .-name
#endif
/* If symbol 'name' is treated as a subroutine (gets called, and returns)
* then please use ENDPROC to mark 'name' as STT_FUNC for the benefit of
* static analysis tools such as stack depth analyzer.
*/
#ifndef ENDPROC
#define ENDPROC(name) /
.type name, @function; /
END(name)
#endif
#ifndef ENTRY
#define ENTRY(name) /
.globl name; /
ALIGN; /
name:
#endif
#ifndef WEAK
#define WEAK(name) /
    .weak name; /
    name:
#endif
#ifndef END
#define END(name) /
.size name, .-name
#endif
/* If symbol 'name' is treated as a subroutine (gets called, and returns)
* then please use ENDPROC to mark 'name' as STT_FUNC for the benefit of
* static analysis tools such as stack depth analyzer.
*/
#ifndef ENDPROC
#define ENDPROC(name) /
.type name, @function; /
END(name)
#endif

After finding the start code of vmlinux, we will analyze it. Let's first summarize the functions completed by this part of the code. head.S will first check the validity of proc, arch and atag, then create an initialization page table, perform necessary CPU processing, turn on MMU, and jump to the start_kernel symbol to start executing the following C code. There are many variables here that we need to pay special attention to when we transplant the kernel, which will be discussed one by one below.

Here we first look at the register information when this assembly starts running. The register content here is actually the same as when the bootloader jumps to the decompression code, that is, r1=arch r2=atag addr. Now let's take a closer look at the running process of head.S:

      view plaincopy to clipboardprint?
msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
                    @ and irqs disabled
mrc p15, 0, r9, c0, c0 @ get processor id
    msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
                        @ and irqs disabled
    mrc p15, 0, r9, c0, c0 @ get processor id

First enter SVC mode and turn off all interrupts, and read the CPU ID from the arm coprocessor. The CPU here mainly refers to the CPU model related to the arm architecture, such as ARM9, ARM11, etc.

view plaincopy to clipboardprint?

Then jump to __lookup_processor_type, this function is defined in head-common.S, the bl instruction here will save the current pc in lr, and finally __lookup_processor_type will return from this function. Let's take a closer look at this function:

       view plaincopy to clipboardprint?
__lookup_processor_type:
    adr r3, 3f
    ldmda r3, {r5 - r7}
    sub r3, r3, r7 @ get offset between virt&phys
    add r5, r5, r3 @ convert virt addresses to
    add r6, r6, r3 @ physical address space
1: ldmia r5, {r3, r4} @ value, mask
    and r4, r4, r9 @ mask wanted bits
    teq r3, r4
    beq 2f
    add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
    cmp r5, r6
    blo 1b
    mov r5, #0 @ unknown processor
2: mov pc, lr
ENDPROC(__lookup_processor_type)
__lookup_processor_type:
    adr r3, 3f
    ldmda r3, {r5 - r7}
    sub r3, r3, r7 @ get offset between virt&phys
    add r5, r5, r3 @ convert virt addresses to
    add r6, r6, r3 @ physical address space
1: ldmia r5, {r3, r4} @ value, mask
    and r4, r4, r9 @ wanted mask bits
    teq r3, r4
    beq 2f
    add r5, r5, #PROC_INFO_SZ @ sizeof(proc_info_list)
    cmp r5, r6
    blo 1b
    mov r5, #0 @ unknown processor
2: mov pc, lr
ENDPROC(__lookup_processor_type)

The execution process here is actually quite simple. It reads the proc_info_list structure registered in the __proc_info_begin and __proc_info_end segments. The definition of this structure is in arch/arm/include/asm/procinfo.h. The specific implementation depends on the architecture of the CPU you are using. Find the specific implementation in arch/arm/mm/. Here we use proc-v6.S for ARM11. We can take a look at this structure:

       view plaincopy to clipboardprint?
.section ".proc.info.init", #alloc, #execinstr
/*
* Match any ARMv6 processor core.
*/
.type __v6_proc_info, #object
_proc_info:
.long 0x0007b000
.long 0x0007f000
.long PMD_TYPE_SECT | /
    PMD_SECT_BUFFERABLE | /
    PMD_SECT_CACHEABLE | /
    PMD_SECT_AP_WRITE | /
    PMD_SECT_AP_READ
.long PMD_TYPE_SECT | /
    PMD_SECT_XN | /
    PMD_SECT_AP_WRITE | /
    PMD_SECT_AP_READ
b __v6_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_JAVA
.long cpu_v6_name
.long v6_processor_functions
.long v6wbi_tlb_fns
.long v6_user_fns
.long v6_cache_fns
.size __v6_proc_info, . - __v6_proc_ info
    .section ".proc.info.init", #alloc, #execinstr
    / *
     * Match any ARMv6 processor core.
     */
    .type __v6_proc_info, #object
__v6_proc_info:
    .long 0x0007b000
    .long 0x0007f000
    .long PMD_TYPE_SECT | /
        PMD_SECT_BUFFERABLE | /
        PMD_SECT_CACHEABLE | /
        PMD_SECT_AP_WRITE | /
        PMD_SECT_AP_READ
    .long PMD_TYPE_SECT | /
        PMD_SECT_XN | /
        PMD_SECT_AP_WRITE | /
        PMD_SECT_AP_READ
    FAST_MULT     |HWCAP_EDSP|HWCAP_JAVA     .long cpu_v6_name
    .long
    v6_processor_functions     .long     v6wbi_tlb_fns     .long v6_user_fns     . long v6_cache_fns     .size __v6_proc_info, . - __v6_proc_info

By looking at .h, we know the meaning of each member variable. The lookup process here is actually to first find out the actual physical address of the proc_info_list and read its content. Then, the mask, which is 0x007f000 here, is compared with the register and then with 0x007b00. If they are the same, the verification is successful. If they are different, the next proc_info information will be read. Because there is usually only one proc, there is generally no loop here. If the correct register is detected, the correct physical address of proc_info_list will be assigned to the register. If not detected, the register value will be assigned 0 and then returned through LR.

        view plaincopy to clipboardprint?
bl __lookup_machine_type @ r5=machinfo
movs r8, r5 @ invalid machine (r5=0)?
beq __error_a @ yes, error 'a'
    bl __lookup_machine_type @ r5=machinfo
    movs r8, r5 @ invalid machine (r5=0) )?
    beq __error_a @ yes, error 'a'

After checking proc_info_list, we start to check machine_type. The implementation of this function is also in head-common.S. Let's take a look at its specific implementation:

        view plaincopy to clipboardprint?
__lookup_machine_type:
    adr r3, 3b
    ldmia r3, {r4, r5, r6}
    sub r3, r3, r4 @ get offset between virt&phys
    add r5, r5, r3 @ convert virt addresses to
    add r6, r6, r3 @ physical address space
1: ldr r3, [r5, #MACHINFO_TYPE] @ get machine type
    teq r3, r1 @ matches loader number?
    beq 2f @ found
    add r5, r5, #SIZEOF_MACHINE_DESC @ next machine_desc
    cmp r5, r6
    blo 1b
    mov r5, #0 @ unknown machine
2: mov pc, lr
ENDPROC(__lookup_machine_type)
__lookup_machine_type:
    adr r3, 3b
    ldmia r3, {r4, r5, r6}
    sub r3, r3, r4 @ get offset between virt&phys
    add r5, r5, r3 @ convert virt addresses to
    add r6, r6, r3 @ physical address space
1: ldr r3, [r5, #MACHINFO_TYPE] @ get machine type
    teq r3, r1 @ matches loader number?
    beq 2f @ found
    add r5, r5, #SIZEOF_MACHINE_DESC @ next machine_desc
    cmp r5, r6
    blo 1b
    mov r5, #0 @ unknown machine
2: mov pc, lr
ENDPROC(__lookup_machine_type)

The process here is basically the same as the proc check. The main purpose here is to check the chip type. For example, our current chip is MSM7X27FFA, which is also a structure. Its header file is in arch/arm/include/asm/arch/arch.h (machine_desc). Its specific implementation varies depending on your choice of chip type. Here we use Qualcomm's 7x27, which is specifically implemented in arch/arm/mach-msm/board-msm7x27.c. These structures will eventually be registered in the _arch_info_begin and _arch_info_end segments. For details, you can look at vmlinux.lds or system.map. The lookup here will match the type in __arch_info according to the nr passed by the bootloader. If not, it will look for the next machin_desk structure until the corresponding structure is found, and the address of the structure will be assigned to the register. If not, it will be assigned to 0. Generally speaking, there will be several machine_types here, because different chip types may use the same CPU architecture.

After checking the processor and machine, the validity of the atags parameter will be checked. The specific definition of this atag can be seen in ./include/asm/setup.h. It is actually a combination of a structure and a union. The size inside is calculated in words. The atags param here is created by the bootloader. It contains some information about the allocation of ramdisk and other memories. It is stored in the address defined in the boot.img header structure. For details, you can see our analysis of the bootloader later~

       view plaincopy to clipboardprint?
__vet_atags:
    tst r2, #0x3 @ aligned?
    bne 1f
    ldr r5, [r2, #0] @ is first tag ATAG_CORE?
    cmp r5, #ATAG_CORE_SIZE
    cmpne r5, #ATAG_CORE_SIZE_EMPTY
    bne 1f
    ldr r5, [r2, #4]
    ldr r6, =ATAG_CORE
    cmp r5, r6
    bne 1f
    mov pc, lr @ atag pointer is ok
1: mov r2, #0
    mov pc, lr
ENDPROC(__vet_atags)
__vet_atags:
    tst r2, #0x3 @ aligned?
    bne 1f
    ldr r5, [r2, #0] @ is first tag ATAG_CORE?
    cmp r5, #ATAG_CORE_SIZE
    cmpne r5, #ATAG_CORE_SIZE_EMPTY
    bne 1f
    ldr r5, [r2, #4]
    ldr r6, =ATAG_CORE
    cmp r5, r6
    bne 1f
    mov pc, lr @ atag pointer is ok
1: mov r2, #0
    mov pc, lr
ENDPROC(__vet_atags )

The check of atag here mainly checks whether it starts with ATAG_CORE and whether the size is correct. There is basically nothing to analyze, and the code is also pretty good. Let's take a look at the next highlight, which is to create an initialization page table. To be honest, I didn't understand this part clearly. It requires a considerable understanding of ARM VIRT MMU. I don't have much time to analyze the spec here. I just roughly flipped through the manu of ARM V7 and knew that the page table established here is the arm's secition page table, which completes the mapping of the memory starting with 1m memory. This page table is established between the kernel and the atag paramert, which is generally between 4000-8000. I won't post the specific code and process here. You can look at the reference link and the analysis of other prawns. I haven't understood it yet. I will come back to study it carefully when I study ARM MMU in detail in the future. However, although the code is not analyzed, there are several important addresses that need special analysis.

These addresses are defined in arch/arm/include/asm/memory.h. Let's analyze this header file. First, it includes arch/memory.h. Let's look at arch/arm/mach-msm/include/mach/memory.h, which defines #define PHYS_OFFSET UL (0x00200000). This is actually the initial address of the physical memory of memory. This address is consistent with what we defined in boardconfig.h before. Then let's look at asm/memory.h, which defines the first address of our memory virtual address #define PAGE_OFFSET UL (CONFIG_PAGE_OFFSET).

In addition, we can see in head.S that the definition of the kernel's physical or virtual address has an offset. Where does this offset come from? In fact, we can find it in arch/arm/Makefile: textofs-y := 0x00008000 TEXT_OFFSET := $(textofs-y) Then we look at the physical address and link address when the kernel is started. In fact, it is consistent with what we defined in boardconfig.h and Makefile.boot before.

After the initialization page table is established, the link address of the __switch_data symbol will be placed in sp first, and then the physical address of __enable_mmu will be obtained, and then the INITFUNC in __proc_info_list will be jumped to execute. This offset is defined in arch/arm/kernel/asm-offset.c, which actually gets the __cpu_flush function in __proc_info_list to execute.

      view plaincopy to clipboardprint?
ldr r13, __switch_data @ address to jump to after
                    @ mmu has been enabled
adr lr, __enable_mmu @ return (PIC) address
add pc, r10, #PROCINFO_INITFUNC
    ldr r13, __switch_data @ address to jump to after
                        @ mmu has been enabled
    adr lr, __enable_mmu @ return (PIC) address
    add pc, r10, #PROCINFO_INITFUNC

This __cpu_flush here is the __v6_setup function in our proc-v6.S. I will not analyze its specific implementation. It is all operations on the arm control registers. Here is its comment on this part of the operation. After reading it, you will basically know the functions it completes.

* __v6_setup

* Initialise TLB, Caches, and MMU state ready to switch the MMU

* on. Return in r0 the new CP15 C1 control register setting.

* We automatically detect if we have a Harvard cache, and use the

* Harvard cache control instructions insead of the unified cache

* control instructions.

* This should be able to cover all ARMv6 cores.

* It is assumed that:

* - cache type register is implemented

After completing this part of the CPU operation, the next step is to turn on the MMU. There is nothing much to say about this part. It is also an operation of the arm control register. After turning on the MMU, we can use the virtual address, and we don't need to relocate the address ourselves. The ARM hardware will complete this part of the work. After turning on the MMU, the value of SP will be assigned to PC, so that the code will jump to __switch_data to run. This __switch_data is a structure defined in head-common.S. We actually jump to its function pointer __mmap_switched to execute.

We just briefly look at the execution process of this switch. The copy data_loc segment and the clearing of the .bss segment are not mentioned. Later, the proc information and machine information will be saved in the __switch_data structure, and this structure will be used in the setup_arch of start_kernel in the future. This will be discussed in the detailed analysis of start_kernel later. In addition, this switch also involves some operations on the control register. I did not study the spec carefully here, so I will not explain it if I don’t understand it.

OK, after the switch operation is completed, b start_kernel will be called. This will enter the running of the C code. The next article will carefully study the start_kernel function.

Keywords：Android arm linux kernel Reference address：Android arm linux kernel startup process 2

Previous article：Android arm linux kernel startup process 1
Next article：Porting Kinect to the embedded ARM platform

Recommended ReadingLatest update time:2024-11-15 07:57

S3C2440 ARM chip clock system

The clock control logic in S3C2440A can generate the necessary clock signals, including FCLK of CPU, HCLK of AHB bus peripherals and PCLK of APB bus peripherals. S3C2440A contains two phase-locked loops (PLL): one for FCLK, HCLK and PCLK, and the other for USB module (48MHz). Figure 7-1 shows the block diagram of

[Microcontroller]

The role of MMU in ARM

Let's look at the role of MMU from the perspective of historical development. This part can be combined with Cai Yuqing's explanation . Most of the following content is reproduced here, and some expansion explanations are made based on my own understanding. It should be noted that there are several small errors

[Microcontroller]

Design of Chinese-English translator based on ARM

　　In order to improve the situation that a certain welding equipment can only output and print documents in English, an English to Chinese translator with a high-performance ARM7 controller-LPC2214 as the core was designed. The design principles of the specific hardware circuit and optimized software algorithm were di

[Microcontroller]

Design of Chinese-English translator based on ARM

Difference between ARM and FPGA

In the field of embedded development, ARM is a very popular microprocessor with a very high market coverage. DSP and FPGA are coprocessors for embedded development, helping microprocessors to better realize product functions. What are the technical characteristics and differences between the three? The following i

[Microcontroller]

Lighting control system based on the Internet of Things - ARM client software design

The lighting control system based on the Internet of Things uses ZigBee, single-chip microcomputer, sensor, C# and PHP programming as the main technical means, including lighting control unit, ZigBee wireless transmission layer (including gateway), PC display and control layer three levels of field control unit, CC2

[Embedded]

Lighting control system based on the Internet of Things - ARM client software design

Embedded ARM Learning Summary (V) --RTC-TIME-PWM-Watchdog-AD-TP

RTC Commonly used RTC modules: PCF8563 IIC Interface DS302 SPI Interface DS87887 has many functions but is more expensive BCD code: Decimal number Compressed BCD code, a BCD code from 0 to 9 requires 4 bits of binary storage For example, 0X59 means 59 seconds (directly the BCD value, i.e. decimal number) Tim

[Microcontroller]

Lesson 005 Linux Advanced Commands (File Search, File Decompression Operations)

Section 001_Linux Advanced Commands_find Command When we search for files in Windows, we generally need to pass in two conditions to find files: 1) Search in those directories; 2) What to search for; In Linux, these two conditions are also required to find files. Unlike Windows, which uses the search box to find

[Microcontroller]

What are the classifications of a5 a8 a9, v6 v7, arm7 arm9 arm11 in arm processors?

ARM processors have been developed for many years, with many architectures and many different cores. The architectures are armv1 v2 v3 v4 v5 v6 v7 There are too many cores, for example, armv1 corresponds to arm1, armv5 corresponds to arm9, armv6 corresponds to arm11, and armv7 corresponds to cortex (for ex

[Microcontroller]

Popular Resources
Popular amplifiers