Code relocation
We will now solve the code relocation problem introduced by the code relocation experiment (I).
For S3C2440:
When the BIN file is smaller than 4KB:
If Nand mode is used for startup, there will be no problem.
If it is started in Nor mode, we can just relocate the .data segment
When the BIN file is larger than 4KB:
If Nand mode is used for booting, the entire program needs to be relocated, including the code segment and data segment.
If it is started in Nor mode, only the .data segment needs to be relocated.
Only relocate the .data segment and clear the .bss segment
Normally, the code for relocating the .data segment should be written in assembly language. For simplicity, I wrote it in C language. Since the data segment has not been relocated and the BSS segment has not been cleared, the C function should not be called. However, I ensured that these two functions do not access global variables, so as long as the stack pointer is set correctly, the call can also work normally.
The linker script relocate.lds used is as follows:
SECTIONS {
.text 0 : {*(.text)}
.rodata : {*(.rodata)}
_data_offset = .;
.data 0x30000000 : AT(_data_offset) {
_data_LMA = LOADADDR(.data);
_data_start = .;
*(.data)
_data_end = .;
}
_bss_start = .;
.bss : { *(.bss) }
_bss_end = .;
}
relocate.c
extern unsigned char _data_offset;
extern unsigned char _data_start;
extern unsigned char _data_end;
extern unsigned char _bss_start;
extern unsigned char _bss_end;
void copyDataSection(void){
volatile unsigned char *dataLMA = &_data_offset;
volatile unsigned char *start = &_data_start;
volatile unsigned char * end = &_data_end;
while(start <= end){
*start = *dataLMA;
start++;
dataLMA++;
}
}
void clearBSS(void){
volatile unsigned char* start = &_bss_start;
volatile unsigned char* end = &_bss_end;
while(start < end){
*start = 0;
start++;
}
}
CRT0.S
.text
.global _start
_start:
/* 1. Turn off the watchdog */
ldr r0, =0x53000000
ldr r1, =0
str r1, [r0]
/* 2. Set the clock */
/* 2.1 Set LOCKTIME(0x4C000000)=0xFFFFFFFF */
ldr r0, =0x4C000000
ldr r1, =0xFFFFFFFF
str r1, [r0]
/* 2.2 Set CLKDIVN(0x4C000014) = 0x5 FCLK : HCLK : PCLK = 400m : 100m : 50m*/
ldr r0, =0x4C000014
ldr r1, =0x5
str r1, [r0]
/* 2.3 Set the CPU to asynchronous mode */
mrc p15,0,r0,c1,c0,0
orr r0,r0,#0xc0000000 /* #R1_nF:OR:R1_iA */
mcr p15,0,r0,c1,c0,0
/* 2.4 Set MPLLCON(0x4C000004)=(92<<12) | (1 << 4) | (1 << 0)
* m = MDIV + 8 = 100
* p = PDIV + 2 = 3
* s = SDIV = 1
* Mpll = (2 * m * Fin) / (p * 2 ^ s) = (2 * 100 * 12) / (3 * 2 ^ 1) = 400MHZ
*/
ldr r0, =0x4C000004
ldr r1, =(92<<12) | (1 << 4) | (1 << 0)
str r1, [r0]
/* Once the PLL is set, the lock time will be locked until the PLL output is stable
* Then the CPU operates at the new frequency FCLK
*/
/* 3. Set up the stack
* Automatically distinguish NOR boot or NAND boot
* Write 0 to address 0, and read it out. If it is written, it is NAND, otherwise it is NOR
*/
ldr r0, =0
ldr r1, [r0] /* read out the original value backup*/
str r0, [r0] /* write 0 to address 0 */
ldr r2, [r0] /* read again*/
cmp r1, r2
ldr sp, =0x40000000 + 4096 /* nor start*/
movne sp, #4096 /* nand start */
strne r1, [r0] /* restore the original value */
// Initialize the SDRAM memory controller
bl sdram_init
bl copyDataSection
bl clearBSS
bl main
halt:
b halt
main.c:
#include "myprintf.h"
#include "uart.h"
#include "util.h"
char gCh = 'A';
char gCh1;
int main(void) {
uart0_init();
printf("%snr", "NorFlash Relocate Test.");
while(1) {
gCh++;
printf("%c(0x%x)", gCh, gCh);
wait(800000);
}
return 0;
}
Burn the compiled BIN file into NorFlash, start the development board, and find that the main function can modify the global variables normally, as shown in the following figure:
The above code efficiency improvement
As can be seen from the copy function, we only copy one byte at a time. However, the SDRAM of JZ2440 is 32 bits, so the copying efficiency is very low. Therefore, we copy 4 bytes of data at a time. We modify the relocate.c file as follows:
relocate.c
extern unsigned int _data_offset;
extern unsigned int _data_start;
extern unsigned int _data_end;
extern unsigned int _bss_start;
extern unsigned int _bss_end;
void copyDataSection(void){
volatile unsigned int *dataLMA = &_data_offset;
volatile unsigned int *start = &_data_start;
volatile unsigned int * end = &_data_end;
while(start <= end){
*start = *dataLMA;
start++;
dataLMA++;
}
}
void clearBSS(void){
volatile unsigned int* start = &_bss_start;
volatile unsigned int* end = &_bss_end;
while(start < end){
*start = 0;
start++;
}
}
Compile and burn the program to the development board again, start it in Nor mode, and power on to observe.
From the above dynamic diagram, we can see that the global variable gCh was originally equal to 'A', corresponding to 0x41. But now it is 0, which is cleared. Why is this? Why does the value of the .data segment get cleared when the .bss segment is cleared?
We observe the disassembly of the cleared .bss segment, as shown below:
Disassembly of section .data:
30000000 <_data_start>:
30000000: 41 .byte 0x41
Disassembly of section .bss:
30000001 ... 00000d2c d2c: e52db004 push {fp} ; (str fp, [sp, #-4]!) d30: e28db000 add fp, sp, #0 ; 0x0 d34: e24dd00c sub sp, sp, #12 ; 0xc d38: e59f3040 ldr r3, [pc, #64] ; d80 d3c: e50b300c str r3, [fp, #-12] d40: e59f303c ldr r3, [pc, #60] ; d84 d44: e50b3008 str r3, [fp, #-8] d48:ea000005 b d64 d4c: e51b200c ldr r2, [fp, #-12] d50: e3a03000 mov r3, #0 ; 0x0 d54: e5c23000 strb r3, [r2] d58: e51b300c ldr r3, [fp, #-12] d5c: e2833001 add r3, r3, #1 ; 0x1 d60: e50b300c str r3, [fp, #-12] d64: e51b200c ldr r2, [fp, #-12] d68: e51b3008 ldr r3, [fp, #-8] d6c: e1520003 cmp r2, r3 d70: 3afffff5 bcc d4c d74: e28bd000 add sp, fp, #0 ; 0x0 d78:e8bd0800 pop {fp} d7c:e12fff1e bx lr d80: 30000001 .word 0x30000001 d84: 30000002 .word 0x30000002 From this assembly code, we can see that it clears the address from 0x30000001 to the byte before 0x30000002. The address of the data segment .data is at 0x30000000, so in theory it seems that it will not be cleared. If we look at the circuit diagram of SDRAM again, we can see that: The lowest two bits of the address line sent by the CPU are not connected to the SDRAM chip. Because two SDRAM chips form a 32-bit data bus, the CPU addresses the memory according to 4 bytes. The lowest two bits of the address line are ignored and default to 0. So if 4 bytes of data are accessed at a time, no matter the access address is 0x30000001, 0x30000002 or 0x30000003, the CPU will eventually access the four bytes of data starting at 0x30000000. Therefore, the data in the data segment is cleared when the .bss segment is cleared. Now that the cause of the problem has been found, the solution is simple. We proactively align the data segment and BSS segment to 4 bytes in the link script, and there will be no problem. Modify the link script as follows: relocate.lds: SECTIONS { .text 0 : {*(.text)} .rodata : {*(.rodata)} /* 4-byte alignment */ . = ALIGN(4); _data_offset = .; .data 0x30000000 : AT(_data_offset) { _data_LMA = LOADADDR(.data); _data_start = .; *(.data) _data_end = .; } /* 4-byte alignment */ . = ALIGN(4); _bss_start = .; .bss : { *(.bss) } _bss_end = .; } Recompile, start in Nor mode, power on and observe, and it returns to normal. Relocate the entire program When our development board is started in Nand mode and the BIN file exceeds 4KB, the entire program needs to be relocated, including the code segment and the data segment. To this end, we need to introduce a concept: position-independent code. Position-independent code (PIC) is code that can work normally no matter where it is loaded into any address space. So how do you write position-independent programs? Use b or bl relative jump instructions when calling a program Before relocation, you cannot use absolute addresses and cannot access global or static variables. Arrays with initializers are not accessible (because these initializers are placed in the .rodata segment, which is position-dependent and not on the stack) After relocation, you need to use an absolute jump instruction to jump to the runtime address (link address) to start execution, such as ldr pc, =main Since we have not yet touched upon the Nand Flash operation experiment (next article), and Nand Flash cannot read data like accessing memory, we still burn the BIN file to Nor Flash first, and start it in Nor Flash mode to relocate the entire program. Jump Instructions When calling the main function in an assembly file, note that you must use an absolute jump instruction: ldr pc, =main However, position-independent instructions such as bl main cannot be used, otherwise it will still run in Nor Flash or SRAM. Because the ldr pc, =main instruction does not change the return address stored in the lr register, before jumping to the main function, lr must be modified to point to halt, otherwise when returning from the main function, the ldr pc, =main instruction will be executed again, as shown below: ldr lr, =halt ldr pc, =main All the files are as follows: We list the main source files: CRT0.S .text .global _start _start: /* 1. Turn off the watchdog */ ldr r0, =0x53000000 ldr r1, =0 str r1, [r0] /* 2. Set the clock */ /* 2.1 Set LOCKTIME(0x4C000000)=0xFFFFFFFF */ ldr r0, =0x4C000000 ldr r1, =0xFFFFFFFF str r1, [r0] /* 2.2 Set CLKDIVN(0x4C000014) = 0x5 FCLK : HCLK : PCLK = 400m : 100m : 50m*/ ldr r0, =0x4C000014 ldr r1, =0x5 str r1, [r0] /* 2.3 Set the CPU to asynchronous mode */ mrc p15,0,r0,c1,c0,0 orr r0,r0,#0xc0000000 /* #R1_nF:OR:R1_iA */ mcr p15,0,r0,c1,c0,0 /* 2.4 Set MPLLCON(0x4C000004)=(92<<12) | (1 << 4) | (1 << 0) * m = MDIV + 8 = 100 * p = PDIV + 2 = 3 * s = SDIV = 1 * Mpll = (2 * m * Fin) / (p * 2 ^ s) = (2 * 100 * 12) / (3 * 2 ^ 1) = 400MHZ */ ldr r0, =0x4C000004 ldr r1, =(92<<12) | (1 << 4) | (1 << 0)
Previous article:S3C2440 code relocation experiment (Part 2)
Next article:s3c2440 study notes - relocation and linking scripts
Recommended ReadingLatest update time:2024-11-16 09:02
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- The brushless drive solution for the 17th Smart Car Competition sponsored by Lingdong is now open source
- Texas Instruments CC1310 Synchronous Transmit and Receive
- 16 Years of Taiwanese New Year
- 【Project Source Code】Digital Signal Processing Learning——Mixer
- Chapter 4: Use of Timers and PWM
- Ask an outrageous question, why do we need to use a resistor to form a discharge path for the capacitor to discharge?
- 【Infineon XENSIV PAS CO2 sensor】Official data study
- Looking for a brushless gate driver design
- Voltage boost circuit and electrode detachment detection circuit
- HuaDa MCU M4 RTThread real-time operating system