Why are PC=PC+8 in both ARM3-stage and 5-stage pipelines?

Publisher:WanderlustSoulLatest update time:2016-08-08 Source: eefocusKeywords:ARM Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
First, let's introduce the prerequisite knowledge

ARM7 uses a three-stage pipeline
(1) Fetch:
The task of the fetch stage is to read instructions from the program memory.
(2) Decode:
The decode stage completes the analysis of the instruction and prepares the control signals required by the data path for the next cycle. At this stage, the instruction occupies the decoding logic and does not occupy the data path.
(3) Execute:
Complete the operation required by the instruction and write the result back to the register as needed. The instruction occupies the data path, the register stack is read, and the operand is shifted in the barrel row shifter. The operator generates the operation result and writes it back to the destination register. The operator changes the conditional bit of the status register according to the instruction requirements and the transport result.

ARM9 uses a five-stage pipeline
(1) fetch

Instructions are fetched from memory and placed into the instruction pipeline.

(2) Decode

The instruction is decoded and the register operands are read from the register file. There are three operand read ports in the register file, so most ARM instructions can read their operands in 1 cycle.

(3) Execute

One of the operands is shifted and the result is produced in the ALU. If the instruction is a load or store instruction, the address of the memory is calculated in the ALU.

(4) Buffer/data

If necessary, the data memory is accessed, otherwise the ALU is simply buffered for one clock cycle so that all instructions have the same pipeline flow.

(5) Write-back register stack

--------------------------------------------------------

Note that in arm7, execution and instruction fetch are separated by a decoding level, which does not read data, and the current PC = original PC + 8.
In normal arm9, the decoding level between execution and instruction fetch is no longer honest, and the decoding level has begun to read register operands from the register stack. In this case,
what is read is PC + 4, not PC + 8, and the execution level will not read it again, which will cause the PC of the execution level to still = PC + 4.
In order to maintain backward compatibility, on the ARM9's 5-level pipeline, the PC value added by the instruction fetch level is directly sent to the register of the decoding level, passing through the pipeline register between the two levels, so that the PC value obtained by the decoding level is the PC + 4 of the next instruction, which is equal to the PC + 8 of the current instruction.
When it reaches the execution level, the value of the PC register = the current instruction address + 8

When using the STR or STM instruction to save R15, the current instruction address plus 8 or the current instruction address plus 12 may be saved.
Which method is used depends on the specific design of the chip. Of course, only one method can be used in the same chip. Either save the current instruction address plus 8 or save the current instruction address plus 12. Program developers should try to avoid using the STR or STM instruction to operate R15. When this method is unavoidable, you can first use a small program to determine which method the chip used is implemented. For example:
SUB R1,PC, #4; Store the STR instruction address in
R1STR PC,[R0]; Use the STR instruction to save PC to the address unit pointed to by R0,;
                            PC = STR instruction address + offset (offset is 8 or 12).
LDR R0,[R0]; Read the value of STR instruction address + offsetSUB
R0,R0,R1; Subtract the address of the STR instruction from the value of the STR instruction address + offset,;
                            Get the offset value (8 or 12).

 

=============================

 

ARM7 uses a three-stage pipeline von Neumann structure, and ARM9 uses a five-stage pipeline Harvard structure.

The ARM7 pipeline includes fetch, decode, and execute. The ARM7 pipeline does not read the operand register in the decode stage, so the relationship between the PC value in the execute stage and the PC value in the fetch stage is: PC (execute) = PC (fetch) + 8.

 

The ARM9 pipeline includes fetch, decode, execute, buffer/data, and write-back register stacks. The ARM9 pipeline has already started reading operand registers in the decode stage, so the relationship between the PC value in the decode stage and the PC value in the fetch stage is: PC(decode)=PC(fetch)+4. Therefore, the relationship between the PC value in the execute stage and the PC value in the decode stage is: PC(execute)=PC(decode)+4.

 

In order to ensure the compatibility of the ARM9 pipeline and the ARM7 pipeline, the ARM9 pipeline sends the PC value of the instruction fetch stage directly to the decode stage register, bypassing the instruction fetch and decode pipeline registers. This still ensures that the relationship between the PC value of the execution stage and the PC value of the instruction fetch stage is: PC (execute) = PC (fetch) + 8.
Why are PC=PC+8 in both ARM3-stage pipeline and ARM5-stage pipeline? - Wornyong - Smart IoT, embedded in you and me!
 Why are PC=PC+8 in both ARM3-stage pipeline and ARM5-stage pipeline? - Wornyong - Smart IoT, embedded in you and me! Why are PC=PC+8 in both ARM3-stage pipeline and ARM5-stage pipeline? - Wornyong - Smart IoT, embedded in you and me!
 Take the following assembly code of start.S in uboot as an example to explain:

 


00000000 <_start>:
   0: ea000014 b 58
   4: e59ff014 ldr pc, [pc, #20] ; 20 <_undefined_instruction>
   8: e59ff014 ldr pc, [pc, #20] ; 24 <_software_interrupt>
   c: e59ff014 ldr pc, [pc, #20] ; 28 <_prefetch_abort>
  10: e59ff014 ldr pc, [pc, #20] ; 2c <_data_abort>
  14: e59ff014 ldr pc, [pc, #20] ; 30 <_not_used>
  18: e59ff014 ldr pc, [pc, #20] ; 34 <_irq>
  1c: e59ff014 ldr pc, [pc, #20] ; 38 <_fiq>
00000020 <_undefined_instruction>:
  20: 00000120 .word 0x00000120
Copy code

 

 

 

The following is a detailed description of what the CPU does in each instruction cycle:

Before looking at the specific explanation below, there is one sentence to keep in mind, that is:

The PC does not point to the instruction you are running;

The PC always points to the address of the instruction you want to fetch.

Once you have a clear understanding of this premise, the examples that follow will be easy to understand.

 

 

Instruction cycle Cycle1

(1) Instruction fetch:

PC always points to the address of the instruction to be read (that is, we often say, the address of the next instruction), and the current PC = 4,

So the physical address is 4 pairs of instructions corresponding to "ldr pc, [pc, #20]", and its corresponding binary code is e59ff014.

After the instruction is fetched here, the value of PC is automatically updated, that is, PC = PC + 4 (a single instruction occupies 4 bytes, so add 4) = 4 + 4 = 8

 

 

Instruction cycle Cycle2

(1) Translation instruction: translation instruction e59ff014;

(2) At the same time, fetch the instruction:

PC always points to the address of the instruction to be read (that is, we often say, the address of the next instruction), and the current PC = 8,

So the instruction corresponding to the physical address 8 is "ldr pc, [pc, #20]", and its corresponding binary code is e59ff014.

After the instruction is fetched here, the value of PC is automatically updated, that is, PC=PC+4=8+4=12=0xc

 

 

Instruction cycle Cycle3

(1) Execute (instruction): Execute "e59ff014", which means "ldr pc,[pc, #20]", that is,

PC

= PC + 20

= 12 + 20

= 32

= 0x20

Here, we just calculate that the value to be assigned to PC is 0x20, and this 0x20 is just placed in the internal buffer of the execution unit.

(2) Translation refers to: translation e59ff014.

(3) Instruction fetch:

Since this step is performed synchronously with the execution in (1) above, it is not affected and continues to fetch instructions. At the time of fetching instructions, PC is the value of the previous cycle.

The updated value, i.e. PC=0xc, is to fetch the instruction "ldr pc, [pc, #20]" corresponding to the physical address 0xc, which corresponds to the binary code e59ff014.

Keywords:ARM Reference address:Why are PC=PC+8 in both ARM3-stage and 5-stage pipelines?

Previous article:s3c2440 Storage System and I/O Space
Next article:ARM some special symbols

Latest Microcontroller Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号