1 Introduction
In today's fast embedded system design, the more popular solution is to integrate application software or soft IP platform in FPGA to simplify the process and accelerate the product launch schedule. To this end, many companies have launched their own development platforms and related CPU IP cores. There are two common types: one is a general-purpose CPU, such as the 32-bit and 64-bit general-purpose CPU cores of Xilinx and Altera; the other is a dedicated Type, the most common one is the CPU core of the 51 series microcontrollers, but currently the soft cores of the microcontrollers are basically 8051, and there are very few other varieties. Moreover, the speed of 8051 is not very fast. In some fast control situations (such as using a microcontroller as a USB2.0 control component), the speed is insufficient. The core8051 launched by the famous actel company has an operating frequency of only about 40 MHz. This article introduces the design of a very high-speed DS80C320 microcontroller soft core.
The DS80C320 microcontroller is a high-performance microcontroller based on the 51 frame launched by DALLAS.
It has the following advantages:
ⅰ. It has the same command system as the 51 series and is fully compatible with all programs developed based on the 51 series;
ⅱ, has more complete peripherals than 8051. Compared with the 8051 microcontroller, the DS80C320 adds timer 2 and an enhanced serial port;
ⅲ, has better efficiency than 8051; one instruction cycle of DS80C320 is 4 CLKs, while 8051 is 12. This difference is especially advantageous when processing simple instructions. For example, the processing of single-cycle instructions, DS80C320 only requires 4 CLK, while 8051 requires 12. According to statistics from DALLAS, at the same clock frequency, the execution speed of each instruction of DS80C320 is 1.5 to 3 times that of 8051. For typical applications, the execution speed is also that of 8051. About 2.5 times.
ⅳ. Its method of reading instructions is more suitable for the characteristics of the IP core than 8051; the internal ROM of the microcontroller is removed and the instructions are completely read from the outside. This feature is very suitable for a soft core. First, the structure is simple, which is conducive to instruction reading. Secondly, it can break through the limitation of internal ROM size. Finally, as a characteristic of FPGA design, even in the design of 8051, the internal ROM block is placed in the ROM resource of the FPGA chip. Instead of doing this, it is better to put it directly outside. Simplify timing and structure;
2 Overall structural division
The figure shows the overall functional diagram of the DS80C320 soft core:
Figure 1 DS80C320 functional block diagram
The design of this IP core mainly arranges functional blocks according to the instruction execution process, and transmits data through the data bus; the dotted line is the CPU core; the first is the ROM module. DS80C320 does not have an internal ROM, so the function of this module is mainly based on analysis. The P port reads the instruction, and by looking up the length and number of cycles of the instruction, the relevant control signal is calculated and sent to the CPU control module to control the reading of the instruction; at the same time, if the instruction is LCALL or ACALL, the subsystem can be analyzed The program entry address is reported to the PC module to guide the PC to jump correctly; while the ROM module analyzes the instruction, the decoder DECODER is also performing decoding. It will analyze three important parameters based on the 8-bit data of the instruction. : The action type of ALU, the source and reading method of the operation data of the instruction, the storage location and storage method of the instruction result; the first parameter is sent to the ALU module, and the other two are sent to the CPU control module; the CPU control module CPU_CON is The core part of the entire CPU mainly performs two functions: reading data control before ALU execution, and writing back data control after ALU execution is completed; this module also controls the timing of the entire CPU and monitors the execution of other modules; ALU It mainly completes the calculation work; the INteR module is the control module of the interrupt system. Its function is mainly to complete the effective judgment and sorting of interrupt requests submitted by each interrupt source, generate interrupt flags and submit the judgment results and interrupt entry address codes to ROM module to instruct the program to jump, and also needs to be responsible for clearing the interrupt flag after the interrupt is completed and restoring the interrupt level before the interrupt; DS80C320 has three timers and 2 serial ports, among which if timer 2 and serial port are not It can be reduced if necessary; as for other modules or registers, data is exchanged through the data bus under the control of the CPU control module; it can be seen that the idea of this design is to use CPU_CON to control the execution and timing of the entire CPU, and to use INTER to control the entire interrupt system. The register uses the data bus to complete the exchange of data, which is evenly distributed on both sides of the data bus. The structure is clear and simple, and the regular design is also conducive to increasing speed and making it easier to cut.
3 Some design features
3.1 Timing design
In the information on the DS80C320 microcontroller, there is only an introduction to the timing of the external interface, but there is no explanation of the internal signal execution, so it needs to be re-planned. This soft core conducts a detailed analysis of the timing of the DS80C320, and adds pipeline techniques according to the black box idea. , the timing design is as follows:
For the execution process of ordinary instructions, the internal timing is divided as follows:
Figure 2 DS80C320 internal timing diagram
This is the execution process of a single-byte, single-cycle instruction. Decoding and searching for the length periodic table of this instruction begin at the rising edge of C1. At the same time, the data bus is the result of the previous instruction being written back; at the rising edge of C2 Along the way, the control of the data bus and address bus returns to the hands of this instruction. At this time, the address bus is used to send the address of the data that needs to be read, and the data bus is ready from sending data to receiving data. This action It is completed by the CPU control module; then at the rising edge of C3, the selected module reads out the relevant data according to the address bus and control bus and sends it to the data bus. In the next clock length, the ALU receives the data, and then in On the rising edge of C4, data processing begins. At the same time, the CPU control module changes the contents of the address bus and control bus again and issues a write signal, prompting the module that starts the selected reading to give up control of the data bus and the selected storage The result module analyzes the write type and is ready to receive data. After the calculation is completed, the ALU puts the result on the data bus and waits for C1 in the next cycle to start writing the result to the relevant location; in short, this design makes full use of the data Bus resources and pipeline design techniques simplify the operations that originally required 6 timings to 4. The timing is compact and the speed is fast; at the same time, the idea of distributed processing is adopted, which greatly simplifies the functions of the CPU control module, only Release control signals. Which module needs to perform what function is judged by the module itself based on the control signals, which helps avoid local overheating of the chip caused by excessive concentration of local functions;
3.2 Design of periodic table of instruction length
The instruction length table is mainly used to control instruction fetching and identify instruction codes and instruction parameters; while the instruction period table is mainly used to control the instruction execution time. These two tables can simplify the control of instruction execution. Generally, in this process, the ROM module looks up the table according to the instructions that have been read, and then processes and analyzes based on the results of the table lookup and the timing situation, generates a series of control signals, and sends them to the CPU control module. The main advantage of this is to avoid the CPU The control module deals with instructions and data to reduce the number of its input and output ports; the design of the instruction length periodic table is closely related to the reading method. This design uses a table built separately by itself and is divided into two. The processing method is: index ={lsb_3, ir[7:4]}, where the meaning of lsb_3 is: for the lower three bits of the instruction (ir (2 downto 0)), the rule is: 8-F=》7,6-7=》6,0 -5 does not change. The two tables use the same reading method, which can not only simplify the structure, reduce the search space to 7 bits, but also improve the search speed;
3.3 The role of PC change encoding
Within the microcontroller, the PC needs to constantly change. Not only do all jump instructions need to change the contents of the PC, but interrupt instructions also need to complete the pop-out and push-in operations of the PC; therefore, in some models, the PC The processing is extremely complicated, basically specifying the changes of the PC in detail for each instruction; this design uses coding techniques to improve the speed; first analyze the possibility of coding, although many instructions can change the content of the PC, but For PC, in addition to the normal operation of adding 1, there are only the following changes:
Among them, pmem1 and pmem2 are instruction parameters, which come from the ROM module; PC_OUT is the PC content in the stack.
Previous article:Nuvoton Technology W78E516D motherboard introduction
Next article:Design of wireless radio frequency transceiver system based on RF transceiver Si4432A
- Popular Resources
- Popular amplifiers
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Learn ARM development(22)
- Learn ARM development(21)
- Learn ARM development(20)
- Learn ARM development(19)
- Learn ARM development(14)
- Learn ARM development(15)
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- Application of MSP430F5xxx in the highway toll collection system (ETC)
- Recommendations for antenna design for mobile devices.
- Infineon has launched a mobile game | Car water pump challenge is waiting for you!
- Answer the quiz: Visit the MPS Core Cloud Exhibition Hall and win a prize by passing three levels
- Which chips can run Linux system?
- High voltage pulse causes microcontroller restart problem
- Hair ball trimmer power supply PCB
- Is it useful for people who work with microcontrollers to read "Introduction to Algorithms" and "Data Structures"?
- CC112x SKY65367 EM 30 dBm 169 MHz Reference Design
- [Anxinke NB-IoT Development Board EC-01F-Kit] 3. HTTP acquisition time