In today's fast embedded system design, the more popular solution is to integrate application software or soft IP platform in FPGA to simplify the process and accelerate the product launch schedule. To this end, many companies have launched their own development platforms and related CPU IP cores. There are two common types: one is a general-purpose CPU, such as 32-bit and 64-bit general-purpose CPU cores from Xilinx and Altera; the other is a special-purpose CPU, which is commonly the CPU core of the 51 series microcontroller. However, the soft cores of microcontrollers are basically 8051, and there are few other varieties. Moreover, the speed of 8051 is not very fast. In some fast control occasions (such as using microcontrollers as USB2.0 control components), it seems to be insufficient. The Core8051 launched by the more famous Actel company has an operating frequency of only about 40 MHz. This article introduces the design of a very high-speed DS80C320 microcontroller soft core.
The DS80C320 microcontroller is a high-performance microcontroller based on the 51 framework launched by DALLAS.
It has the following advantages:
ⅰ, It has a command system that is completely consistent with the 51 series and is fully compatible with all programs developed based on the 51 series;
ii. It has more complete peripherals than 8051. Compared with 8051 microcontroller, DS80C320 adds Timer 2 and an enhanced serial port, etc.;
ⅲ. It has better efficiency than 8051. One instruction cycle of DS80C320 is 4 CLKs, while that of 8051 is 12. This difference is particularly advantageous when processing simple instructions. For example, for processing single-cycle instructions, DS80C320 only needs 4 CLKs, while 8051 needs 12. According to statistics from DALLAS, at the same clock frequency, the execution speed of each instruction of DS80C320 is 1.5 to 3 times that of 8051. For typical applications, the execution speed is also about 2.5 times that of 8051.
ⅳ, its instruction reading method is more suitable for the characteristics of IP core than 8051; the internal ROM of the microcontroller is removed, and the instructions are read completely from the outside. This feature is very suitable as a soft core. First, the structure is simple, which is conducive to the pipeline design of instruction reading. Secondly, it can break through the limitation of the internal ROM size. Finally, as a feature of FPGA design, even in the design of 8051, the internal ROM block is placed in the ROM resource of the FPGA chip. Instead of doing this, it is better to put it directly outside to simplify the timing and structure;
2 Overall structural division
The figure shows the overall functional diagram of the DS80C320 soft core:
Figure 1 DS80C320 functional block diagram
The design of this IP core mainly arranges functional blocks according to the flow of instruction execution and transmits data through the data bus; the dotted line is the CPU core; first is the ROM module. DS80C320 does not have an internal ROM, so the function of this module is mainly to analyze the instructions read from the P port, and by finding the length of the instruction and the number of cycles, calculate the relevant control signals and send them to the CPU control module to control the reading of the instruction; at the same time, if the instruction is LCALL or ACALL, the subroutine entry address can be analyzed and reported to the PC module to guide the PC to jump correctly; while the ROM module is analyzing the instruction, the decoder DECODER is also performing the decoding action. It will analyze three important parameters based on the 8-bit data of the instruction: the action type of the ALU, the source of the operation data of the instruction, and the reading method , the storage location and storage method of the instruction result; the first parameter is sent to the ALU module, and the other two are sent to the CPU control module; the CPU control module CPU_CON is the core part of the entire CPU, and it mainly completes two functions: reading data control before ALU execution, and writing back data control after ALU execution is completed; this module also controls the timing of the entire CPU and monitors the execution of other modules; ALU mainly completes calculation work; INteR module is the control module of the interrupt system, and its main function is to complete the effective judgment and sorting of interrupt requests submitted by various interrupt sources, generate interrupt flags and submit the judgment results and interrupt entry address codes to the ROM module to indicate program jumps, and is also responsible for clearing the interrupt flag after the interrupt is completed and restoring the interrupt level before the interrupt; DS80C320 has three timers and two serial ports, of which timer 2 and serial port can be cut if not needed; as for other modules or registers, they exchange data through the data bus under the control of the CPU control module; it can be seen that the idea of this design is to use CPU_CON to control the execution and timing of the entire CPU, and INTER to control the entire interrupt system, while other registers use the data bus to complete data exchange, evenly distributed on both sides of the data bus, with a clear and simple structure, and a regular design that is also conducive to improving speed and facilitating cutting.
3 Some design features
3.1 Timing Design
The information of DS80C320 microcontroller only introduces the timing of external interface, but not the internal signal execution. Therefore, it needs to be re-planned. This soft core analyzes the timing of DS80C320 in detail. According to the black box idea, it adds the pipeline technique and designs its timing as follows:
For the execution process of common instructions, the internal timing is divided as follows:
Figure 2 DS80C320 internal timing diagram
This is the execution process of a single-byte single-cycle instruction. The decoding and the length periodic table of this instruction are searched at the rising edge of C1. At the same time, the result of the previous instruction is being written back on the data bus. At the rising edge of C2, the control of the data bus and the address bus returns to the hands of this instruction. At this time, the address bus is used to send the address of the data to be read, and the data bus is ready to send and receive data. This action is completed by the CPU control module. Then at the rising edge of C3, the selected module reads the relevant data according to the address bus and the control bus and sends it to the data bus. In the next clock length, the ALU receives the data, and then at the rising edge of C4, it starts to execute data processing. At the same time, the CPU control module changes the address bus and the control bus again. The content of the data is read and a write signal is issued to prompt the module that is selected to read to give up the control of the data bus, and the module that is selected to store the result analyzes the write type and prepares to receive data. After the calculation is completed, the ALU puts the result on the data bus and waits for the C1 of the next cycle to start writing the result to the relevant position. In short, this design makes full use of the resources of the data bus and the skills of pipeline design, and simplifies the operation that originally requires 6 timings into 4, with compact timing and fast speed. At the same time, the idea of distributed processing is adopted, which greatly simplifies the function of the CPU control module. Only control signals are issued, and the specific module needs to perform what function. The module determines it by itself according to the control signal, which is conducive to avoiding the problem of local overheating of the chip caused by excessive concentration of local functions. [page]
3.2 Design of instruction length periodic table
The instruction length table is mainly used to control instruction fetching and identify instruction codes and instruction parameters; while the instruction period table is mainly used to control the time of instruction execution. These two tables can simplify the control of instruction execution. Generally, the ROM module looks up the table according to the instructions that have been read, and then processes and analyzes the results of the table lookup and the timing conditions, generates a series of control signals, and sends them to the CPU control module. The main benefit of this is to avoid the CPU control module from dealing with instructions and data, and reduce the number of its input and output ports; the design of the instruction length period table is closely related to the reading method. This design uses a table built separately and divides it into two. The processing method is: index={lsb_3, ir[7:4]}, where lsb_3 means: for the lower three bits of the instruction (ir(2 downto 0)), the rule is: 8-F=》7, 6-7=》6, 0-5 does not change. The two tables use the same reading method, which can not only simplify the structure, reduce the search space to 7 bits, but also improve the search speed;
3.3 The role of PC change code
In the microcontroller, PC needs to change constantly. Not only do all jump instructions need to change the content of PC, but interrupt instructions also need to complete the pop and push operations of PC. Therefore, the processing of PC in some models is extremely complicated, and basically the change of PC is specified in detail for each instruction. This design uses coding techniques to improve the speed in this regard. First, the possibility of coding is analyzed. Although many instructions can change the content of PC, for PC, in addition to the normal addition operation, there are only the following ways to change:
Among them, pmem1 and pmem2 are instruction parameters, which come from the ROM module; PC_OUT is the PC content in the stack.
The remaining problem is who will send this code. For all jump instructions and interrupt instructions, the jump conditions of each instruction are different and need to be judged one by one. This design cleverly uses the ALU module to process this code. The ALU module also needs to judge the operation when calculating. Therefore, just add a small piece of code to complete the function of sending the code; the PC coding method greatly simplifies the operation of the PC module and makes the program more regular;
3.4 Simulation of Bidirectional P Port
Here we mainly simulate the bidirectional ports P0 and P2. For a typical single-chip microcomputer, its P port is generally bidirectional, but for FPGA design, with the current chip structure, it is impossible to realize true bidirectionality in the FPGA chip. Therefore, as a soft core, the bidirectional simulation must be handled well. There are several common solutions: one is to directly change the bidirectional port into two unidirectional ports, which is more convenient for the soft core to use. This design also provides this method for selection, but it is different from the standard single-chip microcomputer. Therefore, this design also provides a simulated bidirectional port. According to the characteristics of FPGA design, there must be a switching process to change the direction of the signal line, so we have to carefully divide it. Analyze the instruction timing to see if the switching process can be handled in the gap between the use of the P port; first, analyze whether the instruction needs to use the P port. The more important control signals include the RD_LATCH signal sent by the decoder, which is used to distinguish whether the instruction needs to use the P port, and the control bus information from CPU_CON, which is used to inform the P port of the specific function to be completed; if the multiplexing function of the P port needs to be used, the relevant module that needs to use the P port (such as the serial port module) sends a request instruction; then the P port analyzes all usage requests and arranges different usage situations according to different usage methods; if bidirectional switching is required, it is processed according to the timing and instruction characteristics, thereby successfully completing the bidirectional switching process.
4 Synthesis and Verification
We used Altera's Quartus II 4.2 software for synthesis and Nios Development Board and Cyclone Edit development board for on-board verification. The synthesis results are as follows:
Among them, the previous version is the version without internal serial port; the results of timing simulation verification show that the system can work stably at the above frequency; theoretically converted to the main frequency of 8051: 83*2.5=207.5M, which can basically adapt to most occasions that require single-chip control; the simulation test mainly uses modelsim SE5.8 and quartus4.2 VWF file test; the on-board waveform observation mainly uses Agilent's 1673G logic analyzer; at the same time, the resources of the development board are fully utilized to carry out a large number of system-level tests; the program is downloaded to the chip, and the execution waveform of some instructions is observed by logic analysis as follows:
Figure 3 Interrupt instruction waveform
This is a waveform diagram of an interrupt return instruction. The instruction code is 32H. The main observation is the change of PC. After this instruction, PC changes from 3FH to the address 25H before the interrupt occurs.
5 Conclusion
This design has the advantages of high speed, scalability, good reusability and portability, full compatibility with DS80C320 microcontroller interface, and easy use. In particular, the specially constructed internal framework and timing distribution make its high-speed performance basically the most advanced among the current 51 series soft cores. Therefore, it can be easily applied to FPGA design and embedded system design that require microcontroller soft cores.
Previous article:Embedded Ethernet Monitoring System Based on C8051F020
Next article:Design and implementation of wireless switch system for smart home lighting control
- Popular Resources
- Popular amplifiers
- Learn ARM development(16)
- Learn ARM development(17)
- Learn ARM development(18)
- Embedded system debugging simulation tool
- A small question that has been bothering me recently has finally been solved~~
- Learn ARM development (1)
- Learn ARM development (2)
- Learn ARM development (4)
- Learn ARM development (6)
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- From probes to power supplies, Tektronix is leading the way in comprehensive innovation in power electronics testing
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
- Looking for domestic wireless charging chip manufacturers
- Detailed explanation of 5G OTA testing technology
- Analog IC Design Resource Download
- DSP chip selection tips
- MPlab IDE cannot be opened after reinstallation
- The product cannot be connected to the Internet, and the displayed time cannot be updated through the network. The LCD displays the date and time inaccurately. From which angles can it be solved...
- EEWORLD University Hall ---- Top 10 Raspberry Pi Designs in 2019
- Industrial control system software design based on finite state machine.pdf
- Overview of air conditioning automatic control system 2
- See how many PA posts have been viewed in the forum