As the amount of data processed increases, the required memory capacity increases accordingly. The previous first-in-first-out queue (FIFO) cannot meet its high speed and large capacity requirements. Many hardware engineers began to consider the possibility of using DRAM. DRAM
has the advantages of fast access, space that can be used according to the designer's planning, and large capacity, but the memory array needs to be recharged, and double data rate synchronous dynamic random access memory (DDR SDRAM) has problems such as data phase synchronization that are difficult to control, and is not as convenient to use as FIFO. Therefore, when using FPGA for design, it is the current development trend of data control access to match the RAM control IP provided by its supplier, plus the control logic developed by the hardware engineer.
The idea of this article is to add a wrapper to this DRAM control IP so that it has a FIFO interface and a multi-port memory access control (MPMA: Multi-Port Memory Access) function. It can not only maintain the advantages of large capacity and fast access speed, but also add the advantage of easy FIFO interface. In the design process, DRAM space can have higher flexibility according to the designer's definition. As shown in Figure 1, this DRAM has two write ports and two read ports. For each write port, its data can be written continuously from the starting address until the end address, and then continue to be written from the starting address, forming a circular write method. For each read port, its data can be read out in a similar way to the circular write, and as long as the number of data written to the memory is greater than the number of data read out, it is a reasonable FIFO-like access method. Figure
1 DRAM control slot with two write ports and two read ports
How MPMA is applied to data processing modules
In many applications that require large amounts of information to be processed, a large cache is required. Compared with the price of a 4KB FIFO, it is more appropriate to buy a 32Mb DRAM. However, its complex access control is a big problem. Therefore, when writing the HDL algorithm of the FPGA, the IP provided by the FPGA supplier can be used to form a solution.
For applications with high repetitiveness of the amount of data to be processed, such as the image raw data shown in Figure 2, the image error detection algorithm is used to detect whether the P4 point is wrong. The 8 points around it need to be compared as reference data. If FIFO is used, it may not be possible to access the data of these three lines at the same time, so DRAM is used to access a large amount of data.
Figure 2 Array of raw image data points
Since the control method of DRAM is relatively complex, the data address to be accessed must be recalculated every time it is accessed. According to the continuity of its data address, after the raw image data is written, it can be divided into three ports and read out in a continuous address manner. As shown in Figure 2, the first port continuously reads out P0, P1, and P2, the second port continuously reads out P4, P5, and P6, and the third port continuously reads out P8, P9, and P10, so that the calculation of P5 point error detection can be completed; when calculating whether P6 point is wrong, the first port only needs to read out P3, the second port reads out P7, and the third port reads out P11, so that the data before calculation can be completed, which greatly improves the data utilization rate. The continuous reading mechanism does not need to calculate the data address before each calculation, as long as each port continuously reads the data, it can be completed, and the complexity of DRAM control is also reduced.
Implementation of MPMA
The following takes the DDR DRAM controller generated by Altera MegaCore IP Generator as an example, and adds the self-created Wrapper logic to construct an MPMA access port with one input (32-bit input) and one output (8-bit output). Figure 3 is its block architecture diagram.
Figure 3 One-input and one-output MPMA access port
In this architecture, the data bandwidth between Altera DDR DRAM control and write/read wrapper is 64 bits, and the input and output bandwidth can be freely programmed through the wrapper logic. In the write/read wrapper, the address calculation of the data adopts a progressive accumulation method, and its access interface is similar to the access of FIFO, so it is easier to implement the access of large-capacity data.
Each wrapper has a small-capacity FIFO, a packing/unpacking mechanism, and an address progressive counter. The FIFO is used to adjust the difference between the user interface and the DRAM frequency domain; the packing/unpacking mechanism is used to adjust the input/output interface data bus width to the same level as the DRAM control IP interface, so as to improve the efficiency of writing/reading DRAM data. The address progressive counter is the DRAM address generator of each wrapper. As long as the counter number in the write wrapper is greater than the counter in the read wrapper, the data read must be the legal data that has been written to the DRAM before, and the data at the wrong address will not be accessed.
MPMA improves efficiency. Take
point P5 in Figure 2 as an example. If no wrapper is used, the data of this point will be written once, and read out 1 (as the main operation point) + 8 (as the reference data point) times during the calculation. When an image with n points of data needs to be debugged, n*(1+1+8) data accesses are required, not including the delay caused by address calculation.
When a one-input and three-output MPMA wrapper is used, point P5 only needs to be written once, and read out 3 times during the calculation (3 read wrappers need to read once each), so the same n-point data only needs n*(1+3) data accesses to complete the debug processing, and the progressive DRAM address calculation is used, without spending extra delay time. It can be seen that the MPMA design can improve the data access efficiency by more than 2 times.
Conclusion
This article proposes an architecture that adds a write/read wrapper to the IP core provided by the FPGA supplier, which has the advantages of high data reuse rate and easy-to-operate FIFO-like interface. Designers can also define the number of MPMA wrapper input/output ports and data bus width to improve data utilization.
ADLINK has applied this technology to large-scale image acquisition/processing/transmission modules, especially line-scan image acquisition systems, because during detection, image data calculation algorithms need to be executed on FPGAs, which requires large-capacity image data caches and repeated reading of image data. This technology is necessary.
Previous article:Application of Ferroelectric Memory in Multi-CPU Automatic Identification Control System
Next article:Testing Algorithm for Unidirectional Dual-Port SRAM
- Popular Resources
- Popular amplifiers
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- [RVB2601 Creative Application Development] Dynamically loading MBRE JPEG decoder transplant source code and test results
- The chip does not work when powered on
- Right angle turn without amplitude
- Intelligence at the Edge Powers Autonomous Factories
- 8. [Learning LPC1768 library functions] Timer experiment
- One minute to understand: World Industrial History (video explanation)
- EEWorld's "Search Device" applet is now online, allowing you to easily check device data and make chip selection easier
- Wireless transmission distance calculation
- How to amplify small signals with the ATA-5000 series preamplifier?
- 4G Small Base Station