Memory access method suitable for image detection and compression system-EEWORLD

Collect

In order to combine scalability and data processing speed, programmable data processing modules are needed for various applications such as image data error detection, video data compression, audio data gain, motor control, etc.

As the amount of data processed increases, the required memory capacity increases accordingly. The previous first-in-first-out queue (FIFO) cannot meet its high speed and large capacity requirements. Many hardware engineers began to consider the possibility of using DRAM. DRAM

has the advantages of fast access, space that can be used according to the designer's planning, and large capacity, but the memory array needs to be recharged, and double data rate synchronous dynamic random access memory (DDR SDRAM) has problems such as data phase synchronization that are difficult to control, and is not as convenient to use as FIFO. Therefore, when using FPGA for design, it is the current development trend of data control access to match the RAM control IP provided by its supplier, plus the control logic developed by the hardware engineer.

The idea of this article is to add a wrapper to this DRAM control IP so that it has a FIFO interface and a multi-port memory access control (MPMA: Multi-Port Memory Access) function. It can not only maintain the advantages of large capacity and fast access speed, but also add the advantage of easy FIFO interface. In the design process, DRAM space can have higher flexibility according to the designer's definition. As shown in Figure 1, this DRAM has two write ports and two read ports. For each write port, its data can be written continuously from the starting address until the end address, and then continue to be written from the starting address, forming a circular write method. For each read port, its data can be read out in a similar way to the circular write, and as long as the number of data written to the memory is greater than the number of data read out, it is a reasonable FIFO-like access method. Figure

1 DRAM control slot with two write ports and two read ports

How MPMA is applied to data processing modules

In many applications that require large amounts of information to be processed, a large cache is required. Compared with the price of a 4KB FIFO, it is more appropriate to buy a 32Mb DRAM. However, its complex access control is a big problem. Therefore, when writing the HDL algorithm of the FPGA, the IP provided by the FPGA supplier can be used to form a solution.

For applications with high repetitiveness of the amount of data to be processed, such as the image raw data shown in Figure 2, the image error detection algorithm is used to detect whether the P4 point is wrong. The 8 points around it need to be compared as reference data. If FIFO is used, it may not be possible to access the data of these three lines at the same time, so DRAM is used to access a large amount of data.

Figure 2 Array of raw image data points

Since the control method of DRAM is relatively complex, the data address to be accessed must be recalculated every time it is accessed. According to the continuity of its data address, after the raw image data is written, it can be divided into three ports and read out in a continuous address manner. As shown in Figure 2, the first port continuously reads out P0, P1, and P2, the second port continuously reads out P4, P5, and P6, and the third port continuously reads out P8, P9, and P10, so that the calculation of P5 point error detection can be completed; when calculating whether P6 point is wrong, the first port only needs to read out P3, the second port reads out P7, and the third port reads out P11, so that the data before calculation can be completed, which greatly improves the data utilization rate. The continuous reading mechanism does not need to calculate the data address before each calculation, as long as each port continuously reads the data, it can be completed, and the complexity of DRAM control is also reduced.

Implementation of MPMA

The following takes the DDR DRAM controller generated by Altera MegaCore IP Generator as an example, and adds the self-created Wrapper logic to construct an MPMA access port with one input (32-bit input) and one output (8-bit output). Figure 3 is its block architecture diagram.

Figure 3 One-input and one-output MPMA access port

In this architecture, the data bandwidth between Altera DDR DRAM control and write/read wrapper is 64 bits, and the input and output bandwidth can be freely programmed through the wrapper logic. In the write/read wrapper, the address calculation of the data adopts a progressive accumulation method, and its access interface is similar to the access of FIFO, so it is easier to implement the access of large-capacity data.

Each wrapper has a small-capacity FIFO, a packing/unpacking mechanism, and an address progressive counter. The FIFO is used to adjust the difference between the user interface and the DRAM frequency domain; the packing/unpacking mechanism is used to adjust the input/output interface data bus width to the same level as the DRAM control IP interface, so as to improve the efficiency of writing/reading DRAM data. The address progressive counter is the DRAM address generator of each wrapper. As long as the counter number in the write wrapper is greater than the counter in the read wrapper, the data read must be the legal data that has been written to the DRAM before, and the data at the wrong address will not be accessed.

MPMA improves efficiency. Take

point P5 in Figure 2 as an example. If no wrapper is used, the data of this point will be written once, and read out 1 (as the main operation point) + 8 (as the reference data point) times during the calculation. When an image with n points of data needs to be debugged, n*(1+1+8) data accesses are required, not including the delay caused by address calculation.

When a one-input and three-output MPMA wrapper is used, point P5 only needs to be written once, and read out 3 times during the calculation (3 read wrappers need to read once each), so the same n-point data only needs n*(1+3) data accesses to complete the debug processing, and the progressive DRAM address calculation is used, without spending extra delay time. It can be seen that the MPMA design can improve the data access efficiency by more than 2 times.

Conclusion

This article proposes an architecture that adds a write/read wrapper to the IP core provided by the FPGA supplier, which has the advantages of high data reuse rate and easy-to-operate FIFO-like interface. Designers can also define the number of MPMA wrapper input/output ports and data bus width to improve data utilization.

ADLINK has applied this technology to large-scale image acquisition/processing/transmission modules, especially line-scan image acquisition systems, because during detection, image data calculation algorithms need to be executed on FPGAs, which requires large-capacity image data caches and repeated reading of image data. This technology is necessary.

Reference address：Memory access method suitable for image detection and compression system

Previous article：Application of Ferroelectric Memory in Multi-CPU Automatic Identification Control System
Next article：Testing Algorithm for Unidirectional Dual-Port SRAM

Popular Resources
Popular amplifiers

Latest Microcontroller Articles

Download from the Internet--ARM Getting Started Notes
A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
Learn ARM development(22)
Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
Learn ARM development(21)
First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
Learn ARM development(20)
With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
Learn ARM development(19)
After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
Learn ARM development(14)
Learn ARM development(15)
Learn ARM development(16)
Learn ARM development(17)

He Limin Column Microcontroller and Embedded Systems Bible

Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.

MoreSelected Circuit Diagrams

Change More Related Popular Components

MorePopular Articles

MoreDaily News

Guess you like