In the application fields of image communication, remote sensing image analysis, medical imaging diagnosis, etc., in order to facilitate display, observation or further processing, it is often necessary to extract features (such as edge detection, edge sharpening), smooth noise filtering, geometric correction, etc. of the original digital image. This type of image processing technology is called image preprocessing. In practical applications, spatial domain filtering algorithms are widely used in image preprocessing technology.
The spatial domain filtering algorithm is a type of image enhancement technology that directly processes the pixels of an image without the need for transformation. Common filtering operators such as sharpening operators, high-pass operators, and smoothing operators can perform image edge extraction, noise removal, and other processing. Although these filtering operators have different functions, their implementation methods are similar, and they are all implemented through the template convolution method.
The rapid development of VLSI technology provides a hardware foundation for real-time digital image processing technology, and the characteristics of FPGA (Field Programmable Gate Array) make it very suitable for digital image processing. This paper studies the design of hardware circuits on the FPGA design platform to implement the spatial domain filtering algorithm of digital images.
1 Digital Image Spatial Domain Filtering Algorithm
The implementation steps of the digital image spatial domain filtering algorithm are shown in Figure 1. The left part is a part of the image to be processed, and the middle is the 3×3 template for processing the image.
The specific processing steps are:
Move the template on the image and make the center of the template coincide with a certain pixel position in the image;
Multiply the coefficient on the template with the corresponding pixel under the template;
Add all the products together.
Assign the sum (template output response) to the pixel in the image corresponding to the center position of the template. Figure 1 shows a part of the image, S0~S8 are the grayscale values of the pixels, K0~K8 are the 3×3 template coefficients. The process of using this 3×3 template for spatial filtering is: the center point of the template, that is, the point where the template coefficient is K0, coincides with the point with the grayscale value S0 in the image, and the template output response R is:
R=K0*S0+K1*S1+…+K8*S8 (1)
In this way, the gray value of the pixel at the original position (x, y) of the enhanced image changes from S0 to R. If the template operation is performed on each pixel in the image, the new gray value of the enhanced image at all positions can be obtained. If different values are given to the template coefficients when designing the filter, different high-pass and low-pass effects can be obtained.
The image used in this paper is a grayscale image of size 256×256, and the filter template size is 3×3. How to design the hardware circuit to complete the above spatial domain filtering algorithm? By analyzing the implementation process of the above algorithm, it can be concluded that the spatial domain filtering algorithm can be described by using three third-order FIR filters + delay units. [page]
2 FPGA Design of FIR Digital Filter
When designing the circuit of three third-order FIR filters + delay units to implement the spatial filtering algorithm, the main issues to be considered are: how to shorten the critical path in hardware circuit design and improve the data throughput of the system. In order to solve these key issues in actual FPGA design, the following aspects are considered when designing the specific circuit:
2.1 FIR digital filter and pipeline structure
Pipelining technology is widely used in modern microprocessors, digital signal processors, and high-speed digital system designs. Its core design idea is to divide the logic operations performed in one cycle into several smaller operations and complete them in multiple high-speed clock cycles. The result of each small logic operation is stored in a register, synchronized by a high-speed clock, and used in the next pipeline unit. Therefore, it is one of the most commonly used technologies in speed optimization and can greatly improve the overall operating speed of digital systems.
The following analyzes the basic structure of the third-order FIR filter, the FIR structure after adopting pipeline technology, and the data broadcast structure of the FIR filter.
A third-order finite impulse response (FIR) digital filter can be expressed as follows:
y(n)=ax(n)+bx(n-1)+cx(n-2) (2)
The structure of this third-order FIR filter is shown in Figure 2.
In Figure 2, the critical path (minimum time to process a new sample point) of the FIR filter of this structure is limited by the time of one multiplier and two adders. If the sampling period is less than this minimum time, then the FIR filter of this structure cannot meet the requirements. At this time, pipelining technology should be considered. The critical path can be shortened by using pipeline technology, as shown in Figure 3.
In a pipelined FIR filter, when the current iteration is started, the adder at node 2 is completing the calculation of the previous iteration result. Therefore, the critical path is shortened from the time of one multiplier and two adders to the time of one multiplier and one adder.
When using pipeline technology to reduce the critical path length by appropriately inserting pipeline latches in the structure, the insertion of latches is not random. When the data flow graph is cut, the data direction must be consistent and forward. In this way, the added pipeline will not affect the function. In Figure 3, when the pipeline latch is inserted, latches are added to both the upper and lower paths along the forward direction of the data flow in the structure, so that the logic of the FIR filter will not be confused. The speed (clock cycle) of a structure is usually defined by the longest path between any two latches, between an input and a latch, between a latch and an output, or between an input and an output. Pipeline latches can effectively shorten the longest path.
In addition to the above two FIR filter structures, there is also a data broadcast structure FIR digital filter, which shortens the critical path by transposing the structure, and does not need to introduce any pipeline latches. The specific transformation method is: changing the input and output; reversing the direction of the signal flow; replacing the adder with a branch, and vice versa. The data broadcast structure FIR digital filter is shown in Figure 4.
In this structure, data is not stored but broadcasted to all multipliers at the same time. The critical path of this structure is the same as the critical path of the FIR filter structure with pipeline latches inserted in Figure 3. However, no additional shift register is required for the input, and no additional pipeline is required for the sum of partial products to achieve a high throughput rate. This is the advantage of the FIR filter data broadcast structure.
This article adopts the three different FIR digital filter structures introduced above when designing the spatial domain filtering algorithm circuit. [page]
2.2 Hardware Design of Multiplier Module
From the template operation expression of formula (1) and the FIR filter expression of formula (2), it can be seen that there is another important link in completing the template operation and realizing FIR digital filtering, which is the multiplication operation. The multiplier module is one of the key modules that affects the operation speed of the spatial domain filtering algorithm.
The multiplication operation can basically be divided into two steps: one is to find all the basic product terms, and the other is to add all the basic product terms. Therefore, to design a fast multiplier circuit module, it is necessary to improve these two steps. On the one hand, the number of partial products should be reduced, and on the other hand, the accumulation speed of the partial product summation array should be improved. Therefore, in order to speed up the operation speed of the multiplier module, when designing the multiplier circuit, special consideration is given to using the radix 4-BOOTH algorithm to reduce the number of partial sums, and at the same time, the Wallace Tree is used to reduce the carry propagation delay of the partial product addition array in the array multiplier, thereby speeding up the operation speed of the entire addition array.
The basic principle of the radix 4-BOOTH algorithm is to encode the multiplier and generate partial products according to the encoding table, considering only 3 bits at a time: the current bit, the adjacent high bit, and the adjacent low bit. Wallace Tree is relatively regular and easy to layout and route. This method does not directly add all partial products one by one, but adds and merges the data bits with the same weight in each partial product. A full adder is usually used to complete the addition of bits with the same weight. Using a full adder, the number of vectors of partial products can be reduced by 3:2 at each layer of the Wallace tree. Two full adders can also be used to obtain a reduction ratio of 4:2. In this article, a 3:2 counter (full adder) is used to reduce partial sums. In this way, when the number of partial products is large, the Wallace Tree multiplier is used to reduce the partial sums very quickly.
3 Simulation synthesis results
Section 2 introduces the process of designing and implementing the high-speed digital filter FIR of the digital image spatial domain filtering algorithm on the FPGA platform in this paper, in which the main considerations are to shorten the critical path and improve the data throughput. This section gives a comparison table of simulation and synthesis results. In this paper, the test image is a 256×256 size, 8-bit grayscale image. The design software is XILINX's ISE integrated development environment, the simulation tool is Modelsim SE 5.8b, the synthesis tool is ISE's own synthesis software XST, and the implementation chip is XILINX's XC2V1000. The simulation and synthesis results show that the designed circuit fully meets the requirements.
The following is a comparison table of comprehensive results of three circuit structures designed according to the above different FIR filter structures for implementing spatial domain filtering algorithms.
From Table 1, we can see that from the perspective of resource usage, structure 3 has the largest number of equivalent gates, while structure 2 has the least. From the perspective of delay/maximum frequency, we can see that structure 1 is the best.
The structural delay comparison data of the three structures is shown in Table 2.
4 Conclusion
This paper discusses the digital image spatial domain filtering algorithm and the basic design method of FIR filter. Based on the critical path analysis, pipeline design is introduced to improve the operation speed. Three design structures of the filter are proposed and the design process of the filter is given. It can be seen from the simulation and synthesis results that the hardware resources are effectively saved, the hardware volume is greatly reduced, and the reliability of the system is increased.
Previous article:How to use FPGA to verify the schematic diagram of the prototype board
Next article:Design of a Radar Beam Control System Based on FPGA
Recommended ReadingLatest update time:2024-11-16 16:52
- Popular Resources
- Popular amplifiers
- Analysis and Implementation of MAC Protocol for Wireless Sensor Networks (by Yang Zhijun, Xie Xianjie, and Ding Hongwei)
- MATLAB and FPGA implementation of wireless communication
- Intelligent computing systems (Chen Yunji, Li Ling, Li Wei, Guo Qi, Du Zidong)
- Summary of non-synthesizable statements in FPGA
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Award-winning live broadcast: Infineon MERUS Class D audio amplifier's multi-level technology and its advantages live broadcast material collection
- Who has used the LM26LV-Q1 temperature chip?
- Question about programmable frequency divider based on shift register
- AD20 Select PCB Components cannot be used?
- Newly purchased MicroPython development board
- STM32F 7508DK I2C BH1570 driver development and I2C usage
- C2000 MCU, Vienna Rectifier-Based Three-Phase Power Factor Correction Reference Design
- The 5G era has arrived. This is a rare opportunity. What ideas do you have to share?
- General method of displaying pictures and texts on TFT color screen
- Ultra-low standby power consumption DC-DC controller chip