FPGA Design and Implementation of Digital Image Spatial Domain Filtering Algorithm-EEWORLD

Collect

In the application fields of image communication, remote sensing image analysis, medical imaging diagnosis, etc., in order to facilitate display, observation or further processing, it is often necessary to extract features (such as edge detection, edge sharpening), smooth noise filtering, geometric correction, etc. of the original digital image. This type of image processing technology is called image preprocessing. In practical applications, spatial domain filtering algorithms are widely used in image preprocessing technology.

The spatial domain filtering algorithm is a type of image enhancement technology that directly processes the pixels of an image without the need for transformation. Common filtering operators such as sharpening operators, high-pass operators, and smoothing operators can perform image edge extraction, noise removal, and other processing. Although these filtering operators have different functions, their implementation methods are similar, and they are all implemented through the template convolution method.

The rapid development of VLSI technology provides a hardware foundation for real-time digital image processing technology, and the characteristics of FPGA (Field Programmable Gate Array) make it very suitable for digital image processing. This paper studies the design of hardware circuits on the FPGA design platform to implement the spatial domain filtering algorithm of digital images.

1 Digital Image Spatial Domain Filtering Algorithm

The implementation steps of the digital image spatial domain filtering algorithm are shown in Figure 1. The left part is a part of the image to be processed, and the middle is the 3×3 template for processing the image.

Implementation steps of digital image spatial domain filtering algorithm

The specific processing steps are:

Move the template on the image and make the center of the template coincide with a certain pixel position in the image;

Multiply the coefficient on the template with the corresponding pixel under the template;

Add all the products together.

Assign the sum (template output response) to the pixel in the image corresponding to the center position of the template. Figure 1 shows a part of the image, S0~S8 are the grayscale values of the pixels, K0~K8 are the 3×3 template coefficients. The process of using this 3×3 template for spatial filtering is: the center point of the template, that is, the point where the template coefficient is K0, coincides with the point with the grayscale value S0 in the image, and the template output response R is:

R=K0*S0+K1*S1+…+K8*S8 (1)

In this way, the gray value of the pixel at the original position (x, y) of the enhanced image changes from S0 to R. If the template operation is performed on each pixel in the image, the new gray value of the enhanced image at all positions can be obtained. If different values are given to the template coefficients when designing the filter, different high-pass and low-pass effects can be obtained.

The image used in this paper is a grayscale image of size 256×256, and the filter template size is 3×3. How to design the hardware circuit to complete the above spatial domain filtering algorithm? By analyzing the implementation process of the above algorithm, it can be concluded that the spatial domain filtering algorithm can be described by using three third-order FIR filters + delay units. [page]

2 FPGA Design of FIR Digital Filter

When designing the circuit of three third-order FIR filters + delay units to implement the spatial filtering algorithm, the main issues to be considered are: how to shorten the critical path in hardware circuit design and improve the data throughput of the system. In order to solve these key issues in actual FPGA design, the following aspects are considered when designing the specific circuit:

2.1 FIR digital filter and pipeline structure

Pipelining technology is widely used in modern microprocessors, digital signal processors, and high-speed digital system designs. Its core design idea is to divide the logic operations performed in one cycle into several smaller operations and complete them in multiple high-speed clock cycles. The result of each small logic operation is stored in a register, synchronized by a high-speed clock, and used in the next pipeline unit. Therefore, it is one of the most commonly used technologies in speed optimization and can greatly improve the overall operating speed of digital systems.

The following analyzes the basic structure of the third-order FIR filter, the FIR structure after adopting pipeline technology, and the data broadcast structure of the FIR filter.

A third-order finite impulse response (FIR) digital filter can be expressed as follows:

y(n)=ax(n)+bx(n-1)+cx(n-2) (2)

The structure of this third-order FIR filter is shown in Figure 2.

The structure diagram of the third-order FIR filter implementation

In Figure 2, the critical path (minimum time to process a new sample point) of the FIR filter of this structure is limited by the time of one multiplier and two adders. If the sampling period is less than this minimum time, then the FIR filter of this structure cannot meet the requirements. At this time, pipelining technology should be considered. The critical path can be shortened by using pipeline technology, as shown in Figure 3.

Pipeline structure of a third-order FIR filter

In a pipelined FIR filter, when the current iteration is started, the adder at node 2 is completing the calculation of the previous iteration result. Therefore, the critical path is shortened from the time of one multiplier and two adders to the time of one multiplier and one adder.

When using pipeline technology to reduce the critical path length by appropriately inserting pipeline latches in the structure, the insertion of latches is not random. When the data flow graph is cut, the data direction must be consistent and forward. In this way, the added pipeline will not affect the function. In Figure 3, when the pipeline latch is inserted, latches are added to both the upper and lower paths along the forward direction of the data flow in the structure, so that the logic of the FIR filter will not be confused. The speed (clock cycle) of a structure is usually defined by the longest path between any two latches, between an input and a latch, between a latch and an output, or between an input and an output. Pipeline latches can effectively shorten the longest path.

In addition to the above two FIR filter structures, there is also a data broadcast structure FIR digital filter, which shortens the critical path by transposing the structure, and does not need to introduce any pipeline latches. The specific transformation method is: changing the input and output; reversing the direction of the signal flow; replacing the adder with a branch, and vice versa. The data broadcast structure FIR digital filter is shown in Figure 4.

FIR digital filter based on data broadcast structure

In this structure, data is not stored but broadcasted to all multipliers at the same time. The critical path of this structure is the same as the critical path of the FIR filter structure with pipeline latches inserted in Figure 3. However, no additional shift register is required for the input, and no additional pipeline is required for the sum of partial products to achieve a high throughput rate. This is the advantage of the FIR filter data broadcast structure.

This article adopts the three different FIR digital filter structures introduced above when designing the spatial domain filtering algorithm circuit. [page]

2.2 Hardware Design of Multiplier Module

From the template operation expression of formula (1) and the FIR filter expression of formula (2), it can be seen that there is another important link in completing the template operation and realizing FIR digital filtering, which is the multiplication operation. The multiplier module is one of the key modules that affects the operation speed of the spatial domain filtering algorithm.

The multiplication operation can basically be divided into two steps: one is to find all the basic product terms, and the other is to add all the basic product terms. Therefore, to design a fast multiplier circuit module, it is necessary to improve these two steps. On the one hand, the number of partial products should be reduced, and on the other hand, the accumulation speed of the partial product summation array should be improved. Therefore, in order to speed up the operation speed of the multiplier module, when designing the multiplier circuit, special consideration is given to using the radix 4-BOOTH algorithm to reduce the number of partial sums, and at the same time, the Wallace Tree is used to reduce the carry propagation delay of the partial product addition array in the array multiplier, thereby speeding up the operation speed of the entire addition array.

The basic principle of the radix 4-BOOTH algorithm is to encode the multiplier and generate partial products according to the encoding table, considering only 3 bits at a time: the current bit, the adjacent high bit, and the adjacent low bit. Wallace Tree is relatively regular and easy to layout and route. This method does not directly add all partial products one by one, but adds and merges the data bits with the same weight in each partial product. A full adder is usually used to complete the addition of bits with the same weight. Using a full adder, the number of vectors of partial products can be reduced by 3:2 at each layer of the Wallace tree. Two full adders can also be used to obtain a reduction ratio of 4:2. In this article, a 3:2 counter (full adder) is used to reduce partial sums. In this way, when the number of partial products is large, the Wallace Tree multiplier is used to reduce the partial sums very quickly.

3 Simulation synthesis results

Section 2 introduces the process of designing and implementing the high-speed digital filter FIR of the digital image spatial domain filtering algorithm on the FPGA platform in this paper, in which the main considerations are to shorten the critical path and improve the data throughput. This section gives a comparison table of simulation and synthesis results. In this paper, the test image is a 256×256 size, 8-bit grayscale image. The design software is XILINX's ISE integrated development environment, the simulation tool is Modelsim SE 5.8b, the synthesis tool is ISE's own synthesis software XST, and the implementation chip is XILINX's XC2V1000. The simulation and synthesis results show that the designed circuit fully meets the requirements.

The following is a comparison table of comprehensive results of three circuit structures designed according to the above different FIR filter structures for implementing spatial domain filtering algorithms.

From Table 1, we can see that from the perspective of resource usage, structure 3 has the largest number of equivalent gates, while structure 2 has the least. From the perspective of delay/maximum frequency, we can see that structure 1 is the best.

The structural delay comparison data of the three structures is shown in Table 2.

It can be seen from Table 1

4 Conclusion

This paper discusses the digital image spatial domain filtering algorithm and the basic design method of FIR filter. Based on the critical path analysis, pipeline design is introduced to improve the operation speed. Three design structures of the filter are proposed and the design process of the filter is given. It can be seen from the simulation and synthesis results that the hardware resources are effectively saved, the hardware volume is greatly reduced, and the reliability of the system is increased.

Keywords：FPGA Reference address：FPGA Design and Implementation of Digital Image Spatial Domain Filtering Algorithm

Previous article：How to use FPGA to verify the schematic diagram of the prototype board
Next article：Design of a Radar Beam Control System Based on FPGA

Recommended ReadingLatest update time:2024-11-16 16:52

Optimizing the Design of Industrial Ethernet Switches Using FPGA

Ethernet-based networking is one of the fastest growing technologies in the industrial market. Most industrial Ethernet standards use the IEEE 802.3 standard Ethernet protocol, so these networks can transmit standard network services and real-time data. But each standard uses different technologies to provide real-t

[Embedded]

Optimizing the Design of Industrial Ethernet Switches Using FPGA

Design and implementation of HDLC protocol controller based on FPGA+ARM

Aiming at the communication requirements of HDLC protocol in flight control simulation device, a new HDLC protocol controller based on FPGA+ARM architecture is designed. The frame structure and cyclic redundancy check (CRC) principle of HDLC protocol are first introduced in this paper. Then, combined with the advantag

[Microcontroller]

Design and implementation of HDLC protocol controller based on FPGA+ARM

FPGA-based LCD display remote update

　　1 Project Background 　　1.1 Research Background 　　The application of LCD display screens is becoming more and more widespread, and the number is increasing. LCD display screens are widely used and are everywhere. From various household electrical appliances to military equipment. More commonly, they are used in v

[Power Management]

Mid-range FPGA process drops to 28nm FD-SOI technology for Lattice's new platform

The high-end FPGA market is dominated by two giants, Xilinx and Intel, and Lattice has hardly ever set foot on this mountain. So how did it become the world's third largest FPGA manufacturer? This year, Xilinx and Intel competed for the title of "World's Largest FPGA", and released two 16nm and 14nm FPGA chips respe

[Mobile phone portable]

Mid-range FPGA process drops to 28nm FD-SOI technology for Lattice's new platform

Design and implementation of multi-module UART controller based on FPGA

Asynchronous serial communication requires fewer transmission lines, has high reliability, and has a long transmission distance. It is widely used in data exchange between microcomputers and peripherals. To achieve serial communication, two main tasks need to be completed: Convert the serial port level to the w

[Embedded]

Design and implementation of multi-module UART controller based on FPGA

Using Cyclone III FPGAs in High-Definition LCD HDTVs

introduction Today's liquid crystal display (LCD) technology is widely used in the field of high-definition television (HDTV). The challenge is how to achieve higher resolution and faster data rates. Increasing data rates requires specialized image processing algorithms to support fast-moving video. The mai

[Embedded]

Using Cyclone III FPGAs in High-Definition LCD HDTVs

Xilinx launches Vitis unified software platform to reach out to software developers

"The Vitis unified software platform is targeted at software developers in the AI community. For a long time, our development tools have been mainly for hardware developers, so Xilinx is not very well known to software people. With the official launch of Vitis and Vitis AI today, it marks a transformation of the com

[Embedded]

Xilinx launches Vitis unified software platform to reach out to software developers

Design of large-capacity data storage using FPGA and SRAM

1 Introduction Aiming at the shortcoming of limited internal BlockRAM in FPGA, a method of improving the design by combining FPGA with external SRAM is proposed, and some VHDL programs are given. 2 Hardware Design Here we will mainly discuss the design ideas of using Xilinx's FPGA (XC2S600E-6f

[Industrial Control]

Popular Resources
Popular amplifiers