0 Introduction
Digital filters play an important role in speech and image processing, pattern recognition, radar signal processing, spectrum analysis and other applications. They can avoid problems such as temperature drift and noise that analog filters cannot overcome. At the same time, they are more accurate, stable, small and flexible than analog filters, so they are widely used. In acoustic well logging, it is usually necessary to filter the signal accurately, and the filter has strict real-time requirements. This paper uses the auxiliary Matlab design tool to design a high-order fast digital filter based on FPGA that can meet the logging requirements.
1 Linear Phase FIR Filter Structure
There are many types of digital filters, and the classification methods are also different. From the perspective of the unit impulse response of the digital filter, digital filters are divided into finite impulse response digital filters (FIR) and infinite impulse response digital filters (IIR). Compared with IIR filters, FIR filters can be accurately designed with linear phase, and their structure has stable quantized filter coefficients. For acoustic logging processing of acoustic signals with linear phase requirements, FIR filters are the first choice.
In the time domain, the input and output process of the FIR filter is a process of linear convolution of the input signal and the unit impulse response, and its differential equation expression is:
Among them, y (n) is the filter output, x (n) is the sampled data, and h (n) is the filter tap coefficient. Its structure is shown in Figure 1 (a). In the figure, an N-1 order FIR filter is described by N coefficients. Usually, N multipliers and N-1 two-input adders are required to realize it. It is not difficult to find that the coefficients of the multiplier are exactly the coefficients of the transfer function. Therefore, this structure is called a direct structure.
For a FIR linear phase filter with symmetric coefficients, equation (1) can be written as follows:
The structure of the improved FIR filter with symmetric coefficients is shown in Figure 1(b). This structure combines taps with symmetric coefficients (same or opposite) and then multiplies them, which can reduce the number of multipliers to half of the original number, but also adds additional adders.
Figure 1 FIR filter structure
2 Design methods and indicators
FDATool is a dedicated filter design and analysis tool in the Matlab signal processing toolbox. The main function of this tool is to extract filter coefficients according to design indicators. The key to designing digital filters with FDATool lies in the selection of parameters such as filter type, window function, filter order, and cutoff frequency. The window function is used to determine the stopband attenuation and transition band bandwidth. Commonly used window functions include rectangular window, Hanning window, Hamming window, and Blackman window. The rectangular window and Hanning window have small stopband attenuation, while the Blackman window has a large transition band. Relatively speaking, the Hamming window is more in line with the design requirements. Its minimum stopband can reach 54.5dB, and the normalized transition band bandwidth is 3.11π/M (filter order N=2M+1). For acoustic logging signals, the parameters listed in Table 1 should be set during design.
Table 1 Filter parameter selection
Figure 2 shows the amplitude-frequency and phase-frequency response curves of the filter, which maintain linear phase in the passband, stopband attenuation greater than 52dB, and transition band bandwidth of 1.65kHz. The tap coefficients can be quantized to fixed-point integer data in the toolbox to implement a 127-order filter on the FPGA, which has a total of 128 coefficients. For larger order filters, quantization has minimal effect on stopband attenuation and transition band.
Figure 2 Filter amplitude-frequency and phase-frequency response characteristic curves
3 FPGA-based filter design
The key to designing FIR filters with FPGA is how to deal with the multiplication unit that occupies a lot of resources. The introduction of distributed algorithm (DA) can convert multiplication operations into shift-add operations, thereby saving hardware resources. If Hk is the filter coefficient, xk (n) is the sample input at time n, and y (n) is the system response at time n, then equation (1) can be equivalent to the following equation:
If the source data format of the data is specified as 2's complement form, then:
In the formula, xkb (n) is a binary number, which can be 0 or 1; xk0 (n) is the sign bit, 1 means the data is negative, and 0 means the data is positive. Therefore, substituting (4) into (3) yields:
The form of formula (5) is called a distributed algorithm. It can be seen that the square brackets represent a data bit of the input variable and each bit of all filter tap coefficients H0~HN are “AND” operated and summed. The exponent part describes the bit weight of the summation result. An integer multiplied by 2b is a left shift of b bits, which can be achieved through hardware wiring without occupying logic resources. In this way, the operation in the square brackets can be achieved by establishing a lookup table. The lookup table can be addressed with the same bit of all input variables. This is a distributed algorithm based on the lookup table (LUT-DA).
The lookup table size of the LUT-DA algorithm is B·2N bits, where B is the bit width of the input data and N is the filter order. As the filter order increases, the lookup table size grows exponentially by 2; when B is 16 and N is 128, the size of the lookup table is already unimaginable. Therefore, dividing the lookup table into multiple sub-tables can effectively solve this problem, which also derives relatively effective serial LUT-DA algorithms and parallel LUT-DA algorithms, but both have shortcomings. For a serial structure, it takes more than B clock cycles to complete an output; for a parallel structure, although an output can be completed in one clock cycle, it is necessary to copy B identical LUT tables, which will increase the overhead of hardware resources.
In order to balance speed and area, this paper designs a CSD-DA algorithm based on the DA algorithm principle. First, the fixed coefficient Hk in the coefficient formula (3) is expanded by the power of 2 to obtain:
Then swap the shift and accumulation order, and you get the following:
Wherein, Hkb is a weight coefficient with a value of 0 or 1; Sk is 1, indicating that Hk is positive, and -1, indicating that Hk is negative; s′kb can be 0, -1, or 1. After the expansion of equation (4), all multiplication operations will be converted into shift-add operations, and the parts with a weight of 0 can be eliminated without calculation. In order to further reduce the non-zero items in the Hkb array, Hk can be encoded as a CSD code, that is, starting from the least significant bit of the binary code, all 1 sequences greater than or equal to 2 are replaced by 10···01, and 1 indicates that the bit is -1. Since any two adjacent bits in the CSD representation must contain a 0, the number of 1s will not exceed N/2 at most. On average, about 1/3 of the bits in the CSD representation are non-zero values, which is about 1/3 less than the non-zero bits in the complement representation. Assume h = (15) 10 = (01111) 2, y = hx = x (23 + 22 + 21 + 20), and if (15) 10 is encoded as (10001) csd, then Y = x · (24-20). Using binary encoding, three adders will be used, while using CSD encoding, only one subtractor will be used. It can be seen that CSD encoding can essentially reduce hardware resource overhead. After CSD encoding optimization, the number of non-zero values of s′kb will be much smaller than the number of non-zero values of Hkb.
For FIR filters with symmetrical linear phase coefficients, in order to reduce the number of multiplication units, the structure shown in Figure 3 can be selected. Since all multiplication operations can be converted into a large number of addition and subtraction operations, the critical path will be too long and the system will run at a low speed. However, adding pipeline registers can reduce the length of the critical path, thereby increasing the maximum operating frequency of the system. When b is a constant, the number of non-zero values of s′kb is uncertain. Therefore, when designing the pipeline, it can be flexibly divided according to s′kb. The longer the path, the more pipeline registers are added. In order to prevent the overflow of the intermediate results, the bit width of the register must be redundantly designed. For signed numbers, the bit width is M+log2N-1, where M is the bit width of the upper accumulator and N is the filter order.
Figure 3 Local structure of pipeline CSD-DA algorithm
From the pipeline optimized CSD-DA algorithm structure in FIG3 , it can be seen that all multiplications are converted into shift additions, the shift operations can be implemented by hardware wiring, and the entire structure has undergone reasonable pipeline segmentation.
Table 2 lists the comprehensive results of filters with different structures. The parallel structure is the worst one, which occupies more resources and has a slow speed. The serial LUT-DA structure, although it occupies less resources and has a high maximum operating frequency, is a serial structure after all, and cannot complete the filtering operation of one sampling point in one clock beat. The pipeline CSD-DA structure has obvious advantages in both speed and area. If the working clock is 75MHz, then one clock beat can complete one output, and it only takes 4.4μs to process a single-channel signal of 330 sampling points, which can meet the real-time requirements of well logging.
Table 2 Comprehensive results of filters
4 Results Analysis
In order to verify whether the function of the filter is correct, this design can be simulated in Modelsim. If the original waveform is a noisy sound wave signal, then the filtering result is shown in Figure 4.
Figure 4 Simulation results of the filter in Modelsim
Figure 5 shows the simulation results of the filter in Matlab. It can be seen that the simulation results of Modelsim and Matlab are consistent. In the frequency domain, by comparing Figure 5 (a) and Figure 5 (b), it can be seen that the waveform after filtering only retains the spectrum part of 5kHz~18kHz, which shows that the digital filter design of the pipeline CSD-DA structure is correct.
Figure 5 Simulation results of the filter in Matlab
5 Conclusion
This article describes in detail the method of designing FIR linear phase filters using Matlab tools, and designs a pipeline CSD-DA structure that is superior to traditional structures for acoustic signals. This structure has obvious speed and area advantages. The rationality and correctness of the design are also verified by simulation experiments. However, it is worth pointing out that this structure is only suitable for occasions where the filter coefficients are fixed. If you want to modify it, you need to re-encode the coefficients with CSD and pipeline segmentation.
Previous article:FPGA implementation of high-precision DDFS signal source
Next article:Design of High Precision Signal Source Based on FPGA
Recommended ReadingLatest update time:2024-11-16 15:37
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- mpy's LoRa driver library uPyLora
- Can I create a C program with only a hex file?
- [NXP Rapid IoT Review] Unfinished "IKEA Alarm Clock" and Review Summary
- Will you still go crazy shopping on Double 11 now?
- Answer the questions and get a gift | Rochester Electronics will help you solve the problems of the entire semiconductor cycle
- C language (vd6.0) sleep function usage and delay usage
- Battery applications in medical monitoring and the changing environment
- TI Power Supply Learning and Growth Road: DC-DC/AC-DC Control Circuit
- Thank you + thank you EEWORLD
- msp430 library serial PWM