LTE[1](Long Term Evolution)是3GPP展开的对UMTS技术的长期演进计划。LTE具有高数据速率、低延迟、分组传送、广域覆盖和向下兼容等显著优势[2],在各种“准4G”标准中脱颖而出,最具竞争力和运营潜力。运营商普遍选择LTE,为全球移动通信产业指明了技术发展的方向。设备制造商亦纷纷加大在LTE领域的投入,其中包括华为、北电、NEC和大唐等一流设备制造商,从而有力地推动LTE不断前进,使LTE的商用相比其他竞争技术更加令人期待。
Turbo code[3] was selected as one of the channel coding schemes for the LTE standard due to its excellent error correction performance close to the Shannon limit[4]. FPGA integrated design of Turbo codec can accelerate the commercialization of LTE and has broad application prospects. In different channel environments, communication systems have different requirements for information reliability and data real-time performance, and appropriate compromises must be made between the two in practical applications. Therefore, hardware design of a Turbo codec with flexible configuration of error correction performance and decoding delay is more commercially valuable.
The power-optimized, performance-enhanced Stratix III series from Altera uses the same FPGA architecture as the industry-leading Stratix II series, including high-performance adaptive logic modules (ALMs), support for more than 40 I/O interface standards, and industry-leading flexibility and signal integrity. The combination of Stratix III FPGAs and Quartus II software provides engineers with highly innovative design methods, further improving performance and efficiency[5]. Stratix III L devices have more logic cells, which facilitates the design of FPGAs with configurable frame length turbo codecs.
The error performance of Turbo code depends largely on the information frame length. The longer the information frame, the better the decoding performance, but the cost is the increase of decoding delay. Based on this, this design proposes an FPGA implementation of Turbo code encoder with configurable frame length, introduces the working principle of the interleaver in the system in detail, and analyzes the timing simulation results and functional implementation, providing a reference for the development of Turbo codec dedicated integrated chip under LTE standard.
1 System Architecture of Turbo Codec with Configurable Frame Length
In the LTE standard, channel coding mainly adopts two schemes: tail biting convolutional code and turbo coding [4]. The turbo code has a code rate of 1/3 and consists of two recursive systematic convolutional codes (RSC) with coefficients of (13, 15) and a QPP (quadratic permutation polynomial) random interleaver, using a typical PCCC coding structure.
According to the principle of Turbo code encoding and decoding structure, the information frame length depends on the size of the interleaving depth. If the interleaver can automatically implant different interleaving patterns according to different frame length parameters and control the corresponding parameters of other modules, the design function can be realized. The design idea of configurable Turbo codec is obtained: before encoding and decoding, the information frame length is input by the keyboard circuit, and the system initializes the codec accordingly, mainly including setting the depth of the memory in the circuit, calculating and storing the interleaving pattern, and synchronously displaying the frame length information through the LCD; at the end of the initialization process, the status flag is output, and the codec enters the ready state. Once there is data input, the encoding and decoding process is started. The structure diagram of the Turbo codec system is shown in Figure 1.
In the turbo codec of FIG1 , all parameters related to the message length are set as input variables, including memory depth, counter cycle, etc., for easy configuration.
2 Design and Implementation of FPGA Functional Modules
2.1 Design of Interleaving Module
The interleaver is one of the main components of the Turbo codec. Whether it can generate the corresponding interleaving pattern according to the frame length parameter is also the key to this design. The LTE standard stipulates that the interleaver adopts the QPP pseudo-random interleaving scheme with an interleaving length range of 40~6 114. This scheme generates different interleaving patterns for different frame lengths, which can effectively improve the Hamming distance and code weight distribution of the codeword. Assume that the bit sequence input to the interleaver is d0, d1, ..., dK-1, where K is the information sequence frame length, and the interleaver outputs the sequence d′0, d′1, ..., d′K-1. Then:
The parameters f1 and f2 depend on the interleaving length K. The specific values can be found in reference [4].
The FPGA design of traditional interleavers generally adopts software programming methods. According to the communication protocol, the interleaving pattern of the determined frame length is pre-calculated, and a memory initialization file (.mif or .hex format) is generated and loaded into the ROM [6]. Although this reduces the hardware complexity, it cannot configure the encoding frame length by itself, lacking flexibility and versatility. Therefore, the interleaving algorithm is integrated into the FPGA in the design, and when the information frame length needs to be changed, the interleaver is started to recalculate the interleaving address and store it in the RAM. The hardware structure block diagram of the QPP interleaver is shown in Figure 2.
In Figure 2, during the system initialization phase, the keyboard circuit collects the input information frame length K, which is de-jittered and transmitted to the LCD synchronous display module in one path and to the f1 and f2 operation units in another path. The values of f1 and f2 are obtained by looking up the table and provided to the interleaving algorithm integration module.
The interleaving algorithm integrated unit is the core part of the interleaver design. Its main function is to calculate the interleaving address according to the LTE protocol standard and parameters K, f1, f2, under the constraints of the timing control module. During the operation, the operation of taking the remainder of any integer that cannot be synthesized by FPGA is converted into a fixed number of addition and subtraction loop operations. Under the control of the clock management module, the small clock calculation and large clock output measures are taken to ensure the correct reading of the interleaved data.
The interleaving address is calculated and the write address is generated at the same time, and the interleaving address is sequentially stored in the dual-port RAM, thus completing the main design of the interleaver. Then the handshake signal is sent to start the Turbo code encoding and decoding process.
Because the interleaving algorithm module does not need to be run when encoding and decoding each frame of information, the interleaving address is only loaded during the initialization phase, so that the interleaving algorithm and the encoder work in a time-sharing manner. When calling the interleaver module, only the sequential address needs to be input into the read address end of the dual-port RAM to obtain the QPP pseudo-random interleaving address of the given frame length, which will not increase the decoding delay. After obtaining the interleaving pattern, the interleaving and deinterleaving process can be carried out [7].
2.2 Design of Turbo Code Encoder
After completing the interleaving module, the FPGA design of the Turbo code encoder is carried out. The Turbo code encoder consists of an RSC (recursive systematic convolutional code) sub-encoder, an interleaver, a multiplexing circuit, etc. The hardware implementation block diagram is shown in Figure 3.
After the system is initialized, the interleaver has stored the interleaving pattern of the corresponding frame length. The encoder first receives a frame of information and stores it in the RAM, and starts the encoding process with a start signal. Under the guidance of the clock management module and the timing control module, the counter generates a sequential address, and then accesses the interleaver according to the sequential address to obtain the interleaving address. The data is read from the RAM storing the information sequence with the sequential address and the interleaving address, respectively, and enters the corresponding RSC for encoding. At the same time, the multiplexing circuit converts the information bit and the check bit into parallel and serial. After a frame of information is encoded, the sub-encoder is reset to zero.
2.3 Design of Turbo Code Decoder
The hardware structure of the Turbo code decoder is more complex than that of the encoder. According to the decoding principle and the implementation method of the interleaver, the decoder implementation structure diagram is shown in Figure 4.
[page]
In order to save hardware resources, the Turbo code decoder designed in this paper adopts a sub-decoder single-core multiplexing structure. When the sub-decoder module is used as sub-decoder 1, the information bits are sequentially written into the memory and then sequentially read out to the sub-decoder. L_a2 is written into the memory with an interleaved address, and the sequential address is read out as the prior information of sub-decoder 1. At the same time, the check bit selects yp1. Sub-decoder 1 performs SISO (soft input soft output) decoding operation based on the three inputs to obtain new L_a2 and L_e; thereafter, the sub-decoder is used as sub-decoder 2, and ys is read out from the memory with an interleaved address. L_a2 is written into the memory with a sequential address, and the interleaved address is read out as the prior information of sub-decoder 2. At the same time, the check bit selects yp2. Sub-decoder 2 performs SISO (soft input soft output) decoding operation based on the three inputs to obtain new L_a2 and L_e, completing one iteration. After the iteration stop criterion is met, L_e is deinterleaved and hard decision is performed to obtain the decoding sequence.
In the design, the sub-decoder adopts the Max-Log-MAP decoding algorithm that compromises complexity and performance. According to the input information bits, check bits and prior probability information, under the management of the timing control module, the branch transfer metric, forward state metric, backward state metric and log-likelihood ratio are calculated and stored respectively for the next decoding operation call.
According to the initialization branch transfer metric value, the corresponding relationship between the current forward state metric and the previous forward state metric is found from the fence diagram of (13, 15) RSC [7], and the current forward state metric is calculated. Recursively, in order to prevent the data from overflowing the range, it is normalized at each iteration, and the implementation block diagram is shown in Figure 5. The backward state metric has a similar operation structure to the forward state metric, but it is just reverse recursion.
Based on the definition of log-likelihood ratio, the obtained branch transfer metric, forward state metric and backward state metric are substituted into the calculation formula [8], and the three input parameters are combined and calculated. Then, the minimum value of the eight states of the "1" path and the minimum value of the eight states of the "0" path are taken out, and the log-likelihood ratio in the Max-Log-MAP algorithm is obtained by difference. After several iterations, auxiliary hard decision is performed, and the deinterleaved output is the decoded sequence transmitted back to the destination.
3 System Implementation and Simulation Results Analysis
On the Quartus Ⅱ development tool, the Stratix III series EP3-SL150F1152C2 is used as the configuration platform, and the above functional modules are programmed and modeled using the Verilog HDL language. After unified debugging and compilation, the main hardware resource usage of the codec is obtained as shown in Table 1.
The waveform files are created and the timing simulation of the Turbo code is performed. The simulation waveforms of the Turbo code encoder are shown in Figure 6(a) and Figure 6(b) when the information frame length is configured as 128 and 512 respectively.
In Figure 6, at the end of each frame of the codeword sequence, there are 12 system tail bits to return the encoder register to the all-zero state. After multiple verifications and comparisons with Matlab simulation data, the results are correct.
The coded codewords are quantized and stored in ROM, and provided to the decoder for timing simulation. When the information frame lengths are configured as 128 and 512 (the codeword sequence lengths are 396 and 1548 respectively), the simulation waveforms of the Turbo code decoder are shown in Figure 7(a) and Figure 7(b).
In Figure 7, the decoder first initializes the interleaving pattern according to the frame length setting, then demultiplexes the system codeword to obtain the information sequence (ys), check bit 1 (yp1) and check bit 2 (yp2), which are input into the sub-decoder together with the external information (L_all) for SISO decoding operation. After 6 iterations, the decoding result (decoderout) is determined.
After multiple simulations and verifications, different information frame lengths were set and the encoding and decoding functions were correctly implemented. The program was downloaded and configured into EP3SL150F1152C2, and the test window was written using VC software for testing. The results show that this design can use the peripheral keyboard circuit to input the frame length, perform interleaving operations, obtain interleaving patterns, and correctly implement the Turbo encoding and decoding functions, meeting the design requirements.
This design uses LTE as the application background and implements a hardware solution for Turbo codec that can configure the frame length on-site according to the channel environment. The QPP interleaving algorithm is integrated into the FPGA, making full use of its advantages of high clock frequency and fast speed, reducing the consumption of peripheral interface circuits. The interleaving operation is performed during system initialization, before the Turbo codec process starts. The two work in time-sharing and coordinated, without causing additional delays. The implemented Turbo codec is an ideal general solution, which provides a reference for the development and promotion of Turbo codec dedicated integrated chips under the LTE standard.
Previous article:Design and implementation of HART intelligent instrument online monitoring system
Next article:Implementation of AES cryptographic algorithm based on Verilog hardware description language
Recommended ReadingLatest update time:2024-11-16 17:37
- Popular Resources
- Popular amplifiers
- Analysis and Implementation of MAC Protocol for Wireless Sensor Networks (by Yang Zhijun, Xie Xianjie, and Ding Hongwei)
- MATLAB and FPGA implementation of wireless communication
- Intelligent computing systems (Chen Yunji, Li Ling, Li Wei, Guo Qi, Du Zidong)
- Summary of non-synthesizable statements in FPGA
- Mir T527 series core board, high-performance vehicle video surveillance, departmental standard all-in-one solution
- Akamai Expands Control Over Media Platforms with New Video Workflow Capabilities
- Tsinghua Unigroup launches the world's first open architecture security chip E450R, which has obtained the National Security Level 2 Certification
- Pickering exhibits a variety of modular signal switches and simulation solutions at the Defense Electronics Show
- Parker Hannifin Launches Service Master COMPACT Measuring Device for Field Monitoring and Diagnostics
- Connection and distance: A new trend in security cameras - Wi-Fi HaLow brings longer transmission distance and lower power consumption
- Smartway made a strong appearance at the 2023 CPSE Expo with a number of blockbuster products
- Dual-wheel drive, Intellifusion launches 12TOPS edge vision SoC
- Toyota receives Japanese administrative guidance due to information leakage case involving 2.41 million pieces of user data
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- How to speed up the program running speed of Huada HC32F460 HC32F4A0?
- FPDLINK Spark Interference Optimization
- Why do electrolytic capacitors explode? Find out in one article!
- [Iprober 520 current probe] Calibration and use in PCB mode
- A brief history of hard disk interface evolution
- Hot plug and unplug
- EEWORLD University ---- STM32CubeMX and STM32Cube HAL basics
- [TI mmWave Radar Evaluation]_3_AWR1843BOOST Corridor Ranging Test Environment
- What is the relationship between the Internet of Things and embedded systems?
- [NXP Rapid IoT Review] + Mobile Synchronizer