Integrated Design of Turbo Codec under LTE Standard-EEWORLD

Collect

LTE[1]（Long Term Evolution）是3GPP展开的对UMTS技术的长期演进计划。LTE具有高数据速率、低延迟、分组传送、广域覆盖和向下兼容等显著优势[2]，在各种“准4G”标准中脱颖而出，最具竞争力和运营潜力。运营商普遍选择LTE，为全球移动通信产业指明了技术发展的方向。设备制造商亦纷纷加大在LTE领域的投入，其中包括华为、北电、NEC和大唐等一流设备制造商，从而有力地推动LTE不断前进，使LTE的商用相比其他竞争技术更加令人期待。

Turbo code[3] was selected as one of the channel coding schemes for the LTE standard due to its excellent error correction performance close to the Shannon limit[4]. FPGA integrated design of Turbo codec can accelerate the commercialization of LTE and has broad application prospects. In different channel environments, communication systems have different requirements for information reliability and data real-time performance, and appropriate compromises must be made between the two in practical applications. Therefore, hardware design of a Turbo codec with flexible configuration of error correction performance and decoding delay is more commercially valuable.

The power-optimized, performance-enhanced Stratix III series from Altera uses the same FPGA architecture as the industry-leading Stratix II series, including high-performance adaptive logic modules (ALMs), support for more than 40 I/O interface standards, and industry-leading flexibility and signal integrity. The combination of Stratix III FPGAs and Quartus II software provides engineers with highly innovative design methods, further improving performance and efficiency[5]. Stratix III L devices have more logic cells, which facilitates the design of FPGAs with configurable frame length turbo codecs.

The error performance of Turbo code depends largely on the information frame length. The longer the information frame, the better the decoding performance, but the cost is the increase of decoding delay. Based on this, this design proposes an FPGA implementation of Turbo code encoder with configurable frame length, introduces the working principle of the interleaver in the system in detail, and analyzes the timing simulation results and functional implementation, providing a reference for the development of Turbo codec dedicated integrated chip under LTE standard.

1 System Architecture of Turbo Codec with Configurable Frame Length

In the LTE standard, channel coding mainly adopts two schemes: tail biting convolutional code and turbo coding [4]. The turbo code has a code rate of 1/3 and consists of two recursive systematic convolutional codes (RSC) with coefficients of (13, 15) and a QPP (quadratic permutation polynomial) random interleaver, using a typical PCCC coding structure.

According to the principle of Turbo code encoding and decoding structure, the information frame length depends on the size of the interleaving depth. If the interleaver can automatically implant different interleaving patterns according to different frame length parameters and control the corresponding parameters of other modules, the design function can be realized. The design idea of configurable Turbo codec is obtained: before encoding and decoding, the information frame length is input by the keyboard circuit, and the system initializes the codec accordingly, mainly including setting the depth of the memory in the circuit, calculating and storing the interleaving pattern, and synchronously displaying the frame length information through the LCD; at the end of the initialization process, the status flag is output, and the codec enters the ready state. Once there is data input, the encoding and decoding process is started. The structure diagram of the Turbo codec system is shown in Figure 1.

In the turbo codec of FIG1 , all parameters related to the message length are set as input variables, including memory depth, counter cycle, etc., for easy configuration.

2 Design and Implementation of FPGA Functional Modules

2.1 Design of Interleaving Module

The interleaver is one of the main components of the Turbo codec. Whether it can generate the corresponding interleaving pattern according to the frame length parameter is also the key to this design. The LTE standard stipulates that the interleaver adopts the QPP pseudo-random interleaving scheme with an interleaving length range of 40~6 114. This scheme generates different interleaving patterns for different frame lengths, which can effectively improve the Hamming distance and code weight distribution of the codeword. Assume that the bit sequence input to the interleaver is d0, d1, ..., dK-1, where K is the information sequence frame length, and the interleaver outputs the sequence d′0, d′1, ..., d′K-1. Then:

The parameters f1 and f2 depend on the interleaving length K. The specific values can be found in reference [4].

The FPGA design of traditional interleavers generally adopts software programming methods. According to the communication protocol, the interleaving pattern of the determined frame length is pre-calculated, and a memory initialization file (.mif or .hex format) is generated and loaded into the ROM [6]. Although this reduces the hardware complexity, it cannot configure the encoding frame length by itself, lacking flexibility and versatility. Therefore, the interleaving algorithm is integrated into the FPGA in the design, and when the information frame length needs to be changed, the interleaver is started to recalculate the interleaving address and store it in the RAM. The hardware structure block diagram of the QPP interleaver is shown in Figure 2.

In Figure 2, during the system initialization phase, the keyboard circuit collects the input information frame length K, which is de-jittered and transmitted to the LCD synchronous display module in one path and to the f1 and f2 operation units in another path. The values of f1 and f2 are obtained by looking up the table and provided to the interleaving algorithm integration module.

The interleaving algorithm integrated unit is the core part of the interleaver design. Its main function is to calculate the interleaving address according to the LTE protocol standard and parameters K, f1, f2, under the constraints of the timing control module. During the operation, the operation of taking the remainder of any integer that cannot be synthesized by FPGA is converted into a fixed number of addition and subtraction loop operations. Under the control of the clock management module, the small clock calculation and large clock output measures are taken to ensure the correct reading of the interleaved data.

The interleaving address is calculated and the write address is generated at the same time, and the interleaving address is sequentially stored in the dual-port RAM, thus completing the main design of the interleaver. Then the handshake signal is sent to start the Turbo code encoding and decoding process.

Because the interleaving algorithm module does not need to be run when encoding and decoding each frame of information, the interleaving address is only loaded during the initialization phase, so that the interleaving algorithm and the encoder work in a time-sharing manner. When calling the interleaver module, only the sequential address needs to be input into the read address end of the dual-port RAM to obtain the QPP pseudo-random interleaving address of the given frame length, which will not increase the decoding delay. After obtaining the interleaving pattern, the interleaving and deinterleaving process can be carried out [7].

2.2 Design of Turbo Code Encoder

After completing the interleaving module, the FPGA design of the Turbo code encoder is carried out. The Turbo code encoder consists of an RSC (recursive systematic convolutional code) sub-encoder, an interleaver, a multiplexing circuit, etc. The hardware implementation block diagram is shown in Figure 3.

After the system is initialized, the interleaver has stored the interleaving pattern of the corresponding frame length. The encoder first receives a frame of information and stores it in the RAM, and starts the encoding process with a start signal. Under the guidance of the clock management module and the timing control module, the counter generates a sequential address, and then accesses the interleaver according to the sequential address to obtain the interleaving address. The data is read from the RAM storing the information sequence with the sequential address and the interleaving address, respectively, and enters the corresponding RSC for encoding. At the same time, the multiplexing circuit converts the information bit and the check bit into parallel and serial. After a frame of information is encoded, the sub-encoder is reset to zero.

2.3 Design of Turbo Code Decoder

The hardware structure of the Turbo code decoder is more complex than that of the encoder. According to the decoding principle and the implementation method of the interleaver, the decoder implementation structure diagram is shown in Figure 4.

[page]

In order to save hardware resources, the Turbo code decoder designed in this paper adopts a sub-decoder single-core multiplexing structure. When the sub-decoder module is used as sub-decoder 1, the information bits are sequentially written into the memory and then sequentially read out to the sub-decoder. L_a2 is written into the memory with an interleaved address, and the sequential address is read out as the prior information of sub-decoder 1. At the same time, the check bit selects yp1. Sub-decoder 1 performs SISO (soft input soft output) decoding operation based on the three inputs to obtain new L_a2 and L_e; thereafter, the sub-decoder is used as sub-decoder 2, and ys is read out from the memory with an interleaved address. L_a2 is written into the memory with a sequential address, and the interleaved address is read out as the prior information of sub-decoder 2. At the same time, the check bit selects yp2. Sub-decoder 2 performs SISO (soft input soft output) decoding operation based on the three inputs to obtain new L_a2 and L_e, completing one iteration. After the iteration stop criterion is met, L_e is deinterleaved and hard decision is performed to obtain the decoding sequence.

In the design, the sub-decoder adopts the Max-Log-MAP decoding algorithm that compromises complexity and performance. According to the input information bits, check bits and prior probability information, under the management of the timing control module, the branch transfer metric, forward state metric, backward state metric and log-likelihood ratio are calculated and stored respectively for the next decoding operation call.

According to the initialization branch transfer metric value, the corresponding relationship between the current forward state metric and the previous forward state metric is found from the fence diagram of (13, 15) RSC [7], and the current forward state metric is calculated. Recursively, in order to prevent the data from overflowing the range, it is normalized at each iteration, and the implementation block diagram is shown in Figure 5. The backward state metric has a similar operation structure to the forward state metric, but it is just reverse recursion.

Based on the definition of log-likelihood ratio, the obtained branch transfer metric, forward state metric and backward state metric are substituted into the calculation formula [8], and the three input parameters are combined and calculated. Then, the minimum value of the eight states of the "1" path and the minimum value of the eight states of the "0" path are taken out, and the log-likelihood ratio in the Max-Log-MAP algorithm is obtained by difference. After several iterations, auxiliary hard decision is performed, and the deinterleaved output is the decoded sequence transmitted back to the destination.

3 System Implementation and Simulation Results Analysis

On the Quartus Ⅱ development tool, the Stratix III series EP3-SL150F1152C2 is used as the configuration platform, and the above functional modules are programmed and modeled using the Verilog HDL language. After unified debugging and compilation, the main hardware resource usage of the codec is obtained as shown in Table 1.

The waveform files are created and the timing simulation of the Turbo code is performed. The simulation waveforms of the Turbo code encoder are shown in Figure 6(a) and Figure 6(b) when the information frame length is configured as 128 and 512 respectively.

In Figure 6, at the end of each frame of the codeword sequence, there are 12 system tail bits to return the encoder register to the all-zero state. After multiple verifications and comparisons with Matlab simulation data, the results are correct.

The coded codewords are quantized and stored in ROM, and provided to the decoder for timing simulation. When the information frame lengths are configured as 128 and 512 (the codeword sequence lengths are 396 and 1548 respectively), the simulation waveforms of the Turbo code decoder are shown in Figure 7(a) and Figure 7(b).

In Figure 7, the decoder first initializes the interleaving pattern according to the frame length setting, then demultiplexes the system codeword to obtain the information sequence (ys), check bit 1 (yp1) and check bit 2 (yp2), which are input into the sub-decoder together with the external information (L_all) for SISO decoding operation. After 6 iterations, the decoding result (decoderout) is determined.

After multiple simulations and verifications, different information frame lengths were set and the encoding and decoding functions were correctly implemented. The program was downloaded and configured into EP3SL150F1152C2, and the test window was written using VC software for testing. The results show that this design can use the peripheral keyboard circuit to input the frame length, perform interleaving operations, obtain interleaving patterns, and correctly implement the Turbo encoding and decoding functions, meeting the design requirements.

This design uses LTE as the application background and implements a hardware solution for Turbo codec that can configure the frame length on-site according to the channel environment. The QPP interleaving algorithm is integrated into the FPGA, making full use of its advantages of high clock frequency and fast speed, reducing the consumption of peripheral interface circuits. The interleaving operation is performed during system initialization, before the Turbo codec process starts. The two work in time-sharing and coordinated, without causing additional delays. The implemented Turbo codec is an ideal general solution, which provides a reference for the development and promotion of Turbo codec dedicated integrated chips under the LTE standard.

Reference address：Integrated Design of Turbo Codec under LTE Standard

Previous article：Design and implementation of HART intelligent instrument online monitoring system
Next article：Implementation of AES cryptographic algorithm based on Verilog hardware description language

Recommended ReadingLatest update time:2024-11-16 17:37

A solution to implement DSP on FPGA based on C language

Hardware designers have begun to adopt FPGA technology in high-performance DSP designs because it can provide 10-100 times faster computing than PC-based or microcontroller-based solutions. Previously, it was difficult for software developers who were not familiar with hardware design to take advantage of FPGAs, but

[Microcontroller]

A solution to implement DSP on FPGA based on C language

Application of FPGA in Medical 4D Imaging

Medical imaging is one of the most valuable tools that doctors have in detecting and diagnosing disease or abnormality in their patients. From ultrasound, which provides fast 2D images, to computed tomography (CT) and magnetic resonance imaging (MRI), which provide highly accurate 3D images of the human body, both 2D

[Medical Electronics]

Difficulties in LTE Testing

Long-term evolution (LTE) is rapidly gaining popularity around the world. China Mobile has started the construction of more than 200,000 TD-LTE base stations across the country. 100 key cities will achieve continuous coverage in the main urban areas, and 1 million TD-LTE terminals will be purchased. China Unicom and Ch

[Test Measurement]

Infineon Technologies and Nokia to collaborate on advanced LTE solutions

Infineon Technologies and Nokia have announced a collaboration to develop advanced radio frequency (RF) transceiver solutions. The agreement signed by the two parties is a non-exclusive cooperation agreement aimed at ensuring compatibility and interoperability between Nokia's licensable advanced baseband modem techn

[Analog Electronics]

Electronic Technology Decrypted: Simplifying FPGA Power Supply Design

　　FPGA is a field programmable gate array, which is a chip with multiple power requirements. It is a common electronic chip in electronic technology design. Therefore, when it is powered, the power supply design requirements must be strict. Multi-power supply design is complex. If we want FPGA to run efficiently, we n

[Power Management]

Electronic Technology Decrypted: Simplifying FPGA Power Supply Design

Altera Revolutionizes FPGA-Based Floating-Point DSP

Altera announced yesterday that it has revolutionized FPGA floating-point DSP performance. Altera is the first programmable logic company to integrate hard-core IEEE 754-compatible floating-point arithmetic in FPGAs, which has unprecedentedly improved DSP performance, designer productivity, and logic efficiency. The h

[Embedded]

Altera Revolutionizes FPGA-Based Floating-Point DSP

Color LED Large Screen Control System Based on FPGA

As an important medium for modern information release, LED (Light Emitting Diode) large screens are receiving great attention from all walks of life, especially the business and advertising circles. With the advancement of science and technology, full-color LED display screens (RGB three primary colors) are gradually

[Power Management]

Color LED Large Screen Control System Based on FPGA

Interfacing the MAX5881 Direct RF Synthesis DAC to an FPGA

Abstract: This application note discusses techniques for interfacing the MAX5881 4.3Gsps downstream cable direct RF synthesis DAC to a field-programmable gate array ( FPGA ) . The focus of the discussion is on the timing of interfacing the MAX5881 high-speed digital inputs to a Xilinx® Virtex™-5 FPGA

[Analog Electronics]

Interfacing the MAX5881 Direct RF Synthesis DAC to an FPGA

Popular Resources
Popular amplifiers