Design and implementation of multi-rate QC-LDPC decoder-EEWORLD

Collect

Abstract: Low-density parity-check code (LDPC) is one of the most effective error control methods, and quasi-cyclic LDPC code (QC-LDPC) is the most widely used. A general multi-rate QC-LDPC decoder design method is proposed, and it is implemented and tested on FPGA. The test results show that the multi-rate decoder can effectively support at least three code rates under the premise that the resource occupation does not exceed the sum of the resources of the decoders of two code rates; and when the working clock is 110 MHZ and the number of iterations is fixed to 16 times, the throughput of the decoder can be maintained above 110 Mb/s.

0 Introduction

LDPC code was first proposed by Gallager in 1962 and can be regarded as a linear block code with a sparse check matrix. Since Mackay and Neal found that the performance of LDPC code is very close to the Shannon limit, LDPC code has attracted more and more attention. Based on the structural characteristics of quasi-cyclic LDPC (QC-LDPC) code, a design method for QC-LDPC decoder supporting multiple code rates is proposed, and a universal QC-LDPC decoder that can adaptively support three different H matrices in real time is designed and implemented.

1 Introduction to QC-LDPC Codes

The check matrix Hqc of the QC-LDPC code is composed of c × t circulant permutation matrices, where c and t are both integers and c < t. Each permutation matrix in the check matrix of the QC-LDPC code is replaced by the corresponding shift value, thus obtaining a new matrix, called the basic matrix. The basic matrix corresponds to the H matrix one-to-one. The structure of the QC-LDPC rule makes its coding easy to implement in engineering, so many LDPC codes in the standards use QC-LDPC codes.

2. Introduction to Decoding Algorithm

The decoder designed here mainly adopts the offset value minimum sum algorithm based on soft decision. The offset value minimum sum algorithm is improved on the basis of the sum-product algorithm and the minimum sum algorithm, and has the characteristics of low decoding complexity and excellent performance. In order to better describe the algorithm, some symbols are defined first.

L(ci) represents the original soft information of variable node i of the input decoder, L(rji) represents the information transmitted from check node j to variable node i, and L(qij) represents the information transmitted from variable node i to check node j. The meanings of αij and βi'j are shown in formula 1:

The specific algorithm steps are as follows:

Initialize the original probability information of the codeword.

Step 1: Update the probability information of the check node (CNU, Check NodeUpdate).

Step 2: Update the probability information of the information node (VNU, Variable NodeUpdate).

Also calculate:

LQ makes a hard decision. If ( ) 0 i LQ > , the decision is 0, otherwise 1. Calculate whether cHT is 0 or the maximum number of iterations has been reached. If so, go to step 3, otherwise go to step 1.

Step 3: Output the judgment result.

Through simulation, the decoder input fixed-point scheme determined in this paper is as follows: the quantization bit width is 6 bits, of which 3 bits represent integer bits and 2 bits represent decimal bits.

3 Multi-rate LDPC decoder design

First, consider the following three QC-LDPC codewords as references, with a code length of 8064 bits and code rates of 7/8, 3/4, and 1/2. The optimal offset values (offset values in (3)) required by the minimum and decoding algorithms for different code rates are obtained through simulation, which are 1, 0.7, and 0.5, respectively. The code expansion factor used in the above QC-LDPC code is 112.

The LDPC decoder implemented in this paper is based on a partially parallel decoding structure. The decoder performs double buffering of input and output to support continuous data processing. The overall structure is shown in Figure 1.

The overall structure of the decoder

Figure 1 Overall structure of the decoder

Since three different H-matrix LDPC codes are to be supported, a mode port is required to inform the decoder which code type the current data block belongs to. The input mode register controls the selector to select different H-matrixes to configure the control and addressing module, so that it can select the node RAM that needs to be updated and the circuit sets of the check node unit (CNU) and variable node unit (VNU) that need to be updated.

The input data is first input into the input cache RAM group, which is divided into N blocks according to the number of columns of the basic matrix to cache the data, where N is configurable and the N used in this paper is 72. After the data is full of a frame of coding blocks, it is input into the node RAM group. The role of the node RAM group is to store the intermediate information during the algorithm iteration update. Since there are more zero matrices in the basic matrix, the number of node RAMs actually generated is much smaller than M × N.

The purpose of the CNU circuit is to update the probability of the check node and complete the calculation of equation (3). The implementation structure is shown in Figure 2 (a). The purpose of the VNU circuit is to update the probability of the variable node and calculate the hard decision result at the same time to complete the calculation of equations (4) and (5).

The specific structure is shown in Figure 2(b).

Structure of CNU and VNU circuits

Figure 2 Structure of CNU and VNU circuits

The output buffer RAM group is used to store and output the decoding results. It also adopts ping-pong operation to support continuous input and output of data blocks. The control and addressing module is the core module of the decoder. It provides various control signals and addressing signals for reading and writing RAM for the decoder. The addressing module is divided into two parts: CNU address generation module and VNU address generation module. The starting address of CNU address generation module is its offset value; while the addresses generated by VNU address are from 0 to Z.

Due to the use of input and output double buffering, there can be at most three data blocks in the decoder, and these three data blocks can be data blocks with different code rates, thus realizing the function of adaptive decoding of continuously input data blocks with different code rates.

4 FPGA implementation and performance testing:

According to the above design scheme, Verilog HDL was selected for design, Modelsim 6.1b was used for simulation verification, and finally the test was carried out on the Stratix IIEP2S180F1020I4 chip. See Table 1 for details.

Table 1 Resource usage

Table 1 also lists the resource usage of a single-rate decoder (7/8 bit rate). It can be seen that the multi-rate decoder can effectively support three bit rates under the premise that the resource usage does not exceed the sum of the resources of the decoders of two bit rates.

At the same time, the throughput and the highest operating clock of each bit rate were tested. The highest operating clock of the three bit rates (1/2, 3/4, 7/8) is 110 MHz, and the highest throughput is 110 Mb/s, 165 Mb/s and 192.5 Mb/s respectively. From the test results, it can be seen that the throughput of the multi-rate decoder is also above 110 Mb/s, indicating that it still maintains a high decoding throughput while meeting the requirements of adaptive multi-rate applications.

5 Conclusion

According to the characteristics of QC-LDPC code, a multi-rate QC-LDPC decoder implementation method is proposed, and this universal multi-rate decoder is implemented using FPGA, which can support at least three different QC-LDPC codes. The input and output parameters of this multi-rate QC-LDPC decoder can be flexibly configured according to the required supported code types, and the final decoding throughput can exceed 110 Mb/s for any code rate, taking into account the flexibility and high throughput required by the multi-rate decoder.

Reference address：Design and implementation of multi-rate QC-LDPC decoder

Previous article：Research and Implementation of 8PSK Soft Demodulation Based on FPGA
Next article：Design of Partial Response CPM Signal Demodulator Based on FPGA

Popular Resources
Popular amplifiers