Acceleration Method for MPEG Audio Layer III Compression Using DSP-EEWORLD

Collect

1 Overview

Digital audio compression technology provides a more effective method for audio storage and transmission. There are many audio compression technologies, and their complexity, audio compression quality, and compression ratio vary greatly. For example, the μ-law audio compression algorithm is simple, but has a low compression ratio and average sound quality. According to the CCITT G.711 recommendation, the natural logarithm quantization process can provide relatively high precision quantization when the input amplitude is relatively small, while for large amplitude signals with a relatively small probability of occurrence, the quantization noise is relatively large. This quantization method makes the 8-bit digital quantization signal equivalent to 14-bit linear quantization in terms of quantization noise effect. The ADPCM compression coding makes full use of the characteristics of the relatively small amplitude change of adjacent sample values, and the coding output result is the difference between the current sample value and the predicted value. Although the fidelity of ADPCM coding is high, its compression ratio is relatively low, and can only reach a compression ratio of 4/1. Improved ADPCM coding methods include the improved algorithm proposed by IMA (Interactive Multimedia Association), CCITT's G.721, G.723 recommendations, etc. [1].

The MPEG (Motion Picture Expert Group) audio compression standard provides a high-fidelity, high-compression ratio compression algorithm. The ISO11172-3 standard describes sub-band audio coding schemes with different complexities and performances to adapt to various high-quality digital audio applications. According to the different coding calculation complexity and coding efficiency, it is divided into three standards: Layer I, Layer II and Layer III.

The MPEG audio standard was originally derived from a draft algorithm that was divided into four types, namely Audio Spectral Perceptual Entropy Coding (ASPEC), Masking-pattern Universal Sub-band Integrated Coding and Multiplexing (MUSICAM), and Sub-Band Adaptive Difference PCM (SB/ADPCM). After a series of objective and subjective sound quality tests, considering the sound quality at different bit rates, sensitivity to transmission bit errors, encoding/decoding complexity, and encoding/decoding delay, ASPEC and MUSICAM showed the best sound quality at a low bit rate of about 100 kbit/s. At a low bit rate (64 kbit/s), ASPEC showed better sound quality, while MUSICAM was slightly better in terms of encoding and decoding complexity and delay. Based on several algorithms of ASPEC, MUSICAM was improved, which increased the calculation complexity but achieved better compression ratio and sound quality. This is the standard of ISO11172-3 Audio Layer III.

Layer I is the simplest algorithm. For example, Philips' digital cassette recorder DCC (DIGItal Compact Cassette) uses the layer I compression algorithm, and the bit rate used is 192 kbit/s per channel.

Layer II has a medium coding complexity and is applicable to bit rates of about 128 kbit/s per channel. It is widely used in audio coding for DAB (Digital Audio Broadcasting) and video CDs.

Layer III is the most complex coding algorithm, but it provides the best sound quality at the same bit rate. The typical bit rate is 64 kbit/s, which is most suitable for audio transmission over ISDN.

On April 22, 1998, APT (Audio Processing Technique) successfully broadcast the "International Earth Day" radio concert from Beijing to Tokyo to Shanghai through ISDN lines using the Apt-X100 system. However, this broadcast occupied three (i.e., six B) ISDN lines to ensure 22 kHz stereo transmission. This is because the Apt-X100 system uses the SB/ADPCM audio compression method [2]. However, if the MPEG layer III audio compression method is used, only one ISDN line is needed to achieve 22 kHz stereo transmission. Since the MPEG layer III audio compression coding is too complex and the amount of calculation is too large, it is difficult to implement it with a general DSP (Digital Signal Processor) single chip, so this algorithm is rarely used in current audio equipment. In order to implement the MPEG layer III high-efficiency audio compression algorithm at a lower cost, we have conducted a comprehensive analysis of this algorithm and proposed a coding acceleration scheme suitable for DSP implementation.

2 MPEG Audio Layer III Compression Coding Process and Characteristics

The MPEG audio layer III compression encoding process is shown in Figure 1. Compared with layer I and layer II, its characteristics are:

MPEG Audio Layer

Figure 1 MPEG audio layer III coding flow chart (mono model)

(1) The masking threshold of human hearing is calculated using the Cochlea spreading function, a modified rounded diffusion function that is independent of signal frequency and sound pressure level.

(2) An MDCT module is added to improve frequency resolution.

(3) Through the control loop, the non-uniform quantization rate is iteratively allocated to maintain a relatively constant signal-to-noise ratio. In addition, the variable-length entropy coding - Huffman coding is used to obtain a better data compression ratio for each quantized sub-band signal.

The Layer III coding algorithm process is divided into three major functional blocks: (1) time-frequency mapping, (2) psychoacoustic model, and (3) quantization coding. In time-frequency mapping, the calculation of the polyphase hybrid filter bank (Polyphase/MDCT Hybrid Filter Bank) is a relatively standardized calculation, and its computational complexity is calculable. In addition, there are various fast algorithms to reduce the computational complexity. The calculation of the psychoacoustic model mainly involves the 1024-point and 256-point FFTs. However, this is a relatively standard calculation process, and its computational complexity can be accurately estimated regardless of which FFT is used. Quantization coding is completed through iterative loops, and its loop control variables are uncertain. In addition, the lookup table process of the Huffman code table makes its computational complexity difficult to predict and estimate. Therefore, we believe that the standardization of the quantization coding part is a breakthrough point for optimizing MPEG audio Layer III coding.

The functions of the iterative loop and quantization coding part of the layer III encoder are: quantizing the sample values after subband filtering and MDCT transformation and controlling the quantization noise according to the calculation results of the psychoacoustic model, so as to complete the Huffman coding of the frequency domain signal under certain bit rate requirements. The iterative loop of the layer III quantization coding part is divided into an inner loop and an outer loop. Figure C. 9. a, C. 9. b, and C. 9. c in reference [1] give the iterative loop flow diagram of quantization coding.

3 Main Problems and Solutions of Audio Layer III Compression Using DSP

DSP programming does not provide flexible pointer and array addressing operations like C language. When using DSP to implement iterative loop quantization encoding in audio layer III compression, a large number of irregular array addressing operations are involved, which consumes a large number of instructions, reduces the utilization of DSP, and inhibits the real-time implementation of encoding. Therefore, irregular table query instructions need to be well organized to make the program structure clear, concise, and efficient.

3.1 Huffman Coding Multiple Address Indexes

Huffman coding in layer III coding is an exhaustive, table-lookup process. Table B.7 in reference [1] lists 32 Huffman code tables for layer III coding. Their maximum value domain ranges and the signal statistical characteristics applicable to the code tables are different. In the coding process, first find the maximum value of the sample value in the area to be coded, and query each Huffman code table in turn until the code table can encode this maximum value, and then calculate the number of bits required for encoding using the table. Then try other code tables with the same coding value domain range, and find the code table with the minimum bit number requirement for the final coding.

Since not all of the 32 tables provided in the standard can be used, and a large number of code tables are only different in linbits. Therefore, how to store these code tables and make them easy to query and encode is one of the key issues in the encoding process. However, the proposed "multi-level indexing" method can solve this problem well. The process is shown in Figure 2. There are different ways to handle possible tables:

Huffman coded multiple address index

Figure 2 Huffman coding multiple address index

A normal table is such as Table 15, that is, each level of index corresponds to each item of information in Table 15.
The invalid table is shown in Table 14, and its final direction is code table zero, which is equivalent to the invalid table.
The similarity tables such as Table 16 and Table 17 are actually different only in the linbits of the level II index, and the final Huffman data are the same. The modular implementation of the program can be well solved by such a multi-level code table address index.

3.2 Acceleration of Layer III Coding Iteration Loop

In the iterative loop coding part, when using the initial quantization constant recommended in the standard for quantization and coding, the initial bit number requirement must be much larger than the upper limit allowed. At this time, if the step size is simply increased by one for re-quantization coding, the system efficiency will be greatly reduced.

According to actual experimental results, the number of bits that can be obtained initially is generally about 700 bits/Granule, while the initial quantization coding result is generally above 5,000 bits. If the quantization step size is increased by 20 again, the required number of bits can be quickly approached. Table 1 lists a possible accelerated approximation method that we use.

Table 1 A possible method to accelerate the iterative loop

A possible method to speed up the iteration loop

The actual operation results on the fixed-point DSP chip ADSP2181 of AD (Analog Device) show that this acceleration method can reduce the instruction operation cycle of the original algorithm by about 2/3.

4 Conclusion

The MPEG audio layer III compression standard is an efficient and high-fidelity compression coding algorithm, but due to its high complexity and high computational complexity, it is difficult to implement in real time using a general DSP. Based on a comprehensive analysis of the algorithm, it is proposed that the key to reducing complexity and improving DSP computing efficiency when implementing it with a DSP is to optimize the loop iteration quantization coding. The "multiple address indexing of Huffman coding" is proposed, which provides a concise and clear line for the addressing operation of a large number of irregular arrays, saves addressing instructions, and improves the utilization of DSP. Furthermore, the "iterative loop acceleration" scheme is proposed. Through the operation of the ADSP2181 fixed-point chip, it is shown that this scheme can reduce 2/3 of the instruction cycle.

Keywords：MPEG Reference address：Acceleration Method for MPEG Audio Layer III Compression Using DSP

Previous article：TMS320TCI6612/14 helps small cell base stations achieve high performance
Next article：Design of digital frequency meter based on DSP

Recommended ReadingLatest update time:2024-11-16 20:23

MPEG-H standard landed in China: Fraunhofer IIS, the supplier behind it

Toni Fiedler, the representative of Fraunhofer IIS in China, has been very excited in recent days, chatting with business partners who came to consult at the CCBN booth. At the booth, Fraunhofer brought the first encoder that supports MPEG-H technology to China, which is an encoder system in cooperation with Digital V

[Home Electronics]

MPEG-H standard landed in China: Fraunhofer IIS, the supplier behind it

MPEG4 Video Decoder Based on ARM

1 Introduction This paper aims to study the MPEG-4 video decoding technology based on arm microprocessor, which is mainly used in handheld mobile devices. The choice of processor is the key to realize MPEG-4 video decoding using embedded system. The commonly used RISC processor in embedded system is ARM core, m

[Microcontroller]

Popular Resources
Popular amplifiers