Modular joint rate control technology-EEWORLD

Collect

Abstract: It is proposed to divide the entire joint code rate control algorithm into several control modules. Adjusting the module algorithm does not affect the overall control strategy, making the algorithm more versatile. Several main modules of the system: code rate prediction, bandwidth allocation, quantization parameter selection and buffer control algorithms are discussed. Finally, the performance evaluation method of the joint rate control system is given.

Keywords: joint rate control, modular rate prediction, bandwidth allocation, quantization parameter selection

With the formulation of video and audio compression coding standards MPEG-1 and MPEG-2, digital video systems based on them are increasingly used. In the future, video services will increase significantly, and the technology to simultaneously transmit as many MPEG-compressed video programs as possible within a traditional fixed-bandwidth channel will become increasingly urgent.

1 Overview of joint rate control technology

When multiple variable bit rate (VBR) encoded video programs are transmitted in the same fixed bandwidth channel, statistical multiplexing technology can be used to compensate for each other in the code rates of each program, dynamically allocate fixed channels, and make full use of channel resources. However, it has the following shortcomings: (1) Statistical multiplexing follows the "law of large numbers". Only when the number of multiplexed services N is large enough (N>10), the code rates of each channel compensate each other, and high statistical multiplexing gain can be produced (See Section 3). If the channel bandwidth is limited and the number of services transmitted at the same time is not large, the total code rate will still fluctuate greatly after multiplexing, and data is easily lost when transmitting in a fixed-bandwidth channel. (2) Although statistical multiplexing can avoid the direct accumulation of the peak bit rates of each service, because the image content changes cannot be predicted, the total output bit rate after multiplexing may still exceed the channel bandwidth in a certain period of time, resulting in data loss during the transmission process. . Especially when important information (such as header, DCT DC and low-frequency coefficients) is lost, it will seriously affect the quality of the image or even the entire group of pictures (GOP) in which the image is located. Therefore, statistical multiplexing technology alone is not suitable for digital video broadcasting where fewer video services are transmitted simultaneously and higher image quality requirements are required.

The joint bit rate control system (see Figure 1) integrates the statistics of various video programs and uniformly allocates the total available bandwidth so that the output bit rate after multiplexing does not exceed the bandwidth, no data is lost, and the quality of each channel reaches optimal. According to the information currently available, joint code rate control technology is still in the research stage, and companies such as IBM, PHILIPS and DIVICOM are conducting this research.

Unlike independently controlling the bit rate of each program in MPEG fixed bit rate encoding, the joint rate control system implements joint control over each encoder. When the system starts working, each encoder can be set to the same quantization parameter; when the system predicts that the code rate after multiplexing exceeds the channel bandwidth, the effective bandwidth is reallocated according to the image complexity, and then the quantization parameters are changed so that each encoder The output meets the target bitrate.

Some literature introduces their respective joint code rate control algorithms, but there are some shortcomings, such as: an algorithm for adjusting the code rate by monitoring the status of the channel buffer [1]. Since the buffer status does not directly reflect image changes, the code rate Adjustment will lead to quality differences in images of the same complexity; the algorithm using a dedicated chip cannot be universal [2]; use the previous GOP encoding result to predict the bit rate [3], so that bandwidth allocation lags behind image changes; according to the customized Super GOP (The combination of each program corresponding to the GOP) and the Super Frame (the combination of the corresponding frames of each program in the Super GOP) structure allocates bandwidth [4], without considering the image changes within the Super GOP. Although each algorithm has improved program quality to some extent, it lacks comprehensive consideration of image changes, bandwidth allocation, bit rate control and buffer status, making it difficult to achieve optimal program quality.

To this end, this paper proposes a modular joint code rate control algorithm for the first time, which divides the entire code rate control into several control modules so that the module algorithm is relatively independent and its adjustment does not affect the system control strategy. The algorithm is more versatile and can be applied to different encodings. chip.

2 Modular control algorithm

The system can be divided into several modules: code rate prediction, bandwidth allocation, quantization parameter selection and buffer control. Figure 2 only shows the relationship between program n and each control module. The relationship between other programs and control modules is the same.

2.1 Code rate prediction module

The code rate prediction module extracts the statistical information of each video program based on the selected time period (frame or GOP). There are two code rate prediction methods: forward prediction method [1][2] and feedback prediction method [3]. The former is to preprocess the image to extract statistical quantities before encoding it. There are various amounts of statistics that can be extracted, and their selection should be closely related to the bit rate required by the encoder to output a certain quality image. For example, a statistic of 10 means that the encoder outputs an image of the same quality at a code rate that is twice as high as when the statistic is 5. The forward prediction method responds quickly to changes in image complexity and scene switching. However, program content varies widely, and activity and complexity are very different. It is a difficult task to find a statistical quantity that can adapt to any image content and is closely related to the output bit rate. Challenging work. In addition, real-time forward prediction requires pre-processing chips, which increases system costs.

The feedback prediction method is to collect the statistical quantities generated during the encoding process (dashed arrows in Figure 2) after image encoding to guide subsequent image encoding. Compared with the preprocessing method, the feedback method does not require image preprocessing and has a smaller computational load. However, the statistical quantity can only be some information generated during the encoding process, and the statistical quantity of the previous image is used to predict the subsequent image. Therefore, the feedback prediction method does not reflect image complexity changes and scene switching as quickly as the previous prediction method. However, since the image content will last for a certain period of time, the feedback prediction method can also predict the bit rate.

2.2 Bandwidth allocation module

The bandwidth allocation module can choose from a variety of algorithms. A relatively simple algorithm is to divide the available channel capacity into two parts, Cp and C0. The former is allocated according to the predicted code rate of each program so that each program maintains an acceptable image quality; the latter is allocated according to the variance of the predicted code rate of various programs. Allocation ensures that complex programs get more bit rates and makes the image quality of various programs uniform [1].

The bandwidth allocation module can also calculate the complexity of each program based on the statistics output by the code rate prediction module, and allocate bandwidth according to its proportion [3]. The complexity calculation can use the corresponding formula in MPEG TM5[5]:

C＝R×Q

In the formula, R is the code rate of image encoding, and Q is the average quantization factor of the image (the average of the quantization factors of each macroblock in the image).

Another algorithm is to first define the Super GOP and Super Frame structures [4] and assign the same bit rate to each Super GOP; then allocate the Super GOP bit rate to each Super Frame according to the bit rate allocation method in TM5; similarly The method is also applied to the code rate allocation of each frame in Super Frame.

The quality of each program is uniformly reflected in the image having the same degree of distortion. According to the rate-distortion theory [6], when a complex image should be allocated to more code rates, its distortion degree is the same as that of a simple image. Therefore, no matter what algorithm is used to allocate bandwidth, the target bit rate allocated to each program should be proportional to its complexity.

In addition, the time period in which the bandwidth allocation module allocates bandwidth is also worth considering. There are two options: use image frames or GOP as the time unit. To allocate bit rates by frames, it is necessary to first determine the image type of each program at any time (the GOP structure of each program is different, and the image type changes are not synchronized) in order to reasonably allocate the bit rate. Considering the stability of the image quality, the code rate allocation should optimize the overall quality of the entire image sequence rather than the optimal image quality of a certain frame. The probability of scene switching in most video programs within the GOP is very small, and the GOP The arrangement order of the three image types I, P, and B is repetitive. As a result, the code rates of images within the GOP can be allocated according to a predetermined ratio, and fluctuations in the code rates of each frame can compensate for each other within the GOP. If a scene switch occurs in a certain GOP, the remaining images in this GOP and the next GOP can be combined into one large GOP so that the scene switch does not affect the code rate allocation strategy. Therefore, it is more reasonable to allocate bandwidth in units of GOP.

2.3 Quantification parameter selection

The quantization parameter selection module makes the output of each encoder meet the target code rate pre-allocated by the bandwidth allocation module. Quantization parameters include quantization factor Q and quantization matrix. The quantization matrix can be adjusted at the picture level, and the quantization factor Q can be adjusted at the slice or macroblock level. The quantization matrix changes according to the frequency characteristics of human visual space and is relatively stable. Code rate control and adjustment are generally achieved by changing the quantization factor. Figure 3 is the relationship between quantization factor and output code rate.

In order to make the subjective quality of the image relatively consistent, each program should try to use the same quantization factor [3]. The quantization parameter selection module can search within the quantization factor value range (1 ~ 31), and select the appropriate Q to make the encoder output the closest target bit rate. It can be seen from Figure 3 that when the quantization factor is small, its increase or decrease by 1 will cause the code rate to change greatly. Therefore, Q that meets the target code rate may not be an integer. If an integer Q is selected, that is, the same Q is used for each macroblock in the image, the encoder output may deviate from the target bit rate, but the bit rate deviations can compensate for each other in the buffer.

The image average quantization factor can also be a decimal value, that is, different Q values are selected for strips or macroblocks in the image. The quantitative parameter selection module can combine the characteristics of the human eye to pre-determine a variety of Q selection templates for images to be selected based on their activity, complexity and content to ensure the best subjective quality of the image. For example, if the Q value of an image with more details in the middle part is 3.75, the Q value of the macro blocks accounting for three-quarters of the total number of macro blocks at the edge of the image can be taken as 4, and the Q value of the other one-quarter macro blocks in the middle part can be set to 4. Take it as 3. In this way, the edge quantization of the image that is not noticed by the audience is coarser, while the quantization of the center of the image is finer, and the subjective quality of the entire image is the best.

2.4 Buffer control

The buffer control module adds limits to the code rate so that the buffer does not overflow or underflow [3]. A buffer threshold coefficient α can be set so that the total output code rate Bf satisfies:

αBs≤Bf≤(1-α)Bs

In the formula, Bs is the buffer capacity. If the code rate exceeds this threshold, the buffer control module instructs the bandwidth allocation module to reallocate the bandwidth. α determines the cache utilization and should be selected flexibly according to the actual situation.

Code rate prediction, bandwidth allocation, quantization parameter selection and buffer control are the most important modules in the joint code rate control system. They are not independent or separated, but influence and restrict each other. Therefore, the algorithms of these modules should be selected from the perspective of optimal performance of the entire system.

3 Joint rate control performance evaluation

There are two indicators to evaluate the performance of the joint rate control system: statistical multiplexing gain (G) and peak signal-to-noise ratio (PSNR). The statistical multiplexing gain G of multi-channel MPEG VBR video programs is defined as: within the same fixed bandwidth channel, the number of multiplexed VBR video services with equal or better image quality that can be transmitted to the number of CBR video services that can be transmitted Than [1]. Generally, the larger G is, the better the multiplexing performance is and the greater the number of VBR video services that can be multiplexed at the same time.

The method of using PSNR to evaluate the performance of the joint code rate control system is: find the ratio of the peak signal-to-noise ratio of each VBR video program after multiplexing to the peak signal-to-noise ratio when transmitting the same number of CBR video programs. The resulting increase in PSNR is the representation Image quality improvement. The calculation formula of peak signal-to-noise ratio is:

In the formula, n(x,y,z) is the noise superimposed on the pixel (x,y,z), and M is the total number of pixels.

In short, joint rate control technology can eliminate the information loss defect of statistical multiplexing, limit the multiplexed video service code rate within the channel capacity, and maintain the same image quality of each program, making it suitable for digital video broadcasting. This paper proposes modular joint code rate control for the first time, dividing the system into several control modules. The adjustment of the module algorithm does not affect the overall control strategy. Its purpose is to enhance the versatility of the algorithm, making it suitable for different encoding chips and gaining wider application. .

There are still many topics to be studied for modular joint rate control technology. This includes choosing appropriate algorithms to improve system computing speed and work performance; studying methods that not only allocate bit rates according to image complexity, but also ensure that key programs have sufficient bit rates by setting priorities; and study methods that can more accurately calculate the bit rate than "peak signal-to-noise ratio" An objective quantity that reflects the subjective quality of the reconstructed image, etc.

Reference address：Modular joint rate control technology

Previous article：Fast implementation of Gaussian filter in real-time system
Next article：Complex passive filter parameter design based on genetic algorithm