The difference between AVS and international standard MPEG

Publisher:素心轻语Latest update time:2011-04-20 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

This article compares the three video standards of MPEG-2, MPEG-4 AVC/H.264 and AVS video (GB/T 200090.2) from a technical perspective, including technical solutions, subjective tests, objective tests, and complexity.
I. Technical Comparison
Both AVS video and MPEG standards use a hybrid coding framework (see Figure 1), including transformation, quantization, entropy coding, intra-frame prediction, inter-frame prediction, loop filtering and other technical modules, which is the current mainstream technical route. The main innovation of AVS is that it proposes a number of specific optimization technologies, achieving technical performance comparable to international standards at a lower complexity, but does not use a large number of complex patents behind international standards. The characteristic core technologies of AVS-video include: 8x8 integer transformation, quantization, intra-frame prediction, 1/4 precision pixel interpolation, special inter-frame prediction motion compensation, two-dimensional entropy coding, deblocking loop filtering, etc.


Figure 1 Typical video coding framework

The block diagram of the AVS video encoder is shown in the figure below.

Figure 2 AVS video encoder block diagram

The AVS video standard defines three different types of images: I-frame, P-frame and B-frame. The macroblocks in the I-frame only perform intra-frame prediction, while the macroblocks in the P-frame and B-frame need intra-frame prediction or inter-frame prediction. In the figure, S0 is the prediction mode selection switch. The prediction residual is subjected to 88 integer transformation (ICT) and quantization, and then the quantization coefficients are zig-zag scanned (another scanning method is used for interlaced coding blocks) to obtain one-dimensionally arranged quantization coefficients, and finally the quantization coefficients are entropy encoded. The transformation and quantization of the AVS video standard only require addition, subtraction and shift operations, which can be completed with 16-bit precision.
The AVS video standard uses a loop filter to filter the reconstructed image, which can eliminate the block effect and improve the subjective quality of the reconstructed image on the one hand; on the other hand, it can improve the coding efficiency. The filtering strength can be adjusted adaptively.

The AVS standard supports a variety of video services. Considering the interoperability between different services, the AVS standard defines profiles and levels. A profile is a subset of the syntax, semantics, and algorithms defined by AVS; a level is a limited set of syntax elements and syntax element parameter values ​​under a certain profile. In order to meet the needs of services such as high-definition/standard-definition digital TV broadcasting and digital storage media, the AVS video standard defines a baseline profile and four levels (4.0, 4.2, 6.0, and 6.2). The maximum supported image resolution ranges from 720576 to 19201080, and the maximum bit rate ranges from 10 Mbit/s to 30 Mbit/s.
Table 1 Comparison of technologies used by AVS and MPEG-2, MPEG-4 AVC/H.264 and estimated performance differences

Video Coding Standards

MPEG-2 Video

MPEG-4 AVC/H.264 video

AVS Video

Estimation of performance differences between AVS video and AVC/H.264

(Estimated using signal-to-noise ratio dB, the percentage in brackets is the bit rate difference)

Intra-frame prediction

DC coefficient differential prediction is performed only in the frequency domain

Based on 4×4 blocks, 9 luminance prediction modes, 4 chrominance prediction modes

Based on 8×8 blocks, 5 luminance prediction modes, 4 chrominance prediction modes

Basically equivalent

Multi-reference frame prediction

Only 1 frame

Up to 16 frames

Up to 2 frames

When both use two frames, the performance improvement is not obvious when the number of frames is increased.

Variable Block Size Motion Compensation

16×16

16×8 (field coding)

16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4

16×16, 16×8, 8×16, 8×8

Reduce by about 0.1dB

(2-4%)

B-frame macroblock direct coding mode

none

Independent spatial or temporal prediction mode. If the block in the backward reference frame used to derive the motion vector is intra-coded, its motion vector is treated as 0 and is still used for prediction.

The temporal and spatial domains are combined. When the block used to derive the motion vector in the backward reference frame in the temporal domain is intra-coded, the motion vector of the adjacent block in the spatial domain is used for prediction.

Improve 0.2-0.3dB

(5%)

B-frame macroblock bidirectional prediction mode

Encode the two motion vectors before and after

Encode the two motion vectors before and after

It is called symmetric prediction mode, which encodes only one forward motion vector, and the backward motion vector is derived from the forward motion vector.

Basically equivalent

¼ pixel motion compensation

Bilinear interpolation is performed only at half-pixel locations

6-tap filtering is used at ½ pixel position, and linear interpolation is used at ¼ pixel position

4-tap filtering at ½ pixel position, 4-tap filtering and linear interpolation at ¼ pixel position

Basically equivalent

Transformation and quantization

8×8 floating point DCT transform, division quantization

4×4 integer transform, both the codec and encoder need normalization, quantization and transform normalization are combined, and multiplication and shift are used to achieve

8×8 integer transform, transform normalization is performed at the encoding end, quantization is combined with transform normalization, and multiplication and shift are used to achieve

Improved by about 0.1dB (2%)

Entropy Coding

Single VLC table, poor adaptability

CAVLC: High correlation with surrounding blocks, more complex to implement

CABAC: Calculation is more complicated

Context-adaptive 2D-VLC, multiple code table switching during encoding block coefficients

Reduce by about 0.5dB (10-15%)

Loop Filter

none

Based on 4×4 block edges, the filter strength classification is numerous and the calculation is complex

Based on 8×8 block edges, simple filter intensity classification, filtering fewer pixels, low computational complexity

——

Fault-tolerant coding

Simple striping

Data segmentation, complex macroblocks such as FMO/ASO, strip organization mechanism, forced Intra block refresh coding, constrained intra prediction, etc.

Simple striping mechanism is sufficient to meet the error concealment and recovery requirements of broadcast applications

——

DCT (Discrete Cosine Transform): Discrete Cosine Transform
VLC (Variable Length Coding): Variable Length Coding
CAVLC (Context-based Adaptive Variable Length Coding): Context-based Adaptive Variable Length Coding
CABAC (Context-based Adaptive Binary Arithmetic Coding): Context-based Adaptive Binary Arithmetic Coding
FMO (Flexible Macroblock Ordering): Flexible Macroblock Ordering
ASO (Arbitrary Slice Ordering): Arbitrary Slice Ordering

2. Subjective evaluation and objective testing
There are two evaluation criteria for compression effects: subjective evaluation and objective evaluation, each with its own advantages and disadvantages. Subjective evaluation is to hire special evaluators to compare the difference between the audio-visual effect restored after compression and the original effect, usually in a special audio-visual environment according to certain rules for subjective scoring. Objective evaluation is to use a specific algorithm to count the loss of multimedia data compression results, such as the signal-to-noise ratio SNR (i.e. the logarithm of the signal-to-noise ratio). Subjective evaluation and objective evaluation sometimes differ greatly, so to measure the quality of an algorithm, it is necessary to find a balance between the two. For the evaluation of a set of standards, an objective evaluation method is usually used during the development process, but it must be confirmed by subjective evaluation in the end.

1. Subjective test of MPEG-4 AVC video standard
From October to December 2002, MPEG organized a special group to test the AVC (ISO/IEC 14496-10 | ITU-T Rec. H.264) and MPEG-4 Visual (ISO/IEC 14496-2) and MPEG-2 Video (ISO/IEC 13818-2) standards. The test was conducted at FUB/ISCTI (Italy), NIST (USA) and TUM (Germany). The test results showed that the encoding performance of AVC has been significantly improved. The
test conditions (video sequence and bit rate) used in this test for standard definition (SD) and high definition (HD) are as follows:
Table 2 AVC standard definition test conditions

The subjective evaluation test of image quality is based on ITU-R BT.500-11 "Methodology for the subjective assessment of the quality of television pictures" (as can be seen below, the two subjective tests of AVS also adopted this test standard). The test results are shown in the following table:
Table 4 Comparison between AVC and optimized MPEG (MPEG-2 HiQ) in standard definition

SD Main (AVC Main vs. MPEG-2 HiQ)

sequence

Football

Mobile

Husky

Tempete

AVC bit rate

6 Mbps

T

T

4Mbps

1.5x

T

1.5x

T

3Mbps

1.3x

2x

1x /1.3x

T

2.25Mbps

> 1.3x

2.7x

1.3x

T

1.5Mbps

> 1.5x

4x

> 1.5x

T, 2x

(Note: T in the table means transparent, and the difference between the compressed and original images is not noticeable. Nx means that the bit rate of the compared object must be N times that of AVC to achieve the same quality, the same below)
It can be seen from the table that N is greater than or equal to 1.5 for 8 of the 12 comparable items, greater than or equal to 2 for 3, and greater than or equal to 4 for 1.
Table 5 Comparison between AVC and MPEG-2 reference software (MPEG-2 TM5) in standard definition

As can be seen from the table, compared with MPEG-2 reference software, 9 out of 12 comparable items have N greater than or equal to 1.8, and 2 have N greater than or equal to 4.
Table 6 Comparison of AVC and optimized MPEG (MPEG-2 HiQ) in high definition

HD Main (AVC Main compared MPEG-2 HiQ)

Sequence

720 (60p)

1080 (30i)

1080 (25p)

Crew

Harbour

Stockholm Pan

New Mobile & Calendar

River Bed

Vintage Car

AVC bitrates

20Mbps

T

T

T

T

T

10Mbps

2x

T

1x

T, 2x

> 1x

T, 2x

6Mbps

1.7x

T, 3.3x

This bitrate was not part of the test

> 1.7x

1.7x

As can be seen from the table, compared with the optimized MPEG-2 HD encoder, 7 out of 9 comparable items have N greater than or equal to 1.7, 3 are greater than or equal to 2, and one is equal to 3.3.
Table 7 Comparison of AVC and MPEG-2 reference software (MPEG-2 TM5) in HD

HD Main (AVC Main compared to MPEG-2 TM5)

Sequence

720 (60p)

1080 (30i)

1080 (25p)

Crew

Harbour

Stockholm Pan

New Mobile & Calendar

River Bed

Vintage Car

AVC bitrates

20Mbps

T

T

T

T

T

10Mbps

2x

T

2x

T, 2x

> 1x

T, 2x

6Mbps

1.7x

T, 1.7x

This bitrate was not part of the test

> 1.7x

1.7x

As can be seen from the table, compared with MPEG-2 reference software, 8 out of 9 comparable items have N greater than or equal to 1.7, and 4 out of 9 comparable items have N greater than or equal to 2.
In general, compared with MPEG-2, 66 out of 85 comparisons of AVC need to reach 1.5 times the bit rate of MPEG-2 to achieve the same quality as AVC, and 51 out of 85 comparisons of MPEG-2 need to reach 2 times the bit rate of AVC to achieve the quality of AVC. In other words, in 60% of cases, the encoding efficiency of AVC can reach twice that of MPEG-2.
Given that the encoding efficiency of AVC can reach twice that of MPEG-2, relevant testing institutions in China usually set the bit rate of AVS video at half or lower than the typical bit rate of MPEG-2 when testing AVS, that is, to test whether the encoding quality of AVS video can meet the broadcast requirements when the encoding efficiency of AVS is twice or higher than that of MPEG-2.

2. AVS Subjective Test - Digital TV User-End Product Test Laboratory of National Radio and Television Product Quality Supervision and Inspection Center
From November 15 to December 26, 2004, commissioned by the Digital Audio and Video Codec Technical Standard Working Group, the Digital TV User-End Product Test Laboratory of National Radio and Television Product Quality Supervision and Inspection Center organized a subjective evaluation test of image quality for the AVS video codec/decoder scheme submitted by the working group. The performance of the AVS video compression scheme was evaluated through subjective evaluation tests of the AVS soft codec/decoder provided by the client. The subjective evaluation test of image quality
was based on ITU-R BT.500-11 "Methodology for the subjective assessment of the quality of television pictures" and ITU-R BT.710-2 "Subjective assessment of image quality in high-definition television", and the dual stimulus continuous quality scale (DSCQS) method was used to evaluate the difference in image quality after the AVS video codec/decoder system was processed compared with the original uncompressed image quality.
The purpose of this subjective evaluation of video image quality is to evaluate the overall performance of the AVS video encoding/decoding scheme by comparing the image quality processed by the AVS encoding/decoding system with the image quality of the original material. The selection of evaluation materials should be extensive and appropriately strict. The selected test sequence should be able to reflect the characteristics of image brightness reproduction, color reproduction, static spatial resolution, dynamic spatial resolution, motion reproduction, apparent depth effect, telepresence, flicker performance and reproduction of familiar tones. It is hoped that the selected programs can fully and accurately reflect the performance of the video encoding/decoding scheme being evaluated. The test sequence includes 8 high-definition programs and 8 standard-definition programs, each of which is 10s to 20s long.
The average score difference between the evaluated object and the original material of the high-definition image test sequence is between 1.6 and 6.0, and the total average score difference is 3.6, indicating that the evaluators believe that the image quality of the evaluated object is very little different from that of the original material.
The statistical results show that when the image format of the AVS video encoding/decoding scheme is 1920×1080P/25Hz and the compression bit rate is 6Mbps, the image quality of the evaluated object is very different from that of the original material, and it is not easy to detect.
The average score difference between the evaluated object and the original material of the 8 standard definition image test sequences is between 1.1 and 10.5, and the total average score difference is 6.4. Except for sequence 2, the standard deviations of the other 7 test sequences are between 6.4 and 8.6, indicating that the discreteness of the scores given by the evaluators for these test sequences is small.
The test results show that when the image format of the AVS video encoding/decoding scheme is 720×576I/50Hz and the compression bit rate is 2.5Mbps, the difference in image quality between the evaluated object after encoding/decoding and the original material can be detected, but the difference is small.
This test shows that when the AVS video bit rate is less than half (standard definition) and one-third (high definition) of the typical bit rate of MPEG-2, the quality loss is very small and can meet the broadcast requirements.
3. AVS Subjective Test - Radio and Television Planning Institute of the State Administration of Radio, Film and Television
From April to September 2005, the Radio and Television Planning Institute of the State Administration of Radio, Film and Television was commissioned by the Institute of Computing Technology of the Chinese Academy of Sciences, the affiliated unit of the AVS Working Group, to conduct subjective evaluations on standard-definition and high-definition videos encoded and decoded by the AVS reference software, evaluate the degree of damage to the quality of the source image, and complete the "Subjective Evaluation of AVS Video Compression Quality" test report (attached).
The basis for this test is the radio and television industry standard GY/T 134-1998 "Subjective Evaluation Method for Digital Television Image Quality" and ITU-R BT.500-11, ITU-R BT.1210-3 "Test Materials to be used in Subjective Assessment (Test Materials for Subjective Evaluation)" standards. The standard-definition test uses 6 international standard image sequences, and the high-definition test uses 6 national standard image sequences.
The test results are summarized as follows:

Table 8 AVS subjective test results

Test bitrate\ video type

Standard definition (625/50i)

High Definition (1125/50i)

AVS test bit rate (Mbps)

3

1.5

10

6

Test Results

excellent

good

excellent

Good to Excellent

Considering that the bit rate of 20Mbps is generally used when implementing high-definition television broadcasting using the MPEG-2 standard, and the bit rate of 5-6Mbps is generally used when implementing standard-definition television broadcasting using the MPEG-2 standard, it can be seen from the test results that when the bit rate of AVS is half of the current MPEG-2 standard, the encoding quality is excellent in both standard definition and high definition. When the bit rate is less than one-third of it, it is also good to excellent. Therefore, compared with the premise that the video encoding efficiency of MPEG-2 is 2~3 times higher, the video quality of AVS has fully met the "good" requirements required for large-scale applications. Compared with the test report of the MPEG standard organization on MPEG-4 AVC/H.264, AVS is at the same technical level in terms of encoding efficiency.
4. Objective test of AVS and MPEG standards
The common method for objectively evaluating video coding standards is the peak signal-to-noise ratio PSNR. Tables 9 and 10 respectively give the objective encoding performance of AVS and MPEG-2 standards and AVS and MPEG-4 AVC/H.264 standard main profile. The results are the gain of peak signal-to-noise ratio PSNR under the same bit rate conditions. It can be seen that the coding efficiency of AVS is 2.56dB higher than that of MPEG-2 standard on average, and is slightly lower than that of H.264 standard, with an average loss of 0.11dB.

The following is the performance comparison experimental results of AVS and H.264 for another set of video sequences. The AVS video encoder used in the experiment is RM 5.0a, and the H.264 encoder is JM 6.1e. The experimental sequences include 720p and 1080i sequences. The encoding parameters are shown in Table 4. Tables 11-12 show the PSNR gain of the AVS video standard relative to H.264. Figures 5 to 8 show the PSNR curves.

Table 11 AVS and H.264 encoding parameters

JM 6.1e

RM 5.0a

Entropy Coding

CABAC

2D-VLC

Rate-Distortion Optimization

use

use

Reference Images

2 frames

2 frames

B-frames

2 frames (IBBP)

2 frames (IBBP)

Interlaced encoding

Macroblock frame/field adaptation

Image frame/field adaptation

Motion compensation block size

16´16 to 4´4

16´16 to 8´8

Loop Filter

use

use

Figure 3 Experimental results of City sequence

Figure 3 Experimental results of City sequence
Figure 4 Harbour sequence experimental results

Figure 4 Harbour sequence experimental results
Figure 5 Spincalendar sequence experimental results

Figure 5 Spincalendar sequence experimental results

Figure 6 Flamingo sequence experimental results

From the above data, it can be seen that in terms of progressive encoding, the performance of the AVS video standard is basically the same as that of H.264; in terms of interlaced encoding, since the AVS video standard currently only supports image-level frame/field adaptive encoding, there is an average performance gap of 0.5dB.

3. Complexity comparison
Table 13 briefly compares the computational complexity of AVS and H.264. It is roughly estimated that the decoding complexity of AVS is equivalent to 30% of H.264, and the encoding complexity of AVS is equivalent to 70% of H.264.
Table 13 Comparison of computational complexity of AVS and H.264

Technology Modules

AVS Video

MPEG-4 AVC/H.264 video

Complexity Analysis

Intra-frame prediction

Based on 8×8 blocks, 5 luminance prediction modes, 4 chrominance prediction modes

Based on 4×4 blocks, 9 luminance prediction modes, 4 chrominance prediction modes

Reduced by about 50%

Multi-reference frame prediction

Up to 2 frames

Up to 16 frames, complex buffer management mechanism

Storage savings of more than 50%

Variable Block Size Motion Compensation

16×16, 16×8, 8×16, 8×8 block motion search

16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4 block motion search

Save 30~40%

B-frame macroblock symmetry mode

Just search for the right amount of forward motion

Bidirectional search

Maximum reduction: 50%

¼ pixel motion compensation

4-tap filtering at ½ pixel position

¼ pixel position uses 4-step filtering and linear interpolation

6-step filtering at ½ pixel position

¼ pixel position linear interpolation

Reduce memory access by 1/3

Transformation and quantization

Normalization at the decoding end is done at the encoding end, reducing decoding complexity

Normalization is required on both the codec and encoder

The decoder is lower than

Entropy Coding

Context-adaptive 2D-VLC, Exp-Golomb code reduces computational and storage complexity

CAVLC: High correlation with surrounding blocks, more complex to implement

CABAC: Hardware implementation is particularly complex

Reduced by more than 30% compared to CABAC

Loop Filter

Simple filter strength classification based on 8×8 block edges, filtering fewer pixels

Based on 4×4 block edges, there are many types of filter strengths and many filter edges.

Reduce by 50%

Interlace Coding

PAFF frame level frame field adaptation

MBAFF macroblock-level frame-field adaptation

30% reduction

Fault-tolerant coding

Simple striping mechanism is sufficient to meet the error concealment and recovery requirements of broadcast applications

Data segmentation, complex macroblocks such as FMO/ASO, strip organization mechanism, forced Intra block refresh coding, constrained intra prediction, etc.

Particularly complex to implement

Much lower than

IV. Summary
The AVS video standard (GB/T 20090.2) is a standard based on China's independent innovation technology and international open technology. It is mainly aimed at high-definition and high-quality digital TV broadcasting, network TV, digital storage media and other related applications. It has the following characteristics: (1) High performance, the coding efficiency is more than twice that of MPEG-2, and is at the same level as the coding efficiency of H.264; (2) Low complexity, the algorithm complexity is significantly lower than that of H.264, and the software and hardware implementation costs are lower than H.264; (3) China controls the main intellectual property rights, and the patent licensing model is simple and low-cost. Based on this, we believe that the AVS standard is an important standard that can support the development of the country's digital audio and video industry.

References
[1] Information Technology Advanced Audio and Video Coding Part 2: Video. AVS N1165, 2005
[2] Huang Tiejun, Gao Wen. Background of AVS Standard Formulation and Intellectual Property Status. Television Technology. 2005, No. 7. P4-7
[3] Liang Fan, Siwei Ma, Feng Wu. Overview of AVS Video Standard. Proc. 2004 IEEE Intl. Conf. Multimedia & Expo., 2004: 423-426
[4] Liang Fan. Technical Characteristics of AVS Video Standard. Television Technology. 2005, No. 7
[5] ISO/IEC JTC1/SC29/WG11 (MPEG). N6231 Report of The Formal Verification Tests on AVC (ISO/IEC 14496-10 | ITU-T Rec. H.264). December 2003, Waikoloa
[6] National Radio and Television Product Quality Supervision and Inspection Center. AVS video encoding/decoding solution image quality subjective evaluation test report. December 2004
[7] State Administration of Radio, Film and Television, Radio and Television Planning Institute. AVS video compression quality subjective evaluation (AVS reference software version 5.2) test report. September 2005

Reference address:The difference between AVS and international standard MPEG

Previous article:A brief introduction to the international video coding standard MPEG and key technologies of AVS video
Next article:ADV7525 High Definition Multimedia Interface HDMI Transmitter

Latest Analog Electronics Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号