Analysis and implementation of TV video hardware decoding based on BT.656-EEWORLD

Collect

Abstract: According to the ITU2R BT. 656 television video coding principle, the effective data of each frame of video image is extracted through hardware decoding and transmitted to the player for display. At the same time, the effective data of the video is filtered according to a certain ratio specification while keeping the video undistorted, and finally the purpose of scaling down the video is achieved.

This image processing method through hardware decoding can not only reduce the video size proportionally but also save CPU occupancy. It can well adapt to the needs of TV systems in digital products with small screen windows such as mobile phones and has broad market application prospects.

With the continuous development of analog signal digitization, analog TV digitization has also been widely used in digital multimedia terminals. In particular, monitoring equipment and emerging mobile TV terminals have requirements for video aspect ratio, clarity, CPU occupancy rate, etc., which further urges people to study how to achieve the best encoding, transmission, decoding and playback of videos with the least resource occupancy rate. Analog video signal decoding is not only an important part of video applications, but also the basis of post-stage digital signal processing. The International Telecommunication Union has proposed the standard ITU2R BT. 601 for converting analog video signals into digital signals, and ITU2R BT. 656 is a data transmission interface , which can be said to be a transmission method of BT. 601. Therefore, how to most effectively restore the original image of the analog signal encoded by BT. 656 and save CPU space to the greatest extent is the key to video decoding technology.

1 System Analysis

Video decoding refers to the process of filtering out other data from the data stream from the external signal source , extracting a complete frame of valid data and transmitting it to the player for playback. It is not only an important part of video applications, but also the basis for the host computer to process data. The system adopts hardware decoding to restore the original video image after a series of processing of the encoded external signal source to realize the video decoding function. It consists of an external signal source (such as DVD, etc.), a data decoding module, a data cache module, a data transmission module , a display module (player) and a control module . The system block diagram is shown in Figure 1:

TV system block diagram

Figure 1 TV system block diagram

The signal source of digital video is formed by sampling, quantization and encoding on the basis of analog video signal. Different spatial vertical resolutions will have different spatial horizontal resolutions. After the BT. 656-encoded data is sent by the signal source, it is screened by the data decoding module, and the valid data is put into the data cache FIFO module. Then the DMA transmission module transmits the processed valid data to the cache ( memory space) of the host computer according to a certain format. When the data reaches one frame, the host computer will interleave one frame of data with odd and even numbers and put it into the data cache FIFO. Finally, the player restores one frame of data to the original image and presents it on the display for playback.

2 BT.656 encoding principle

There are currently three different analog television systems in the world: PAL, NTSC, and SECAM, and these systems cannot communicate directly with each other.

Therefore, the International Telecommunication Union Radiocommunication Sector adopted the ITU2RBT. 601 component digital system recommendation. ITU2R BT. 601 is the "Studio Digital Television Coding Parameters" standard, while ITU2R BT. 656 is the digital interface standard in Annex A of ITU2R BT. 601, which is used for the digital transmission interface standard between major digital video devices (including chips ) using 27 MHz/s parallel or 243Mbit/s serial interfaces. It includes signal formats, bit parallel interface characteristics, and bit serial interface characteristics that are common to both 525-line and 625-line interfaces. This interface provides unidirectional interconnection between a single signal source and a single endpoint. A frame of data contains 525 lines or 625 lines, and each line of data signal is encoded in the form of 8 bits (or 10 bits), including three components: video signal, time base signal, and auxiliary signal. The data is distinguished by the time base signal, which includes SAV and EAV, indicating the start and end of the data line respectively, and is composed of 4 bytes of data in hexadecimal FF 00 00 XY. Among them, FF 00 00 is the data flag bit of SAV and EAV, and XY is the information bit of the time base signal. The corresponding encoding format is shown in Table 1:

Table 1 Time base signal encoding format

Time base signal encoding format

The highest bit 7 of XY is fixed data 1; F = 0 indicates an even field, F = 1 indicates an odd field; V = 0 indicates that the line is valid video data, V = 1 indicates that the line has no valid video data; H = 0 indicates a SAV signal, H = 1 indicates an EAV signal; P3~P0 are protection signals, which are generated by the calculation of F, V, and H signals; P3 = V XOR H; P2 = F XOR H; P1 = F XOR V; P0 = F XOR V XOR H. When the V = 0 of the time base signal, it indicates that the line is video data; when V = 1, it indicates that the line is auxiliary data (when there is no auxiliary data, it is blanking, generally 10 and 80 appear alternately). The player can display the image correctly only after decoding and interleaving the odd-even data of a frame of 625-line (PAL) and 525-line (NTSC) television systems. According to the changes of the time base signals SAV and EAV, the non-valid data is filtered out, and the valid data of the even field and the valid data of the odd field are extracted and put into the host computer cache. The host computer only needs to intersperse the received pure valid data of the video and put it into the FIFO at the front end of the player to realize video playback.

3 Hardware decoding and proportional reduction design implementation

3.1 Hardware decoding design

Common television formats include PAL, NTSC, and SECAM. In order to achieve digital processing, A/D conversion must be performed. The converted format is usually a digital video signal of the ITU2R BT656 standard. However, the video signal at this time includes synchronization signals and blanking signals and cannot be processed.

In order to truly realize the effective video data that can be processed, Y, Cb, Cr are accurately separated, and the ITU2R BT656 video stream is decoded. At the same time, after the effective video stream is processed, Y, Cb, Cr and the synchronization signal and blanking signal are compounded and decoded. In this regard, video decoding is extremely important in video processing. Video decoding can be divided into hardware decoding and software decoding according to the different ways of extracting effective data. Software decoding is closely integrated with computers. Although it has the advantages of relatively convenient and flexible processing, the CPU occupancy rate is relatively high due to the large amount of data to be processed. Hardware decoding can save CPU occupancy rate to a large extent, but one frame of data requires more than 800 kbyte. It is almost impossible to complete the parity interleaving on the hardware and transmit it to the host computer. Therefore, the host computer needs to interleave one frame of data. The encoding format of one frame of data of the corresponding 525-line and 625-line television system is shown in Figure 2:

BT. 656 has two frame data formats: 525/60 and 625/50

Figure 2 BT.656's 525/60 and 625/50 frame data formats.

According to the changes of the XY information bits F and V in the time base signal SAV and EAV in the encoding protocol, the first line of each frame of data can be determined (the FV of the 525-line system jumps from 10 to 11, and the FV of the 625 system jumps from 11 to 01). At this time, it can be marked as the frame head of a frame of data. At the same time, a counter is set . Every time EAV is judged, the counter is increased by 1. When the number of lines required for a frame format is counted, the mark of the end of the frame is told to the host computer, and then the counter is cleared to wait for the start of a new frame of data. Usually, data is transmitted continuously frame by frame, and the previous operation is repeated. Therefore, the host computer can directly obtain each frame of valid data based on the data received after being processed by the hardware. It only needs to put the data into the memory when the frame head is judged, and when the frame end is judged, the frame of data is interspersed and transmitted to the player for playback. According to calculations, a frame of data requires a capacity of more than 800 kB. Therefore, the 1M memory space of the host computer is sufficient for processing a frame of data.

[page]

3.2 Scaled-down design

In order to achieve the purpose of proportional reduction of video, the effective data must be screened strictly according to the data encoding format. For example: PAL standard (625 lines per frame) has 576 lines of effective data, each line has 720 pixels, so the image format is 720 × 576.

Now, if we want to generate a 640 × 480 image format, we need to filter 80 pixels per line and 96 lines per frame, that is, 40 lines are filtered before and after each line, 24 lines before and after the odd domain, and 24 lines before and after the even domain, as shown in Figure 3.

If the format of the image to be generated is required to be less than half of the original, it is necessary to filter every other row and every other pixel to ensure that the reduced image can still be consistent with the original image to the greatest extent possible. Therefore, this method can generate images of any required size ratio.

Frame data structure before and after processing

Figure 3 Frame data structure before and after processing.

3.3 Experimental results

In actual tests, when playing with a VHS TO DVD player, the CPU occupancy rate of the BT.656 data format TV system software decoding on a PC system with 1G memory and a P4.0 CPU is 33% to 40%, while the CPU occupancy rate of the hardware decoding under the same conditions is only 3% to 9%, achieving satisfactory results.

The comparison charts of CPU usage effects are shown in Figures 4 and 5.

Software decoding CPU usage

Figure 4: Software decoding CPU usage.

Hardware decoding CPU usage

Figure 5: CPU usage of hardware decoding.

In the experiment, the clarity of the image reduced by 2:1 is still high. The reduced image maintains the effect of the original image to the greatest extent, but the data sampling amount becomes half of the original. The following is a comparison of the original image and the image reduced by 2:1 as shown in Figures 6 and 7 respectively.

Original picture effect

Figure 6 Original picture effect.

The image effect after being reduced by 2:1

Figure 7 The effect of reducing the image by 2:1.

4 Conclusion

This paper proposes a hardware decoding solution for BT. 656 TV video system. The difficulty lies in the hardware's internal accurate screening of the video data to select the valid data that the player needs to display and filtering other data. The advantage of this solution is that it can greatly reduce the workload of the video processor , and at the same time, the hardware can be configured to complete the proportional reduction function of the video data, which can meet the needs of small-screen window digital products such as mobile TVs.

Reference address：Analysis and implementation of TV video hardware decoding based on BT.656

Previous article：Design and implementation of LED driver based on I2C interface
Next article：Microsemi Launches Combination Timing Controller and LED Backlight Solution for Next-Generation 3D LCD TVs

Popular Resources
Popular amplifiers