Key Points for Developing Mobile TV Based on T-DMB-EEWORLD

Collect

This article discusses in general the key points of hardware and software development for embedded handheld devices such as mobile TV: how to design hardware, achieve audio and video synchronization, increase the H.264 decoding rate, and prevent DMA buffer overflow.

　　Hardware Design

　　Hardware Design Overview

　　The selection of hardware configuration should be considered comprehensively, such as the processing function of the CPU is related to the final decoding display effect. Of course, choosing some high-end general-purpose processors or dedicated media processors can achieve better results, but it increases the cost of hardware. A compromise can be made between the final display effect and the choice of hardware. At present, there are not many chips that can receive T-DMB and DVB-H standards, and some mainstream chips on the market can be selected. The hardware configuration of the product discussed in this article is: S3C2440A (400MHz), 64MB of SDRAM, apollo fs1110, kino2efs1026. It can basically meet the hardware requirements of mobile TV. The RF signal received by the antenna is sent to the RF tuning chip APOLLO FS1110 of the RF front end, which mainly demodulates the RF signal into IF (Intermediate Frequency) signal. This chip is a mainstream product on the market. It can receive multiple standard signals, and it is small in size (5.0mmx5.0mmx0.9mm), has low power consumption (80mW), and has three low-noise front-end amplifiers, covering the three frequency bands of L-Band, BandⅡ and BandⅢ. Apollo fs1110 sends the IF signal to kino2 efs1026, which completes the source code decoding and outputs MPEG2-TS data. Kino2 is a highly optimized baseband processor with a small size (10mm×10mm×1.3mm) and low power consumption (100mW). It can provide various DMB code rates, up to 1.8Mbps. It contains an RS decoder on the chip, which can achieve better mobile channel performance. Kino2 sends the source code TS code stream to the CPU, which completes the demultiplexing, decoding and display of the TS stream. The hardware design block diagram is shown in Figure 1.

　　Figure 1 Hardware design block diagram

　　Description of each hardware functional module

　　Mobile TV terminals must eventually support multiple standards and multiple frequency bands, which is also the market demand. Since the frequency bands used by the three places where mobile TV services are currently being carried out are not the same, such as Beijing and Guangdong use VHF Band 3, while Shanghai uses L-Band. Therefore, if the same mobile TV terminal is to receive mobile services in different regions of the country, it needs to support multiple frequency bands. The working frequency bands used by T-DMB discussed in this article are VHF 3 band and L band, so Band Ⅲ and L-Band of FS1110 should be used; Band Ⅱ is mainly used for FM broadcasting. All three high-frequency inputs of FS1110 can be used, and the frequency band selection can be controlled by FS1026 through the IIC interface. At the same time, the initialization of the internal registers of FS1110 is also completed through this interface.

　　The downstream FS1026 baseband processor module receives the IF signal sent by the RF tuner and finally completes the source decoding. The output MPEG2-TS data supports both parallel and serial formats. Serial data can be directly connected to the CPU through the SPI interface. The baseband module can also exchange control information with the CPU through the SCP (Serial Control Port) interface, and the SCP interface and the IIC interface are fully compatible. It is also possible to communicate with the CPU using the serial port (UART). Since some commercial DMB programs are scrambled (encrypted) by the service provider, the smart card module can complete the descrambling function.

　　The function of the CPU is to receive TS data through the SPI interface, complete the decoding of audio and video, and then display it. The data can be cached through DMA, and then the data can be read from DMA for demultiplexing. The DMA method is a high-speed data transmission operation that allows data to be read and written directly between external devices and memory, neither through the CPU nor requiring CPU intervention. The entire data transmission operation is carried out under the control of the DMA controller. In addition to doing some processing at the beginning and end of data transmission, the CPU can do other work during the transmission process. In this way, most of the time, the CPU and input/output are in a parallel operation state. Therefore, the efficiency of the entire system can be greatly improved. Under the WinCE platform, the operation of DMA is quite convenient, and the driver development is not difficult. Reading data is like operating ordinary files. There is only one difference, that is, to prevent DMA overflow. Because the reading of ordinary files is completely controllable, and what is processed here is a "real-time" stream, overflow may occur. Overflow includes overflow (data reading is too slow) and underflow (reading is too fast). The principle of preventing data overflow at the receiving end of the MPEG decoder is the same, mainly because the image encoding format is different, and the data rate sent to the decoder front end is not constant. MPEG uses flow rate feedback control to control data overflow, so that the data rate reaching the audio and video decoder tends to be constant. The control of DMA is simpler, using a dedicated thread to read data, and the demultiplexing thread can discard some frames or slow down the decoding speed according to the amount of data. However, frame loss is still common.

　　Hardware Design Considerations

　　The main problems when designing hardware circuits are high frequency and electromagnetic compatibility. The general method is to add a shielding cover. APOLLO FS1110 can be added with a shielding cover to reduce the spatial interference of the module. Of course, APOLLO FS1110 and KINO2EFS1026 can also be made into external modules. The impact of high frequency can also be reduced by optimizing the design of the schematic diagram. Because the quality of the schematic diagram directly affects the difficulty of layout and wiring, as well as the performance of the board in the future. In order to clearly carry out the partition design during layout and wiring to reduce the impact between various functional modules, the digital, analog and RF circuits should be separated when designing the schematic diagram. However, due to the small size of handheld devices, a shielding cover is generally indispensable.

　　Software Design

　　Overview of T-DMB Standard

　　T-DMB uses the H.264 video compression standard, the audio uses the MPEG-4 bit-sliced arithmetic coding BSAC (Bit-Sliced Arithmetic Coding) or AAC+ (adopted by European T-DMB) with lower patent fees, and the image format is CIF (Common Intermediate Format) (352×288). These audio and video code streams are added with some user data, packaged by the MPEG-4 SL (Sync Layer) synchronization layer and multiplexed by the MPEG-2 TS (Transport Stream), and then handed over to the modulator for modulation into a signal suitable for transmission on the channel and transmitted. The receivers of various standards have significant differences in channel decoding, but the decoding of the source is very similar. The structure of the encoder at the transmitter of the T-DMB system is shown in Figure 2.

　　Figure 2 T-DMB transmitter coding block diagram

　　The MPEG-4 OD/BINFS generator generates audio-visual objects, scene spatiotemporal relationship information and audio-visual object descriptor information. The IOD generator generates the initial information of the audio-visual object: scene description and object description information. The segment generator mainly collects SLP and IOD data information to generate reference information PSI (Program Specific Information) related to program demultiplexing. In the data stream of T-DMB, the IOD_descriptor can be obtained by parsing the description field in the PMT, and the scene and object description information can be obtained from the IOD_descriptor. The object description can obtain information such as ES_descriptor. The SL synchronization packager is mainly responsible for the synchronization of audio-visual objects and auxiliary data. After the SL packet is packaged by PES, the PES packet is packaged as a TS packet and sent to the modulator.

Functional description of the software

　　The main task of the software is to demultiplex TS streams and decode H.264 and AAC+. It is developed using Microsoft's Direct Show technology, which can reduce the difficulty and cycle of development. Direct Show technology is a multimedia development kit for the Windows platform provided by Microsoft, based on COM. Direct Show uses the Filter Graph model to manage the entire data stream processing process. The functional modules involved in the processing are called Filters, which are divided into three categories according to their functions: Source, Transform, and Rendering Filter. The Source Filter is mainly responsible for acquiring data and pre-processing; the Transform Filter is responsible for data format conversion and transmission, mainly responsible for decoding; and the Render Filter is responsible for display. The interaction between each Filter and the application is completed by the event notification mechanism: when the Filter state changes, an event is issued, which is processed by the Filter Graph Manager or sent to the application. The entire software can be divided into five major functional modules, as shown in Figure 3. The TS demultiplexer module belongs to the Source Filter. Its function is to obtain data from the DMA buffer, and then parse the PAT (Program Association Table) and PMT (Program Map Table) from the TS stream. After obtaining the PID (Packet Identifier) of the TS packets of the audio and video data of the relevant program, it can combine the PES (Packetized Elementary Stream) packets, and also obtain the parameters related to audio and video synchronization: PCR (Program Clock Reference), CTS (Presentation Time Stamp), DTS (Decoding Time Stamp). Finally, the ES (Elementary Stream) data after removing the header of the PES packet is sent to the downstream decoding filter. The H.264 and AAC+ decoding modules belong to the Transform Filter. The main function is to decode the audio and video data obtained from the upstream, reorder the decoded PU (Presentation Unit) (only when bidirectional prediction is used), and send it to the downstream generator. The video generator and audio generator modules belong to the Rendering Filter, which mainly completes the display function. If the data format needs to be converted, a Transform Filter with conversion function can be added between the decoder and the generator.

　　Synchronization of audio and video

　　The key technology in software design is to solve the problem of audio and video synchronization. Audio and video synchronization is mainly solved in the TS demultiplexer. To achieve audio and video synchronization, these parameters are needed: PCR, DTS, and PTS. PCR can be obtained from the adjustment field of the TS packet, and PTS can be obtained from the PES packet. The data in the PES packet is an SL packet, and DTS can be obtained from the SL packet header. DTS is the decoding time and PTS is the display time. PCR counts the encoder's 90K clock. Its function is to provide the initial value of the decoder's PCR counter when the decoder switches programs. PTS and DTS are most likely to reach the same time starting point as PCR, that is, to provide a common clock reference for the decoder to accurately synchronize audio and video. When PCR captures the same moment as DTS, audio and video decoding can be performed. Because bidirectional prediction is used in video encoding, an image unit is not displayed immediately after being decoded. It may stay in the memory for a period of time as a decoding reference for the remaining image units and will not be displayed until the reference is completed. Since sound does not use bidirectional prediction, its decoding order is its display order, so MPEG only proposes the concept of PTS for it, and PTS is the DTS value of audio. That is:

　　DTS=PTS (1)

　　If PTS is not available, calculate as follows:

　　PTS=PTS_pre +Xms (2)

　　Among them, PTS_pre represents the PTS of the previous AU, and X is the time interval of ACC+one frame in ms.

　　Generally, video objects are divided into three coding types: I-VOP, B-VOP, and P-VOP. Assume that the order of VO (Video Object) input at the decoder end is:

　　1 2 3 4 5 6 7 8 9 10………

　　I B B P B B P B B P B B P B B I B B P........

　　Since bidirectional prediction is used when encoding video objects, the actual decoding order of the decoder is:

　　I P B B P B B P B B P B B I B B P B B ........

　　The display order is the same as the input order of the decoder. Assuming that the PTS and DTS of the I frame are known, then the following is obtained about the P frame:

　　PTS_P4=PTS_I +33.67ms * 3 (3)

　　DTS_P4=DTS_I +33.67ms (4)

　　B1 frame：PTS_Bn=PTS_I +33.67ms * 2 (5)

　　DTS_Bn=DTS_I +33.67ms (6)

　　The B2 frame can refer to the above two formulas, where 33.67ms is the video frame time interval.

　　Software Development Considerations

　　Regarding the decoding efficiency of H.264. The software decoding part uses the H.264 decoder in the open source project ffmpeg, which is highly efficient and easy to port. Among them, key operations such as IDCT and motion compensation are also implemented in assembly on several different platforms. The H.264 decoder is ported to the ARM platform. For the IDCT and motion compensation assembly codes, it can be implemented by imitating the codes of other platforms, and its development difficulty is not great. For the audio decoding part, you can refer to the FAAC and FAAD open source projects.

　　Conclusion

　　This article discusses the hardware and software design of an embedded handheld device that can receive mobile TV signals that comply with the T-DMB specification (the differences between the receiving terminals of various standards are very small). This device allows users to directly obtain digital TV signals without going through the mobile communication network, which can meet people's demand for information anytime and anywhere. In the actual development process, the main hardware problem is electromagnetic compatibility, and the software is the synchronization of audio and video and the decoding efficiency of H.264. The difficulty of software development is concentrated on the demultiplexing of MPEG-2 and the design of the Direct Show application framework.

Keywords：T-DMB Reference address：Key Points for Developing Mobile TV Based on T-DMB

Previous article：Does the popularity of sRGB mode promote the true color reproduction of projectors?
Next article：Conditional Access System (CAS) Overview

Popular Resources
Popular amplifiers