Analysis and Implementation of Embedded DVR Based on MPEG-4-EEWORLD

Collect

1. Current status of DVR development and advantages of MPEG-4 in embedded DVR

1.1 Development of DVR
The development of DVR has generally gone through two stages: PC-based DVR and embedded DVR. Traditional PC-based DVR uses Windows operating system stored in hard disk. This open system has advantages such as user-friendly graphical user interface GUI, but it also has disadvantages such as inherent instability of Windows and limited support for CPU. Embedded DVR uses real-time operating system RTOS that can be embedded in ROM/Flash memory. It has a wide control area, can form a very complex monitoring network, and has stable and reliable performance, which makes up for this defect.

In embedded DVR, video compression technology is its key core technology. The mainstream compression technology used by DVR is MPEG-1. The core of MPEG-1 is discrete cosine transform and bidirectional motion compensation algorithm. The main idea is to reduce the amount of data by reducing the redundancy and correlation information between images in time and space. MPEG-1 can achieve a good image clarity effect at a transmission rate of 800kbps to 2Mbps. However, embedded DVRs that use MPEG-1 also have many disadvantages: ① High hard disk consumption; ② Due to the large amount of data, it is not suitable for network transmission; ③ Not flexible enough, poor adaptability, and cannot adaptively adjust the transmission rate according to network conditions. The emergence of MPEG-4 overcomes all of the above shortcomings and is the development trend of modern embedded DVRs.

1.2 Advantages of MPEG-4 in embedded DVRs
MPEG-4 is far superior to MPEG-1 in compression methods. MPEG-4 is based on scene description and band-oriented design, which makes MPEG-4 have great advantages in the field of video surveillance recording in the following aspects:
(1) Saving storage space.
(2) Higher video resolution. Although MPEG-4 is an audio/video solution aimed at low bandwidth, its unique compression method is also suitable for CIF or higher resolution (768 × 576, 640 × 480) video compression. In this way, it effectively breaks through the maximum resolution CIF (352 × 288) limit of MPEG-1 and obtains better video compression quality.
(3) Adjustable video frame rate.
(4) It is more conducive to network transmission than MPEG-1. The superior video compression method of MPEG-4 also determines its excellent performance on low-bandwidth networks.

2. Application of MPEG-4 core ideas and key technologies in DVR

MPEG-4 represents the second generation of compression coding technology based on model/object. It makes full use of the visual characteristics of the human eye, grasps the essence of image information, and supports interactive functions based on visual content from the perspective of contour and texture. It adapts to the development trend of multimedia information application from playback type to content-based access, retrieval and operation, and is particularly suitable for application in embedded DVR.

2.1 MPEG-4 core ideas - object-based coding
AV object (Audio Visual Object) is an important concept proposed by MPEG-4. It is an entity that can be accessed and manipulated in a scene. Specifically, in an image, an object is a group of regions that can represent meaningful entities. The division of objects can be based on their unique texture, motion, shape, model and high-level semantics. MPEG-4 coding is based on AV objects. AV objects are the representation units of audio-visual content, and their basic unit is the original AV object. The original AV object can be a natural or synthetic sound or image. It has the flexibility of time and space domains, and can scientifically allocate bits according to the importance of the object, so as to ensure the subjective quality of the image. Therefore, MPEG-4 has the characteristics of efficient coding, efficient storage and transmission, and interoperability. AV object coding is the core idea of MPEG-4. In the application of DVR, this efficient compression based on AV object coding will inevitably save a lot of storage resources, and can also be flexibly expanded according to the objective conditions of on-site bandwidth and bit error rate to make full use of bandwidth. The benefits of this AV object-based coding will be more clearly seen from the analysis of the core technologies adopted by MPEG-4 below.

2.2 Key technologies of MPEG-4
(1) Video object extraction technology. MPEG-4 implements content-based coding by first dividing the video and image into static objects or moving objects, and then adopting corresponding coding methods for different objects to achieve efficient compression. The general steps of video object segmentation are: ① Simplify the original video/image data through low-pass filtering, median filtering, morphological filtering, etc. to facilitate segmentation; ② Extract features such as color, texture, motion, frame difference, displacement frame difference and semantics from the video/image data; ③ Determine the segmentation decision based on a certain uniformity index; ④ Perform relevant post-processing to filter out noise and accurately extract the boundary.

When used in MPEG-4 in embedded DVR, the extraction of video objects is much simpler than other applications. There will not be too much complex video content in the video/image of embedded DVR. Basically, it can be divided into a static object: background and several moving objects. (2)

VOP (Video Object Plane) encoding technology. VOP is a sample of the video object (VO) at a certain moment, which is also one of the core concepts of MPEG-4 video encoding. MPEG-4 adopts different encoding strategies for different VOs during the encoding process, that is, the compression encoding of the foreground VO retains details and smoothness as much as possible; the background VO adopts a high compression rate encoding strategy, and even does not transmit it but splices it with other backgrounds at the decoding end. This object-based video coding not only overcomes the block effect caused by high compression rate coding in the first generation of video coding, but also allows users to interact with the scene, thereby improving the compression ratio and realizing content-based interaction, providing a broad space for the development of video coding. Moreover, MPEG-4 supports the encoding and decoding of images and videos of arbitrary shapes. For video objects of arbitrary shapes, MPEG-4 uses the VLBV (Very Low Bit rate Video) core for encoding.

In the application of embedded DVR, there is a big feature: the background object is stationary or rarely moves. Therefore, the background object can be encoded with a high compression rate or appear in fewer frames according to the actual situation, which can greatly reduce the pressure on storage and transmission. The above method is the Sprite encoding idea in MPEG-4. Sprite is called a mosaic or background panorama. A Sprite is an image composed of all the parts of a video object in a video sequence. Through Sprite encoding, the background Sprite is first sent to the decoding end and the background image is generated. After that, only a small amount of parameters need to be transmitted, which can reduce the amount of data in the entire video transmission process and achieve a high compression ratio. In order to reduce the delay of sprite images, the hierarchical transmission function of MPEG-4 can be used to transmit them in layers and blocks.

(3) Scalable coding technology. Scalable coding is also scalable coding. The scalability of video refers to the adjustability of the bit rate, that is, the video data is compressed only once, but can be decoded with different frame rates, spatial resolutions or video qualities according to the actual situation, so as to support a variety of different application requirements. Scalable coding is divided into spatial domain expansion and temporal domain expansion coding. MPEG-4 implements hierarchical coding through the video object layer (VOL) data structure. In scalable coding, the video sequence is divided into two layers: the basic layer and the enhancement layer. The basic layer provides the basic information of the video sequence, and the enhancement layer provides the video sequence with higher resolution and details. In the embedded DVR, the object-based layered transmission concept of MPEG-4 is adopted, and the spatial resolution and frame rate are adjusted using spatial and temporal domain expansion coding. In this way, on the one hand, the bit rate control can be easily implemented, which can well change the network bandwidth; on the other hand, the interactive performance between the user and the DVR can also be realized, and the user can easily select the resolution and frame rate to obtain a satisfactory video effect.

(4) Motion estimation and motion compensation technology. MPEG-4 uses three frame formats, I-frame, P-frame and B-frame, to characterize different types of motion compensation. I-frame uses intra-frame compression, only utilizes spatial correlation, and does not use motion compensation, so it does not rely on other frames and is the reference frame for decoding. I-frame images appear periodically in the image sequence. P-frame and B-frame images use inter-frame coding, and utilize spatial and temporal correlations at the same time. P-frames use forward temporal prediction to improve compression efficiency and image quality, and B-frames use bidirectional temporal prediction to further increase the compression multiple, but B-frames cannot be used as reference frames for other image encodings and cannot be randomly accessed. Therefore, in the application of embedded DVR, when the user's available bandwidth is relatively low, only I-frames and as few P-frames as possible can be forwarded, which can reduce the transmission bit rate and obtain acceptable video quality. [page]

3. Implementation in embedded DVR based on MPEG-4

3.1 Hardware implementation
The hardware implementation of embedded DVR based on MPEG-4 is shown in Figure 1. The following scheme is adopted:
(1) DSP uses TMS320C6416 launched by TI, with an operating frequency of 750MHz and a computing speed of 4000MIPS.
(2) ARM board uses ARM embedded system expansion board (CY-ARM4510B).
(3) PCI interface board (CY-PCI2.2).

Figure 1 Hardware block diagram

3.2 Software Implementation
The software is written in ANSI C and compiled with GNU Make-3.74 or higher version of GCC compiler. The MPEG-4 codec is implemented in the framework provided by MPEG-4 after some simplification and optimization according to the actual situation of embedded DVR application. The code is optimized mainly from the following three aspects:
(1) Reduce the amount of code as much as possible and improve the effectiveness of data structure. Therefore, it is necessary to remove the structures and processing parts that are basically not used in DVR applications.
(2) And try to make the execution control more reasonable, reduce unnecessary storage allocation and release, and minimize the access to external memory.
(3) Improve ME/MC (FastMotion Estimation and Motion Compensation) to improve encoding efficiency.

The program is divided into three parts: the common part of encoding and decoding, the encoding part and the decoding part. According to the optimization ideas mentioned above, we optimized the program. After optimization, the structure of VOP has been greatly changed. The basic syntax elements are retained and the Sprite is simplified. The definition is as follows:
struct vop
{
/* VOP syntax elements* /
Int p rediction_type; /* VOP type* /
Intmod_time_base; /* VOP absolute base time* /
Float time_inc; /* VOP relative mod_time_base time* /
Int vop_coded;
Int rounding_type;
Int width; /* VOP width* /
Int height; /* VOP height* /
.
/* Motion estimation elements* /
.
/* Some sp rite coding elements passed from VOL* /
Int sp rite_hdim;
Int sp rite_vdim;
Int sp rite_left_edge;

Int sp rite_top_edge;
Int warping_accuracy; /3 Warping accuracy (2, 4, 8, 16) 3 /
Int sp rite_usage; /3 0:not used; 1:static; 3 /
Int no_of_sp rite_points;
/3 0:fixed, 1:translation, 2:rotation, scaling, 3:affine, 4:perspective drawing 3 /
TrajPoint 3 ref_point_coord;
/3 position of reference points sp rite 3 /
TrajPoint 3 traj_point_coord;
/3 position of reference points aftermotion compensation 3 /
TrajPoint 3 difftraj_point_coord;
/3 ( dui, dvj) trajectory coordinates, to be transmitted 3 /
Int brightness_change_in_sp rite;
Float brightness_change_factor;
Int low_latency_sp rite_enable;
/3 0:basic sp rite, 1:saved sp rite 3 /
struct vop 3 rec_sp rite; /3 current decoded sp rite pointer 3 /
Sp rite_motion 3 warp_param; /3 Global motion vector 3 /
.......
}
In VOP, if sprite_usage is 1, only static sprites are processed in the image; sprite_hdim and sprite_vdim correspond to the pixels of static sprites, and they change with the number of Macroblocks. For a DVR application with a stable environment, they can have a certain value; n Number of Sprite Points When it is 0, it is zero motion....... At the same time, similar optimizations are also made in the image, sprite_motion, vol, video_object, motion and other structures and encoding and decoding functions to adapt to the characteristics of DVR and improve encoding efficiency.

(2) The software supports spatial expansion and spatial expansion. The setting parameters can support five types: 0 is time domain expansion type 0; 1 is spatial expansion; 2 is temporal spatial expansion
; 3 is time domain expansion type 1; 4 is time domain expansion type 2 for spatial expansion.
Enhance PBBB . . .
Base IPPP . . .
For time domain extension, the frame rate of the base layer becomes 5fps and the frame rate of the enhancement layer is also 5fps. Three types are supported:
Case0 I-VOP appears periodically in the base layer and P-VOP appears in the base layer and enhancement layer, while B-VOP does not appear
Enhance layer PPPPP . . .
Base layer IPPPP . . .
Case1 I-VOP and P-VOP are encoded and decoded in the base layer, and only B-VOP is in the enhance layer
Enhance layer BBBBBB . . .
Base layer IPPP . . .
Case3 The base layer contains I-VOP, P-VOP and B-VOP, and only B-VOP is in the enhance layer
Enhance layer PBBBBB . . .
Base layer IB PBPBP . . .

4. Conclusion

MPEG-4 content-based compression is an advanced stage of information processing, which is closer to people's own information processing methods. This paper focuses on analyzing the advantages of applying MPEG-4 encoding methods to embedded DVRs and discusses this DVR implementation method. Practice shows that this application is an effective optimization of DVRs and can improve DVR performance in many aspects. MPEG-4 encoding will definitely be the development trend of the next generation of DVRs.

Keywords：MPEG-4 Reference address：Analysis and Implementation of Embedded DVR Based on MPEG-4

Previous article：Realizing multi-task communication on μC/OS-Ⅱ by using interrupt mode based on S3C44BOX
Next article：Implementation of Embedded Networked Control Simulation Based on Simulink

Popular Resources
Popular amplifiers