Network camera technical structure, coding standards and mainstream solutions-EEWORLD

Collect

Network cameras are a new generation of products that combine traditional cameras with network video technology. In addition to all the image capture functions of general traditional cameras, the machine also has a built-in digital compression controller and a WEB-based operating system, so that the video data can be compressed and encrypted and sent to the end user through the local area network, the Internet or a wireless network.

The application of network cameras has made a qualitative leap in image monitoring technology. First, the integrated wiring of the network replaces the traditional video analog wiring, realizing the true integration of four lines (audio, video, data, power supply), and the network camera is plug-and-play, which is easy to implement and expand the system. Second, cross-regional remote monitoring is possible, especially with the Internet, image monitoring has no distance restrictions, and the image is clear, stable and reliable. Third, the storage and retrieval of images are very safe and convenient, and can be stored in different places, with multi-machine backup storage and fast non-linear search. The following figure is a typical application topology of network cameras in video monitoring.

Network Camera Application

Technical structure of network camera

As you can see, network cameras can be directly connected to the TCP/IP digital network, so the main function of this system is to transmit video and audio through the Internet or internal LAN. From the internal structure, the basic structure of network cameras is generally composed of image sensors, video encoders, network servers, external alarms, control interfaces, etc.

Compared with traditional analog cameras, the core technology of network cameras is video encoder. Now, let's do a technical analysis of the various parts of network cameras, and then focus on the core video encoder part.

Image Sensor

Image sensor: Traditional analog cameras collect video information through image sensors, and then directly output analog video signals through video lines. There are two main types of image sensors for cameras, namely CMOS and CCD. Since CCD has certain advantages over CMOS in terms of image quality, but CMOS has developed in recent years and has a high cost-effectiveness, its market share in the monitoring market has increased year by year. Therefore, the mainstream image sensor used in monitoring projects is CMOS, and the mainstream manufacturers of CCD are mostly Japanese companies, such as Sony, Sharp, etc., which account for more than 90% of the global CCD market.

Video encoder: Its function is to digitally encode the CCD video signal according to a certain format. Some directly capture the BT.656 signal output by the CCD, and some collect the analog signal output by the CCD driver and perform analog-to-digital conversion through a video AD. There are many video encoding standards. The main standards for network cameras are MJPEG, MPEG4, H.264 and H.265.

Network server: Its function is to output the compressed video signal through the TCP/IP protocol, and basically support the mainstream communication formats at this stage, such as PPPOE, DNS, UDP, TCP, etc.

External alarm and control interface are auxiliary functions of network cameras, which are mainly realized through serial port or IO port. Serial port methods include RS232 and RS485, etc.

Image coding standards

Currently, the main image compression encoding standards for network cameras include M-JPEG, MPEG4, H.263, H.264, etc. Below we will give a brief introduction to these technologies.

M-JPEG

M-JPEG technology is a motion still image compression technology that treats a moving video sequence as a continuous still image. This compression technology compresses each frame completely and individually, and can randomly store each frame during the editing process, allowing frame-by-frame editing. However, M-JPEG only compresses the spatial redundancy within the frame, and does not compress the temporal redundancy between frames, so the compression efficiency is not high.

MPEG4

MPEG standard refers to a series of standards on compression and decompression technology of audio and video signals and multimedia signals formulated by ISO's Moving Picture Experts Group. MPEG-4 focuses on solving the problem of audio and video transmission on low bandwidth. On a bandwidth of 164KHZ, MPEG-4 can transmit an average of 5-7 frames per second. Network products using MPEG-4 compression technology can use networks with lower bandwidth, such as PSTN, ISDN, ADSL, etc., which greatly saves network costs. In addition, the highest resolution of MPEG-4 can reach 720×576, which is close to the DVD picture effect. The image compression mode determines that it can guarantee good clarity for moving objects. All these advantages of MPEG-4 make it one of the important trends in the development of current network product manufacturers.

H.263

H.263 is a video codec recommendation proposed by ITU-T for use in H.324 terminals. After continuous improvement and multiple upgrades, H.263 has become increasingly mature and has now largely replaced H.261. In addition, H.263 is becoming increasingly popular because it can transmit high-quality video streams over low bandwidth.

H.263 is a hybrid coding based on motion-compensated DPCM. It performs motion compensation based on motion search, and then uses DCT transform and zigzag scanning coding to obtain the output bitstream. Based on the H.261 recommendation, H.263 increases the search of motion vectors to half-pixel search; at the same time, it adds four advanced options such as unlimited motion vectors, syntax-based arithmetic coding, advanced prediction technology and PB frame coding; thus achieving the purpose of further reducing the bit rate and improving the coding quality.

H.264

H.264 is a joint video group (JVT: joint video) of ITU-T's VCEG (Video Coding Experts Group) and ISO/IEC's MPEG (Moving Picture Coding Experts Group).

It is a new digital video coding standard developed by the ITU-T H.264 and the tenth part of ISO/IEC MPEG4.

With the same reconstructed image quality, H.264 can save about 50% of the bit rate compared to H.263, and improve the performance of the current video format implemented based on MPEG4 by about 33%.

H.265

H.265 is a new video coding standard developed by ITU-TVCEG after H.264. The H.265 standard revolves around the existing video coding standard H.264, retains some of the original technologies, and improves some related technologies. The new technology uses advanced technology to improve the relationship between bit stream, encoding quality, delay and algorithm complexity to achieve the optimal setting. Specific research contents include: improving compression efficiency, improving robustness and error recovery capabilities, reducing real-time delays, reducing channel acquisition time and random access delays, and reducing complexity. Due to algorithm optimization, H264 can achieve standard definition digital image transmission at a speed lower than 1Mbps; H265 can achieve 720P (resolution 1280*720) ordinary high-definition audio and video transmission at a transmission speed of 1~2Mbps.

Mainstream core solutions

As we just introduced, the main difference between traditional analog cameras and network cameras is that network cameras convert the analog video signals of traditional analog cameras into digital signals of a certain standard, and transmit them through the TCP/IP protocol, while also having certain auxiliary functions of external alarm and control interface. At present, the core network camera solutions all use one chip to complete the core functions of video compression and network server. At present, the main solutions for network cameras are DSP and ASIC. In terms of DSP, there are mainly TI, ADI, Trimedia, Ambarella, etc., and in terms of ASIC solutions, the solutions of Injia and HiSilicon have been relatively successful recently. Let's analyze these core solutions below.

TI (Texas Instruments)

It is the abbreviation of Texas Instruments, headquartered in Dallas, Texas, USA. It is a world-renowned semiconductor company, mainly engaged in the research of analog circuits and digital signal processing technology. Its representative DaVinci-DM3xARM9 video processor solution is widely used in the security industry. In 1951, it was renamed Texas Instruments and began to enter the semiconductor market. It has an important share in many market fields to date. When video surveillance in the security field developed from the analog stage to the digital compression processing stage, TI gradually took a leading position in the field of security video compression. The representative solutions in the IPC field are DM355, DM365, DM368, and the latest DM369 and DM388.

Hisilicon

HiSilicon Semiconductor Co., Ltd. was established in October 2004. Its predecessor was Huawei Integrated Circuit Design Center, which was established in 1991. HiSilicon is headquartered in Shenzhen, with design branches in Beijing, Shanghai, Silicon Valley, USA and Sweden. HiSilicon's products cover chips and solutions in the fields of wireless networks, fixed networks, digital media, etc., and have been successfully applied in more than 100 countries and regions around the world; in the field of digital media, it has launched network monitoring chips and solutions, videophone chips and solutions, DVB chips and solutions, and IPTV chips and solutions. In 2009-2012, DVR chips were very popular. Representative solutions in the field of IPC are Hi3516, Hi3516C, Hi3517, Hi3518A, Hi3518C, and Hi3518E.

Ambarella

Founded in 2004, the company is headquartered in Santa Clara, California. The Chinese name of the company is "Ambarella". Ambarella is a technology leader in the high-definition video industry, mainly providing low-power, high-definition video compression and image processing solutions. In the television broadcasting market, Ambarella technology is also widely used. TV programs from all over the world are compressed and transmitted by Ambarella chips. It was the first in the industry to launch a highly integrated SoC chip based on the latest H.265 video compression standard, integrating various key system functions, providing cost-effective high-definition overall solutions, and has nearly 90% of the market in the H.264 high-definition professional broadcast encoding equipment market. Representative solutions in the IPC field are A2, A5, A5S, A7, and A9.

Analog Devices Inc. (ADI)

ADI also has a certain share in the DSP chip market, and has launched DSP chips with its own characteristics. Its Blackfin series DSP has the characteristics of low power consumption and strong computing power. Among them, BF531 has a very good cost performance, with a batch price of only US$5, which is more suitable for use in low-end network camera solutions. However, BF531 can only process CIF MPEG4 encoding, which cannot meet the requirements of many occasions with high image clarity. In addition, ADI also has a more suitable product, BF561, which is a dual-core DSP that can process D1 resolution.

Imagia Taiwan

Yingjia has recently developed well in the video surveillance industry. In the network camera market, some camera manufacturers have begun to use Yingjia's products. Yingjia's solution is ASIC, and the compression algorithm used is standard MPEG4. The single chip can achieve D1. Yingjia's ASIC is mainly MPG440. Yingjia has previously launched MPG420 and MPG430. In comparison, MPG440 is basically more mature and has been recognized by some customers. At the same time, Yingjia also provides a relatively complete network camera solution and can provide strong support in application software.

NXP (NXP Semiconductors)

NXP chips are industrial-grade and have great advantages in relatively harsh environments; ROI function, very low bit rate, up to 512K, provides users with the possibility of watching high-definition and smooth image quality on mobile phones; the industry's high-end high-profile H.264 encoding method, high compression ratio, high video quality, based on the powerful processing power of the CPU, can implant intelligent analysis algorithms in the front end, which can greatly alleviate the back-end processing pressure, and powerful 3D noise reduction capabilities, there will be no noise flashing there, giving people the feeling of looking at a painting as peaceful; advanced AWB control, good color reproduction performance, etc., NXP chip solutions are very competitive in the market.

Reference address：Network camera technical structure, coding standards and mainstream solutions

Previous article：3D vision system can improve the accuracy of video surveillance in complex scenes
Next article：Application of Panasonic Security in the Railway Industry

Popular Resources
Popular amplifiers