Key technologies and application scenarios of low-latency video based on 5G network

Publisher:Mingyue1314Latest update time:2023-08-07 Source: elecfans Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

This sharing will be divided into three parts: the first part introduces the key technologies involved in low-latency video, including low-latency video encoding and decoding, video transmission, video processing low-latency framework, video acquisition and display; the second part focuses on the requirements for low-latency video in the 5G environment to combat weak networks, including: detection of weak network status, congestion control, etc.; the last part will combine actual test results to introduce application examples in scenarios such as port remote control and remote driving.


The development of network technology has brought about the discussion of the issue of low latency. In the past, the network latency was relatively high, and the chip processing took time, so the latency was not perfect. With the development of technology, the improvement of chip processing power and the development of the network, low-latency video can be used in some special scenarios. So today I want to talk about the following: First, I will raise the issue of "low latency", then introduce how to solve these problems and related technologies, and finally introduce the application scenarios of low-latency video.


-01-

Problems with low-latency video

5823ab08-115c-11ee-962d-dac502259ad0.png

Delay is a key technical indicator that will be encountered when making videos. In normal face-to-face real-time communication, a delay of 200ms can generally achieve a satisfactory state, because the reaction speed of communication between people is not that fast. However, when people communicate with machines or machines connect with each other, the delay requirements will be relatively high, because the reaction speed of machines is much faster than that of people. For example, for a controlled medium-speed engineering vehicle, a delay of 100ms can usually meet the control requirements. A typical example is the robotic arm on a remote-controlled vehicle. In addition, games also have relatively high delay requirements, preferably below 100ms, and preferably around 50ms or 60ms. Because although the game is controlled by people, it controls more intense game scenes, such as driving and shooting. The game may be lost if the delay is a little bit short, so the delay is very critical to the user experience. The remote driving speed is usually 60 kilometers per hour. If the delay is around 60ms, there will be an error of about 1 meter during control. For another example, when controlling a drone on a highway, the "smart road" requires that the operator be informed of whether there is an accident or obstacle ahead while controlling it, so the delay needs to be around 30ms to meet the user's requirements.

58583c38-115c-11ee-962d-dac502259ad0.png

So how do we define latency? The delay from the generation and acquisition of the video to the display (in the cloud gaming scenario, that is, from the time the video is generated), that is, the video is sent, transmitted over the network, received, and displayed. This entire process is an end-to-end delay. The delay in the middle segment can also be calculated separately, for example, from the sender to the receiver, the delay at the head and tail is not calculated. Many links in the middle will cause delays, so the delay of each link needs to be calculated and the indicators need to be considered to what extent, so that the complete delay can be defined. At the same time, the source of the content needs to be considered. For example, if the content is generated from a certain cloud, the delay also needs to be calculated. Therefore, technical indicators need to take into account all aspects. The above is its definition.

5874d8d4-115c-11ee-962d-dac502259ad0.png

The list can be used to estimate how much delay each link will generate and which links are prone to delay. The first link is sampling. If it is 30 frames per second, there will be an interval of about 33ms. This (process) itself also has delays, because there is a fixed interval for acquisition, and the fixed interval is the sampling delay. The next is the pre-processing of the media, which is mainly the delay caused by the ISP chip, such as noise reduction, distortion correction, etc. The calculation and data transmission will generate delays, which is related to its processing power, and we can also estimate it.

Then there is the sending buffer, network transmission, and receiving buffer, which also take time because the network bandwidth is limited. If the bandwidth is very wide, the transmission time can be ignored. The problems caused by this will be introduced in detail later. This part is the main source of delay.

After that comes post-processing, which is the same as pre-processing and belongs to media processing. The delay of this part is relatively fixed, and its value will not fluctuate too much. The delay of the network part in the middle fluctuates greatly, and it changes with the network conditions and environment, and it is also related to whether there is packet loss. The final display, after decoding, enters the display cache and display, and there is also a waiting delay for the display. Based on the above calculations and the current system capabilities, if the sampling delay is not calculated, the delay is about 20 to several hundred.

58a0220a-115c-11ee-962d-dac502259ad0.png

Next, we will analyze the reasons for the delay one by one. One is the computational overhead. For example, the calculations are very complex during processing, involving some numerical calculations, such as white balance, noise reduction, noise removal, encoding, and decoding. These are all computationally intensive and take time, ranging from a few milliseconds to tens of milliseconds. This part belongs to the computational overhead.

The second source of delay is transmission delay. The network itself has delay. No matter what kind of network it is, light transmission takes time, router processing also takes time, and no matter what method is used (transmission), it takes time. The amount of video data is relatively large, and large video frames range from hundreds of kbps to several megabytes. Transmission is carried out on a limited bandwidth. Assuming a network of about 50M or 100M, it takes several milliseconds to transmit a 2M video. From the waveform diagram of the video transmission data below, it can be seen that the video code stream is unstable, that is, the size of each frame is quite different, the time required for each frame to be transmitted is uncertain, and the time for each frame to reach the destination is different. It is not fixed, so the arrival time cannot be predicted. The video data before compression reaches about 1G, that is, the data has such a large amount of data before and after encoding, which takes time to display and process on the terminal device. Therefore, transmission delay is an aspect that needs to be considered. There is also anti-packet loss. When the network quality is relatively poor, to deal with network jitter or network packet loss, a "price" must be used. Usually, to ensure user experience, the "price" to be used is to exchange for delay. Therefore, delay should be increased to resist (packet loss), such as retransmission. This part of the delay is variable and changes with the change of network status. Another is the delay between task scheduling. To string together the above steps, different modules and tasks need to be arranged well, like an assembly line. If the assembly line is not arranged well, "waiting" will occur: the data has not arrived yet, I cannot process it, and I need to wait for the data to arrive before I can start. This includes waiting delay.

58c619c4-115c-11ee-962d-dac502259ad0.png

Let's look at the relationship between latency, bandwidth and computing power. They are triangular and contradictory. If you want to limit computing power or save computing power, latency may increase. If you give more computing power, latency or bandwidth will be less, and they are contradictory. If the bandwidth is very high and sufficient, latency will definitely decrease, because less time is spent on the transmission path, and if the bandwidth is high, it is easier to resist packet loss, and less latency is required to resist weak networks. So there is a triangular relationship between them. Latency can ultimately be expressed as a function related to bandwidth, computing power, and video quality. If the video quality is high, the required latency and computing power consumption will be more, which will affect the video quality. High computing power means high cost, and high bandwidth also means high cost, so we need to find a balance. Under what quality and what latency requirements, how much cost constraint is reasonable. It is not necessary to choose the best chip. It is better to have stronger processing power, configuration, and bandwidth configuration. It is a balanced relationship, and we need to achieve balance and optimization between them. Of course, video quality and delay are also contradictory. If the video quality requirement is high, the bit rate will be high.

58dc1724-115c-11ee-962d-dac502259ad0.png

Let's look at the video bitstream graph. Different lines represent different conditions, some are games, some are speeches, and some are chats. The curves under different conditions are changing, and the bitstream is not very regular. Although there is an average line for a long time, it fluctuates in a short time. This is because the complexity of the video is different. The complexity of the content is related to the bitstream. The more complex the content, or the camera shakes violently or the distance changes, the bitstream will naturally rise, because the bandwidth (for example, 4M bandwidth) cannot be suppressed and will rise. Because the encoding must ensure that the quality is within a certain range and that it is not too bad, then (the bitstream) will inevitably rise because the amount of information is very large. If the bitrate is lowered, the quality of the encoder's final output will be very poor, and you will see many unclear and blurred images. The complexity of the content also includes the content within the frame. If the content within the frame is very rich, it will also affect the output of the bitstream. Therefore, the video bitstream fluctuates violently.

[1] [2] [3]
Reference address:Key technologies and application scenarios of low-latency video based on 5G network

Previous article:Application and advantages of air pressure sensor in smart home
Next article:How does the smart fresh air system work?

Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号