Implementing 12G SDI video capture, H.265 compression, SGMII 10G Ethernet streaming on Mir Electronics MPSOC
[Copy link]
1. Introduction
With the development of online video platforms, users' demand for 4K high-definition quality is growing. However, many users have found that even if they have purchased membership of the video platform, the quality of 4K content is not as good as expected, and sometimes even blurry and stuck. This phenomenon involves many factors such as video encoding, network bandwidth, and video transmission.
The recent video "The clarity is not as good as it was 4 years ago! Is the blurry video just your illusion?" released by "Film and TV Hurricane" discussed the video platform's reduction of bitrate and change of encoding format to compress video quality, which affected the content expression.
One of the main reasons for the loss of clarity in 4K videos is that platforms compress video streams to save bandwidth, sometimes resulting in lower bitrates that fail to realize the full potential of 4K resolution.
In this context, how to efficiently compress and transmit 4K video has become a key technical challenge . This article will explore how to use the ZU4EV MPSoC platform of Mir Electronics to access the true 4k 60UHD-SDI video source, use the VCU for efficient H.265 encoding and decoding, and then use the SGMII 10 Gigabit Ethernet to achieve network streaming to ensure the smooth transmission of high-quality 4K video.
2. Causes of video quality degradation and optimization methods
1) Bandwidth bottleneck : As the number of users increases, the bandwidth of servers and networks often cannot meet the needs of 4K video streaming.
2) Inadequate compression algorithms : Traditional video compression technology performs poorly on high-resolution content and can easily lead to blurry images.
3) Optimization of video streaming
In the streaming process, network bandwidth and video compression efficiency directly determine the clarity and smoothness of video playback. In order to ensure efficient transmission of 4K video on 10 Gigabit Ethernet, this design adopts the following optimization measures:
-
Reasonable bitrate control : Under the premise of ensuring video clarity, adjust the target bitrate of H.265 encoding to avoid too low bitrate affecting video quality or too high bitrate causing bandwidth waste or. Through CBR or VBR mode, the bitrate can be dynamically adjusted according to network conditions.
-
Low latency mode : VCU supports low latency encoding mode to ensure that the video maintains the lowest possible latency during compression and transmission, improving the user's viewing experience.
-
Network transmission protocol selection : Select the appropriate transmission protocol according to the application scenario. For scenarios with high real-time requirements, UDP transmission can be selected, while for scenarios with high data reliability requirements, TCP protocol is recommended.
3. Advantages of MPSoC and VCU architecture in 4K UHD audio and video broadcasting
1. Combination of high performance and low power consumption : Zynq UltraScale+ MPSoC uses 16nm FinFET process and integrates multi-core processors and programmable logic, which can improve performance while reducing power consumption. This is crucial for the field of audio and video broadcasting because it can reduce energy consumption while ensuring high-definition video transmission.
2. Real-time compression and decompression capability : The integrated VCU supports H.264/AVC and H.265/HEVC standards, enabling real-time compression and decompression of videos up to 4K UHD resolution. This means that in broadcast applications, the VCU can be used for efficient video encoding, reducing storage space and bandwidth requirements while maintaining video quality.
3. Multi-video stream processing capability : VCU can process up to eight different video streams simultaneously, which is very useful for 4K UHD broadcast applications that need to broadcast multiple video sources simultaneously. This multi-tasking capability makes MPSoC an ideal choice for multimedia centers and video servers.
4. Flexibility and scalability : MPSoC's programmable logic (PL) provides the flexibility of any-to-any high-speed video/audio interface, which can bring differentiated effects of customized image and video processing functions to multimedia pipelines. This programmability enables the system to adapt to the ever-changing audio and video broadcasting needs.
5. Dedicated hardware acceleration : MPSoC provides dedicated processing engines, such as ARM Cortex A53-based APU, Mali graphics processing unit, etc. These dedicated hardware can accelerate graphics and video processing tasks and improve the overall performance of the system.
6. Support multiple video formats : VCU supports up to 4:2:2 10-bit UHD-4K video formats, suitable for professional and high-end consumer production and post-production solutions. This wide range of format support allows MPSoC to be applied to a variety of different audio and video broadcast scenarios.
7. Integrated multimedia framework support : MPSoC combines with the common multimedia framework GStreamer to develop hardware-accelerated multimedia applications. This integrated support simplifies the development process and enables developers to quickly implement complex audio and video processing tasks.
8. Optimized power management : Zynq UltraScale+ MPSoC places components such as processing engines and hardware codecs in different power domains with independent power rails. This configuration can be used to optimize power management solutions for the entire system design and further reduce system power consumption.
9. High-speed interconnect peripherals : MPSoC provides high-speed interconnect peripherals, such as an integrated DisplayPort interface module that supports operating rates up to 6 Gb/s, which helps process real-time audio and video streams from PS or PL and further reduce system BOM costs.
10. Support the new generation of terrestrial digital TV broadcasting technology : With the advent of the ultra-high-definition TV era, MPSoC and VCU architecture can support the new generation of terrestrial digital TV broadcasting technologies, such as DVB-T2, ATSC 3.0 and DTMB-A, which support higher video quality and new broadcast application modes.
In summary, MPSoC and VCU architecture provide multiple advantages in the field of 4K UHD audio and video broadcasting, such as high performance, low power consumption, real-time compression and decompression, multi-video stream processing, flexibility, hardware acceleration, wide format support, multimedia framework integration, optimized power consumption management and high-speed interconnection peripherals, making it an ideal solution in this field.
4. System Architecture Overview
In this design, we use the Zynq UltraScale+ MPSoC platform (specific model: MYIR XCZU4EV), implement H265 compression of SDI video through FPGA, and push it to 10 Gigabit Ethernet through the SGMII interface. The system architecture mainly includes the following parts:
1. Video input : The input source can be an SDI camera, an SDI signal generator, or an HDMI signal connected from a computer via an HDMI to SDI device. The video signal is equalized by TI's LMH1219 chip, and the single-ended signal is converted into a differential signal before being input into the FPGA.
2. SDI video decoding : The UHD-SDI GT IP core in the FPGA is used to deserialize the SDI video and convert the video signal into AXI4-Stream format for subsequent processing. The SDI video is decoded into RGB format through the SMPTE UHD-SDI RX SUBSYSTEM IP core.
3. Video frame buffering and processing : The decoded video signal is stored in the DDR4 on the PS side, which is implemented by the Video Frame Buffer Write IP core provided by Xilinx. At this stage, the video frame can be processed by color conversion, scaling, etc.
4. H.265 video compression : Use the Zynq UltraScale+ VCU IP core to perform H.265 encoding compression on the stored RGB video frames. VCU supports YUV420 format videos with encoding resolutions up to 4K@60fps.
5. SGMII 10 Gigabit Ethernet transmission : The video stream compressed by H.265 is pushed to the 10 Gigabit Ethernet through the SGMII interface. Through the PetaLinux system, the compressed stream is transmitted to the PC or server using the TCP/UDP protocol. Users can play the received H.265 stream in real time through software such as VLC player.
5. Main process of engineering design
1. SDI input : Signal equalization is performed through LMH1219 and SDI signal is converted to AXI4-Stream format.
Through the HDMI to SDI box, 4K 60FPS video is output to the FPGA via 12G UHD-SDI. Users can also use SDI industrial cameras.
2. Video decoding : The UHD-SDI GT IP core completes the video deserialization, and the SMPTE UHD-SDI RX SUBSYSTEM IP core decodes the video into RGB signals.
3. Video caching : Use the Video Frame Buffer Write IP core to write the video to DDR4.
Users can choose to make custom ISP here, such as image scaling, stitching
4. Video compression : Perform H265 compression on the video through the Zynq UltraScale+ VCU IP core.
5. Network transmission : Through the SGMII 10 Gigabit Ethernet interface, the compressed H265 video stream is pushed to the PC through the UDP protocol and played using the VLC player.
6. Conclusion
As video content continues to evolve towards 4K, the SGMII 10G Ethernet video compression streaming solution based on the VCU implemented on the Zynq UltraScale+ MPSoC platform can not only efficiently compress and transmit 4K video, but also ensure low latency and high-quality image output. This solution is suitable for application scenarios that require high-resolution video, such as video surveillance, medical imaging, and industrial automation.
For users who want a better viewing experience on online video platforms, video platforms and service providers need to optimize video encoding, network transmission, etc. to meet users' demand for 4K video quality.
7. Interactive session
When streaming to the PC side through SGMII Gigabit Ethernet, because it is a 10 Gigabit network, the CPU cannot afford the high-speed throughput here, so we need to use network offloading. Mir Electronics' MYC-J7A100T dual-core design core board can collect SGMII 10 Gigabit Ethernet data through SFP, and the PC reads the video source through PCIE to realize the 10 Gigabit network port data packet unloading. We will share in the subsequent series of articles based on Mir MYC-J7A100T SFP acquisition and PCIE XDMA interrupt reading.
|