Low-latency interactive live broadcasts, short videos that continuously swipe up, 1080p movies and TV series... At a time when ultra-high-definition videos are easily available, our tolerance for low-definition and stuck videos is getting lower and lower. According to the "2022 China Internet Audiovisual Development Research Report", as of December 2021, the scale of online video (including short videos) users in my country reached 975 million, an increase of 47.94 million from December 2020, accounting for 94.5% of the total netizens. Behind this is the superposition of storage, bandwidth and computing power costs. If you want to pursue the ultimate audio-visual experience when watching an ultra-high-definition movie, then the video requires 16 times the computing power, 12 times the storage and 10 times the bandwidth. What if 100 people watch it at the same time? At this time, we urgently need a real-time live media processing platform with low cost, high compression rate and certain enhancement capabilities, as well as the trump card behind it-the codec processing solution.
1
You can have your cake and eat it too
According to data analysis by Agora, the user retention time in the channel is 10.3% higher for HD quality than for SD quality. HD images can make viewers more willing to stay on the platform and enhance user stickiness. But HD video is not something that can be done just by saying it, and the cost pressure behind it cannot be underestimated. In order to cope with the continuous growth of video traffic, video standards organizations have been promoting the continuous iteration of video coding technology. Starting with MPEG2, the compression rate of video coding standards has increased by approximately 50% every 10 years. Take H.266, which was launched in 2021, as an example: the compression rate is increased by 50% compared to H.265, but its encoding calculation cost is increased by 15 times. At this time, facing the new generation of encoding costs that are more than 10 times higher, traditional CPU capabilities have been unable to cope with it, and the collateral effect of Moore's Law also makes it difficult for it to burst out with high-performance capabilities. Since the CPU can't do it, how about using GPU and AI to leverage its power?
According to the public financial reports of relevant companies, the cost of video transcoding and bandwidth has accounted for 10% of the company's annual revenue.
AI is indeed a good helper. This is a complete flowchart of video transcoding and streaming. We can see that in the entire red frame process, AI can completely take over the work of content review, understanding, editing and transcoding. But while improving the image quality of video encoding and decoding, the computing power cost required by AI should not be underestimated. The high cost of GPU is daunting. Enterprises dare not stockpile a large number of GPU cards at one time, not to mention that GPU transcoding cannot achieve the same high compression rate as CPU. In the face of the above demand pain points, the original single CPU or GPU architecture can no longer fully meet them. Compared with the two, there is no obvious winner. So the question is, is there a way to combine the two without increasing costs? There really is. We all know that there are many hardware platforms for video encoding, including CPU, GPU, proprietary chips and even FPGA... But for video transcoding (especially for hot data transcoding with high access volume), CPU is still the first choice, precisely because CPU has two irreplaceable advantages: 1. High flexibility; 2. High reusability. So, if AI is embedded in transcoding, can the entire transcoding solution be implemented on CPU? In the Intel 4th Generation Xeon Scalable Processor released earlier this year, Intel made a major innovation: by building in several hardware accelerators to accelerate performance in different scenarios. Among them, AMX's AI acceleration completely fills the gap in CPU coding and builds full-link intelligent coding.
On Intel's fourth-generation Xeon, each physical core has such a built-in AMX acceleration unit. So, who says you can't have your cake and eat it too?
2
How did Tencent Cloud manage to remain number one for four consecutive years?
As the saying goes, practice makes perfect, and Tencent Cloud's practice is a good example. As 4k/8k videos gradually enter thousands of households, consumers' viewing habits are gradually moving towards HD and UHD. As a leading HD video service provider, the choice of Tencent Cloud becomes very important.
In terms of technology selection, it is the irreplaceable advantages of CPU that made Tencent Cloud decide to abandon the selection of hardware solutions and switch to pure CPU encoder processing. So, how does the fourth-generation Xeon help Tencent Cloud 4k/8k ultra-high-definition decoding?
Let's talk about cost reduction, super-resolution, computing power and upgrades. As mentioned earlier, the high flexibility of the CPU makes the CPU upgrade almost cost-free. The pure CPU encoder can achieve a higher compression rate than the hardware solution through algorithm design, and the software solution is more convenient to upgrade. For example: the original hardware chip supports 8K265 encoding. If you want to upgrade to support 266 encoding in the future, the hardware needs to be redesigned, and the software only needs to upgrade the code. The system can continue to iterate to support the latest capabilities. The pure CPU solution uses general computing power. When 8K transcoding is not performed, this part of the resources can be easily released for general CPU computing power utilization. When performing 4k/8k encoding, full-link intelligent encoding allows developers to focus on algorithm innovation without having to consider details such as how to deploy it, and it can be used out of the box.
Merge processes to reduce operation and maintenance costs: Since the computing power requirements of the super-resolution part are very high, it needs to be assisted by the GPU, but this will also cause some problems: migrating high-demand AI loads to the GPU will cause the encoding and pre-processing to be completely separated. This is like decoding in one room - sending it to another room for pre-processing - and then turning it back to encode. Not only does it make the process lengthy, it also imposes a huge burden on operation and maintenance, and the repeated scheduling of data also causes a certain increase in latency. The CPU full-link intelligent encoding incorporates this part into the CPU, successfully reducing operation and maintenance costs.
Due to the flexibility of the software, Tencent Cloud's 8K real-time transcoding system can support all mainstream video codec standards. Tencent Cloud is far ahead in MSU O264, V265 in 2021 and MSU H.264, H.265 and AV1 in 2022 and 2023. Fine-grained control AMX, INC (Intel N) and precision
The high computing power of BF16 and INT8 is indeed very helpful for migrating AI from GPU to CPU, but how to ensure accuracy? Intel Neural Compressor (INC) has a built-in correction algorithm specifically for accuracy. As a developer, you only need to do three things: input model, input data set, and input accuracy requirements.
In addition, during the pre-processing process, the fourth-generation Xeon uses intelligent encoding to bind the CPU core and finely control the overall transcoding process. For example, decoding, adding watermarks, converting resolution, encoding, and other operations are assigned to a specified CPU to ensure that interdependent operations are on the same CPU. AI reasoning capabilities have been greatly improved: video pre-processing such as image quality enhancement requires powerful computing power support. This is an actual case of Intel and Tencent Cloud. In the two scenarios of video enhancement and target detection, the AI reasoning performance optimized using the fourth-generation Xeon AMX has increased by 1.86 and 1.95 times respectively compared to the previous generation platform.
At the same time, the loss of precision is controlled within an acceptable range, which also enables users to realize full-link intelligent encoding on the CPU, greatly reducing deployment costs and operation and maintenance costs.
3
"Core" inspires intelligent change, and builds together
The human eye always desires the clearest and most realistic images and videos, and people's pursuit of clarity is endless. No matter how fast artificial intelligence brings technological progress, digitalization and cloud computing should be essential solutions for enterprises to cope with continuous changes. At the 2023 Tencent Global Digital Ecosystem Conference on September 7, Intel will host a special sub-forum with the theme of "Core" to Inspire Intelligence and Work Together as a Deep Partner. (Time: 14:30 Location: 1F CC105C) In the Intel sub-forum, you can learn about the many new achievements of Intel and Tencent's all-round and in-depth cooperation in artificial intelligence, big data, scientific computing, audio and video, etc. in the past 20 years, as well as the latest progress in building a new generation of information technology intelligent infrastructure with high energy efficiency, high reliability and easy scalability to promote the deep integration of the digital economy and the real economy.
At the same time, Intel will also share its latest product and technology blueprints, including Intel AI big model solutions supported by advanced hardware and optimized software such as the fourth-generation Intel Xeon Scalable Processor and Hanana Gaudi2, as well as Intel's cloud-edge integrated intelligent network solutions. In addition, Intel will also set up a special exhibition area at this conference, exhibiting a total of 15 advanced solutions through four major areas: cloud and AI product solutions, cloud-to-end solutions, conference room solutions, and edge product solutions. Standing at a new milestone in the digitalization of the industry, how do you view the infinite imagination that artificial intelligence, cloud computing, and big data bring to the future?
Previous article:Qianshi NDI EFP Box-mounted Program Production System
Next article:IU5200 integrates 30V OVP function, supports I2C interface, 3A charging current, 1~4 lithium battery buck-boost charging chip
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Brief analysis of three detection methods for faulty cables
- How to write software I2C for HC32F460
- Sound detection system
- What is an API? What is an SPI?
- Problems with the function of zero-crossing detection circuit
- ST MEMS Device Resource Library-Other Application Documents
- Automotive chips are hard to come by, 20 yuan chips are being sold for 3,000 yuan, and it takes more than 400 days to get them
- LCD segment code LCD screen design drawing analysis
- High EMC Immunity RS-485 Interface Reference Design for Absolute Encoders
- PWM control and driver usage guide and application circuits - SPWM, PFC and IGBT control and driver section