GPU decisive battle in the new era, local manufacturers accelerate entry

Latest update time：2020-11-30

Reads：

After Nvidia acquired 3DFX in 2000 and AMD acquired ATI in 2006, the desktop GPU market had already been settled.

Nvidia is the undisputed giant in the GPU market, while AMD's GPU is struggling to move forward. Intel relies on the appeal of its CPU and has absolute appeal in the integrated graphics market, but in the discrete graphics market, both the previous Intel 740 and the subsequent Larrabee have ended in failure.

But after Alex Krizhevsky successfully trained the deep convolutional neural network AlexNet using NVIDIA GPUs and significantly improved the performance in the field of image classification and recognition with this network, a new era of artificial intelligence officially began. It was also from this time that the GPU market began to enter a new stage. NVIDIA became the well-deserved winner of this era.

Nvidia's stock price trend from 2012 to now

Nvidia's two powerful weapons in the AI era

Looking back at the history of graphics processors, according to relevant information, the Whirlwind built by MIT in 1951 may be the world's first 3D graphics system, but it is not the basis of modern GPUs. It is reported that the current GPU prototype is based on the so-called video shifters and video address generators in the mid-1970s.

After the development of large systems and small workstations, the graphics processor flourished in the 3D games on PCs in the mid-to-late 1990s. During this period, many companies flocked to it, and NVIDIA was one of them. According to NVIDIA's official website, in 1993 when they were founded, there were more than 20 graphics chip companies in the world. By 1997, this number soared to 70. But by 2006, NVIDIA was the only independent company still in operation, and they became the final winner. The previous waves washed up on the beach included competitors such as ATI, S3 Graphics and 3DFx.

Like other players, NVIDIA only focused on the graphics card market when it was first established, and the first two products, NV1 and NV2, received mediocre market response. However, NVIDIA was not discouraged and invested a lot of experience in developing NV3, which was launched in 1997. As the world's first 128-bit 3D processor, NV3 shipped more than one million units four months after its launch. Because NV3 can support OpenGL well, NVIDIA gradually defeated 3DFx, which had a market share of 85% at the time, and became the overlord of the graphics card market.

It is worth mentioning that NVIDIA said that they invented GPU in 1999 (this is a term first coined by NVIDIA, GPU is the abbreviation of Graphics Processing Unit), and the GeForce 256 launched that year was the world's first GPU.

If Nvidia continues to focus only on the graphics market, then at best they will be the next 3DFx, but Huang Renxun has a bigger ambition, which is to push GPU into the general market, which is the GPGPU that everyone is familiar with.

According to a previous report by Semiconductor Industry Observer: "Around 2000, the academic community became interested in using GPUs for general-purpose computing (GPGPU). At that time, CPUs that were mainly used to execute general algorithms were the main force in scientific computing. However, in order to have good performance on general algorithms, a lot of chip area was actually used for control logic such as on-chip memory and branch prediction, and there were not many units actually used for computing. In contrast, the control logic in the GPU architecture is relatively simple, and most of the chip area is used for rendering, polygons and other calculations. The academic community has found that calculations such as matrices in scientific operations can be easily mapped to the GPU's processing units, so very high computing performance can be achieved."

The report further pointed out that at the time, the main bottleneck of GPGPU was that it was difficult to use. Since GPUs were developed for graphics applications, it was not easy to support general high-performance computing in their programming models, which required a lot of manual debugging and coding, resulting in a high threshold and few people were able to use them proficiently.

In order to make GPUs universal in terms of both software and hardware, NVIDIA launched the Tesla architecture in 2006. Instead of using vector computing units for rendering, it split a vector computing unit into multiple scalar computing rendering units. This makes GPUs based on this architecture not only powerful in rendering, but also suitable for general computing.

That year, NVIDIA launched CUDA. According to them, this is a revolutionary architecture for general-purpose GPU computing. CUDA will enable scientists and researchers to use the parallel processing power of GPUs to solve their most complex computing challenges.

Thanks to the layout in these two directions, NVIDIA has thrived in the AI era.

According to industry experts, in the current cloud AI chip market, except for Google's own TPU, most other manufacturers use NVIDIA's GPU for related model training, which makes NVIDIA's market share in the cloud AI chip market remain high. This has also made NVIDIA's performance hit new highs in the past few years. According to the forecast data of CCID Gu Wen, the domestic cloud AI chip market alone will have a cumulative growth of 152% between 2019 and 2021. McKinsey also predicts that in the next few years, the training market will grow rapidly, and in the next ten years, it will still be the world of NVIDIA GPU.

Seeing this data demand and forecast, ASIC products such as Graphcore IPU and Google TPU have emerged abroad, planning to challenge Nvidia in the training market. Intel and AMD hope to compete with Nvidia in GPU.

AMD and Intel are ready to move

In fact, AMD had a corresponding plan long before and after Nvidia entered the GPGPU market. However, unlike Nvidia, which has invested heavily in promoting the CUDA development environment over the past few years, AMD has put all its eggs in the "OpenCL" basket. This has led to the fact that even though they released the ROCm platform in 2017 to provide deep learning support, it cannot change the fact that their GPUs have achieved almost nothing in the AI era.

But AMD is not resigned. In order to compete with Nvidia, AMD launched its new CDNA architecture in March this year. According to reports, this is AMD's computing-focused GPU architecture for data centers and other uses. AMD's goal for CDNA is simple and direct: to build a large, powerful family of GPUs that are optimized for general computing and data center use.

According to reports, a large part of the performance improvement in the new architecture will be reflected in machine learning, which means supporting faster execution of smaller data types (such as INT4/INT8/FP16), and AMD also explicitly mentioned tensor operations when introducing the new architecture. In addition, the new architecture can flexibly design performance through the Infinity Fabric interconnect bus, and supports enhanced enterprise-level RAS features, security, and virtualization technology, and will also provide higher energy efficiency, thereby reducing enterprise TCO costs.

Based on this architecture, AMD released a new generation of Instinct MI100 computing cards in the middle of this month. Data shows that the new architecture can provide up to 11.5 TFLOPS of FP64 peak throughput, making it the first GPU to break 10 TFLOPS in FP64. Compared with the previous generation MI50, the performance of the new accelerator card has increased by 3 times. It also has a peak throughput of 23.1 TFLOPS in FP32 workloads. Data shows that AMD's new accelerator card beats Nvidia's A100 GPU in both categories.

Instinct MI100 also supports AMD's new Matrix Core technology, which improves the performance of single-precision and mixed-precision matrix operations such as FP32, FP16, bFloat 16, INT8, and INT4, and can also increase FP32 performance to 46.1 TFLOPS.

To better compete with Nvidia, AMD also said its open source ROCm 4.0 developer software now features an open source compiler and unified support for OpenMP 5.0, HIP, PyTorch, and Tensorflow.

In addition to AMD, Intel has also increased its investment in its GPU in recent years, hoping to get a share of the AI market.

According to Intel, the company's Xe ^{architecture
GPU will cover the entire range from integrated graphics to high-performance computing. The independent GPU code-named Ponte Vecchio is a design launched by the company for HPC modeling and simulation and AI training. Ponte Vecchio will be manufactured using Intel's 7nm technology and will be Intel's first Xe}^- based GPU optimized for HPC and AI workloads . But so far, Intel has not seen this new product.

In addition, in order to better utilize its chips including CPU, GPU, FPGA and AISC in the application market including AI and facilitate programming for developers, Intel also launched OneAPI with far-reaching ideals. In the eyes of developers, this is a good plan, but also a very challenging task.

Chinese manufacturers are accelerating their entry

As the importance of GPU becomes more and more prominent, more and more domestic manufacturers are beginning to invest in this market. In addition to Jingjiawei, Zhaoxin and Hangjin, which have been in this market for a long time, there are also some new entrants into this field. Among them, Biqi, Muxi, Haifeike and Xintong are the most well-known.

First, let’s look at BiRen Technology. According to its official website, the company was founded in 2019. The team is composed of core professionals and R&D personnel in the fields of chips and cloud computing at home and abroad. It has deep technical accumulation and unique industry insights in the fields of GPU, DSA (dedicated accelerator) and computer architecture.

In terms of products, BiRen Technology is committed to developing original general computing systems, establishing efficient software and hardware platforms, and providing integrated solutions in the field of intelligent computing. In terms of development path, BiRen Technology will first focus on cloud-based general intelligent computing, gradually surpass existing solutions in multiple fields such as artificial intelligence training and reasoning, graphics rendering, and high-performance general computing, and achieve breakthroughs in domestic high-end general intelligent computing chips.

Muxi was founded by former AMD executives. It was reported that Muxi Integrated Circuit was established in September 2020. The core team came from a world-class GPU chip company, with an average of more than 15 years of experience in high-performance GPU chip design and rich experience in 5nm tapeout and 7nm chip mass production. The company is committed to the research and development of high-performance GPU chips with independent intellectual property rights, safe and reliable, serving many important fields that require high computing power, such as data centers, cloud games, and artificial intelligence, filling the gap in the independent and controllable domestic high-performance GPU chips.

Hexaflake was founded in 2019. It is a high-tech startup company dedicated to the research and development of AI high-performance processor chips and full-stack software and hardware system solutions. It is a leading AI general-purpose processor company that can compete with international giants in this field. The main founders and core team bring together many top international senior experts from all over China and the United States; their expertise covers parallel computing and AI processor architecture, GPU and other ultra-large-scale SoC chips and processor system software research and development; they have worked in the core R&D departments of leading international companies for a long time and successfully developed a variety of chips and system products. The purpose of their company's establishment is to jointly create a new generation of general-purpose AI processor chips and their software and hardware ecosystem.

CoreTone Semiconductor was founded in 2018. In an interview with the media, they said that the company's GPU targets three application areas: eight major industries of the party and government (aviation, tanks, radar, etc.), the military, and cloud games. In addition, Xindong, which has authorized Imagination IP, Zhaoxin, which has inherited relevant GPU patents, and Loongson, which has been making domestic CPUs, are also players in the GPU market.

Considering the current domestic GPU situation and the trade situation between China and the United States, the above-mentioned GPU manufacturers include not only players who are eyeing the AI market, but also entrepreneurs who hope to make breakthroughs in the graphics GPU market.

However, as industry experts told me, whether in the graphics or general computing market, for GPUs, what is more important is the software and developer ecosystem. Only when this is done well can GPUs be commercially available. When will any domestic manufacturer be able to truly break through? This is worth our wait and see.

*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.

Today is the 2509th issue of content shared by "Semiconductor Industry Observer" for you, welcome to follow.

Latest articles about

■SiC giant, rebirth: how to predict the future?

■Apple chips may hit Qualcomm hard

■Chip cost per car: soaring to $1,000

■TSMC 2nm, important information

■Huang Renxun's latest views

■The risks of this type of chips that are promising have increased significantly!

■NPU, how to see it?

■Storage giants are abandoning DDR 4

■Intel, why?

■Nvidia will definitely be disrupted