High-performance AI chips, how to ensure "high performance"?
In high-end academic language, it is a science and technology that studies and develops theories, methods, technologies and application systems for simulating and extending human intelligence ; it is also a complex science based on computer science, biology, psychology, neuroscience, mathematics and philosophy. In layman's terms, it means giving computers the ability to be as intelligent as humans.
In the past two years, artificial intelligence has become the hottest topic
If you can’t talk about artificial intelligence when you go out, you must be out of date
This year, fresh graduates have an astonishing annual salary of 500,000 yuan
From the AI industry
Even so, companies are still worried about not being able to recruit suitable people...
How about it?
Envy... Jealousy... huh?
The editor also wanted to get a high salary, so he calculated
No, no, no, actually I asked Baidu and Google.
I found that the journey of artificial intelligence has not been smooth sailing.
If you really want to get a high salary, you have to level up well!
|
|
|
|
1950s Pioneers have already proposed The concept of artificial intelligence |
|
|
1956 Created the world’s first artificial intelligence laboratory, MIT AI LAB
|
In the following decade, artificial intelligence was perhaps even more popular than it is today, and scholars believed that within twenty years, machines would be able to accomplish everything that humans can do!
As a result, artificial intelligence has not yet replaced human intelligence! At that time, the computing performance was insufficient and the amount of data was seriously lacking, resulting in the inability of machines to achieve high intelligence.
|
|
|
|
1990s IBM's computer system "Deep Blue" defeated the world chess champion Kasparov, marking another milestone in the development of artificial intelligence. |
|
|
2016 Google's AlphaGo defeated Korean chess player Lee Sedol, which has once again set off a craze for artificial intelligence |
After years of development, the chips at the bottom of artificial intelligence have gradually grown into a large scale: GPU, FPGA, ASIC (especially after Google contributed TPU, countless xPUs emerged) and brain-like chips, etc. Based on the development of these hardware facilities and the massive data brought by the Internet and mobile Internet, artificial intelligence has revived again.
At this stage, chip research and development and manufacturing are in full swing, an endless stream of companies are investing in the AI trend, and various new applications are beginning to emerge. Everyone has truly seen the possibility of AI implementation.
To be at the forefront of the storm and to be a pig in the storm for a long time, if you really want to get an enviable high salary, you also need to be down-to-earth and
face the hardships and challenges in between.
Knock on the blackboard and highlight the key points
Today we invited Master Li Kai
A systematic explanation of high-performance AI chips
Part.1
Features of high-performance AI chips
Cloud computing has become the infrastructure of the modern Internet era. The big data and artificial intelligence (AI) applications spawned by it have become a hot spot for capital pursuit and have risen to the national strategic level.
AI chips are the core technology for AI technology implementation:
◎ In terms of application scenarios, AI chips are mainly divided into two categories: cloud and device chips;
◎ From a functional perspective, it is mainly divided into two categories: training and inference.
Among them, chips for cloud training require massive amounts of data to train their deep neural network models, and have the most stringent performance requirements. Due to the high performance requirements and difficulty in implementation, cloud training chips are currently mainly divided into the GPU camp represented by NVIDIA and the ASIC camp represented by Google. Among them, the development of ASIC chips is showing a gradual upward trend, and it is also the field most likely to achieve breakthroughs in chip technology.
The figure below shows a heterogeneous server for AI computing built with IBM's Power series CPU and NVIDIA 's GPU as examples. As you can see, for AI chips for cloud training, in addition to changes in the chip's own architecture to meet the requirements of AI algorithms, the throughput requirements for the memory bus and peripheral interfaces are very high because of the need to access and exchange massive amounts of data at any time.
* Image source: IBM
Part.2
Testing requirements for high-performance AI chips
For high-performance AI chips, typical testing requirements are mainly divided into the following categories:
High-performance interconnect testing
It is used for high-speed, low-latency interconnection between multiple AI chips to improve computing power in a clustered manner, mainly based on NVLink, CCIX, and 100G/400G Ethernet. At the same time, dedicated AI chips such as GPUs and ASICs are generally difficult to form an AI computing platform alone. In many cases, they also need the cooperation of the CPU for task scheduling. The interconnection with the CPU is mainly PCIe3.0/4.0, and PCIe5.0 will be used in the future. The following table shows the development trend of mainstream heterogeneous computing buses.
High performance storage test
Used for high-speed access to massive amounts of data. Currently, DDR4/GDDR5 memory is the main memory, and DDR5/GDDR6/HBM and other memory methods will be used in the future. The figure below shows the development of high-performance memory technology.
Large-scale semiconductor advanced process production test
The development of AI, especially machine learning, requires powerful computing systems with computing capabilities exceeding 10 billion operations per second (FPS). High-performance AI chips are accompanied by high power consumption. Currently, the power consumption of typical high-performance AI chips or GPUs has reached about 300W, so liquid cooling solutions and more advanced process technologies will also be gradually adopted. The basis for building these systems is currently CMOS technology chips, and the CMOS process can continuously improve system performance mainly due to the reduction of integrated size.
For high-speed SerDes, the industry has already demonstrated a single-channel 100Gbps chip based on 28nm. However, if it is to be integrated into AI chips in the future, 16nm or even 7nm processes may be used to reduce power consumption. In 2019, chips using the 10nm process have been mass-produced, 7nm has begun mass production, and the technical definition of the 5nm node has been completed. The figure below shows the development of high-speed SerDes and semiconductor process technology.
In response to the above test requirements, Keysight provides
Several major test platforms for high-performance AI chips
mainly include
▼
High-speed interconnect channel analysis platform
High-speed signal quality verification platform
High-speed interface tolerance test platform
High-speed storage bus performance analysis platform
High-speed interconnect simulation design platform
Massively parallel semiconductor parameter test platform
Next, we will
Introduce each platform separately
Don't miss it
We are committed to helping enterprises, service providers and government customers accelerate innovation and create a secure and connected world. Since the founding of HP in 1939, Keysight Technologies has been operating independently as a new electronic test and measurement company on November 1, 2014. We continue to uphold the same entrepreneurial spirit and passion to start a new journey, inspire global innovators and help them achieve goals beyond imagination. Our solutions are designed to help customers innovate in 5G, automotive, IoT, network security and other fields.