NVIDIA surprised many people, including long-time NVIDIA watchers, when it announced it would spend $6.9 billion to acquire data center networking company Mellanox. This is by far NVIDIA's largest acquisition ever, and its previous purchases of companies were much smaller and often sold at fire-sale prices. In terms of size, the closest deal was their 2001 purchase of competitor 3dfx assets, as NVIDIA was a much smaller company at the time.
As I explained in a previous article, the purchase of 3dfx assets (and the hiring of 100 employees) was a more understandable move, as the new assets could be immediately put to work on NVIDIA’s core business – PC graphics processors. For many years Mellanox was in a completely different business – data center networking. Mellanox’s products complemented NVIDIA’s products, with no overlap.
With this acquisition, NVIDIA is saying that they are no longer a GPU company. With the accelerator business growing exponentially and entering the network, NVIDIA is now a data center company.
Mellanox CEO Eyal Waldman and NVIDIA CEO Jensen Huang on stage at GTC 2019
There are many interesting aspects to the Mellanox acquisition, such as NVIDIA's deeper entry into Israel's tech industry; Mellanox's other compute-related assets (EZChip and Tilera); how Jensen Huang's management style will play out in Israel; and Mellanox's support for the CCIX compute accelerator connection protocol versus NVIDIA's own NVLink. In later posts, we'll explore each of these differences in depth. But for now, let's explore this new NVIDIA.
How NVIDIA Became a Data Center Company
It all started with the discovery (at Stanford University) around 2006. At that time, the Stanford guys were using Graphics Processing Units (GPUs) for some computationally intensive workloads, and that GPUs provided a significant improvement in performance per watt over traditional processors or CPUs.
It turned out that all the little computing elements used to process pixels (texture processing) could be used for crude scientific computing. The field was initially called GPU Compute. At the same time, graphics also became more complex, and full-featured math processing capabilities were added to GPUs. Some people at NVIDIA, including Professor Bill Dally and the late John Nicholls, noticed an opportunity to expand the use of GPUs and play a major role in the high-performance computing (HPC) market. The result was that NVIDIA built on its Quadro product line for graphics computing, added more features for HPC workloads in its GPUs, and opened up a Tesla product line specifically for numerical computing.
The company also developed the CUDA programming framework for its GPUs, but never supported any other GPUs. AMD, the main competing GPU supplier, chose to wait for OpenCL to develop, but the development speed of this software was much slower. On this basis, NVIDIA has been very successful in HPC and ranked first in the TOP500 list of supercomputers. According to reports, they power the two fastest supercomputers in the world.
NVIDIA CEO Jensen Huang shows off the company's growth in supercomputing
Because of NVIDIA's superior work in GPU computing for HPC, some researchers in the AI field decided to use GPUs to accelerate new machine learning algorithms called deep convolutional neural networks (DCNNs). The combination of new DCNNs and GPUs makes the training and inference of AI neural networks faster and more accurate than before. This has driven an explosion of AI research and applications that was originally in the Cambrian period, and NVIDIA is leading the trend. The company quickly adapted its GPUs for these new workloads, adding new mathematical functions and even fueling specialized processing elements called Tensor Cores. NVIDIA has also developed a series of software libraries called cuDNN that are optimized for CUDA and deep neural networks.
Due to the explosion of AI research, each cloud vendor has also developed its own language. Google has TensorFlow, Facebook has Pytorch/Caffe 2, etc. Even with the fragmentation of AI frameworks, the field is still growing rapidly. Because we continue to research new algorithms, a flexible approach has long-term cost-of-ownership benefits. This is where flexible accelerators such as GPUs (or FPGAs) excel, as they are easily adaptable to new algorithms. In his GTC 2019 keynote, Jensen called this architecture "PRADA", programmable acceleration of multiple domains from one architecture. This architectural compatibility allows building on an installed base of software and systems and reduces the cost of infrastructure.
Jensen Huang explains his acronym PRADA
From chips to systems
In Huang's keynote, he proposed that data science is the fourth support of the scientific method. NVIDIA realizes that there is a shortage of data scientists and AI researchers, so the productivity of these people is very important. To maintain this momentum, it is important to bring resources to a wider range of developers. Therefore, the company has designed a series of DGX workstations and servers, fully loaded with CUDA-X tools and libraries for ML research. The company is expanding its influence on data scientists with new data science platforms from multiple system OEMs, including Dell, HP and Lenovo.
Even with new systems and tools, the industry still faces the challenge of sorting new and existing data for business and scientific insights. This pushes data science to address the problem of too much data. As we enter the era of self-driving cars, they will generate billions of bytes of information that need to be processed. That's why Nvidia believes that more and more data centers need to build AI processing to sort through all this data.
Supercomputers and HPC
In our work in HPC, NVIDIA focuses on delivering maximum computing performance to solve very large problems. Hyperscale data centers typically run many computing tasks simultaneously (scale out). The needs of data science fall somewhere in between - large datasets and many users, with both scale-up and scale-out characteristics.
To meet these different needs, NVIDIA has established many server projects with Mellanox to provide rack networking. Due to the success of Mellanox, it has become an acquisition target for various chip companies and cloud companies, including companies such as Intel and Microsoft. However, instead of going to one of these companies, Mellanox sought a friendlier partner like NVIDIA. Huang Renxun also seized this fleeting opportunity when he had the opportunity to become the white knight of Mellanox.
With the increasing containerization and hyperscaling of workloads for data analytics programs like Hadoop, SPARC, and RAPIDS, they are seeing exponential growth in rack-to-rack communications, often referred to as east-west communications in data centers. That means low-latency networking is critical to creating the fabric of computing.
Mellanox's networking technology can make data centers flexible enough to adapt to these changing workloads. Mellanox's key development is to move networking tasks from CPUs to accelerators, and in the future it will add AI to its switching products to move data more efficiently.
For server scale-out applications like HPC, the goal is to make multiple GPUs act like one giant GPU. That's where NVIDIA's NVLink comes into play, tying multiple GPUs together. For a wider infrastructure, the Tesla T4 cards can be deployed. These 70W half-height PCIe cards fit into a 2U rack chassis, so these cards can be added to existing data centers in large numbers. The T4 is NVIDIA's most flexible data center offering - it can be used for inference, training (at different speeds than the V100), data science, video transcoding, and VDI (virtual desktop) applications.
In the future, NVIDIA will pay more attention to inference in cloud and edge applications, which is also the area where NVIDIA competes most fiercely with Intel.
While there are many contenders for the AI accelerator throne, NVIDIA remains the king of the hill with the largest installed base. With the acquisition of Mellanox, they have opened up their datacenter space.
Previous article:Huawei increases capital expenditure, and the rise of the supply chain is unstoppable
Next article:Mouser: More than just a small-volume distributor
- Popular Resources
- Popular amplifiers
- Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
- The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
- ASML predicts that its revenue in 2030 will exceed 457 billion yuan! Gross profit margin 56-60%
- ASML provides update on market opportunities at 2024 Investor Day
- It is reported that memory manufacturers are considering using flux-free bonding for HBM4 to further reduce the gap between layers
- Intel China officially releases 2023-2024 Corporate Social Responsibility Report
- Mouser Electronics and Analog Devices Launch New E-Book
- AMD launches second-generation Versal Premium series: FPGA industry's first to support CXL 3.1 and PCIe Gen 6
- SEMI: Global silicon wafer shipment area increased by 6.8% year-on-year and 5.9% month-on-month in 2024Q3
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Download from the Internet--ARM Getting Started Notes
- Learn ARM development(22)
- Learn ARM development(21)
- Learn ARM development(20)
- Learn ARM development(19)
- Learn ARM development(14)
- Learn ARM development(15)
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Huada's MCU has an RTC, why doesn't it have an independent VBAT pin? !
- Who can interpret this graph in the capacitor data?
- TouchGFX Design + HMI Design for HVAC Air Conditioning
- [2022 Digi-Key Innovation Design Competition] [Intelligent Garden Integrated Control System] Material Unboxing STM32L496G-DISCO
- How does the light sensor work?
- A strange phenomenon about LM358 as a comparator
- Switching Power Supply
- C Language Zone (2) Multiplication Table
- MicroPython serial port receiving driver irqUART
- Problems and solutions of deep sleep when using Huada HC32L136