In the past, when people mentioned Arm, they thought more about mobile phones and embedded systems. However, since 2018, Arm announced the launch of Neoverse and entered the high-performance computing market. Four years have passed. Now, Arm architecture infrastructure has become an obvious trend. As Arm CEO Rene Haas said: "All major public cloud service providers in the world are now using Arm architecture."
Arm Neoverse milestones in 2022
Chris Bergey, senior vice president and general manager of Arm's Infrastructure Business Unit, reviewed the important events of Arm Neoverse in 2022, including:
Arm is now used in major public clouds around the world, including AWS, Microsoft, Google, Alibaba, Oracle and other technology giants. It is worth mentioning AWS. A month ago, Amazon Vice President James Hamilton talked about how they started their journey to custom chips. In 2013, James made a two-point argument to Jeff Bezos. First, given the number of chips shipped using the Arm architecture, he was sure that Arm would eventually design an excellent server CPU; second, James noticed that over time, more and more functions were gradually migrating from the motherboard to the SoC. The mobile phone field has already shown signs, and he believes that the server will naturally follow suit. AWS has been building custom servers for many years and has created more value for customers through customization. But if all innovations in servers are transferred to chips and AWS does not build chips, their innovation will be limited. The conclusion drawn from James' argument is that AWS needs to start building CPUs. This also prompted them to acquire Annapurna Labs, which created the AWS Graviton series of CPUs based on Arm Neoverse.
In the field of 5G RAN, Neoverse is everywhere. At the Mobile World Congress (MWC), Dell announced that it would use Marvell's OCTEON Fusion platform to develop O-RAN accelerator cards. Qualcomm has also reached cooperation with Rakuten and HPE, also based on the Arm Neoverse platform.
In the HPC market, NVIDIA released the Grace superchip for AI and high-performance computing (HPC), which is based on the latest Armv9 architecture. A single socket has 144 CPU cores, has the highest single-thread core performance, and supports Arm's new generation of vector extensions. It can achieve twice the memory bandwidth and energy efficiency of today's leading server chips.
In addition, at the software and system level, Arm's Neoverse is also gaining more and more recognition. For example, VMware uses DPU to carry out the Monterrey project, RedHat's OpenShift supports Arm architecture, SAP HANA is migrating its cloud infrastructure to AWS Graviton, and HPE's ProLiant 11th generation platform is equipped with the Ampere Altra processor based on Arm Neoverse, etc.
Neoverse has achieved a series of achievements in processors, including:
The first CPU to exceed 1 terabyte per second of total memory bandwidth
The first CPU with more than 100 cores on a single die, with the number of cores reaching 128
The first CPU to bring DDR5 and PCIe Gen5.0 to market
The first CPU to break the 500 integer score in the SPEC CPU 2017 benchmark
Arm releases latest Neoverse roadmap
"Arm architecture is the cornerstone of the future of global computing," said Bergey. "Today's infrastructure is customized, from SSD to HDD, from DPU to video accelerator, server CPU is the last standard product and will not continue to develop as a general-purpose product. At the same time, computing workloads are growing rapidly and becoming more complex. ML and AI are playing a replacing role. Another problem is power consumption. Currently, the electricity expenditure of large Internet companies accounts for 30-40% of the total cost of ownership (TCO), which is only slightly lower than that of telecommunications network operators."
For this reason, Arm announced the latest Neoverse roadmap to meet infrastructure upgrade requirements.
Neoverse is divided into V, N and E series cores, targeting three different types of performance. The V core pursues maximum performance, the E core focuses on performance efficiency, and the N core focuses more on throughput efficiency.
As shown in the figure, whether it is the V, N or E series, Neoverse has a detailed roadmap upgrade plan announced.
Dermot O'Driscoll, vice president of product solutions at Arm Infrastructure Business Unit, said that single-chip performance and single-thread performance are two key indicators for cloud decision makers. Among them, single-thread performance is an indicator of whether decision makers can migrate workloads with the highest "scalability" requirements and high performance requirements to Arm. Single-chip performance is the key to maximizing the value of investment through a large number of "horizontal expansion" workloads running on the platform. "AWS Graviton3 using Arm Neoverse V1 cores can provide the highest single-thread performance, and even the upcoming competing CPUs cannot shake its leading position. We expect Graviton3 to provide excellent price-performance and performance per watt, while Ampere Altra Max and Alibaba's Yitian 710 can provide the best single-chip throughput among all CPUs."
In addition to hardware, Driscoll also mentioned that Arm has been working hard to implement and optimize full-stack solutions, from architecture and IP to technology libraries, operating environments and compilers, to achieve optimal performance across the entire infrastructure software range.
The actual test results also show that Arm has achieved or even surpassed traditional architectures in infrastructure processing. Taking the mainstream data storage MongoDB application as an example, Driscoll compared the instances based on Graviton2 and Intel Xeon from AWS and found that MongoDB performance was 117% better than the x86 architecture.
Driscoll also said that as machine learning becomes more popular, Neoverse V1 also has a set of features specifically designed to enhance the performance of ML applications. These include:
On the architecture side, Bfloat16 (BF16) was added
Adjusted the microarchitecture of V1, N2, and subsequent designs to improve BF16 execution with BERT
Added BF16 support to Arm Compute Library (ACL)
Integrating ACL into the oneDNN ML framework
oneDNN framework with Tensorflow to run BERT
Similarly, Arm is running BERT on AWS EC2 C7g based on V1 cores and comparing it with C6i using the latest Xeon cores. The BF16-optimized stack on the Arm architecture performs 80% better than Intel. At the same time, the addition of BF16 and Int8 MatMul in V1 means that ML models can be more compactly embedded in memory, so they require less memory bandwidth, making Graviton3's ML performance 3 times that of Graviton2.
When talking about the Neoverse V2 platform, Driscoll said that the platform can simultaneously meet the three needs of customers: "want to improve the performance of cloud workloads", "continue to advance single-threaded performance while balancing power consumption and area", and "ship as soon as possible to help quickly open up the market."
In terms of machine learning performance, Neoverse V2 will provide market-leading integer performance. Arm currently measures estimates with SPEC Integer Rate, and has been using various cloud infrastructure workloads in the model to adjust the microarchitecture. The results of the entire series make Driscoll "very excited". For workloads like HPC that are rapidly migrating to the cloud, vector performance is still important. On Neoverse V2, Arm has completed the transition from SVE to SVE2, which can help meet more non-HPC ML type workloads while adding more encrypted instructions. In addition, the vector engine has been reconstructed into 4-channel 128-bit, and the microarchitecture has been adjusted to increase its effective throughput.
In addition, Neoverse V2 has made a series of improvements in the system layer, IO layer, and security layer, which can be seen from the performance of NVIDIA's Grace super chip.
Driscoll did not reveal more about the progress of the N and E series, only saying that the N series product line will be updated next year. In terms of market adoption, nearly 20 customers are currently designing based on the N2 platform.
Driscoll said: "The infrastructure market is being redefined, centered on Arm's high-performance, scalable and efficient computing, and enhanced by dedicated processing from our partners. Building on the principles of the Arm Neoverse platform roadmap, we will lay a new starting point for global computing infrastructure." This is also a summary and outlook of the four years since the birth of Arm Neoverse.
Previous article:Ampere's next-generation processors will abandon Arm and use customized cores
Next article:Intel and Baidu PaddlePaddle jointly create an AI developer ecosystem to accelerate the intelligent upgrade of thousands of industries
Recommended ReadingLatest update time:2024-11-16 12:43
- Wi-Fi 8 specification is on the way: 2.4/5/6GHz triple-band operation
- Three steps to govern hybrid multicloud environments
- Microchip Accelerates Real-Time Edge AI Deployment with NVIDIA Holoscan Platform
- Keysight Technologies FieldFox handheld analyzer with VDI spread spectrum module to achieve millimeter wave analysis function
- Qualcomm launches its first RISC-V architecture programmable connectivity module QCC74xM, supporting Wi-Fi 6 and other protocols
- Microchip Launches Broadest Portfolio of IGBT 7 Power Devices Designed for Sustainable Development, E-Mobility and Data Center Applications
- Infineon Technologies Launches New High-Performance Microcontroller AURIX™ TC4Dx
- Rambus Announces Industry’s First HBM4 Controller IP to Accelerate Next-Generation AI Workloads
- NXP FRDM platform promotes wireless connectivity
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Based on FM33LC0xxN series BLDC solution information (schematic diagram + routine)
- MaixSense R329 development board armbian system test (Part 2)
- AT24C1024 full capacity read and write NXPLPC11XX source program
- TDA2030A single power supply amplifier circuit diagram
- Matlab Image Processing Application Collection
- Switching power supply related issues
- FPGA Classic 100 Questions: 20 Questions on Simulation.pdf
- EEWORLD University Hall----Live Replay: Microchip Trust&GO for any cloud service
- PCB Layout /PCB design part-time OEM
- 【Distributed temperature and humidity acquisition system】+MFC software