A supercomputing cluster with 131,072 GPUs
????If you hope to meet more often, please mark the star ?????? and add it to your collection~
Source: Content Compiled from tomshardware , thank you.
Oracle on Wednesday unveiled new clusters for AI training through Oracle Cloud Infrastructure (OCI). The most powerful cluster will be based on Nvidia's upcoming Blackwell GPUs and will deliver up to 2.4 ZettaFLOPS of AI performance, making it more powerful than the AI cluster recently announced by Elon Musk.
Oracle’s new supercomputer clusters can be configured with Nvidia’s Hopper or Blackwell GPUs for AI and HPC, as well as different networking equipment, including ultra-low latency RoCEv2 with ConnectX-7 NICs and ConnectX-8 SuperNICs or Nvidia’s Quantum-2 InfiniBand-based networking, and a choice of HPC storage depending on performance needs:
An OCI supercluster with H100 GPUs can support up to 16,384 GPUs, delivering 65 FP8/INT8 exaFLOPS of peak performance and 13 Pb/s (13 petabits per second) of combined network throughput.
The OCI supercluster powered by H200 GPUs will be available later this year and will scale to 65,536 GPUs, delivering up to 260 FP8/INT8 exaFLOPS and 52 Pb/s of network throughput.
Finally, the Blackwell B200 GPU-based OCI supercluster will scale to 131,072 GPUs and deliver up to 2.4 FP8/INT8 zettaFLOPS of peak performance.
OCI's upcoming supercomputing cluster far exceeds the capabilities of current leading systems. According to Oracle, the top-of-the-line B200-based OCI supercluster will have more than three times the number of GPUs as the Frontier supercomputer (which uses 37,888 AMD Instinct MI250X GPUs) and six times the number of other hyperscale computing systems.
“We have one of the broadest AI infrastructure offerings and support customers running some of the most demanding AI workloads in the cloud,” said Mahesh Thiagarajan, executive vice president of Oracle Cloud Infrastructure. “With Oracle’s distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose, while retaining the highest levels of data and AI sovereignty.”
Several companies are already benefiting from this advanced infrastructure. WideLabs and Zoom are leveraging OCI’s high-performance AI infrastructure to accelerate their AI development while maintaining sovereign control.
“As businesses, researchers and countries race to innovate with AI, access to powerful computing clusters and AI software is critical,” said Ian Buck, vice president of hyperscale and high-performance computing at Nvidia. “Nvidia’s full-stack AI computing platform on Oracle’s widely distributed cloud will deliver AI computing power at an unprecedented scale to advance the world’s AI efforts and help organizations around the world accelerate research, development and deployment.”
The upcoming OCI supercluster will use Nvidia's GB200 NVL72 liquid-cooled cabinet with 72 GPUs communicating with each other in a single NVLink domain with a total bandwidth of 129.6 TB/s. Oracle says Nvidia's Blackwell GPUs will be available in the first half of 2025 (as Blackwell supply is limited this year), but it's unclear when OCI will offer fully loaded Blackwell-powered clusters.
The first Zettascale cloud computing cluster
Oracle today announced the availability of the first Zeta-class cloud computing clusters accelerated by the NVIDIA Blackwell platform. Oracle Cloud Infrastructure (OCI) is now accepting orders for the largest AI supercomputer in the cloud, which can be equipped with up to 131,072 NVIDIA Blackwell GPUs.
“We have one of the broadest AI infrastructure offerings and support customers running some of the most demanding AI workloads in the cloud,” said Mahesh Thiagarajan, executive vice president of Oracle Cloud Infrastructure. “With Oracle’s distributed cloud, customers have the flexibility to deploy cloud and AI services wherever they choose, while retaining the highest levels of data and AI sovereignty.”
OCI is now accepting orders for the largest AI supercomputer in the cloud, featuring up to 131,072 NVIDIA Blackwell GPUs and an unprecedented 2.4 zettaFLOPS of peak performance. The largest scale of the OCI Supercluster offers more than three times the number of GPUs as the Frontier supercomputer and more than six times the number of other hyperscalers. The OCI Supercluster includes OCI Compute Bare Metal, ultra-low latency RoCEv2 or NVIDIA Quantum-2 InfiniBand-based networking with ConnectX-7 NICs and ConnectX-8 SuperNICs, and a choice of HPC storage.
OCI Super Clusters can be ordered with OCI Compute powered by NVIDIA H100 or H200 Tensor Core GPUs or NVIDIA Blackwell GPUs. OCI Super Clusters with H100 GPUs will scale to 16,384 GPUs with up to 65 ExaFLOPS of performance and 13Pb/s of aggregate network throughput. OCI Super Clusters with H200 GPUs will scale to 65,536 GPUs with up to 260 ExaFLOPS of performance and 52Pb/s of aggregate network throughput and will be available later this year. OCI Super Clusters with NVIDIA GB200 NVL72 liquid-cooled bare metal instances will use NVLink and NVLink Switch to enable up to 72 Blackwell GPUs to communicate with each other in a single NVLink domain with an aggregate bandwidth of 129.6 TB/s. NVIDIA Blackwell GPUs will be available in the first half of 2025 with fifth-generation NVLink, NVLink Switch, and cluster networking for seamless GPU-GPU communication in a single cluster.
“As businesses, researchers and countries race to innovate with AI, access to powerful computing clusters and AI software is critical,” said Ian Buck, vice president of hyperscale and high-performance computing at NVIDIA. “NVIDIA’s full-stack AI computing platform on Oracle’s widely distributed cloud will deliver AI computing power at an unprecedented scale to advance the world’s AI efforts and help organizations around the world accelerate research, development and deployment.”
Customers such as WideLabs and Zoom are leveraging OCI’s high-performance AI infrastructure with strong security and sovereign control.
Reference Links
https://www.tomshardware.com/tech-industry/artificial-intelligence/nvidia-and-oracle-team-up-for-zettascale-cluster-available-with-up-to-131072-blackwell-gpus
END
????Semiconductor boutique public account recommendation????
▲Click on the business card above to follow
Focus on more original content in the semiconductor field
▲Click on the business card above to follow
Focus on the trends and developments of the global semiconductor industry
*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.
Today is the 3884th content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Recommended Reading
★ Important report on EUV lithography machine released by the United States
Silicon carbide "surge": catching up, involution, and substitution
★ Chip giants all want to “kill” engineers!
Apple , playing with advanced packaging
★ Continental Group, developing 7nm chips
★
Zhang Zhongmou's latest interview: China will find a way to fight back
"The first vertical media in semiconductor industry"
Real-time professional original depth
Public account ID: icbank
If you like our content, please click "Reading" to share it with your friends.