Ampere Arm CPU latest roadmap, chips with 512 cores and AI accelerators to be unveiled
????If you hope to meet more often, please mark the star ?????? and add it to your collection~
Source: Content from Semiconductor Industry Observer (ID: icbank) compiled from nextplatform, thank you.
With all the hyperscalers and major cloud builders designing their own CPUs and AI accelerators, there’s enormous pressure on the companies that sell compute engines to them. That includes Intel, AMD, and Nvidia, of course. But it also includes Arm server chip upstart Ampere Computing, which is competing with them in the CPU space and, soon, in AI processing.
For server CPUs, the “cloud giants” account for more than half of server revenues and more than half of shipments, and for the currently dominant AI accelerator server GPU, these companies may account for 65% or even 70% or 75% of revenues and shipments. (We don’t see any data here, so the margin of error for this statement is wide.) As GenAI becomes more mainstream and GPUs become more plentiful—the two complement each other—datacenter GPUs and other types of AI accelerators should achieve revenue and shipment share similar to server CPUs. As we’ve discussed before, at some point half of global server revenue will come from AI accelerators and the other half will come from general-purpose CPU machines.
Unless what’s happening on PCs for AI is also happening on datacenter servers. Server CPUs can get more powerful AI compute power for local AI processing, close to the applications that need it, just as our PCs and phones are getting embedded neural network processing now. We happen to believe — and have always believed — that a large portion of AI inference will be done on server CPUs, but we don’t see the huge parameter counts and model weights of the large language models that support GenAI. It’s just getting bigger faster than we expected — and we’re not the only ones seeing this happening.
This means that the AI processing power integrated into server CPUs will have to grow faster than we expected, something Ampere Computing Chief Product Officer Jeff Wittich talked to us about in April when the company did a small reveal of its AmpereOne CPU roadmap.
Today, Wittich is making a bigger reveal, as Ampere Computing unveiled plans to bring an Arm server CPU with homegrown Arm cores, a homegrown mesh interconnect, and now a homegrown integrated AI accelerator, dubbed “Aurora,” to the field, which could arrive in late 2025 or early 2026.
Ampere Computing’s new 2024 roadmap is consistent with the enhanced roadmap we created in 2022 based on everything we knew at the time:
It’s worth noting that the calendar date shown above is the date Ampere Computing announced the chip, not the date it started shipping.
Since Ampere Computing’s second-generation AmpereOne chip has no codename, we invented one in keeping with the X-Men theme the company has used for years and named it “Polaris,” after Magneto’s daughter, also named “Magnetrix.”
Here’s the updated roadmap, which doesn’t show any more iterations than above, but names the fourth-generation Ampere One chip “Aurora,” part of the Canadian X-Men team Alpha Flight, and adds some details about the future Ampere Computing Arm CPU architecture:
This roadmap tells us the current state of past, present, and future Altra and AmpereOne processors. We added the codenames and names of the Arm Neoverse cores used in the Altra family, as well as our own names for the Ampere Computing homegrown cores used in the AmpereOne family. (The company doesn’t talk about its cores this way, and we’re trying to encourage that behavior by doing this.)
We've talked about the 192-core, eight-channel Polaris chip several times over the past year, and in April we talked about a variant of this chip with 12 DDR5 memory channels, as well as a kicker chip with 256 cores and 12 memory channels (which we called Magnetrix, in line with what we guessed was the company's MX name).
The big change with Aurora is that Ampere Computing has added its own AI engine—presumably a tensor core that’s more flexible than a flat matrix multiplication engine, but Wittich didn’t say. The Aurora chip will have as many as 512 cores and will likely have at least 16 memory channels, with 24 channels sounding like a better balance. (A mix of HBM4 and DDR6 memory is possible, but unlikely given the high cost of adding HBM memory to a CPU.) The chip will also feature a homegrown mesh interconnect to connect the CPU cores and AI cores together.
We also think that future A2+ and A3 cores (as we call them) will get more vector units, but probably not bigger ones. (AMD has decided to go with four 128-bit units in its “Genoa” Epyc 9004 series, rather than one 512-bit vector or two 256-bit vectors.) The AmpereOne and AmpereOne M have two 128-bit vectors per core, and we wouldn’t be surprised if the AmpereOne MX and AmpereOne Aurora have four 128-bit vectors per core. (See why we’re using codenames? There’s a lot of duplication there.) So there will be different levels of AI acceleration in these chips, and for Aurora we don’t think vector units will be taken away from the A3 cores to make room for more cores on the chip at a given process.
Perhaps most interestingly, even though there’s a large AI engine inside the package (likely as a separate chip), the Aurora chips will remain air-cooled.
“These solutions have to be air-cooled,” Wittich explained during a preview of the Polaris chip’s roadmap and microarchitecture, which we’ll cover separately. “These solutions need to be deployed in all existing data centers, not just a few operating in a few locations, and they have to be efficient. There’s no reason to give up on the industry’s climate goals. We can do both at the same time — we can build AI compute, and we can achieve our climate goals.”
As Wittich described in his briefing, it’s a daunting challenge. According to the Uptime 2023 Global Data Center Survey, 77% of data centers have a maximum power consumption of less than 200 kilowatts per rack, and 50% have a maximum power consumption of less than 10 kilowatts. Today, a rack of GPU acceleration systems (which must be liquid cooled) consumes about 100 kilowatts, and there are discussions about increasing density to reduce latency between devices and nodes, which would push power consumption up to 200 kilowatts. Wittich has been in serious discussions with companies that want to consume 1 megawatt per rack and cool it somehow.
We live in interesting times. But most of the enterprise IT world won’t have any of this, and as we’ve pointed out before, Ampere Computing may start out trying to make the Arm server processors that hyperscalers and cloud builders will buy, but it may end up selling most of its processors as backup to homegrown products and enterprises that can neither afford nor design their own chips. Ampere Computing could position itself as a safe Arm server chip choice for those who can’t make their own Arm chips but still want to enjoy the benefits of the cloud-native Arm architecture.
For many companies that need cheaper computing and artificial intelligence, Aurora may be just the right chip.
One last thing. Ampere Computing has a lot of ex-Intel employees in its top ranks, wouldn’t it be interesting if Intel foundry instead of Taiwan Semiconductor Manufacturing Company ended up etching the Aurora chips?
Reference Links
https://www.nextplatform.com/2024/07/31/ampere-arm-server-cpus-to-get-512-cores-ai-accelerator/
END
*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.
Today is the 3844th content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Recommended Reading
★ Important report on EUV lithography machine released by the United States
Silicon carbide "surge": catching up, involution, and substitution
★ Chip giants all want to “kill” engineers!
Apple , playing with advanced packaging
★ Continental Group, developing 7nm chips
★
Zhang Zhongmou's latest interview: China will find a way to fight back
"The first vertical media in semiconductor industry"
Real-time professional original depth
Public account ID: icbank
If you like our content, please click "Reading" to share it with your friends.