Known as "Little Nvidia", how did this company make a fortune in silence?
????If you hope to meet more often, please mark the star ?????? and add it to your collection~
Source: Content from Information Equality, thank you.
With the large-scale development of generative AI, the prices of AI servers have risen sharply in recent years, including Nvidia's GPU chips and SK Hynix's HBM. However, in another niche field, there is also a company that is secretly making a fortune: Astera Labs, which is called "Little Nvidia" by industry insiders.
Astera Labs was founded in a garage in 2017, in typical Silicon Valley style. Co-founders Jitendra Mohan, Sanjay Gajendra and Casey Morrison worked in the high-speed interface business unit of Texas Instruments (TI). On March 20 this year, Astera Labs officially went public with an initial public offering price of $36. Today, its stock price is around $98, with a market value of $15.3 billion.
Astera Labs recently released its third-quarter financial report, with revenue of $113 million, up 206% year-on-year and 47% quarter-on-quarter, exceeding Wall Street analysts' expectations by 15%; gross profit margin was as high as 78%, slightly higher than Nvidia, which has a gross profit margin of 75%. And this profitability is not a flash in the pan, Astera Labs expects revenue of $126-130 million in the fourth quarter, a median increase of 153%, and a gross profit margin of 75%.
What exactly does Astera Labs sell? Astera Labs currently has three main product lines: Aries PCIe/CXL retimers; Taurus smart cable modules; and Leo CXL smart memory controllers. Among them, PCIe retimers are the "money printing machine" of Astera Labs . Chip giants such as Nvidia, AMD, and Intel, as well as technology giants such as Microsoft, Google, and Amazon are all major customers of Astera Labs.
How valuable are PCIe retimers? Even big manufacturers like Broadcom and Marvell want to get a piece of the pie.
Astera Labs’ Value in NVIDIA AI Servers
First, let’s take a look at how many PCIe retimers and switches are used in NVIDIA’s current DGX server?
We know that a DGX server has a UBB (universal base board) board with 8 GPGPUs on it, and a CPU board (called head node) with 2 CPUs on it. According to the supply chain research, a standard DGX server will be configured with 8 PCIe Gen5 retimers (corresponding to 8 GPGPUs) on the UBB board, and 8 PCIe Gen5 retimers on the head node, corresponding to the 8 retimers on the UBB (some MGX customers will also shorten the data transmission distance by changing the board layout, so that only 4 retimers are placed on the head node, but the standard version of DGX is designed with 8+8 retimers). In addition, a DGX server is also equipped with 2 144-lane PCIe Gen5 switches for connecting CPUs, GPUs, and CX7 network cards. Specifically, each PCIe switch is connected to an Intel or AMD CPU, occupying 16 x 2 = 32 lanes; 2 CX7 network cards, occupying 16 x 2 = 32 lanes; and 4 GPGPU cards, occupying 16 x 4 = 64 lanes, a total of 128 lanes. NVIDIA does not specify the configuration of the remaining 144 - 128 = 16 lanes, leaving it to customers and machine manufacturers to play freely (see the figure below, taking AMD CPU DGX as an example):
Among them, NVIDIA uses Astera Labs for PCIe Gen5 retimers, and the mass production price is $30~35 per piece (depending on the quantity required by the customer); the PCIe Gen5 switch uses Broadcom's PEX89144, and the mass production price is $400~450 per piece. After talking about the DGX server, let's take a look at the PCIe topology diagram of NVIDIA's GB200 compute tray:
There may be a misunderstanding here: since Astera Labs announced at the OCP conference that its Scorpio PCIe Gen6 switch product will be used in GB200, some investors mistakenly believe that the blue PCIe fanout switch in the above picture is the PCIe switch used in GB200. In fact, it is just a PCIe Gen3 switch (16 uplinks connected to Grace CPU + 18 downlinks connected to USB/BMC/Boot/Debug network), which is used to manage some miscellaneous/peripherals devices in the compute tray and is supplied by Diodes, an American Analog chip company. There is no PCIe Gen6 switch in the standard version of NVIDIA's GB200 reference design, and only hyperscalers customers who use non-NVIDIA CX8 network cards and/or non-NVIDIA Grace CPUs need to install a PCIe Gen6 switch in the GB200 compute tray.
We know that Astera Labs first launched a 64-lane PCIe Gen6 switch this year, which is used to connect the CPU/GPU/NIC/NVMe in the compute tray. According to the supply chain survey, the author learned that one GB200 card needs to use two Astera Labs 64-lane PCIe switches, each of which is connected to one CPU, occupying 17 lanes; one NIC card, occupying 16 lanes; one GPGPU card, occupying 16 lanes; and two SSDs (ie NVMe), occupying 2 x 4 = 8 lanes, a total of 57 lanes, and the remaining 64 – 57 = 7 lanes are temporarily idle. Different customers can configure according to their needs (see the figure below):
A GB200 compute tray has two GB200 cards, so 2 x 2 = 4 64-lane PCIe switches are required. In addition, although the standard GB200 compute tray does not need PCIe retimers because the CPU and GPU are very close and connected via NVLink C2C, if hyperscalers customers use self-developed NICs based on FPGA, and the NIC and NVMe are placed on an extended board outside the motherboard, then 4 PCIe retimers (corresponding to 4 NICs) will still be required.
Based on the above PCIe topology of DGX and GB200, we can calculate the value of GB200 and introduce who are the GB200 project customers currently obtained by Astera Labs.
As mentioned above, the value of Astera Labs in Nvidia DGX server is roughly $30~35 ASP x 16 PCIe Gen5 retimers = $480~560 per compute tray, or $60~70 per GPU. If we only look at the retimer, Astera Labs' content dollar in GB200 will indeed drop significantly: $45~50 ASP x 4 PCIe Gen6 retimers = $180~200 per compute tray, or $45~50 per GPU.
It should be noted here that although the usage of PCIe retimers in GB200 has been reduced a lot, because PCIe Gen6 retimers have significant technical improvements compared to PCIe Gen5 retimers, the ASP will increase by ~50%.
But this only calculates the content of PCIe retimer in DGX vs. GB200. When we add the company's Scorpio PCIe Gen6 switch product, we will find that the value of Astera Labs in GB200 has actually increased significantly: PCIe Gen6 switch chips are still in the sampling stage, so the exact mass production price is still unknown, but the author roughly estimates that the price of a 64-lane PCIe Gen6 switch should be between $200 and $250. A GB200 compute tray requires 4. Add 4 PCIe Gen6 retimers, and Astera Labs' content dollar in NVIDIA GB200 is roughly $1000~1200 per compute tray, or $250~300 per GPU.
Scorpio Fabric switches add fuel to the fire
At the beginning of the fourth quarter of this year, Astera Labs launched a new Scorpio Smart Fabric switch product portfolio designed for cloud-level AI infrastructure, which is also the company's fourth product line. Many people are optimistic about this new product, claiming that it will help Astera Labs significantly increase its value.
Astera Labs also pointed out in its third-quarter financial report: Our Scorpio intelligent fabric switch series goes beyond our current market footprint of PCI Express and Ethernet Retimer-class products and controller-class devices for CXL memory, providing meaningfully higher functionality and value to our AI and cloud infrastructure customers. We estimate that by 2028, Scorpio will expand the total market opportunity for our four product lines to more than $12 billion.
The Scorpio Smart Fabric switch family includes two application-specific product lines, including the P-Series for GPU-to-CPU/NIC/SSD PCIe Gen 6 connectivity and the X-Series for platform-specific back-end GPU clusters.
The Scorpio P-Series fabric switches are the industry's first PCIe 6-capable switches, architected for hybrid traffic head-node connectivity and data ingestion across a diverse ecosystem of PCIe hosts and endpoints.
Scorpio X-Series fabric switches are designed to deliver the highest back-end GPU-to-GPU bandwidth and support platform-specific customization through its software-defined architecture. Innovations in protocol enhancements, bandwidth and latency tuning, and expanded telemetry capabilities provide optimizations to reliably scale homogeneous GPU or accelerator fabrics to deliver the best direct user experience for real-time insights and maximize uptime to improve ROI for large-scale AI training and inference builds.
The company's Scorpio PCIe Gen6 switch has now entered AWS and Google's customized GB200 rack (i.e., the compute tray uses self-developed NICs instead of NVIDIA's CX network cards). How many orders have been received? How much will the wafer investment at TSMC increase next year? I asked a friend at TSMC for some reliable production schedule figures. I can only say that if the valuation is calculated according to the above deduction, it is indeed still expensive... But a startup boss in the industry group who makes interconnect chips praised the company's strong product strength and said that there are many subsequent product lines, and he is very optimistic about the long-term prospects. In short, the differences are getting bigger and bigger. Welcome to discuss on the Knowledge Planet.
END
????Semiconductor boutique public account recommendation????
▲Click on the business card above to follow
Focus on more original content in the semiconductor field
▲Click on the business card above to follow
Focus on the trends and developments of the global semiconductor industry
*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.
Today is the 3940th content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Recommended Reading
★ Important report on EUV lithography machine released by the United States
Silicon carbide "surge": catching up, involution, and substitution
★ Chip giants all want to “kill” engineers!
Apple , playing with advanced packaging
★ Continental Group, developing 7nm chips
★
Zhang Zhongmou's latest interview: China will find a way to fight back
"The first vertical media in semiconductor industry"
Real-time professional original depth
Public account ID: icbank
If you like our content, please click "Reading" to share it with your friends.