Article count:25239 Read by:103424336

Account Entry

AMD launches another attack on Intel server chips

Latest update time:2021-03-16
    Reads:

Source: The content is compiled from " henexrplatform " by Semiconductor Industry Observer (ID: icbank), thank you.


With each passing year, as AMD first talked about its plans to re-enter the server processor space and give Intel some real, much-needed, and very direct competition, and then delivered time and again on its processor roadmap, AMD has gradually proven that they are serious about taking on the Intel-dominated X86 computing space.

It will become easier with the launch of the third-generation "Milan" Epyc 7003 processors, but customers would prefer that AMD deliver this processor years ago.

But don't get confused. Just because things get easier doesn't mean it's easy, as Intel's latest quarterly financial results for its data center division go further than ever before, suggesting that Epyc's comeback isn't as easy as the Opteron offensive a decade and a half ago.

Enthusiasm for AMD's X86 server processors is influenced by many factors, not the least of which is that Intel is much stronger in computing, networking and storage in 2021 than AMD was when it launched the Opterons in 2000.

Intel has screwed up their roadmap and manufacturing over the past few years, but it's nowhere near as bad as their decision to make Itanium, a chip that isn't really compatible with Xeon.

So it was no surprise when AMD was able to capture 20% or more market share in certain segments of the X86 server space fairly quickly.

With three generations of Epyc now underway, expectations are high for the fourth-generation “Genoa” Epyc 7004 series due for release in 2022, however, AMD’s market share growth has been slower.

The new era sees about 50% more server shipments per quarter compared to the mid-2000s - and some of them, such as hyperscalers and cloud builders, are absolutely huge. We believe that the Epyc server chip business is on a better and more sustainable growth path this time, which will bring a lot of pain to Intel in the coming years. This should be because every IT customer deserves the benefits of fierce and direct competition, which Intel has not really had in the server processor space for more than a decade, and during the period of hegemony, its gross profits in the data center group proved that it brought more benefits than doubts about it.

Indirect competition from IBM's Power processors and the fleeting members of the Arm team haven't been enough to chip away at Intel's armor. AMD's reemergence with the Epyc processors will make things much tougher for Intel.

The company is climbing fast under the guidance of president and CEO Lisa Su, and has been able to put some dents in Intel's armor as the company jumped off the steps of the castle due to its 10nm manufacturing misstep.

While the upcoming "Ice Lake" Xeon SP processors will allow Intel to fend off the Milan Epyc 7003 attack, which actually starts when AMD and Intel start shipping their chips to hyperscalers and cloud manufacturers in the fourth quarter, the fact remains that Ice Lake should be going up against the second-generation "Rome" Epyc 7002s, and that's not the case. Intel will be better off with Ice Lake and the follow-up to the "Sapphire Rapids" technology, which is based on an improved 10nm manufacturing process that will be launched later this year or early next year. However, Intel's fabs didn't finish 10nm manufacturing on time, and even a little late, rather than being severely delayed as it is now.

Let it be. This is the chip business, and this is the way chips fall sometimes. Everyone — and we mean everyone — is going to have some issues in the chip foundries that will be plagued by manufacturing capacity constraints and other delays to future process leaps. Everyone is going to be in the penalty box at the longest term, especially as Moore’s Law moves into the cannula from the slowdown it has been doing over the past few years. From what we’ve heard, 10nm and 7nm were tough for everyone, 5nm will be even tougher, and we don’t hold out much hope for anything easy in the 3nm cycle. Chiplets everywhere! And AMD already knows how to do it better than Intel.

Against this backdrop, we’ll take a look at AMD’s new Milan line of products and, as always, take a deep dive into the new processors in the Epyc 7003 series, including an overview of the new Milan chips and how they compare to previous-generation Opteron and Epyc processors, a deep dive into the architecture, the competitive position of these CPUs in the server space, and the competitive response from Intel and other vendors providing server CPUs and the OEMs and ODMs that consume them.

There is a feedback loop between the design of PCs and servers, which RISC/Unix server vendors could previously use to amortize design costs over a wider base, thereby extracting more profit from customers. But currently, only X86 server makers Intel and AMD and GPU makers Nvidia and AMD are still able to do this for their compute engines. One day, there may be an Arm vendor that does both client and server, and it may be Nvidia or it may be Apple. Intel also hopes to provide GPUs for both clients and servers. AMD's Ryzen chips for clients and Epyc chips for servers all have the same architecture, with the Milan server chip being based on the Zen3 core, a technology that has been used in PC CPUs for many months.

In the case of the Milan chips, the memory and I/O hub chips at the heart of the architecture remain essentially the same, save for some tweaks to support nested paging of main memory and to run the Infinity Fabric interconnect to link the Zen 3 cores. They are linked to the memory and I/O hub chips (and therefore to each other) at the same 1.6 GHz clock speed as the main memory clock (which is pumped twice to keep the main memory running at 3.2 GHz). In the past, the two clocks were not synchronized, and this synchronization is a factor in the improved performance between Rome and Milan processors. On applications that are sensitive to memory bandwidth and latency , the clock synchronization provides a 3% to 5% improvement over Rome processors that did not run the two clocks at the same speed.

Here are the general feeds and speeds for the three generations of Epyc processors:


As you can see, the core and thread counts haven't changed much between the Rome and Milan generations, and both chips use the 7-nanometer process from Taiwan Semiconductor Manufacturing Co. AMD still offers simultaneous multithreading (SMT) support for two virtual threads per physical core, rather than pushing it to four or eight threads per core like IBM did with its Power8 and Power9 chips.

The memory and I/O systems are essentially the same, with eight controllers per Epyc socket and 128 lanes of PCI-Express 4.0 I/O per socket. The thermal design points of the processors are identical.

There's good reason for that: The Milan chips had to maintain socket compatibility with the Rome chips, or else motherboard and system makers would make it extremely painful for AMD. It had to be a performance boost within all of those constraints, and that's exactly what AMD delivered with Milan, averaging 19 percent more instructions per clock (IPC) across a representative set of workloads compared to Rome.

A 19% improvement in volume per socket is far better than the 5% to 10% IPC improvement per socket per generation that Intel has shown, and frankly, it's probably much better than many people expected from AMD.

You can't do everything at once, or anything at all. In fact, Milan had to wait until the Ryzen PC chip market needed a fatter core complex to do something that flattened the NUMA domains, because they all plugged in with that memory and I/O hub chip to create what looked like a monolithic socket (more or less) to the operating system and its applications.


Specifically, the Rome core complex has four Zen2 cores, each with its own L2 cache, hanging off a shared 16MB L3 cache. Two of these modules are etched onto a small chip, which is essentially Ryzen's baby PC chip, and then eight of them are interconnected with the Infinity Fabric inside the socket to create the 64-core Rome chip. Incidentally, both Rome and Milan are using Infinity Fabric Gen 2.0 (x-GMI-2 in the image above) to link the core complex to the memory and I/O chips in the center of the package.

In the Milan design, the core hierarchy is unified, with each of the eight Zen3 cores having a dedicated L2 cache, and they all share a 32 MB L3 cache, implemented as a chiplet. Eight of these chiplets provide the same 64 cores at most, but the number of NUMA domains represented by the entire socket is reduced by half, so the operating system and virtual machines see more raw processing and cache. In fact, a single core can be assigned 32 MB of L3 cache, and in some SKUs of the Rome product line, especially those aimed at very high performance, this is exactly the case.

So, for example, in the Epyc 75F3, only four of the eight cores are turned on, for a total of 32 cores, with each of the four cores having a full 32 MB of shared L3 cache and all eight DDR4 memory controllers activated for a maximum of 4 TB per socket capacity using 256 GB memory sticks. On the eight-core Epyc 72F3 chip (which is the extreme end of the Milan line), only one of the eight cores is activated, and it runs at 3.7 GHz, close to its 4 GHz turbo speed. Each core has 32 GB of L3 cache, which is a lot, and can contribute significantly to performance in some applications than you might expect based on the combination of core count, clock speed, and IPC boost compared to its Rome predecessor.

AMD offers a total of 19 Milan Epyc 7003 processors, which are divided into three major categories as follows:


As in the past, the F models are optimized for the fastest core clock speed frequencies for a relatively small number of cores - something that's only possible with a smaller number of cores, which necessarily results in a higher L3 cache-to-core ratio. There are four of these models, with 8, 16, 24, and 32 cores. Another set of five Milan chips has very high core density, and therefore high thread counts, and they're aimed at server virtualization and database workloads, both of which like many cores and threads for increased throughput. Then there are ten Milan processors that are "balanced and optimized" to strike the difference between relatively high performance and low total cost of ownership. As with the Naples and Rome processors, some of the Epyc chips are marked with a P.

As with the first two generations of Epyc chips, the third generation does not support NUMA machines with more than two sockets. AMD is exiting the market, which has machines equipped with Intel and IBM sockets with four or eight sockets.

As we said, we'll be diving into the details on Milan processing in subsequent stories. For now, we just wanted to give you the data on the new chips and how they compare to each other and to the previous generation of Opteron and Epyc processors. So without further ado, here are the Milan SKUs:


The high performance F models are in bold italics and the P single processor chips are highlighted in grey, which is custom to our Epyc line. We have calculated the raw performance metrics based on core count and clock speed within the Milan line and then created a relative performance metric that takes this into account as well as the raw improvement in IPC over time to give you a relative performance metric based on the performance of a quad-core "Shanghai" Opteron 2387 with a frequency of 2.8 GHz, which has a relative performance of 1.0 and a price/performance ratio of $873. List price is the unit price for customers purchasing processors in 1,000 unit quantities, which is standard for Intel and AMD list prices.

Here are the synopsis and speeds for the Naples and Rome Epyc chips, as well as the Shanghai Opteron 2300:


The relative performance of the Milan chips ranges from less than 6 for the eight-core Epyc 72F3 to 31.6 for the Epyc 7763, and anywhere from $94 as low as $414 as high as low as far as relative performance per unit. The 16-core Epyc 7313P and 24-core Epyc 7443P offer the best price/performance, and interestingly, the low-core, high-clock, high-L3-cache eight-core Epyc 72F3 is just under half that at $414, which is more performance and value than the Shanghai Opteron processor benchmark from early 2009. That may seem crazy, but it just goes to show you that Dennard scaling really stopped a long time ago.

It’s hard to generalize across a product line where the SKUs don’t match up precisely between generations, but it looks like AMD is offering higher performance and more value for money overall – but certainly not in all cases. Jumping from Rome to Milan. Take the 48-core Epyc 7643 running at 2.3 GHz and match it with the 48-core Epyc 7642 running at 2.3 GHz. That’s a 19% performance increase based on IPC improvements alone, but AMD has also increased the price from $4,775 for the Rome chip to $4,995 for the Milan chip, which is an apparent 10% improvement in price/performance.

It comes down to case studies, which is why we built the table above. You can compare to your heart's content.


★ Click [Read original text] at the end of the article to view the original link of this article!

*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.


Today is the 2616th content shared by "Semiconductor Industry Observer" for you, welcome to follow.

Recommended Reading

Semiconductor Industry Observation

" The first vertical media in semiconductor industry "

Real-time professional original depth


Scan the QR code , reply to the keywords below, and read more

Wafers | ICs | Equipment | Analog Chips | RF | Sensors | Transistors | ASML

Reply Submit your article and read "How to become a member of "Semiconductor Industry Observer""

Reply Search and you can easily find other articles that interest you!


Click to read the original text to view this article
Original link!


 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号