Samsung's abandoned self-developed CPU core: How does the M5 perform?
Latest update time:2021-08-31 22:48
Reads:
Source: The content is compiled by Semiconductor Industry Observer (ID: icbank) from "
wikichip
", author: David Schor,
thank you.
Earlier this year, Samsung released the Exynos 990. The chip features a faster NPU, the latest G77 MP11 GPU, and LPDDR5. In terms of computing, the chip has an 8-core configuration - quad-core A55, dual-core A76, and the company's latest custom CPU core design - M5.
Earlier, Samsung announced that it would cut its CPU R&D center in Austin. In everyone's view, this is an important signal that Samsung will stop developing its own CPU core and embrace the Arm public version. For this reason, everyone has paid great attention to the strength of this product. In an update patch yesterday, Samsung submitted a patch with a new compiler scheduler model, which included some details of the new kernel.
The M5 is reportedly Samsung's fifth-generation custom core developed by Samsung Austin Research and Development Center (SARC). Considering the recent wave of layoffs and internal restructuring, this should also be their last custom core.
Samsung says the M5 cores offer "up to 20% enhanced performance," so we can expect the average to be lower. While the LLVM scheduler model is too high-level to tell what smaller modifications have taken place, we can still see some larger changes. From the LLVM patches, it's hard to see whether most of the M5's performance gains come from IPC improvements, significantly improved prefetchers, branch predictors, or other similar hidden components.
In terms of instruction set, the M5 has the same Armv8.2-A as the M4. At a high level, the M5 is also very similar to the M4: the pipeline remains 6-wide decode, and the back end retains the same 228-entry deep reorder buffer. Samsung did increase the instruction queue slightly from 48 entries to 60. The bigger change is the misprediction penalty, which has been improved by 1 cycle, down to 15 cycles.
On the back end, Samsung added two new simple 32-bit integer ALU pipelines. This brings the total number of integer pipelines (including branches) to seven. The addition of two 32-bit ALU pipelines is interesting because it does not improve the throughput of typical simple ALU workloads.
In terms of floating point clusters, Samsung has once again rebalanced the execution pipelines. The most notable change is the addition of neon point execution units on each of the three FP pipelines. The addition of dedicated neon point multiply units on each of the three floating point pipelines also helps account for the 32b integer ALUs.
As shown above, Nxxx are NEON (advanced SIMD) units, HAD = horizontal vector arithmetic, MSC = miscellanea, SHT = shift, SHF = shuffle, and CRY = cryptography.
*Click the end of the article to read the original text
in English
.
*Disclaimer: This article is originally written by the author. The content of the article is the author's personal opinion. Semiconductor Industry Observer reprints it only to convey a different point of view. It does not mean that Semiconductor Industry Observer agrees or supports this point of view. If you have any objections, please contact Semiconductor Industry Observer.
Today is the 2136th issue of content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Semiconductor Industry Observation
"
The first vertical media in semiconductor industry
"
Real-time professional original depth
Scan the QR code
, reply to the keywords below, and read more
FPGA
| Apple
| TSMC
| RF
|
ASML
|
IC
|
Storage | Wafer
Reply
Submit your article
and read "How to become a member of "Semiconductor Industry Observer""
Reply
Search
and you can easily find other articles that interest you!
Click here to read
English original
!