Intel's new transistor performance improvement is comparable to node upgrades! Is a new golden decade for computing architecture about to begin?
Author | Bao Yonggang
According to Leifeng.com, Intel Chief Architect Raja Koduri and several Intel Fellows and architects detailed Intel's latest progress in six major technology pillars at Intel Architecture Day 2020. They demonstrated Intel's new 10nm SuperFin technology for the first time, and introduced the fully scalable Xe graphics architecture for the first time, as well as the Willow Cove microarchitecture and Tiger Lake SoC architecture details for mobile clients, advanced packaging technology, hybrid architecture, etc.
Do the progress in process & packaging, XPU architecture, memory & storage, interconnection, security, and software demonstrated by Intel on Architecture Day indicate that Intel has entered a golden decade of computing architecture innovation?
1
New 10nm SuperFin technology is comparable to node conversion
The gap between the rapidly growing computing power demand in the era of intelligence and the speed of advanced process improvement is getting bigger and bigger. The miniaturization of transistors is also facing more and more physical limit challenges. The industry is solving the challenges through technological innovation. Raja Koduri said that after years of improvement in FinFET transistor technology, Intel is redefining the technology to achieve the most powerful single-node performance enhancement in its history, bringing performance improvements comparable to full node conversion.
Intel's new transistor technology, 10nm SuperFin technology, combines Intel's enhanced FinFET transistors with Super MIM (Metal-Insulator-Metal) capacitors.
SuperFin technology provides enhanced epitaxial source/drain, improved gate process and additional gate pitch, and enables higher performance through:
-
Enhancing the extension of the crystal structure on the source and drain electrodes increases strain and reduces resistance to allow more current to flow through the channel.
-
Improve the gate process to achieve higher channel mobility, allowing charge carriers to move faster.
-
Providing additional gate pitch options enables higher drive currents for chip functions that require the highest performance.
-
The use of a new thin-wall barrier reduces via resistance by 30%, improving interconnect performance.
-
与行业标准相比,在同等的占位面积内电容增加了5倍,从而减少了电压下降,显著提高了产品性能。该技术由一类新型的“高K”( Hi-K)电介质材料实现,该材料可以堆叠在厚度仅为几埃厚的超薄层中,从而形成重复的“超晶格”结构。
“This is industry-leading technology that’s ahead of the current capabilities of other chipmakers,” Raja said.
It is reported that the 10nm SuperFin technology will be used in Intel's next-generation mobile processor code-named "Tiger Lake". Products equipped with Tiger Lake mobile processors will be available in the holiday season this year.
While transistor technology is innovating, architectural innovation may be more important. Last year, John L. Hennessy and David A. Patterson, two winners of the 2017 Turing Award, published a long report titled "A New Golden Age for Computer Architecture", which detailed the changes that have led to the advent of a new era in computer architecture, and envisioned that the next decade will be the "new golden decade" in the field of computer architecture.
Intel showcased the latest advances in multiple architectures from CPU to GPU at its 2020 Architecture Day.
2
Tiger Lake and the latest Willow Cove CPUs
And Xe GPU microarchitecture
Tiger Lake是英特尔第一个在SoC架构中采用全新 Xe-LP图形微架构。 Tiger Lake基于全新的Willow Cove CPU核,有显著的频率。GPU采用全新的Xe图形架构,每瓦性能效率有显著提升。加上电源管理、结构和内存、I/O、显示灯方面的提升,Tiger Lake的性能超越上一代CPU,并实现大规模的AI性能和图形性能的飞跃。
Tiger Lake SoC architecture improvements are as follows:
-
全新Willow Cove CPU核心——基于10nm SuperFin技术进步,显著提升频率。
-
新Xe图形架构 – 具有高达96个执行单元(EUs),每瓦性能效率显著提高。
-
Power Management – Autonomous dynamic voltage frequency scaling (DVFS) in the coherent fabric improves fully integrated voltage regulator (FIVR) efficiency.
-
Fabric and Memory – 2x increase in coherent fabric bandwidth, approximately 86GB/s memory bandwidth, and proven LP4x-4267, DDR4-3200; LP5-5400 architecture capabilities.
-
Gaussian Network Accelerator GNA 2.0 dedicated IP for low-power neural inference computing, reducing CPU processing. When running audio noise suppression workloads, the CPU utilization rate using GNA inference computing is 20% lower than that of the CPU without GNA.
-
IO – Integrated TB4/USB4, integrated PCIe Gen 4 on the CPU for low latency, high bandwidth device access to memory.
-
Display – Up to 64GB/s of isochronous transfer bandwidth to support multiple high-resolution displays. Dedicated fabric paths to memory to maintain quality of service.
-
IPU6 – Up to 6 sensors with 4K 30fps video, 27MP image; up to 4K90fps and 42MP image architecture capabilities.
Willow Cove
Willow Cove is based on the latest processor technology and Intel's latest generation CPU microarchitecture with 10nm SuperFin technology. Based on the Sunny Cove architecture, Willow Cove greatly improves the frequency and power efficiency, achieving performance improvements beyond inter-generational CPUs.
In addition, Willow Cove introduces a redesigned cache architecture into larger non-conforming 1.25MB MLCs and enhances security through Intel Control Flow Enforcement Technology.
Xe Graphics Architecture
Those who follow Intel are no strangers to the Xe architecture. Intel also revealed its first discrete graphics card for PCs, the DG1, at CES. At today's Architecture Day, Intel detailed for the first time the fully scalable Xe graphics architecture. The Xe graphics architecture has three series: Xe-LP, Xe-HP, and Xe-HPC. With the newly launched Xe microarchitecture variant Xe-HPG, there are now four series of Xe microarchitectures.
-
Xe-LP is the most efficient architecture for PC and mobile computing platforms. It has up to 96 EU units and a new architecture design, including asynchronous computing, view instancing, sampler feedback, an updated media engine with AV1, and an updated display engine.
-
Xe-LP will enable new end-user features such as Instant Game Tuning, capture and streaming, and image sharpening. In terms of software optimization, Xe-LP will improve the driver through a new DX11 path and an optimized compiler.
-
Xe-HP is the industry's first multi-tiled, highly scalable, high-performance architecture that delivers data center-class, rack-level media performance with scalability and AI optimization. Xe-HP covers a dynamic range of computing from one tile to two and four tiles, and its functionality is similar to a multi-core GPU.
At Architecture Day, Intel demonstrated Xe-HP transcoding 10 full high-quality 4K video streams at 60 FPS on a single block. It also demonstrated the computational scalability of Xe-HP across multiple blocks.
Leifeng.com learned that the first Xe-HP chip has completed startup testing in the laboratory. Currently, Intel is now testing Xe-HP with key customers and plans to make Xe HP available to developers through Intel DevCloud. Xe-HP products will be launched next year.
Xe-HPG is Intel's latest Xe microarchitecture variant, which is optimized for gaming. This new microarchitecture combines the building blocks of Xe-LP's good performance-to-power ratio, and uses the scalability of Xe-HP to optimize Xe-HPC for stronger configuration and computing frequency.
At the same time, Xe-HPG adds a new GDDR6-based memory subsystem to improve cost performance, and will have accelerated ray tracing support. It is reported that Xe-HPG is expected to start shipping in 2021.
Xe architecture products will be launched in the next two years. Intel said that the first Xe architecture product DG1 has been put into production and is expected to start delivery in 2020 as planned. DG1 is now available for early access users on Intel DevCloud.
In addition to DG1, Xe products that will soon go into production and ship later this year include the Server GPU (SG1) discrete graphics card for data centers. SG1 aggregates four DG1s to boost performance to data center levels in a very small size, enabling low-latency, high-density Android cloud gaming and video streaming.
Data Center Architecture
Ice Lake
Intel also has a product, Ice Lake, which is expected to be launched at the end of this year. Ice Lake is the first Intel Xeon Scalable Processor based on 10nm. It will bring a series of technologies, including full memory encryption, PCIe Gen 4, 8 memory channels, and enhanced instruction sets that can speed up cryptographic operations, providing strong performance in throughput and responsiveness across workloads.
Intel revealed that the Ice Lake series will also have variants for network storage and the Internet of Things.
Sapphire Rapids
Of course, Intel will also launch the next-generation Xeon scalable processor Sapphire Rapids based on the enhanced SuperFin technology. It will be the CPU used in the Aurora Exascale supercomputer system at the Argonne National Laboratory in the United States.
Sapphire Rapids provides leading industry-standard technologies, including DDR5, PCIe Gen 5, Compute Express Link 1.1, etc. In addition, it will also continue Intel's built-in artificial intelligence acceleration strategy, using a new accelerator called Advanced Matrix eXtension (AMX).
Intel expects Sapphire Rapids to begin initial production shipments in the second half of 2021.
In addition to CPUs and GPUs, Intel's FPGAs are also continuing to evolve.
3
Hybrid Architecture
It is particularly worth mentioning that on Architecture Day, Intel also introduced Alder Lake, the next-generation client product with a hybrid architecture.
Alder Lake will combine Intel's two upcoming architectures, Golden Cove and Gracemont, and will be optimized to deliver excellent performance per watt.
To usher in a new golden decade of computing architecture, in addition to transistor technology and architectural innovation, we also need more advanced packaging technology and a unified software platform to match it.
4
Packaging and software progress
In terms of packaging technology, Intel's test chip using "hybrid bonding" technology was taped out in the second quarter of 2020.
Hybrid bonding is an alternative to the traditional "thermocompression bonding" technology used in most packaging technologies today. This new technology can accelerate the realization of bump pitches of 10 microns and below, providing higher interconnect density, bandwidth and lower power.
In terms of storage technology, Intel also has a comprehensive range of products to meet different needs.
Transmission products, Intel has the world's first next-generation 224G-PAM4 TX transceiver.
In terms of software, in July this year, Intel released its eighth version of oneAPI Beta, which brought new features and improvements to distributed data analysis, including rendering performance, performance analysis, and video and thread libraries.
At Architecture Day, Intel announced that the oneAPI Gold version will be available later this year. The oneAPI Gold version provides developers with a solution that guarantees product-level quality and performance on scalar, vector, matrix and spatial architectures.
5
Leifeng.com Summary
Recently, news about Intel has caused concerns, including the delay of the 7nm process and the market value being surpassed by Nvidia. As a company with technology as its core competitiveness, Intel announced the latest technological progress and detailed interpretation of transistors, architecture, software and security at the 2020 Architecture Day. It is not only a good opportunity for Intel to let the outside world see its technological strength, but also the best way to respond to external doubts.
Raja said in an interview with Leifeng.com last year: "I 100% agree that the next decade will be the new golden decade for computing architecture. In the next 10 years, we will see much more architectural optimization and improvement than in the past 50 years. Through the combination of software and hardware, we can increase Moore's Law tenfold."
Competition in the chip industry has escalated. Can Intel continue to lead?
Previous recommendations