Recently, at Intel's 2021 Architecture Day event, the company launched a number of processor architecture innovation technologies. Combined with the evolution routes such as Intel7 and Intel4 announced at the recent process technology launch conference, it can be seen that under the leadership of new CEO Pat Gelsinger, the Intel we want is "back again."
Intel officials said the company is accelerating its pace of innovation. Unlike before, it is not proceeding step by step according to the tick-tock pace, but is innovating simultaneously.
In an era when Intel has to face X86 competitor AMD as well as emerging new rivals such as GPUs and AI accelerators, it has to speed up.
At Architecture Day 2021, Intel launched more than 10 architectural technology innovations, covering a variety of application scenarios such as data centers, HPC-AI and clients, and meeting the challenges of future workloads and computing through products such as CPU, GPU and IPU.
Raja Koduri, senior vice president and general manager of Intel’s Accelerated Computing Systems and Graphics Group, highlighted the importance of architectural advancements in meeting this demand: “Architecture is the alchemy of hardware and software. It combines the advanced transistors required for specific computing engines, connects them through leading-edge packaging technology, integrates high-bandwidth and low-power caches, and equips hybrid computing clusters with high-capacity, high-bandwidth memory and low-latency, scalable interconnects in the package, while ensuring that all software is seamlessly accelerated. As workloads from the desktop to the data center become more dense, complex, and diverse than ever before, these new breakthroughs announced this year also demonstrate how architecture will meet the urgent need for higher computing performance.”
Song Jiqiang, vice president of Intel Research and director of Intel China Research Institute, sorted out the major updates of Intel Architecture Day along three main lines.
Major innovation in kernel architecture
In order to meet the different requirements of clients for performance and power consumption, Intel has launched cores that are optimized for performance and energy efficiency respectively - energy efficiency cores and performance cores. The innovation of dual-core microarchitecture is considered to be Intel's biggest innovation in the past decade, and this innovation is not only applied to clients, but also to servers, high-performance computing and other fields, which will benefit from the innovation of microarchitecture.
According to Raja, the energy-efficient core is a highly scalable x86 microarchitecture that can meet customers' full computing needs from low-power mobile applications to multi-core microservices. Compared with Skylake, Intel's most prolific CPU microarchitecture to date, the energy-efficient core can improve single-thread performance by 40% at the same power consumption, or when providing the same performance, the power consumption is only less than 40% of Skylake. In terms of throughput performance, compared with two Skylake cores running four threads, four energy-efficient cores have an 80% performance improvement and lower power consumption, or reduce power consumption by 80% when providing the same throughput performance.
As for the performance core, it is not only Intel's highest-performing CPU core to date, but also achieves a step-by-step improvement in CPU architecture performance, driving computing development in the next decade. It is a wider, deeper, and smarter architecture that exhibits higher parallelism, improves execution parallelism, reduces latency, and improves general performance. It also helps support applications with large data sets and large code volumes. Compared to the 11th generation Core architecture (Cypress Cove core), at the same frequency, the performance core has an average improvement of about 19% on a range of workloads.
In response to the development trend of data center processors and machine learning, the performance core provides dedicated hardware, including the new Intel Advanced Matrix Extensions (AMX) to perform matrix multiplication operations to obtain orders of magnitude performance - AI acceleration is improved by about 8 times. This is designed for software ease of use and takes advantage of the x86 programming model.
Client innovation - from processor SoC to graphics card
Song Jiqiang emphasized that the two cores are optimized and designed in different directions. Although they have different focuses, there is a synergistic effect between the two.
In order to make the performance cores and energy efficiency cores work seamlessly with the operating system, Intel developed a hardware thread scheduler. The hardware thread scheduler is built directly into the hardware and provides low-level telemetry of the core state and thread-instruction mix ratio, allowing the operating system to place the right thread on the right core at the right time. The hardware thread scheduler is dynamic and adaptive - it adjusts scheduling decisions based on real-time computing needs - rather than a simple, rule-based, static approach. Compared to previous optimizations that only focused on battery efficiency, the hardware thread scheduler can optimize from a performance perspective.
Fortunately, these innovations will be available soon. Intel integrates the hardware thread scheduler, energy efficiency core and performance core to create a client SoC codenamed Alder Lake, which is also Intel's first performance hybrid architecture processor built on Intel 7 process technology. The advent of Alder Lake also means the synchronous progress of Tick-Tock.
Alder Lake will support all client devices from ultra-portable notebooks to enthusiast and commercial desktops, using a single, highly scalable SoC architecture.
Notably, to meet the challenges of a highly scalable architecture, Intel designed three independent internal buses, each with demand-based, real-time heuristic post-processing.
It includes a computing internal bus that can support up to 1000GBps; an I/O internal bus that can support up to 64 GBps; and a memory bus that can provide up to 204 GBps of data and dynamically expand its bus width and speed to support high-bandwidth, low-latency or low-power scenarios.
In addition to CPUs, Intel also announced innovations in the field of desktop graphics cards.
First, the Xe HPG microarchitecture was announced. Xe HPG is a new discrete graphics microarchitecture designed to provide enthusiast-grade high performance for gaming and creative workloads. The Xe HPG microarchitecture powers the Alchemist series of SoCs, and the first related products will be available in the first quarter of 2022 under the new brand name - Intel® Arc™. The Xe HPG microarchitecture uses the new Xe core and is a compute-focused, programmable and scalable component.
At the same time, the XeSS technology was announced, which uses XMX AI acceleration to bring a new upscaling technology that can achieve high performance and high-fidelity visuals. It uses deep learning to synthesize images that are very close to native high-resolution rendering quality. With XeSS, games that can only be played at low quality settings or low resolutions can also run smoothly at higher quality settings and resolutions.
"XeSS embodies Intel's software-first strategy, which is also the core of Intel graphics card design," said Song Jiqiang.
Intel is currently covering the driver design of integrated and discrete graphics products in a unified code base, and has completed the re-architecture of the kernel graphics driver components, especially the memory manager and compiler, which has increased the throughput of compute-intensive games by 15% (up to 80%) and reduced game loading time by 25%.
Data Center Innovation—Next-Generation Xeon, IPU, and More
The next generation of Intel Xeon Scalable processors is code-named "Sapphire Rapids". The core of the processor is a partitioned, modular SoC architecture that uses Intel's Embedded Multi-die Interconnect Bridge (EMIB) packaging technology, which has significant scalability while maintaining the advantages of a single-chip CPU interface. Sapphire Rapids provides a single, balanced unified memory access architecture, where each thread has full access to all resources on all units such as cache, memory, and I/O, thereby achieving consistent low latency and high horizontal bandwidth across the entire SoC.
Sapphire Rapids features performance cores and a host of data center-relevant accelerators to boost performance across a variety of customer workloads and use cases. New built-in accelerator engines include:
Intel Accelerator Interface Architecture Instruction Set (AIA) – supports efficient scheduling, synchronization, and signaling of accelerators and devices.
Intel Advanced Matrix Extensions (AMX) – A new acceleration engine introduced in Sapphire Rapids that provides significant acceleration for Tensor processing at the heart of deep learning algorithms.
Intel Data Flow Accelerator (DSA) – Designed to offload the most common data movement tasks that cause overhead in data center-scale deployments.
The processor is designed to drive industry technology transformation through advanced memory and next-generation I/O, including PCIe 5.0, CXL 1.1, DDR5 and HBM technologies.
As for the infrastructure IPU dedicated processors, Intel believes that "a single product cannot meet all needs", so it has conducted more in-depth research on its IPU architecture and launched the IPU series family of products designed to cope with the complexity of diverse data centers.
Intel's IPU architecture has the following advantages: strong separation of infrastructure functions and customer workloads enables customers to fully control the CPU; cloud operators can offload infrastructure tasks to the IPU to maximize CPU utilization and revenue; the IPU can manage storage traffic, reduce latency, and effectively utilize storage capacity through a diskless server architecture. With the IPU, customers can better utilize resources through a secure, programmable, and stable solution that enables them to balance processing and storage.
Previous article:TSMC is going to raise prices! Wafer foundry prices will increase by 10%-20% starting in 2022
Next article:Shanghai joins hands with the Yangtze River Delta to build world-class industrial clusters such as integrated circuits
Recommended ReadingLatest update time:2024-11-16 01:48
- Popular Resources
- Popular amplifiers
- Microcomputer Principles and Interface Technology 3rd Edition (Zhou Mingde, Zhang Xiaoxia, Lan Fangpeng)
- Microcomputer Principles and Interface Technology Examples and Exercises (Kong Qingyun, Qin Xiaohong)
- Design and application of autonomous driving system (Yu Guizhen, Zhou Bin, Wang Yang, Zhou Yiwei)
- EDA Technology Practical Tutorial--Verilog HDL Edition (Sixth Edition) (Pan Song, Huang Jiye)
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
- The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
- ASML predicts that its revenue in 2030 will exceed 457 billion yuan! Gross profit margin 56-60%
- ASML provides update on market opportunities at 2024 Investor Day
- It is reported that memory manufacturers are considering using flux-free bonding for HBM4 to further reduce the gap between layers
- Intel China officially releases 2023-2024 Corporate Social Responsibility Report
- Mouser Electronics and Analog Devices Launch New E-Book
- AMD launches second-generation Versal Premium series: FPGA industry's first to support CXL 3.1 and PCIe Gen 6
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- [Raspberry Pi 4B Review] Raspberry Pi 4 drives 3G network card to send and receive text messages and wireless Internet access
- ESP32 VL53L5CX Multi Zone Range Sensor
- Simple questions about DC motors
- Dual-receive and dual-transmit radio RF board based on XC7Z100+ADRV9009
- 【Repost】The definition and difference between passive and active signals of sensors
- TI C64x+ DSP CACHE consistency analysis and maintenance
- Some basic knowledge about SPI-Flash
- Is your phone ready for 5G?
- [MSP430] Practical ADC use, internal temperature measurement use
- CC2640R2F ADC