NVIDIA DRIVE AGX is a scalable, open autonomous vehicle computing platform that serves as the brain of autonomous vehicles. As the leading hardware platform in its class, NVIDIA DRIVE AGX provides high-performance, energy-efficient computing for functionally safe AI autonomous driving. In terms of hardware, the NVIDIA DRIVE embedded supercomputing platform processes data from cameras, general radar, and lidar sensors to perceive the surrounding environment, determine the location of the car on the map, and then plan and execute a safe driving route. In terms of software, NVIDIA DRIVE AGX is scalable and software-defined, and the platform can provide advanced performance to help autonomous vehicles process large amounts of sensor data and make real-time driving decisions. The open NVIDIA DRIVE software stack can also help developers use redundant and diverse deep neural networks (DNNs) to build perception, mapping, planning, and driver monitoring functions. Through continuous iteration and wireless updates, the platform becomes increasingly powerful. At the same time, the open NVIDIA DRIVE SDK provides developers with all the building blocks and algorithm stacks required for autonomous driving. The software helps developers build and deploy a variety of advanced autonomous driving applications more efficiently, including perception, positioning and mapping, planning and control, driver monitoring, and natural language processing. This article will be divided into several chapters, taking the most widely used mainstream NVIDIA chip Orin x as an example, to explain how to develop and apply from the software to the hardware level from both the hardware and software perspectives.
1. NVIDIA internal architecture design
Taking Orin-x as an example, the CPU includes a main CPU complex based on Arm Cortex-A78AE, which provides general-purpose high-speed computing capabilities; and a functional safety island (FSI) based on Arm Cortex-R52, which provides isolated on-chip computing resources and reduces the need for external ASIL D functional safety CPU processing.
The GPU is the NVIDIA Ampere GPU, which provides advanced parallel processing computing capabilities for the CUDA language and supports a variety of tools, such as TensorRT, a deep learning inference optimizer and runtime that provides low latency and high throughput. Ampere also provides state-of-the-art graphics capabilities, including real-time ray tracing. Domain-specific hardware accelerators (DSAs) are a set of dedicated hardware engines designed to offload various computing tasks from the computing engine and perform these tasks with high throughput and high energy efficiency.
The following diagram shows the high-level architecture of the SoC, divided into three main processing complexes: CPU, GPU, and hardware accelerators.
The internal architecture design of the entire chip is mainly divided into functional designs by blocks, including the underlying operating system software QNX BSP (clock source & system restart, CAN/SPI/I2C/GPIO/UART controller, configuration register, system configuration), real-time operating system QNX RTOS, Nv multimedia processing module (sensor processing module MCU (R5), PVA, DLA, Audio Processor, MCU R5 configuration real-time camera input), classic Autosar processing module (for Safety Island Lock-Step R52s), safety service Safety Service (ARM Cotex-A78AE CPU Complex, CPU Switch fabric Coherent, information security PSC), neural network processing module (CUDA & TensorRT).
2. Typical architecture design of autonomous driving based on NVIDIA chips
Conventional SOC system architecture is usually designed with a conventional SOC+MCU dual-chip or even a triple-chip design. Due to its advantages in computing performance, SOC is generally better than MCU in computing application scenarios in front-end perception and planning.
MCU can be used as a verification output for control execution because of its high functional safety level. The industry has always had mixed opinions on whether NVIDIA chips can simply be used as super-heterogeneous chips like TDA4 to independently undertake tasks. In principle, whether it is the Xavier or Orin series, NVIDIA series chip designs have rich AI and CPU computing capabilities. Considering the development of autonomous driving systems above the L2+ level, this capability can fully adapt to the entire solution design.
So, is the industry promoting the corresponding design solution? The answer is no.
In the latest safety requirements in NVIDIA's datasheet, the recommended architecture design for the Orin series chips still requires the use of a specific MCU for failure analysis and risk assessment, so that serious system failures can be located in a timely manner, thereby ensuring that the autonomous driving safety integrity capability requirements defined by the ISO26262 standard are met (this will be explained separately later). At the same time, considering the power management of the entire domain control, connecting an external MCU can also greatly improve its power management capabilities, including entering and exiting sleep mode, etc.
The MCU set up as above can also be called a Safe MCU (SMCU) to a certain extent. In the process of developing the system, some MCUs with higher safety levels (generally need to reach ASIL D level) need to be used, such as Infineon Aurix TC series and Renesas RH850 series, which can act as MCUs to access Orin's SMCU. Such an SMCU can actually act as the power control and serious failure avoidance of the entire system development.
As shown in the figure above, the three-layer fail-safe framework based on the NVIDIA chip design system architecture is shown. Generally speaking, the architecture implements three-level fail-safe protection from the basic services of the SOC layer, the operating system, the virtual machine, the implementation operating environment, and the real-time operating environment of the MCU. The SOC layer and the MCU layer perform healthy and independent watchdog monitoring at the NvIVC, NvIPC, and SPI/Error Pin levels, respectively. Among them, the SOC itself will carry a part of the lockstep safety check Lockstep FSI, and run the virtual machine Hypervisor on the core CPU complex CCPLEX (Carmel CPU complex running the capture stack and applications. Indicates the Carmel CPU complex running the capture stack and applications). The CPU core uses the QNX operating system with a high functional safety level to complete the resource scheduling of the application software watchdog, middleware, application layer software, and driver software. Of course, for the real-time operating system, it still runs on the standard Autosar.
The security architecture shown in the figure below shows how the external MCU supports the boot data flow on the SOC and performs effective secure boot through a standard error reporting/propagation data flow. The entire program and data boot loading process includes three levels: Boot L1 CCPLEX, Boot L2 FSI, and Boot L3 External MCU.
Startup link design
During the L1 program startup process, the bottom-level startup includes using the "Boot and Power Management Processor (BPMP)" terminal (a small ARM core located at the core of the system) to load the bottom-level startup program to the BPMP server, and the virtual machine Hypervisor or operating system Safety OS calls the corresponding startup program file. In general, the Cortex-R5 of BPMP can achieve:
1. Lock-step core pairing
2、Arm 7-R ONE
3. Vector interrupt support: Based on daisy-chain Arm PL192 vector interrupt controller (AVIC)
4. TCM interface for local SRAM
5. Complete instruction and data cache (including 32KB instruction cache I-Cache and 32KB data cache D-Cache)
6. Arm processor revision
At the same time, the underlying iGPU core will also be driven by the RM integrated server. Finally, the first-layer loading boot program L1 CCPLEX (called CPU Complex in NVIDIA, which is a high-performance 64-bit Arm core) completes various professional tasks such as operating system task scheduling, boot management program loading, and RM server driving GPU core.
In addition, Level 2 also mainly involves the functional safety island verification FSI mentioned in the previous article. This article will explain it separately later.
Finally, the external SMCU can provide an additional layer of security protection and boot management configuration, so that the entire chip can be fully driven from a security perspective.
3. Functional Safety Island Design Principles
Figure 2 shows how to load FSI and underlying related module driver boot programs in NVIDIA series chips. In terms of functional safety design of NVIDIA series chips, the Orin series achieves ASIL D system capability design and ASIL B/D random error management capability design by setting goals. This includes ASIL decomposition requirements based on SOC chip hardware to each core, ensuring that the consistency of inter-core design can meet ASIL D requirements, and applying the standard ASIL D development process to the entire functional safety design, and performing corresponding safety designs for safety processes, Drive AGX, operating system Drive OS, Drive Work, sensors, redundant architecture design, and safety strategies from the bottom up.
The functional safety island (FSI) of the NVIDIA series of chips is a processor cluster containing Cortex-R52 and Cortex-R5F real, and a time processor core with a dedicated I/O controller. For example, the FSI module in Orin-X has its own voltage rail, oscillator, PLL, and SRAM to ensure minimal interaction with other modules inside the SOC and to achieve no interference between the above modules.
Orin-x Series FSI features include:
The Cortex-R52 processor, also known as the safety CPU, has 4 cores (8 physical cores in total) in DCLS (dual-core lockstep) mode, can run the classic AUTOSAR operating system, implement error handling, system fault handling and other customer workloads, and has a comprehensive performance of approximately 10KDMIPs.
Previous article:What are the core technologies of new energy electric drive?
Next article:Analysis of fast charging and slow charging interface schematics for new energy vehicles
- Popular Resources
- Popular amplifiers
- A review of deep learning applications in traffic safety analysis
- Dual Radar: A Dual 4D Radar Multimodal Dataset for Autonomous Driving
- A review of learning-based camera and lidar simulation methods for autonomous driving systems
- Multimodal perception parameterized decision making for autonomous driving
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- EEWorld's prize-winning essay contest is here! Cash prizes are waiting for you!
- LED glass screen
- [RVB2601 Creative Application Development] Introduction and use of KV components of RVB2601
- Qorvo Design Summit Webinar Series Returns
- Huada's MCU has an RTC, why doesn't it have an independent VBAT pin? !
- TouchGFX Design + HMI Design for HVAC Air Conditioning
- [Repost] 10 rules for PCB layout and wiring
- How does the light sensor work?
- A strange phenomenon about LM358 as a comparator
- Switching Power Supply