Just 100 milliseconds after power is applied to a standard PCI Express® (PCIe) system, the system's root component begins scanning the bus to understand the topology and, in the process, initiates configuration. If a PCIe device is not ready to respond to configuration requests, the root component cannot find the PCIe device and assumes it does not exist. The device cannot join the PCIe bus system. [1]
The situation in automotive applications is similar. In a CAN-based network, ECUs enter a sleep mode, where they stop running and are disconnected from power. Only a small portion of the circuitry remains alert to detect a wake-up signal. Once a wake-up event occurs, the ECU reconnects power and begins booting. Although some messages can be missed in the first 100 milliseconds after the wake-up event, all ECUs must be fully operational on a network such as a CAN network after this time.
Intensive R&D work between Xilinx Automotive, Xilinx Research Labs, and the Karlsruhe Institute of Technology in Germany is addressing this issue with a two-step configuration method for FPGAs.
Technology trends in the semiconductor industry have enabled FPGA vendors to significantly increase the resources in their devices. However, bitstream sizes have also increased proportionally, as has the time required to configure the device. Therefore, even for medium-sized FPGAs, it is not possible to meet stringent startup timing requirements using low-cost configuration schemes. Figure 1 shows the configuration times for different Xilinx® Spartan®-6 FPGA devices using the low-cost SPI/Quad-SPI configuration interface. Even using the fast configuration scheme (i.e., Quad-SPI running at a 40 MHz configuration clock), only small FPGA devices can achieve the 100 ms startup timing requirement. This result appears to be more challenging for Xilinx Virtex®-6 devices, as these devices offer more abundant FPGA resources.
To overcome this challenge, Fast Startup configures the FPGA device in two steps instead of a single step (whole chip) full device configuration. Following this novel approach, our strategy is to load only timing-critical modules at power-up using the highest priority bitstream, followed by non-timing-critical modules. This approach minimizes the initial configuration data, thereby minimizing the boot time of the FPGA device for timing-critical designs.
FAST STARTUP vs. Partial Reconfiguration
Fast Startup allows the FPGA design to start up the critical modules of the design as quickly as possible, much faster than the standard full configuration method [2]. Although, in essence, Fast Startup utilizes partial reconfiguration, it is different from the traditional concept of this method. The original intention of partial reconfiguration is to use the complete design as an initial configuration that can be modified at run time. In contrast, Fast Startup already uses an initial partial bitstream to configure only a specific (small) area of the FPGA device at power-up. The first configuration contains only those parts of the complete FPGA design that must be configured and run quickly. The rest is configured later, at run time, using partial reconfiguration. Figure 2 illustrates this sequential concept.
Tool Flow Overview
The Fast Startup tool flow relies on a design preservation flow to create partial bitstreams for both timing-critical and non-timing-critical subsystems.
The design preservation flow partitions the FPGA design into logical modules (called “partitions”). Partitions form hierarchical boundaries that isolate the internal modules from other components in the design. Once a partition is implemented (i.e., placement and routing is complete), it can be imported by other implementation runs to implement the partitioned modules in exactly the same way in each instance [3].
Therefore, the first step using the Fast Startup methodology is to partition the complete FPGA design into two parts: a high-priority partition containing the timing-critical subsystems and a low-priority partition for the remaining components.
Figure 2 – Fast Startup concept: sequential configuration
There are some general design considerations to get the smallest possible partial bitstream for a high priority partition. First, the partition must contain only components that are either timing critical or that are needed by the system to perform partial reconfiguration of a low priority portion (such as an ICAP). The key to getting a small initial partial bitstream is to implement the high priority partition using the smallest possible area. That is, you must confine the partition to an appropriate region in the FPGA.
This region should provide the appropriate number of resources required by the design in order to find an ideal physical location in the FPGA. Accessing resources outside of this region is possible, but not encouraged—although
it is generally unavoidable for I/O pins. When finding an appropriate region, also keep in mind that this region of the FPGA may obstruct resources in non-timing critical portions of the FPGA design.
Once you have partitioned the FPGA and have found appropriate regions for these partitions, the next step is to implement the high priority partition using an empty (black box) low priority partition. The resulting bitstream contains many configuration frames for unused resources. You can remove these frames to get a valid partial bitstream for the initial configuration of the high priority partition. [4]
Implementation of Low Priority Partition
To create the partial bitstream for the low priority partition, first, you create an implementation of the complete FPGA design with both partitions, the high priority partition and the low priority partition. Import the high priority partition from the previous implementation so that its implementation is the same as the original one.
For Virtex-6 devices, the partial reconfiguration (PR) flow can be used for all the above implementations. This automatically generates the partial bitstream for the low priority partition. Since the Spartan-6 device family does not support the PR flow, we used the BitGen option for differentiated partial reconfiguration to obtain the partial bitstream for the low priority partition when implementing Fast Startup for Spartan-6 designs. [5] Figure 3 gives a high-level overview of the tool flow.
Figure 3 – Fast Startup tool flow
To verify the Fast Startup configuration method in hardware, our research group implemented this method on a Virtex-6 ML605 board and a Spartan-6 SP605 board.
The application background of the Virtex-6 implementation comes from the video field. When users turn on the power of the video system, they always want to see the system respond immediately without waiting for several seconds. Therefore, in the system shown in Figure 4, a high-priority subsystem equipped with a TFT controller can quickly light up the TFT screen. For other low-priority applications, the second design provides control and access to the Ethernet core, UART, and hardware timers.
facilitate expansion and cleanly isolate the two partitions, an AXI-to-AXI bridge was used. This also minimized the number of nets that cross the boundary between the two design partitions. The low-priority partition shares the system clock with the high-priority partition.
Table 1 shows the FPGA resource utilization, and Table 2 shows the configuration time for the traditional boot method, the boot method with only the compressed bitstream of the high-priority partition [6], and the Fast Startup configuration method. Each method uses the BPIx16 configuration interface, and the configuration rate (this option determines the target configuration clock frequency) is 2 MHz and 10 MHz. We measured this data using an oscilloscope to capture the FPGA’s “init” and “done” signals. The “Compressed” column in Table 2 shows the compressed bitstream of only the high priority partition. The compressed bitstream for the complete FPGA design with two partitions would be 3.1 Mbytes.
Resource Type |
Partition |
|||
High priority |
% |
Low Priority |
% |
|
trigger |
8,849 |
2.9 |
1,968 |
0.7 |
Lookup Table |
7,039 |
4.7 |
2,197 |
1.5 |
I/O |
135 |
22.5 |
20 |
3.3 |
RAMB36s |
34 |
8.2 |
2 |
0.5 |
XC6VLX240T |
Configuration Method |
||
Configuring the Interface |
Traditional 8.9 MB |
Compressed 2.0 MB |
Fast Startup 1.4 MB |
BPIx16 CR2 |
1,740 ms |
389 ms |
278 ms |
BPIx16 CR10 |
450 ms |
112 ms |
84.4 ms |
To validate the Fast Startup approach for Spartan-6, we chose an ECU application scenario in the automotive field. Whenever you see an FPGA device in an automotive electronic control unit, it is generally used only by the main application processing unit of the ECU (see Figure 5). Our goal was to implement a design that puts the system processor into the FPGA. This way we can avoid the need for an external processor, thereby reducing the cost, complexity, space and power consumption of the entire system.
For this scenario, system partitioning is obvious. We split our ECU design into a system processor part as a high priority partition and an application processing part as a low priority partition.
This design has many similarities to the Virtex-6 design, but the difference is that we use SPI instead of BPI as the interface to the external flash memory, so the TFT controller must be replaced by a CAN controller. After power-up, the system controller has only a limited time to boot up and be ready to handle the first communication data. Since the ECU uses the CAN bus for communication, this boot time is typically limited to 100 milliseconds. With traditional configuration methods, it is difficult to achieve such tight timing requirements using a large Spartan-6 with a low-cost configuration interface such as SPI or Quad-SPI. Using a faster and more expensive configuration interface is unacceptable in the automotive field.
Measurement Setup
For the SP605 automotive ECU demonstration, we performed measurements in the lab, which are shown in Figure 6. On the left side of the figure is a Spartan-3-based X1500 automotive platform that implements a network packet generator for the CAN bus. The generator is able to send and receive CAN messages and uses a hardware timer to measure the time between CAN messages. On the right is the target platform, which is not directly connected to the CAN bus, but uses a CAN transceiver from an additional custom board. In addition to providing a CAN PHY, this custom board also controls the power supply of the target board.
Since no receiver acknowledges the message sent by the network sender, this message is immediately repeated until the FPGA has completed its configuration and configured the CAN core with a valid baud rate. Once the CAN core of the Spartan-6 design acknowledges the message, the CAN core of the network sender triggers an interrupt, which stops the hardware timer. This timer now holds the boot time of the SP605 design. The measurement results include an additional hardware timer in the SP605 design, which shows that the software startup time is negligible when the software is executed to configure the CAN core with built-in BRAM memory.
Table 3 shows the FPGA resource consumption for each partition. The percentage information is used to indicate the total amount of available resources of the XC6S45LXT device used.
Resource Type |
Partition |
|||
High priority |
% |
Low Priority |
% |
|
trigger |
3,480 |
6% |
1,941 |
4% |
Lookup Table |
3,507 |
13% |
1,843 |
7% |
I/O |
58 |
20% |
20 |
7% |
RAMB |
12 |
10% |
2 |
2% |
Configuring the Interface |
Configuration Method |
||
Traditional 1,450 KB |
Compressed 920 KB |
Fast Startup 314 KB |
|
SPIx1 CR2 |
5,297 ms |
3,382 ms |
1,157 ms |
SPIx1 CR26 |
292 ms |
196 ms |
85 ms |
SPIx2 CR2 |
2,671 ms |
1,699 ms |
596 ms |
SPIx2 CR26 |
161 ms |
113 ms |
58 ms |
SPIx4 CR2 |
1,348 ms |
872 ms |
311 ms |
SPIx4 CR26 |
97 ms |
73 ms |
45 ms |
in Hardware
The advanced configuration method we developed can be called prioritized FPGA startup because it configures the device in two steps. This method is not only essential to address the challenge of increasing configuration time in modern FPGAs, but it can also be used in many modern applications such as PCI Express or CAN-based automotive systems.
In addition to proposing a high-priority initial configuration method, we also validated this method in hardware. We used and tested the tool flow and methodology for Fast Startup to implement a CAN-based automotive ECU on a Spartan-6 evaluation board (SP605) and a video design on a Virtex-6 prototype board. By using this novel approach, we have reduced the initial bitstream size, resulting in an 84% improvement in configuration time (compared to the standard full configuration scheme).
Xilinx will support the Fast Startup concept for PCI Express applications in software for 7 series FPGAs and simplify its use with an optimized implementation. In 7 series, the new two-step bitstream approach is the simplest and lowest cost approach to implement. When designing an FPGA, the user can implement a two-stage bitstream with a simple software switch. The first stage of this bitstream contains only the configuration frames required to configure the timing critical blocks. During configuration, an FPGA STARTUP sequence is generated and the critical blocks become active, making the 100 ms timing requirement easily met. While the timing critical blocks are running (e.g., the PCI Express enumeration/configuration system process is in progress), the rest of the FPGA configuration is loaded. The two-stage bitstream approach enables the use of inexpensive Flash devices to store the bitstream.
Previous article:Design of Analog Signal Waveform Based on FPGA
Next article:Design of real-time audio processing system based on DSP
Recommended ReadingLatest update time:2024-11-16 21:50
- Popular Resources
- Popular amplifiers
- Analysis and Implementation of MAC Protocol for Wireless Sensor Networks (by Yang Zhijun, Xie Xianjie, and Ding Hongwei)
- MATLAB and FPGA implementation of wireless communication
- Intelligent computing systems (Chen Yunji, Li Ling, Li Wei, Guo Qi, Du Zidong)
- Summary of non-synthesizable statements in FPGA
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- Some questions about air conditioning automatic control system
- Sensorless FOC commutation frequency 700HZ
- Added performance test for STM32L432
- ADI's smoke detector integrated solution based on ADPD188BI has started the registration for the prize live broadcast~
- Prize-winning live broadcast: TI chip technology is helping motor development in progress!
- C2000 MCU DesignDRIVE Solutions for Industrial Motor Drives
- Bluetooth BLE - BlueNRG2 VTimer
- Celebrate National Day and wish our motherland a happy 70th birthday!
- [STM32F769Discovery development board trial] SD card file system application & hard-to-decode JPEG decoding code that makes people autistic
- The general form of C language function call