Reliability Design of Electronic Systems in Harsh Environments from the Perspective of Space Environment
By Warren Miller, Mouser Electronics
Even without the scary aliens, space is one of the most hostile environments for humans. Pressure, or lack of it, high or low temperatures, energetic particles, and radiation are all challenges for humans. Space is also one of the most hostile environments for electronic system design. Space electronics need to be able to withstand a near-perfect vacuum and operate at extreme temperatures. Power is also at a premium in space exploration projects—there are no such thing as long extension cords—so low-power operation is often required. And energetic particles can "flip" the logic state of data stored in SRAM for MCU code or for FPGA configuration logic, causing the system to fail, perhaps by mistake causing a bolt to explode, or an arm to move too fast and damage a robot. In short, space is not a friendly environment for both humans and electronic systems.
Figure 1: Space is a harsh environment for both humans and electronic systems. (Mouser Electronics)
Design of electronic systems in harsh environments
At first glance, it seems unlikely that electronic systems can be designed to operate correctly billions of times in such a harsh environment. For example, what do you do if you suffer a single event upset (SEU) attack that changes the state of a certain SRAM cell in the MCU and FPGA? Fortunately, it turns out that these types of attacks are not common, and the circuits perform correctly most of the time. However, there are some techniques that designers need to seriously consider to mitigate the catastrophic consequences that may result from SEU-induced failures.
Mitigating SRAM-based SEU attacks
A simple approach is to reduce the use of vulnerable components in the device. For example, if the MCU uses cache memory, it is better to bypass it. Caches are usually designed for maximum speed and minimum size, which means they cannot accommodate large charges. Designers can also reduce the SRAM in the application. Registers, accumulators, and peripheral buffer memories are sometimes latched, which have better SEU resistance. Understanding the vulnerable components in the device can often lead to the purchase of designs with significantly improved resistance to SEU attacks.
However, designers do need to use some SRAM in their designs, so there are techniques that can determine if the memory has been hit by an SEU. Some MCUs support SRAM parity detection, which can determine single-bit errors. Even better is the built-in error detection and correction capabilities in SRAM. By adding a few bits, single-bit errors can be corrected or double-bit errors detected. It may seem counterintuitive to increase SRAM to improve reliability, but it has been shown that the use of error correction codes has brought significant improvements in reliability without noticeably affecting memory access time.
Another way to mitigate SEU attacks in SRAM is to use block-based encoding, which is also commonly used in noisy communication environments, where fewer additional bits are required but the computation time is significantly increased. If some applications have idle cycles and ample power margin, designers can add a "cleaning" operation, which periodically scans the SRAM to detect whether some bits have been flipped. This method does not apply to "real-time multi-bit error detection and correction", but if the data is not accessed frequently (perhaps just transfers between caches) and does not require "real-time" access, this method can significantly increase reliability.
Mitigating FPGA-based SEU attacks
In MCUs, designers can mitigate SRAM-based SEU attacks by minimizing the use of SRAM and by utilizing error correction and detection techniques. In SRAM-based FPGAs, SRAM cells are distributed throughout the FPGA fabric to configure logic and routing resources. This makes it difficult to not use SRAM and the cost of adding error correction and detection circuits is high. SRAM-based FPGAs offer some functionality to support a configuration SRAM "cleaning" operation by periodically comparing it to an external non-volatile configuration store and displaying any changes. However, this consumes a lot of time, bandwidth, and power, so it is not possible to apply to a large number of applications.
Another approach uses FPGAs that do not utilize SRAM configuration memory. For example, Microsemi's SmartFusion2 SoC FPGA uses flash distributed throughout the FPGA fabric to configure logic and routing. Flash memory is not susceptible to SEU failures caused by alpha or neutron radiation, which significantly improves the reliability of flash-based FPGAs in environments where SEU effects are a concern.
Figure 2: Microsemi's SmartFusion2 is a Flash-based FPGA that is not susceptible to alpha/neutron glitches, whereas SRAM-based FPGAs are. (Source: Microsemi)
There are also large blocks of SRAM in SmartFusion2 SoC FPGAs, which are used for data storage or large FIFOs for complex peripheral processing. At this time, error detection and correction technology is also used in these SRAMs to mitigate the SEU effects of key storage blocks. Small storage units of simple peripherals use latches instead of SRAM, making them less susceptible to SEU attacks. The off-chip DDR memory controller also supports single error correction and double error detection, which mitigates the SEU effects of large external memories.
System interconnection in harsh environments
After the MCU or FPGA-based design is optimized for harsh environments, it also needs to be connected to other subsystems. Not only temperature, pressure, and radiation are factors that need to be considered in the interconnect system, but vibration and electromagnetic noise are also very important. This will require special rugged connectors that can be used to transmit signals and power at the same time. Phoenix Contact's Heavycon EVO-D series is composed of a special polyamide plastic material that provides high reliability in high vibration environments.
Figure 3: Amphenol’s 10G Ethernet fiber-to-copper connectors and media converters. (Source: Amphenol)
Electromechanical interfaces in harsh environments
At some point, a system needs to control a servo or motor. The sensor used to determine position is perhaps the most important component of an electromechanical control loop, so high reliability in harsh environments is critical. Design engineers may not use position sensors that rely on mechanical contacts because they may spark or experience excessive wear that reduces sensor life. Hall effect sensors are a contactless technology that uses magnetic effects to determine rotary position. For example, the Vishay 34 Servo Hall sensor has a 50 million cycle life and a linearity of 0.5%. The popular SPI output interface makes it easy to connect to an MCU or FPGA.
Figure 4: Vishay Hall-effect sensors for contactless electromechanical sensing. (Source: Vishay)
in conclusion
Space is a harsh environment for humans and electronic systems. When designing systems for space or the atmosphere, it is critical to mitigate the effects of SEUs due to the presence of high-energy particles, which can be achieved by selecting space-based components and using redundant techniques. Extreme temperatures, pressures, vibrations, and electromagnetic radiation also indicate the existence of harsh environments, whether in space or on Earth (or underground). Electronic components and interconnect systems need to be designed to withstand these harsh environments, which is critical to avoiding electronic system failures. You definitely don't want to be called in to repair them.
Share the joy with others.
Click the button in the upper right corner to "Share to Moments"
Click "Mouser Electronics" below the title to follow
Welcome to add WeChat ID: mouserelectronics
Long press the QR code, select "Automatically identify QR code" or scan and follow us. For more exciting articles, please follow us and view historical information.