All SOCs today use scan structures to detect any manufacturing defects in the design. Scan chains are designed for testing and connect the sequential elements of the chip in a serial fashion. These scan chains are prone to hold failures due to the lack of combinatorial logic between the scan elements. In addition to using technologies less than 90nm, OCV (on-chip process variation) has a huge impact on the timing margin. Therefore, unless the design achieves timing sign-off at multiple corners, there is a high probability of hold failures, especially on hold critical paths such as scan chains. These hold failures make the chip unusable in real applications (even if the chip is fully operational in functional scenarios). If these failures occur in the chip, it will reduce the yield and affect the production volume, resulting in huge financial losses for the design company. Therefore, a strong scan structure needs to be designed to address the above issues.
In this article, we will first quickly review the basic concepts of latch and flip-flop timing. In the next section, we will introduce scan chains and the timing closure issues associated with them. We will then explain how to use latches and flip-flops in scan chains to create a robust scan structure to avoid timing failures in technologies less than 90 nm. We will introduce the best solutions to meet the timing requirements of all possible combinations of timing elements in the scan chain.
Setup/Hold Timing Overview
Flip-flops and latches are two basic building blocks of sequential circuits. A flip-flop changes its state on the active edge (positive or negative) of the applied clock pulse. A flip-flop simply holds its output in the absence of an active clock edge. On the other hand, a latch is a level sensitive device that continuously samples its input and changes its output accordingly on the level of active pulses (positive or negative) of some level-start signal. A flip-flop is in a master-slave configuration with two latches operating in cascade on active levels relative to each other. The area of a flip-flop is almost twice that of a latch.
To implement synchronous design, we need to ensure that the output of the flip-flop/latch is not in a metastable state. This can be ensured by meeting the setup and hold check requirements in the design.
Figure 1
In the flip-flop, 1-1 is a hold check, and 1-3 is a setup check for single-cycle operation (Figure 1). We need to ensure that the data emitted by flip-flop 1 is captured by flip-flop 2 before the next active edge. At the same time, we also need to ensure that the data emitted by flip-flop 1 is not captured by flip-flop 2 on the same active edge.
Figure 2
When the second flip-flop is triggered by a negative edge, the setup check will be 1-2 (see Figure 2), and the hold check will occur on the previous negative edge (see Figure 2). This means that the data sent out by flip-flop 1 should not be captured by the falling edge of the previous flip-flop 2. Unless we have a clock skew of more than half a cycle, this cannot be implemented in a real-time manner.
So, in a positive-positive or negative-negative flip-flop pair, the setup check defaults to one cycle and the hold check to zero cycles, whereas in a positive-negative or negative-positive flip-flop pair, the setup check defaults to half a cycle and the hold check to the inverse half a cycle. Now let us understand the concept of timing checks in latches.
Scan Chains
Scan chains are used to perform testing in SOCs. All registers in the design are connected in serial form, an external chip provides stimulus, and then the outputs of these chains are read out to monitor for stuck-at/state transition faults. Today’s SOCs are very complex and have multiple clock domains in a single chip. Although scan stitches a design after logic synthesis, it is generally necessary to take care to stitch flip-flops with the same clock structure in the same scan chain. However, since the input/output ports available for the highest level scan are limited, mixing registers between different clock domains is unavoidable. Having scan chains with unbalanced lengths is also not the best solution as it increases the overall test time. Therefore, this design structure can lead to timing closure issues in later design stages. Since scan shifting is done at a low frequency and minimal, if any, logic is required between flip-flop pairs, establishing closure will not be an issue. However, these paths are critical hold paths because of the minimal logic and skew that occurs between flip-flop pairs. As we discussed earlier, since flip-flops from different domains are mixed in the scan chain, huge skews occur between the issue and capture flip-flops in many cases. In the later stages of design, many hold time violations will occur due to the influence of noise, which will cause hold buffers to appear in either stable or closed designs, causing design failures.
Worse case scenario may be that our derating margin may not be sufficient and we may only find hold failures on silicon. This may happen if the abnormal clock path is very huge and the actual skew on silicon is higher than the expected skew. As we go further to CMOS technology less than 90nm, the skew effect will become more and more dominant and will result in many hold skews on silicon. Hold failures in scan shift paths can lead to severe consequences. Many debugs are required and it takes a lot of time to detect the fault chain on silicon. This situation becomes worse when we also have compression logic for scan. Even if the fault chain is detected, we need to block it which results in reduced test coverage.
In summary, the risk of hold failures in the scan chain is high and a sufficiently robust design must be implemented to handle these uncertainties.
There are many solutions, such as reordering the scan chains and rearranging the scan chains according to the location of the registers. Although these techniques are very easy to obtain, designers must also explore them carefully. As we discussed earlier, it is inevitable that the scan chain crosses between two clock domains.
A more effective way to address this problem is to take steps ahead of time and deal with it during the logic synthesis stage of building the scan chain. All flip-flops from the same clock gating logic should be spliced together, and a locked latch can be inserted at the end of the bundle of flip-flops to avoid any hold failures from the last flip-flop in one domain to the first flip-flop in the next clock domain.
The example shown in Figure 3 will help us understand this concept.
Figure 3
If the clock period is 50ns and the skew is 5ns, we must insert a holding buffer with a derating margin equivalent to more than 5ns between flip-flop 3 and flip-flop 4 in the subsequent stage of the design. As discussed earlier, due to OCV in designs less than 90nm, our standard derating may become insufficient due to abnormal clock paths exceeding certain limits. For example, for a capture path with 10 additional clock buffers, each clock buffer with only 5ps skew (exceeding and exceeding the derating value) will result in a 50ps deviation. In addition, due to the OCV factor. This skew may exceed 5ns, and this margin may not be sufficient.
The solution to solve the above problem is to insert a locked latch in the output of flip-flop 3 while making the locked latch have the same delay as flip-flop 3.
lockup latch: lockup latch; clock gating: gated
zero cycle check hold, easy to meet: zero cycle check hold, easy to meet;
shifting of data from flop 3 to 4 is still in one shift cycle: data transfer from flop 3 to flop 4 is still in one shift cycle.
Hold check is half cycle back now, much relaxed now: hold check now only takes up half cycle, much more relaxed now
Figure 4
As shown in the waveform above (Figure 4), when we insert a lock latch between Flip Flop 3 and Flip Flop 4, our timing path will be divided into two stages.
1. From Flip Flop 3 to Locking Latch
The hold check starts from 1-1, it is still a zero cycle check, but it is very simple and easy because there is no offset. The default setup check starts from 1-2.
2. Hold check from locking latch to flip flop 4
starts from 2-1. This is the main advantage and motivation to insert locking latch. Hold is shifted backward half cycle and now we have enough margin even if our clock is offset up to half shift clock cycle. This ensures that there will not be any hold deviation in this case.
The setup check starts at 2-3. The latch is transparent during 2-3, and any data captured during this phase will be transferred to flip-flop 4 until edge 3 (minus the setup time of the flip-flop). We can see that the setup check from flip-flop 1 to locking the latch can also be done easily. 1-2 is the default check, but the latch is transparent during the entire half cycle, and ideally the setup check can be shifted to edge 3. (This concept is called latch borrowing).
Another important thing to note here is that this lock latch should have the same clock as the issue flip-flop clock, not the capture flip-flop clock. As we can see above, the hold check from flip-flop 3 to latch is still 1-1 (zero cycle check). If the lock latch has the same clock as the capture flip-flop clock, we will not get any advantage. Therefore, the ideal solution is to have both the issue flip-flop and the lock latch driven by the same clock buffer in the clock tree structure.
The above examples show that latches can effectively mix holds in the scan shift path. One might question whether we can also fix these skews by inserting hold buffers or delay cells. However, a quick look at the area of hold buffers, delay cells, and latches shows that hold buffers are suitable for mixing small hold skews, but if the skew is larger, latches have an advantage over buffers in terms of both area and delay. When using delay cells, there is always a huge risk of skew between different operating conditions, so these cells should be used selectively and cleverly. Latches, on the other hand, always have a half-cycle delay under any operating condition.
In the last section, we will consider various cases to find the most suitable solution to fix hold failures when huge clock skew occurs between the issue and capture flip-flops of the scan chain.
Different Scenarios
Scenarios 1: Between Positive-Positive Edge Triggered Flip-Flops
We have included this case in the above examples where a negative level latch can be used.
Case 2: Between negative-negative edge triggered flip-flops
By following the same simulation as above, a positive level latch can be used.
Case 3: Between negative-positive edge triggered flip-flops
We have seen that it is very easy to hold here. No locking element is needed here.
Case 4: Between positive-negative edge triggered flip-flops
This is a very interesting case. From a timing point of view, this case does not cause a problem, but it is an illegal connection in scan shift. Since in ATPG, the clock is considered as a return to zero waveform (after the shift is completed, the clock will go active low), if we allow this crossing, we will find that after the scan shift, all such positive-negative pairs will have the same value after the clock pulse. This will lead to reduced test coverage because all flip-flops are not independently controllable. This situation should be avoided when stitching, but sometimes it cannot be avoided because of the presence of compression logic or hard macros.
We can insert a positive level latching latch between the positive and negative flip-flops, which will solve the ATPG problem, but it will also introduce timing problems because the hold check will become a zero cycle check again from the flip-flop to the latching latch and from the latch to the negative edge flip-flop.
Another solution would be to insert a dummy flip flop that would work on either positive or negative edge between these flip flops. It should be noted that after the shift the dummy flip flop will still have the same value as the first or second flip flop depending on whether it was triggered on positive or negative edge, but this does not cause any problems, so it is not a functional flip flop and we will not use it in any way to capture data anywhere. If we decide to insert a positive edge flip flop, the clock delay of the issuing flip flop and this dummy flip flop will be the same as it will be a zero cycle hold check and the dummy flip flop to the next flip flop will be a half cycle hold check, similarly if we insert a dummy negative edge flip flop, the delay of the capture flip flop and the dummy negative edge flip flop will be the same.
These are all four situations between flip-flops that can exist in a design, but sometimes they are not immediately obvious. For example, special attention needs to be paid when scanning a design that has hard macros and is pre-stitched. Many times we do not have netlist/spef/timing constraints for the hard macros, so we recommend inserting lockout latches before these hard macros in case the owners of these hard macros miss them. Another example of this is burn-in mode, where the scan chains in the design are connected together so that all flip-flops are switched at the same time. So there is also the possibility that the last element in a chain and the first element of the next chain have timing critical logic or invalid positive-negative crossings. For this case, ideally attention should be paid to the RTL itself, because the designer will have a better understanding of the order of the scan elements when connecting these scan chains together. If this is not taken into account, the best practice is to insert the appropriate lockout latches at the end of each chain.
By adopting the above tips and guidelines, designers can implement a strong scan structure on their chip. In the presence of setup failures, the design can run at a lower frequency, but in the presence of any major hold failure, the intended functionality of the logic is unpredictable. Hold failures in scan shift are very serious. It can significantly reduce the test coverage during testing. Therefore, we need a strong scan structure that addresses the potential scan shift failure issues we discussed earlier. The corresponding latch type element can perfectly address such issues as it ensures a half-cycle delay under any operating condition.
Previous article:Ultra-low noise amplifier design based on infrastructure receiver
Next article:Comprehensive solution to system-level LED thermal management problems
- Popular Resources
- Popular amplifiers
- Molex leverages SAP solutions to drive smart supply chain collaboration
- Pickering Launches New Future-Proof PXIe Single-Slot Controller for High-Performance Test and Measurement Applications
- CGD and Qorvo to jointly revolutionize motor control solutions
- Advanced gameplay, Harting takes your PCB board connection to a new level!
- Nidec Intelligent Motion is the first to launch an electric clutch ECU for two-wheeled vehicles
- Bosch and Tsinghua University renew cooperation agreement on artificial intelligence research to jointly promote the development of artificial intelligence in the industrial field
- GigaDevice unveils new MCU products, deeply unlocking industrial application scenarios with diversified products and solutions
- Advantech: Investing in Edge AI Innovation to Drive an Intelligent Future
- CGD and QORVO will revolutionize motor control solutions
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- Rambus Launches Industry's First HBM 4 Controller IP: What Are the Technical Details Behind It?
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- What courses should test engineers take?
- Explore the meaning of Matter with Infineon!
- [ ST NUCLEO-U575ZI-Q Review] Share the machine-translated manual
- [ ST NUCLEO-U575ZI-Q Review] FREERTOS - Multi-parameter structure transfer
- 【Sipeed BL808 all-round board】+ Unboxing
- Have you ever seen this kind of capacitor with amazing performance?
- The package substrate is 100 ohms when shipped from the factory, but tested at 85 ohms?
- [Beineng cost-effective ATSAMD51 evaluation board] MPLAB X IDE environment + serial port + ADC/DAC
- [Xingkong Board Python Programming Learning Main Control Board] 6. Voice Control Light Strip Output (with Bluetooth Output)
- [Sipeed BL808 all-round board] - Development environment construction