Design for Low Power Manufacturing Test-EEWORLD

Collect

A complete digital circuit test approach can often increase dynamic power consumption far beyond its specification. If the power consumption is large enough, it will cause wafer inspection or pre-burn-in package test failures, which require a lot of time and effort to debug. This problem is particularly prominent when testing very large SoCs under corner conditions, and may even cause unnecessary yield loss on the production line and ultimately reduce the manufacturer's gross profit. The best way to avoid test power problems is to incorporate power-aware test techniques into the design for testability (DFT) process. This article will first introduce the relationship between dynamic power and test to show why power management is more urgent now than ever; then introduce two unique DFT techniques that take advantage of ATPG technology to automatically generate low-power manufacturability tests.

Test power

Optimization of the scan ATPG algorithm reduces the number of vectors, which means that each vector maximizes the failure coverage. The bits used to set and propagate the target failure in the scan pattern are called care bits, and the remaining bits are randomly filled to detect other failures that cannot be clearly specified by the care bits. The care bits and random fill bits in each scan vector will cause the transition of the logic state, thereby charging and discharging the parasitic capacitance of the device. This phenomenon will lead to an increase in the dynamic power consumed by the circuit under normal operating conditions.

There are two types of dynamic power consumption that affect device testing: peak power and average power. Peak power, sometimes called "instantaneous power," is the total power consumed in a very short period of time (such as a small fraction of the clock cycle immediately following the rising/falling edge of the system clock). Peak power reflects the activity level of node switching in the device, so the more nodes that switch from one logic state to another at the same time, the greater the peak power.

Scan testing can increase the peak power of the device to 20 times the vector consumption level in mission mode. Significant switching currents can cause rail collapse noise: bits shifted into the circuit along the scan chain are lost, resulting in vector mismatches on the tester. Switching currents are usually not as severe, but can still cause rail collapse because the IR-drop along the power rails also introduces circuit delays. In some cases, scan data may not reach the next stage in the scan chain, causing the test program to fail. Rail collapse in shift mode can usually be solved by sufficiently reducing the scan shift frequency, because this allows the scan signal enough time to meet the shift cycle timing under corner conditions. However, reducing the scan shift frequency will increase the test time of the tester, thereby increasing the test cost in mass production.

Even if the vector is successfully scanned, the peak power in the launch/capture timing (hereafter referred to as "capture mode") can cause a large enough IR-drop delay to cause the logic value to not transition correctly in the capture window and the device to fail under this vector. Although this problem is related to both stuck-at and transition delay testing, it is more common in delay-related at-speed test vectors. IR-drop problems in capture mode and power rail droop problems in shift mode can be solved by redundant design of the power rail system, which can adapt to the increased switching activity in scan testing. However, increasing the width of the power and ground rails will increase the circuit area, and it is best not to use this method if there is a better way to control the peak test power.

Average power is the power consumption averaged over many clock cycles, such as the tens of thousands of cycles required to scan a single stimulus vector into a design while simultaneously scanning out the previous vector response. Scan test can increase the average power in a device by 2-5 times that of a mission mode vector. Excessively high average test power will cause thermal problems such as "hot spots" on the die that can damage the device. Because average power is directly proportional to frequency, average power can be controlled during scan shifts by selecting a sufficiently low shift frequency to avoid this problem. As mentioned above, reducing the scan shift frequency can also result in higher test costs.

Average test power is relatively easy to manage on the tester, so most power-related test problems currently come from excessive peak power. Methods that can reduce both peak power and average power during the test process are becoming the focus of current semiconductor and design automation industry research.

Figure 1: Trigger activity

The Importance of Power Management

Power management during test is becoming increasingly important, as the latest manufacturing processes can result in designs containing hundreds of thousands or even millions of scan flip-flops. Most of these flip-flops will switch simultaneously during scan test, which increases peak power and dramatically increases the IR-drop delay mentioned above.

Additionally, yields have decreased due to the increased defect density at 65nm and below. To compensate for the lack of yield and maintain acceptable quality levels, manufacturers are turning to ultra-high-resolution at-speed testing to detect tiny delay defects in devices. Enhanced timing resolution testing using small delay defect ATPG has been shown to be effective in detecting nanometer-scale defects that were previously undetectable using standard transition delay testing. However, this technique requires tighter control of the incidental delays caused by the peak currents generated during testing than standard isokinetic testing methods.

In summary, as more nanometer defects emerge, large-scale SoCs need to rely on advanced at-speed ATPG technology to maintain high test quality, and this trend is driving the use of power-aware test technology in the DFT process.

Representation of power budget

The flip-flop switching activity is highly correlated with the node switching activity, and its dynamic power consumption reflects the node switching activity. Therefore, it can be considered that an effective way to avoid test-induced power-related failures is to sufficiently reduce the flip-flop switching activity during scan testing, and a detailed case study of the IR-drop behavior of manufactured devices is helpful for this observation. Therefore, the goal of power reduction techniques is to sufficiently reduce the flip-flop switching activity so that a good device can pass all scan ATPG tests under corner conditions. Note that we do not need to minimize the switching activity, but only reduce it to a level comparable to the switching rate observed when applying the mission mode vector.

For illustration purposes, assume that a large number of mission pattern vectors are applied to a design and the peak flip-flop switching activity is found to be 26% of the total number of flip-flops. If we generate scan ATPG vectors and track the vector numbers corresponding to specific switching rates, we might observe something similar to the gray distribution in Figure 1. Since the peak and average switching rates exceed 26%, scan testing adds IR-drop delay relative to normal device operation.

However, if we use techniques to reduce power consumption during test, we can effectively shift this distribution to the left. In the overlaid blue low-power distribution area in Figure 1, the peak switching activity of the scan ATPG vectors does not exceed the power budget, thus reducing the risk of power issues during manufacturing test.

The following text describes two methods for achieving low power vector distribution, which differ fundamentally in the way the power budget is specified. [page]

Reflecting power budgets through design partitioning

Suppose a design has a clock that drives so many flip-flops that their peak switching activity exceeds the overall power budget of the design. We do not want the test logic to change any clocks, so we partition the design into N modules, each with its own scan enable pin and containing its own scan compression logic and scan chain (as shown in Figure 2). The number and composition of the modules need to be carefully chosen so that the flip-flop switching rate of any single module (including the module with the majority of flip-flops) does not exceed the total power budget. In this respect, partitioning can be considered to hardwire the power budget into the design.

The pattern generation is limited so that only one scan enable pin is activated (SE=1), and ATPG processes only one module at a time. The ATPG tool targets faults in the enable (SE=0) module and faults between modules, and faults in all other modules are designated as "ATPG untestable". This process is repeated for all modules in turn, and a single command is used to change the fault status in a module from "ATPG untestable" to "undetectable" before vectors are generated for the module.

Limiting all switching activity to the module being tested can effectively reduce peak power consumption during capture mode. However, it should be noted that the only way to eliminate switching activity in other modules during capture mode is to ensure that there is no change in logic state between the scan shift pattern of the previous cycle and the next cycle (corresponding to the transmission phase of the capture mode in the module under test). This can be achieved by scanning all 1s or all 0s into the module under test. Unfortunately, this approach will result in a loss of fault coverage and require more complex fault list processing and end vector generation to compensate. Even if only one module is tested at a time, we would like to load the vector into all modules at the same time to lock out inter-module faults.

The solution to this dilemma is to use the "low-power fill" feature provided by Synopsys' TetraMAX ATPG tool. TetraMAX usually needs to use less than 10% of the bits in the scan vector to establish and propagate the fault effect, so it no longer randomly fills the remaining bits, but instead copies the value of each bit of interest to the subsequent bits in the scan chain until the next bit of interest with an opposite value. (As shown in Figure 3)

The duplication of the care bit values can reduce the logic state changes in the stimulus vector by more than 90%. In the modules not under test, the reduction is close to 99% (only a few care bits are needed to lock out module-to-module faults), which is enough to ensure that there are almost no logic state transitions between the last shift of the input vector and the following transmission cycle.

Figure 2: Split the design into N blocks to specify the power budget.

Figure 3: TetraMAX ATPG tool’s “Low Power Fill”

Low-power fill patterns can detect additional faults, but fewer than standard ATPG patterns because the pseudo-random bits in each low-power fill stimulus are removed. Therefore, low-power fill ATPG generally generates more patterns than standard ATPG to achieve the same fault coverage. Nevertheless, the technique described in this section is very flexible in terms of compression, as shown in Figure 4: when more compression is applied, the number of test cycles is only slightly higher than the base case (all scan enables not activated, no low-power fill). The figure also shows the peak switching activity obtained from the full set of patterns versus the compression ratio during capture mode. The reduction in peak switching activity is almost independent of the compression ratio.

Figure 4: Relationship between the number of test cycles and peak switching action and compression ratio.

Low-power fill ATPG also reduces average power during scan shifts, saving tester time and cost. In general, duplicating the bit-of-interest values can reduce logic state transitions in the stimulus vector by more than 90%, and in the response vector by 10-50%. Because the stimulus and response are scanned simultaneously, the net average reduction in flip-flop switching is about 50%. The techniques described in this article can achieve even higher reductions because very few bits of interest in the module are not tested.

Once you understand how the low-power fill feature works, it is easy to see why each block has its own compression circuitry. If the compression is "flat" (meaning that a single decompressor/compressor is embedded on top of each block rather than inside it), then the decompressor output can be fed into the scan chains on all blocks separately. The bits of interest for the block under test do not need to be scanned into all other blocks and cause a large number of logic state transitions. In contrast, embedding the compression circuitry into the block constrains the outputs of each block's scan chain, creating a "boundary" of bits of interest that cannot be passed during shift operations. Embedding the compression logic into the physical layer of the design has the further benefit of reducing routing congestion and ultimately reducing the area overhead cost of compression.

Reflecting power budgets by clock domain

While embedded compression within physical blocks helps reduce routing congestion, the techniques described in this section do not require partitioning the design to reflect a power budget. Instead, the flip-flop switching budget can be specified as an ATPG constraint using unique capabilities in TetraMAX.

In this case it is assumed that the design has enough clocks that a single clock cannot control enough circuitry to exceed the power budget. The tool attempts to meet the power constraints by enabling only certain clocks in capture mode. The remaining clocks are inactive in capture mode and retain their states at the end of the shift operation. This means that there is no switching activity in these areas (logic networks or clock networks) and the benefit of low-power fill is limited to reducing the average power during the scan shift. It is important to note that ATPG must fully control all clocks (external clocks or PLL-generated clocks managed by one or more on-chip clock controllers).

The design shown in Figure 5 has seven clock domains under ATPG control. It is important to note that the partitioning of the physical blocks used for compression does not need to be consistent with the clock domains to ensure low-power operation during test. All flip-flops in the design share the same scan enable, allowing all faults, including inter-domain faults, to be discovered by ATPG at once. This simple, highly automated flow produces a low-power vector set in a compact format.

Figure 5: Design with 7 clock domains.

Conclusion

This article describes how dynamic power introduced during manufacturing test can adversely affect the performance of the device under test. Excessive peak power consumption during testing can increase latency and lead to unpredictable test results, while excessive average power during testing can cause thermal issues that damage the device. Both of these power issues can increase costs for manufacturers if not handled correctly, and large-scale SoCs manufactured using the most advanced processes are particularly susceptible to these issues.

Not only is there a large number of flip-flops used in these designs, but also the need for at-speed testing with higher time resolution to detect small delay faults. To address these issues, designers are combining advances in test automation and DFT methods to create low-power manufacturing tests. This article highlights two innovative techniques that reduce switching activity to levels comparable to device mission-mode operation. The main difference between the two approaches is how designers incorporate power budgeting into the DFT process.

Reference address：Design for Low Power Manufacturing Test

Previous article：MIMO Test Method Selection for Design and Manufacturing
Next article：Effective isolation provides higher protection for test and measurement equipment

Popular Resources
Popular amplifiers