Brief Introduction to Failure Analysis - Failure Analysis Process
Failure analysis has always been with the entire chip industry chain. Problems in any link of the complex industry chain will lead to chip failure. Chips face various failure risks from process to application. The author usually participates in failure analysis. This issue will give a systematic explanation of failure analysis. The author's ability is limited, and failure analysis is complicated and tedious . I can only try my best to summarize some knowledge systems, which will definitely have many deficiencies and omissions.
1. Definition of failure:
There are many reasons for failure, and the manifestations of failure are also complicated. Before conducting failure analysis, we need to determine what failure is?
1. Abnormal performance:
This situation is quite common. The chip functions normally, but some performances are not up to standard. In this case, we should start from the design side and locate the cause by combining test data and design indicators. Whether it is a layout or circuit design problem, or other reasons, we need to troubleshoot the problem very carefully during failure analysis.
2. Abnormal function:
Some functions of the chip malfunction, or even the chip cannot start. There are three main reasons for this:
2.1 Circuit design:
The chip fully meets the circuit design requirements, but the problem lies in the circuit design. The circuit is defective and causes abnormal functions. In this case, many failure analysis methods are powerless because the problem lies in the design end. All chips face the same problem and cannot provide reference indicators. Without a baseline, it is impossible to determine "good" and "bad" . The only way is to cut/connect each module through FIB and measure the electrical performance of each module using Nano-Probe, so as to slowly infer the problem. The process will be accompanied by a huge workload.
2.2 Chip reliability design:
There is no problem with the circuit design of the chip itself, but the chip is lacking in physical reliability design, resulting in reliability defects in the chip . The chip is damaged when facing external stimuli such as EOS, ESD, EMI, stress, and temperature, resulting in malfunction . The strength of reliability also directly determines the yield rate of the chip after mass production.
2.3 Process:
This situation is more obvious in advanced processes, where problems with the Fab process lead to chip failure. Currently, the final yield of many chips under advanced processes can only reach 50%~60%, and a large factor is due to process problems (the author's analysis is mainly for mature processes that have been solidified).
The most notable feature of circuit function failure is universality . The same problem will occur in the same batch, and no one is immune. Most failures caused by physical reliability have a certain degree of randomness . Either the degree of failure varies, or certain triggering conditions are required. There is also the possibility that a certain performance may be abnormal due to physical reliability defects, so failure analysis should still be based on actual conditions and specific problems should be analyzed specifically . (The author has also seen very serious physical reliability defects that caused all chips in the same batch to fail)
2. Failure level:
The failure of semiconductor
devices can be divided into three levels according to the stage of failure:
Chip (bare die) level: Failure at the chip level is currently the most likely stage, because the complexity of chip technology and design, process deviation, inadequate design and other factors can cause chips to fail during manufacturing, transportation and other processes.
Packaging level: Bonding failure, excessive wire bonding, adhesion failure, excessive voids and other factors during the packaging process will cause failures at the packaging level. As packaging technology becomes more advanced, the risk of failure in the packaging process is also increasing.
Application level: failures caused by downstream customers at the chip application end. Such failures as unreasonable PCB board design and application scenarios beyond the limit are not included.
3. Failure analysis process:
Record the failure performance
and the "symptoms" of chip failure, such as short circuit, open circuit, leakage, abnormal performance, abnormal function, intermittent performance, etc. Many failure causes have similar "symptoms", such as leakage, physical damage can cause leakage, inadequate isolation can also cause leakage, and latch-up problems can also cause leakage.
Locate the failure level , whether the failure occurs at the application end after delivery, after packaging, or because there is a problem with the bare chip itself. The three different levels correspond to different failure analysis ideas.
Failure trigger conditions : failure in normal functional testing; failure in high and low temperature testing; failure in ATE; ESD failure, and failure caused by packaging. (If the test engineer can strictly abide by the electrostatic protection requirements for testing and designing the test board, the risk of the chip facing ESD/EOS during the test process is very low)
Statistics The failure probability is whether a single chip fails randomly, multiple chips fail at a certain ratio, or all chips fail.
To reproduce the failure conditions
,
it is necessary to reproduce or trace the failure problem and confirm the cause of the chip failure, thereby helping to infer the cause of the failure.
Determine the failure type and make an inference about the failure type, so as to determine the direction of failure analysis. If the failure type cannot be inferred from the failure result, it can only be inferred based on the subsequent test results.
Plan the experiment . After you have a rough inference about the cause of failure, you need to conduct experiments to find data support. If the inference is clear, the experiment will be easier to locate.
After summarizing the improvement measures and drawing conclusions on the causes of failure, corresponding improvement measures need to be formulated and recorded . Every failure conclusion is a lesson learned with money , and it is also a process that product companies must go through.
Figure 1. Schematic diagram of the failure analysis process.
4. Failure analysis methods
At present, there are many professional teams in China doing failure analysis and testing. They not only have professional instruments and equipment, but also have professionals to provide technical support for failure analysis. However, the author believes that IC design companies must still have certain failure analysis capabilities , because the design and circuit indicators of each module of the entire chip are established by the design company, and the layout/back-end is also done by the design company. The design company has a clearer understanding of the entire chip. Third-party companies can assist in locating failure points and provide technical support , but their familiarity with chips is far less than that of design companies. Design companies should take the lead in failure analysis . Here is an introduction to several common failure analysis methods:
4.1. Non-destructive analysis:
4.1.1. OM(Optica Microscope):
Use a high-power microscope to visually inspect the chip or package surface . Figure 2 shows the OM result, where dark field technology can observe surface scratches and contamination, and Nomarski technology can observe cracks and etching pits.
Fig. 2. OM observation results.
4.1.2. SAT/SAM (Scanning AcousticTomography/Scanning Acoustic Microscopy):
SAT/SAM uses the reflection coefficient of ultrasound in different media to obtain the internal structure of the package. This technology can be used to check for voids, cracks and delamination in samples . The accuracy and resolution of SAM are better than SAT.
Fig. 3. SAT/SAM observation results.
4.1.3.
X-Ray / Computed Tomography (CT) X-Ray:
The chip is photographed using X-rays and CT scans to obtain a diagram of its internal structure.
If a more detailed internal structure is required, the sample can be photographed 360 degrees, and then a 3D diagram of the chip can be constructed using image processing technology.
Fig. 4. 3D X-Ray results.
4.1.4. Decapsulation :
Most of the failures caused by packaging can be detected by the above methods, but if the chip needs to be tested, it must be removed from the package (opened ) . There are currently two methods of decapsulation: 1. Chemical method: use sulfuric acid and nitric acid to corrode and open the cover. 2. Laser method: use laser to melt the package.
Figure 5. Schematic diagram of Decapsulation results.
4.2. EFA
(
Electrical Failure Analysis
):
4.2.1. Electrical Testing
:
Using a probe station + semiconductor analyzer + electrical test equipment, the probe is used to sample and apply excitation to the inside of the chip, and the electrical characteristics of the circuit module are directly analyzed . This is the most common method of electrical failure analysis, but the probe 's needle point has many restrictions , and sometimes it is necessary to cooperate with FIB and metal stripping to measure the electrical performance of the specified module .
Figure 6. Electrical characteristics analysis.
4.2.2. Photo Emission Microscope (EMMI InGaAs OBIRCH)
Figure 7. EMMI, InGaAs, Thermal band range.
【Taobao】https://m.tb.cn/h.5PAjLi7?tk=vmMLW43KO7q CZ3457 "Op amp secrets_Operational amplifier Multisim simulation video tutorial part 1 Kaitian_Engineers watching the sea"
Click the link to open directly or search on Taobao to open directly
4.3. PFA (Physical Failure Analysis)
Physical failure analysis requires some physical processing of the chip, the most important of which are vertical planing and metal stripping . The sample preparation for vertical planing includes cleaning, mounting and then placing the sample in polyester or epoxy resin.
Fig. 8. Longitudinal planing SEM image.
The de-layering process uses chemical solution/gas etching and mechanical polishing to slowly and precisely remove each layer of metal on the chip.
Figure 9. Schematic diagram of metal stripping.
4.3.1. FIB (Focus Ion Beam):
FIB
is one of the most commonly used failure analysis methods by IC design companies
, so I will not go into details here.
A brief introduction to failure analysis - FIB focused ion beam processing technology
4.3.2. SEM(Scanning Electron Microscope ):
SEM is also one of the commonly used methods. Because of its high magnification , after confirming the failure point by other means, SEM can be used for intuitive observation to confirm the cause of failure. SEM can observe the surface, and can also be used with longitudinal planing technology to observe the cross section .
4.3.3 SCM(Scannin Capacitance Microscope):
Scanning capacitance microscope, this microscope mainly uses a probe to apply a signal to a semiconductor device, and then measures the CV curve to determine the doping type of the semiconductor device.
Microscopic phase measurement methods such as AFM, TEM , EDX , XPS , and XRD are generally used when Fab conducts more in-depth failure analysis. Design House generally does not need to be involved in such in-depth phase characterization.
Because failure analysis is more specific , it is difficult to summarize a set of general rules , and the possibilities of chip failure are countless . The author believes that failure analysis requires more experience accumulation . Only after being knowledgeable and experienced can the laws and general characteristics of failure be summarized . The author still has a long way to go, and hopes that everyone can communicate more. After all, a person's knowledge is always limited.
Original author:
Tomato ESD Stack
If you have read this far, please like, bookmark and share!
Scan the QR code for free for a limited time to join the group and exchange more industry technologies
Recommended reading ▼
Huawei HiSilicon software and hardware development materials