Semiconductor Defect Management Separating The Vital Few From The Trivial Many
1. Semiconductor Defect Management:
Separating the Vital Few From the Trivial Many
Stuart L. Riley
slriley0207@gmail.com
slriley@valaddsoft.com
Member American Society for Quality
November 19, 2009 Stuart L. Riley 1
2. Copyright Statement
Original work by Stuart L. Riley: Copyright 2009
Rights reserved.
This document may be downloaded for personal use.
Users are forbidden to reproduce, republish, redistribute, or resell any
materials from this document as their original work.
All references to this document, any quotation, or figures should be made to
the author.
Questions or comments can be addressed to Stuart L. Riley, at
slriley@valaddsoft.com or slriley0207@gmail.com
November 19, 2009 Stuart L. Riley 2
3. Definition of Quality Control
Quality control (QC) is a procedure or set of procedures intended to ensure that a manufactured product
or performed service adheres to a defined set of quality criteria or meets the requirements of the
client or customer. QC is similar to, but not identical with, quality assurance (QA). QA is defined as a
procedure or set of procedures intended to ensure that a product or service under development (before
work is complete, as opposed to afterwards) meets specified requirements. QA is sometimes expressed
together with QC as a single expression, quality assurance and control (QA/QC).
In order to implement an effective QC program, an enterprise must first decide which specific
standards the product or service must meet. Then the extent of QC actions must be determined (for
example, the percentage of units to be tested from each lot). Next, real-world data must be collected
(for example, the percentage of units that fail) and the results reported to management personnel. After
this, corrective action must be decided upon and taken (for example, defective units must be repaired or
rejected and poor service repeated at no charge until the customer is satisfied). If too many unit failures
or instances of poor service occur, a plan must be devised to improve the production or service process
and then that plan must be put into action. Finally, the QC process must be ongoing to ensure that
remedial efforts, if required, have produced satisfactory results and to immediately detect recurrences or
new instances of trouble.
>>> Keep highlighted passages in mind, as you read the rest of this document. <<<
Source: http://whatis.techtarget.com/definition/0,,sid9_gci1127382,00.html
November 19, 2009 Stuart L. Riley 3
4. Introduction
• Semiconductor fabs use in-line inspections to
– Detect what are commonly called “defects”
– Use “defect” count / density charts to monitor / control the “fab quality”
– Not focused on the unit product – the individual circuit, or the “die”
• This strategy is misleading at best and catastrophic at worst
– In-line inspections detect anomalies (the “trivial many”)
– Defects (the “vital few”) are only a subset of all anomalies
– The noise of the “trivial many” anomalies can drive the chart trends
• Too much time wasted reacting to the trivial many
• Too easy to miss the vital few
November 19, 2009 Stuart L. Riley 4
5. Goal
• Apply a strategy that separates the vital few from the trivial many
• Defect component needs to be extracted from all anomalies
• Need to know potential faults – fraction of defects that are harmful
• Determine the probable affect of faults on each die (unit product)
• Need to apply a die-based, defect-limited yield strategy
November 19, 2009 Stuart L. Riley 5
6. Define the Unit Product
• Wafers
– Run in batches of up to 25 wafers per batch
– Contain die: 10s, 100s, 1000s of die per wafer
– Are carriers of the die only -- NOT the unit product
• Die
– Individual circuits sold to customers
– The unit product produced
November 19, 2009 Stuart L. Riley 6
7. Semiconductor In-Line Inspection
• Inspection – sample lots
– Sample of lots and wafers in lot
• Data Result: Anomalies – anything detected by inspection
– Inspection tool noise – false positives
– Cosmetic anomalies
• Color, grain, etc. from normal process variation
• No negative effect on yield
– Defects (the vital few)
• Abnormal and potentially harmful
• Particle or process-related
• Separate from other anomalies using classification (categorization)
• Data Result: Wafer maps – coordinate data of anomalies
November 19, 2009 Stuart L. Riley 7
8. Anomaly Counting
Inspection results: Wafer maps and anomaly density (counts).
These high points could be random, affecting many
Map of random anomalies die, or they could be clustered, affecting few die.
Clusters
Anomaly counting cannot distinguish between high
points caused by random or clustered anomalies.
No distribution or die information on chart.
Note: All wafer maps were produced using the
“KlarfView” application, which can be found at:
http://www.valaddsoft.com/
November 19, 2009 Stuart L. Riley 8
9. Defect Counting
• Defects are sub-set of all anomalies
• Requires categorization (classification) of anomalies to
separate defects from rest
• Requires selection process
– Automatic to reduce human bias
– Wafer-based random selection: no distribution / die information.
November 19, 2009 Stuart L. Riley 9
10. Random Sampling
Selected anomalies = dark spots
Wafer-based random sampling tends to
over-sample in the clustered regions. Many
randomly-distributed anomalies (on many
die) are not sampled (lighter spots).
This is ok, if the goal is to define number of
defects on the wafer.
But, it adds no information regarding the
number of die affected (distribution).
Clusters So this is not the correct sampling strategy
if we want to monitor the quality of the die.
Note: All wafer maps were produced using the
“KlarfView” application, which can be found at:
http://www.valaddsoft.com/
Sampling was done using the “DBSample”
application.
November 19, 2009 Stuart L. Riley 10
11. Defect Classification
Examples of defects as seen during classification.
Some obviously impact the product, others aren’t as obvious.
So defects have a probability of affecting the die circuits.
November 19, 2009 Stuart L. Riley 11
12. Defect Count
After classification, the defect classification data can be used to extract
the number of defects from the overall population of anomalies.
Assume: Anomaly count = 1000
Classification data:
Type A: nA = 40 (assume this is a cosmetic anomaly)
Type B: nB = 10
Type C: nC = 20
Type D: nD = 30
10 + 20 + 30 60
Defect Count = × 1000 = × 1000 = 600
100 100
So it is estimated that 60% of the anomalies are defects that could
potentially harm the product. Type “A” is left out, because we assumed it
was a cosmetic anomaly.
November 19, 2009 Stuart L. Riley 12
13. Defect Density Chart
The noise level is reduced and now reflects the count of defects on
the wafer, with the rest of the anomalies removed. We’re now able to
see the vital few, but we need to consider the fact that all defects
don’t cause fails.
Anomaly density (gray)
Defect density (black)
November 19, 2009 Stuart L. Riley 13
14. Fault Count
The defect count data can be refined further.
We can apply the probability of failure, pi, for each of the ith defect types to their respective
counts to find the overall fault count on the wafer.
The fault count is defined as the weighted-average kill ratio
multiplied by the number of anomalies.
M
F =K×A Fault count ∑( p × n )
i i Weighted average kill ratio,
K= i =1 for M types and N classified
N anomalies.
Type A: nA = 40, pA = 0.0, fA = 0
Type B: nB = 10, pB = 1.0, fB = 10
For individual fault types:
Type C: nC = 20, pC = 0.5, fC = 10
Fi = Ki × A Type D: nD = 30, pD = 0.8, fD = 24
( 0 + 10 + 10 + 24 ) = 0.44
( pi × ni ) K= Weighted average kill ratio.
Ki = 100
N
F = 0.44 × 1000 = 440 Fault count
November 19, 2009 Stuart L. Riley 14
15. Fault Density Chart
By applying a probability of failure to each anomaly type, the noise
level is reduced even further. The chart now reflects the count of
faults on the wafer.
But, fluctuations in this chart can still be driven by clusters.
We still need to capture distribution information – the number of die
(unit product) affected.
Anomaly density (gray)
Fault density (black)
November 19, 2009 Stuart L. Riley 15
16. Defect-Limited Yield
• Yield is the fraction of all die that are good
• Yield can be affected by
– Process problems and fall-on particles – defects that cause faults
– Things that may or may not be caught using in-line inspections
• Defect-Limited Yield (DLY)
– Definition: The yield loss for each defect, or group of defects
– Other issues may cause yield loss
– The defect-limited yield will only cap the upper limit to potential yield loss due
to detected defects
– Actual yield may be lower, due to issues that are not detected from in-line
inspections
– So DLY cannot be relied on as a “yield predictor”, but only as a quality metric
to identify potential yield issues due to detected defects
November 19, 2009 Stuart L. Riley 16
17. DLY: General Form
If we assume all anomalies will cause faults, we can find the pct of die without anomalies (or pct clean
die) by dividing the number of die without anomalies, DO, by the number of die inspected, DI:
DO
Pct Clean Die = Note: This is analogous to a “yield” number.
DI
But if we assume that only a portion of the anomalies have a probability of causing faults, some of
the anomalous die have a probability of not failing (or die that can be recovered), D’A:
′
DA + DO
DLY =
DI
The number of anomalous die that may be recovered, D’A, can be expressed as a probability
density function applied to the number of anomalous die. Assuming all anomalies are random, we
can use the Poisson distribution function:
M
∑( p × n )
i i
A
′
DA = e −f
× DA Where, f = K ×a K= i =1 a=
N DA
Now the DLY can be expressed as: DLY =
(e −f
)
× DA + DO
DI
November 19, 2009 Stuart L. Riley 17
18. DLY: General Form
DLY can also be expressed in terms of individual (or combined) anomaly
types. For the ith type:
DLY =
(e − fi
)
× DA + DO
i
DI
Average number of faults Average number of anomalies
Weighted kill ratio for the on anomalous die.
for the ith anomaly type. ith anomaly.
( p ×n )
= i i a=
A
fi = Ki × a Ki DA
N
Kill ratio for Total Number classified
the ith type classified for the ith type
November 19, 2009 Stuart L. Riley 18
19. Why Use the Poisson Distribution Function?
Sources:
C. Stapper, et. Al., “Integrated Circuit Yield Statistics”, Proceedings of the IEEE, Vol. 71, No. 4, pp. 453-468, April 1983.
C. Stapper, “On a Composite Model to the IC Yield Problem”, IEEE Journal of Solid State Circuits, Vol. SC-10, pp. 537-539, December 1975.
From the references: For random distributions, Poisson statistics can be applied.
Poisson Statistics (Random distributions)
Note: The reference uses λ instead of f. But the meaning is the same.
Y = e− f It is the average number of faults per die.
The average number of faults per die is
f = AC × D AC is the critical area, and D is the defect density.
The critical area is the probability of failure, P, times the die area
AC = P × A
So, the average number of faults per die can be expressed as the probability of failure
times the average number of defects per die
f = P × ( A× D) = P × d
November 19, 2009 Stuart L. Riley 19
20. Why Use the Poisson Distribution Function?
The average number of faults per die is be expressed as the probability of failure times
the average number of defects per die
f = P×d
For DLY, we can define the average number of faults per die as the product of the
weighted-average of the kill ratios for all classified anomalies (which is analogous to the
probability of failure), and the average number of anomalies per die:
M
∑( p × n )
i i A
f = K ×a Where K= i =1 and a=
N DA
Now we can apply the average number of faults to the Poisson distribution function to
find the “yield”, of the number of anomalous die, or anomalous die that can be
recovered.
DLY =
( )
e − f × DA + DO So, armed with nothing more than the data collected
from in-line inspections, we can estimate the impact
of defects on yield – the defect-limited yield.
DI
November 19, 2009 Stuart L. Riley 20
21. Why Use the Poisson Distribution Function?
• The Poisson function works only for randomly-distributions
• Anomaly maps typically contain mixed distributions.
• How can we apply the Poisson function to mixed distributions?
– Separate die with random anomalies from die with clustered anomalies
– Treat each die group as random distributions
• A lower-density group for random die
• A higher-density group for clustered die
– Estimate the number of recovered random and clustered die seperately
– Simply add the number of recovered die to the number of clean die to find the
total number of die that likely will not fail
November 19, 2009 Stuart L. Riley 21
22. DLY: Mixed-Distribution
If we assume the distributions of anomalies will always be random, the
DLY can be expressed as:
D′ + DO
DLY = A
DI
But as we can see from the wafer maps, we have mixed distributions –
random and clustered anomalies. So, we need to pull the 2 distributions
apart into their random and clustered die components: Random
′ ′ ′
DA = DR + DC
So the DLY can now be expressed as:
′ ′
DR + DC + DO
DLY =
DI
Cluster
November 19, 2009 Stuart L. Riley 22
23. DLY: Mixed-Distribution
In order to correctly apply the DLY to the random and clustered
distributions, we can express DLY as:
′ ′
DR + DC + DO
DLY =
DI
For the random distribution: For the clustered distribution:
′
DR = e − f R × DR ′
DC = e − fC × DC
f R = K R × aR fC = { K C × ( aC − aR )} + ( K R × aR )
Weighted-average kill ratio Weighted-average kill ratio
for random anomalies only: for clustered anomalies only: Note:
MR MC If KR = KC, then
∑( p × n ) i i R ∑( p ×n )i i C fC = K C × aC
KR = i =1
KC = i =1
NR NC
The average number of random The average number of clustered
anomalies over random anomalous die: anomalies over clustered anomalous die:
AR AC
aR = aC =
DR DC
November 19, 2009 Stuart L. Riley 23
24. Avg Number of Faults Per Clust Die
If we assume the clustered and random Average number
anomalies are independent, we can treat of random
AR anomalies on
the 2 distributions separately on the aR =
clustered die. DR random die.
f C = { K C × ( aC − aR )} + ( K R × aR )
But, if we want to
MC
assume KR = KC,
∑ ( pi × ni )C
MR
KC = i =1 aC =
AC ∑( p × n )
i i R
then
KR = i =1
NC DC
NR fC = K C × aC
Weighted average kill Average number Weighted average kill
ratio for the classified of clustered ratio for the classified
anomalies on just the anomalies on anomalies on just the
clustered die. clustered die. random die.
November 19, 2009 Stuart L. Riley 24
25. DLY: Random Only
At times, it is important to know what the DLY would be on a wafer, if there were no
clusters. This information can be used to plot the mixed-distribution data and the
“random only” data together on the same chart, to see which wafers are clustered and
which are not.
D′ + DO
DLYRandOnly = AR
DI
′
DAR = e − fall × DA
The fault density is expressed in terms of the weighted-average kill ratios for all anomaly
types, applied to the average number of random anomalies per random die, applied to all
anomalous die. This assumes all die only contain random anomalies, and all have a
proportional probability of containing the same anomaly types.
M
f all = K all × aR ∑( p × n ) i i
aR =
AR
K all = i =1
DR
N
November 19, 2009 Stuart L. Riley 25
26. Die-Based Clustering
In order to use the mixed-distribution DLY,
we must separate the random die, DR, and
clustered die, DC.
We can do this by identifying clustered die
as die containing significantly more
anomalies, compared to the other Clustered die
anomalous die.
Note: All wafer maps were produced using the
“KlarfView” application, which can be found at:
http://www.valaddsoft.com/
Dark spots: anomalies in clustered die
Clustering was defined using the “DBCluster”
application.
November 19, 2009 Stuart L. Riley 26
27. Die-Based Sampling
Now that we can separate the random
die from the clustered die, we can apply
die-based sampling to ensure we have a
fair selection of anomalies over as many
die possible.
Compared to wafer-based random
sampling, die-based sampling forces a
fair sampling of more anomalous die, Random sampling Die-based sampling
while still ensuring we get a fair
sampling from clustered die.
Note: All wafer maps were produced using the
“KlarfView” application, which can be found at:
http://www.valaddsoft.com/
Sampling was done using the “DBSample”
application.
November 19, 2009 Stuart L. Riley 27
28. Die-Based Sampling
Random sampling Die-based sampling
Note: All wafer maps were produced using the
“KlarfView” application, which can be found at:
http://www.valaddsoft.com/
November 19, 2009 Stuart L. Riley 28
29. Mixing Multiple Products
• Many fabs run multiple products that have different number of die
• The “native” DLY is modulated by the number of die on the wafer
• Apply a standard set of die to the wafer map to estimate a “normalized” DLY
• Apply normalization after all other steps (inspection, classification,
clustering and native DLY est.) have been completed
• Normalization permits a better apples-apples monitoring of processes that
span multiple product types
• Normalization also permits application of DLY estimations to bare wafer
inspections
November 19, 2009 Stuart L. Riley 29
30. Note: All wafer maps were produced using the “KlarfView” application,
which can be found at: http://www.valaddsoft.com/
Normalization was done using the “RDie” application. Normalize Die
A die-based strategy can be modulated by the number of die on a wafer. In order to
apply this strategy to fabs that are running multiple products with different die
layouts, we can normalize the die layout to a standard set of die. The normalization
allows us to plot the data on one chart for all products. This also allows us to apply
a die-based strategy on wafers that have no die (lower right).
Native Die Normalized Die Native Die Normalized Die
November 19, 2009 Stuart L. Riley 30
31. Example of DLY Response
Assuming the same die layout (or normalized die):
> DLY is modulated by number of die affected
> Probability anomalies can cause a fault (the Krs)
> And density of anomalies per die.
November 19, 2009 Stuart L. Riley 31
32. DLY Chart
Example of a DLY chart.
The average mixed-dist DLY for
all wafers in a lot are plotted
along with the random-only DLY
(circles).
The bars indicate the high and
low values for each lot.
November 19, 2009 Stuart L. Riley 32
33. DLY Compared to Anomaly Density
The DLY data shows numerous low
points that were traced to a problem
with a process tool.
During this same time-period, the
anomaly density chart showed virtually
no correlation to the problem.
The DLY data proved to be a superior
indicator of the problem.
November 19, 2009 Stuart L. Riley 33
34. Level DLY and Defect DLY
Example of how one defect type drove the DLY for one level. (Chart on left – DLY the same as prev chart)
The low points (excursions) correlated to a specific tool problem.
The same data is plotted on the chart on the right to show how much this defect drove the level DLY.
Level DLY and Defect DLY Defect DLY vs. Level DLY
November 19, 2009 Stuart L. Riley 34
35. Cumulative DLY For All Levels
The cumulative yield, DLYcum is expressed as the product of the DLYs for each level.
For N levels, the cumulative DLY is:
N
DLYCum = ∏ DLYi
i =1
November 19, 2009 Stuart L. Riley 35
36. Cumulative DLY Chart
Baseline Excursion Recovered
The cumulative DLY is
driven by more than one
level.
One level shown on this
chart clearly had an affect
on the cumulative DLY due
to an excursion (middle of
chart).
Because DLY is modulated
by the same factors that
can affect yield, there is a
good chance that the
issues pushing the DLY
down will affect final yield.
November 19, 2009 Stuart L. Riley 36
37. Implementation Steps
• Inspect the wafer to find the anomalies.
• Run die-based clustering to identify clustered die.
• Run die-based sampling to select anomalies to classify.
• Classify the anomalies.
• Apply a pre-defined set of kill ratios to each type of anomaly.
• Calculate the un-normalized DLY using the "native" die layout.
• Normalize the die.
• Re-run die-based clustering.
• Using the classification data already collected, calculate the
normalized DLY.
November 19, 2009 Stuart L. Riley 37
38. Implementation Steps for Bare Wafer
• Inspect the wafer to find the anomalies.
• Normalize the die to add die information to the data.
• Run die-based clustering to identify clustered die.
• Run die-based sampling to select anomalies to classify.
• Classify the anomalies.
• Apply a pre-defined set of kill ratios to each type of anomaly.
• Calculate the DLY.
November 19, 2009 Stuart L. Riley 38
39. Summary
• Wafer-based counting strategies
– Do not adequately monitor and control the unit product – the die
– Can be driven by noise – the high points on the chart
– Wasted effort by focusing on the trivial many, while missing the vital few
• Die-based DLY strategy
– Removes a lot of the noise that can result in missed opportunities
– Focuses attention to factors that can drive issues affecting the die
• Number of die affected
• Probability of anomalies causing faults from extracted defects
• Number anomalies (extracted defects) on anomalous die
– Manages the impact of clustered die on the data
– Emphasizes the vital few, while minimizing the trivial many
November 19, 2009 Stuart L. Riley 39
40. References
Menon, Venu B., "Chapter 27: Yield Management", "Handbook of Semiconductor Manufacturing
Technology", Marcel Dekker Inc., 2000, pp. 869-887.
Nurani, R.K., "Effective Defect Management Strategies For Emerging Fab Needs", Statistical
Methodology, IEEE International Workshop, 2001, pp. 33-37.
Riley, Stuart, "A Simplified Approach to Die-Based Yield Analysis", Semiconductor International, Vol.
30, No. 8, August 2007, pp. 47-51.
Riley, Stuart L., "Limitations to Estimating Yield Based on In-Line Defect Measurements," dft, pp.46,
1999 International Symposium on Defect and Fault Tolerance in VLSI Systems, 1999
Riley, Stuart L., "Estimating the Impact of Defects on Yield from In-Line Defect Measurement Data",
Semiconductor International Web Exclusive, December 1999,
http://www.semiconductor.net/article/206973-
Estimating_the_Impact_of_Defects_on_Yield_from_In_Line_Defect_Measurement_Data.php?rssid=2
0279
Riley, Stuart L., "Optical Inspection of Wafers Using Large Area Defect Detection and Sampling",
Proceedings IEEE International Workshop on VLSI Systems, November, 1992 (pp. 12-21).
Stapper, Charles, Et. Al, "Integrated Circuit Yield Statistics", Proceedings of the IEEE, Vol. 71, No. 4,
April 1983, pp. 453-470.
November 19, 2009 Stuart L. Riley 40