Keywords

1 Introduction

In the wafer fabrication process, a number of inspection and measurement stations are set to monitor the process parameters and to find the problems in the early stage [1]. Due to limited capacities and costs for in-line wafer inspection, only certain wafers are inspected among a specific number of lots. Considering the high inspection cost [2], an effective sampling strategy for allocating the finite capacity has always played a huge role in yield management [3]. Meanwhile, given the characteristics of the semi-conductor industry, such as short product life cycles, changing demand of customers, keen competition in the market, and high manufacturing cost, a semiconductor company should seek to cut back on unnecessary inspection cost and production time to increase the overall profit.

Although there are several existing studies for IC sampling strategy [4] in defect/particle inspection, little research has addressed with metrology sampling [1, 5] regarding the critical dimension or thin film. In-line metrology was to inspect the WIP in real time. Even though virtual metrology becomes popular recently [6], enterprises do not trust in this technology due to its uncertainty. Currently, the sampling metrology numbers and sampling frequency are still decided via the engineers’ experience, and may vary from person to person.

The purpose of this study is to develop a full decision framework for statistically determining the optimal sampling strategy for in-line inspection in wafer fabrication. Moreover, we also explored how different sampling strategies could affect the cost of quality (COQ) and conducted an empirical analysis in a semiconductor factory. The constructed model and result may effectively help the engineers to decide the optimal sampling frequency in terms of product types and the cost of quality, which could enable full utilization of the machines and improve the product yield.

The remains of this paper are organized as follows: Sect. 2 reviews the studies related to the fundamentals of proposed framework. Section 3 shows the proposed method. Section 4 presents an empirical study. Section 5 concludes of the developed method and discussion on further studies to deal with the complex in-line metrology sampling process.

2 Fundamental

The notations used in this research are defined as follows.

\( \theta_{w} \)::

nature state of a wafer

\( \theta_{\text{L}} \)::

nature state of a lot

\( \pi (\theta_{\text{wi}} ) \)::

prior probability of a wafer in state i

\( \pi (\theta_{\text{Li}} ) \)::

prior probability of a lot in state i

\( {\text{A}}_{\text{w}} \)::

set of actions in a wafer

\( {\text{A}}_{\text{L}} \)::

set of actions in a lot

\( N_{d} \)::

total number of dies in a wafer

\( N_{w} \)::

total number of wafers in a lot

\( n_{d} \)::

sample size for a wafer

\( n_{w} \)::

sample size for a lot

\( k_{d} \)::

number of out of spec dies in sampled dies

\( y \)::

number of bad wafers in sampled wafers

\( z \)::

number of rejection wafers in sampled wafers

\( \delta_{1} (x) \)::

decision rule for an inspected wafer

\( \delta_{2} (z) \)::

decision rule for an inspected lot

\( c_{1} \)::

acceptance number of a wafer

\( c_{2} \)::

acceptance number of a lot

v: :

sampling frequency

Cq::

cost of per lot quality loss

Cs::

sampling cost per lot

In practice, it costs tremendously to inspect every wafer in every lot. Without full-inspection, engineers only can infer the true quality of the product by prior probability, a subjective judgment of possibility, or by the latest evidence from sampled wafers and dies in a lot. According to the evidences, engineers may revise prior probability to posterior probability and determine proper actions: reject or accept a lot. Baye’s theorem illustrate the revision of probability as

$$ P(H|E) \propto P(H) \cdot P(E|H) $$

A decision to take appropriate actions based on sample data and the probability revision is call “Bayesian decision analysis”.

2.1 Bayesian Decision Analysis

Bayesian decision discusses how to get extra information form appropriate sampling method, and update expected loss for feasible schemes from extra information to select a scheme with the minimum expected loss. According to Bayesian Theorem, decision makers can revise prior probability based on the sample information, and reappraise the expected value of each alternative. Chien, Hsu, Peng and Wu [4] applied Bayesian decision analysis and proposed a heuristic framework for sampling the particle or defect in wafer fabrication to provide the best sampling frequency and control limits. In addition, Chien, Wang and Wang [7] used Bayesian decision analysis to construct a IC final testing strategy for enhancing overall operational effectiveness. Figure 1 [8] shows a conceptual framework of Bayesian decision analysis.

Fig. 1.
figure 1

Bayesian decision analysis [8]

There are three basic decision elements in Baye’s decision analysis: parameter space, sample space and action space. Parameter space \( \Omega \) is composed of possible states of nature \( \theta {}_{j} \), i.e., \( \Omega = \{ \theta_{j} \} \). We assume that there is a set of possible actions, a jointly constituting the action space A (i.e., \( A = \{ a_{i} \} \)). Sample space is composed of sampled data.

When \( \theta \) is not exactly know, we can get prior probability \( \pi (\theta_{j} ) \) of \( \theta \) based on some mixture of subjective judgments and objective evidence. In many circumstances we may have some additional information provided from sample data x, with likelihood function \( p_{\theta } (x) \) obtained from an experiment whose outcomes depend on the value \( \theta \). If we ignore prior information of \( \theta \), sample data alone can be used for choosing the action. Let the decision rule \( \delta (x) \) specify the action in A corresponding to the evidence data x. That is, \( \delta (x) \) is a decision rule that maps X to A. Furthermore, there is a function \( L{\kern 1pt} \;(a{\kern 1pt} ,{\kern 1pt} \theta ) \) defined on the \( A \times\Omega \) space, where \( L{\kern 1pt} \;(a{\kern 1pt} ,{\kern 1pt} \theta ) \) measures the loss which arises if we take action \( a \) when the state of nature is \( \theta \).

Any decision rule \( \delta \)(.) can be assessed in terms of long-term expected loss; that is, the average loss for different data might arise. For any decision rule \( \delta (x) \), we consider the risk function as follows:

$$ R(\delta (x),{\kern 1pt} {\kern 1pt} \theta ) = \int_{x} {L(\delta (x),\theta )\;p_{\theta } (x)\;dx} $$
(1)

If in addition to sample data, one may weigh the risk function by \( \pi (\theta ) \) and compute the summary measure (e.g. expected risk) as a basis for choosing between different decision rules. That is, Baye’s risk is defined as follows:

$$ r(\delta ,\pi ) = \int_{\Omega } {\int_{x} {L(\delta (X),\theta )\; \cdot p_{\theta } (x) \cdot \pi (\theta ) \cdot \;dx \cdot d\theta } } $$
(2)

The best decision rule is the one that has the minimum mean risk with respect to variations in \( \theta \); that is,

$$ \mathop {\hbox{min} }\nolimits_{\delta } r(\delta ,\theta ) $$
(3)

3 Approach

The proposed framework of sampling strategy is constructed based on Bayesian decision analysis as shown in Fig. 2. In particular, three Bayesian decision elements in the proposed sampling framework are defined as follows.

Fig. 2.
figure 2

Conceptual framework of Bayesian decisions for an inspected lot

Parameter space is the true quality of population and comprises two states: θ1 = good and θ2 = bad since the true quality would either meet or fail quality requirement. Sample space X indicates the unqualified number in sample n drawn from population N. Action space contains two actions: \( a_{1} = {\text{accept}} \) and \( a_{2} = {\text{reject}} \). Depend on decision rule (acceptance sampling plan) \( \delta (x) \), if random variable x is lower than a criteria which is determined in advance we will accept this population. Otherwise, we will reject this population.

First, let us consider a sampling plan in a wafer. Suppose there are \( N_{d} \) dies in a wafer. The parameter space of a wafer θw consists of two states: \( \theta_{{{\text{w}}1}} = {\text{good}}\,{\text{wafer}} \) and \( \theta_{{{\text{w}}2}} = {\text{bad}}\,{\text{wafer}} \) since the true quality of a wafer would either meet or fail quality requirement. \( n_{d} \) dies are drawn from a wafer to inspect and \( x \) denotes the number of dies that do not meet quality requirement. According to the decision rule \( \delta_{1} (x) \), \( x \le c_{1} \) means that we do not have sufficient evidence to reject this wafer conform to our requirement, we should not reject this wafer. Otherwise, \( x > c_{1} \) means the wafer does not meet our requirement and thus should be rejected. A wafer state of nature includes good and bad state, \( \Omega _{w} = \{ \theta_{{{\text{w}}1}} = {\text{good}}\,{\text{wafer,}}\,\theta_{{{\text{w}}2}} = {\text{bad}}\,{\text{wafer}}\} \), and \( \pi (\theta_{w1} ) = p(good\,wafer),\pi (\theta_{w2} ) = p(bad\,wafer) \). Action space in a wafer is \( A_{w} = \{ a_{1} = accept,a_{2} = reject\} \) and sample space \( x \) is the possible unqualified number of dies in \( n_{d} \) dies.

Similar to a wafer, the state of nature of a lot consists of good lot and bad lot, i.e., \( \Omega _{L} = \{ \theta_{{{\text{L}}1}} = {\text{good}}\,{\text{lot,}}\,\theta_{{{\text{L}}2}} = {\text{bad}}\,{\text{lot}}\} \), and \( \pi (\theta_{L1} ) = p(good\;lot) \) \( \pi (\theta_{L2} ) = p(bad\;lot) \). Action space in a lot is \( A_{L} = \{ a_{1} = accept,a_{2} = reject\} \) and sample space \( y \) is the possible unqualified number of wafers in \( n_{w} \) wafers that are drown form a lot with \( N_{w} \) wafers.

However, in wafer fabrications, in-line metrology inspection is executed by sampling some wafers in an inspection lot and some dies in every sampled wafer, then integrate the decision result from individual sampled wafer to determine accept or reject this lot. Thus, in addition to basic Bayesian decision elements, there is an extra random variable \( z \) accounts the number of being rejected wafers in sample \( n_{w} \) wafers. Not only the decision making of a wafer, but number of bad wafers in our sampled \( n_{w} \) wafers also effect \( z \). Combine random variable \( z \) and decision rule \( \delta_{2} (z) \) in a lot, if the value \( z \) is less than acceptance quality level c2 we think that every wafer in the lot is meet quality requirement and accept this lot. Otherwise, if the value \( z \) exceed c2, we will reject this lot.

Because the nature state of a wafer uncertainly, and \( N_{w} \) wafers comprise a lot, it means that the state of a lot also uncertainly. Based on nature state of a wafer or a lot, a decision making is possible to make wrong and increase the producer risk or consumer risk simultaneously. The producer risk is meant that the wafer or lot was rejected under the wafer or lot was good. The consumer risk is meant that the wafer or lot was accepted under the wafer or lot was bad (Fig. 3).

Fig. 3.
figure 3

Decision tree for a inspected lot

$$ \begin{aligned} & {\text{Producer}}\,{\text{risk}} = {\text{p}}\left( {{\text{reject}}\,{\text{a}}\,{\text{product}}\,|\,{\text{product}}\,{\text{is}}\,{\text{good}}} \right) = \alpha \\ & {\text{Consumer}}\,{\text{risk}} = {\text{p}}\left( {{\text{accept}}\,{\text{a}}\,{\text{product}}\,|\,{\text{product}}\,{\text{is}}\,{\text{bad}}} \right) = \beta \\ \end{aligned} $$

Under the combination (accept, bad) we will have a loss \( L\;[accept,\;bad] \) because we accept a bad lot. Since the action “accept” a lot decided based on decision rule \( \delta_{2} (z) \), \( L\;[accept,\;bad] \) can be revised as a function of \( \delta_{2} (z) \) as \( L\;[\delta_{2} (z),\;bad] \). Similarly, loss function of the combination (reject, good) \( L\;[reject,\;good] \) can be revised as a function of \( \delta_{2} (z) \) as \( L\;[\delta_{2} (z),\;good] \). The remaining combinations of (accept, good) and (reject, bad) imply that decision maker takes right action and no loss will occur.

According to Eq. (1), we derive a pair of long-term expected loss \( R(\delta_{2} (z),{\kern 1pt} {\kern 1pt} good) \) and \( R(\delta_{2} (z),{\kern 1pt} {\kern 1pt} bad) \) for a inspected lot. Moreover, the Baye’s risk \( r(\delta_{2} (z),\;\pi_{L} ) \) can be calculated by weighting the risk function \( R(\delta_{2} (z),{\kern 1pt} {\kern 1pt} good) \) and \( R(\delta_{2} (z),{\kern 1pt} {\kern 1pt} bad) \) with \( \pi_{\text{L}} \)(good lot) and \( \pi_{\text{L}} \)(bad lot) for a inspected lot. Finally, an optimal decision rule \( \delta_{2}^{*} (z) \) with the minimum Baye’s risk in all feasible decision rules under given conditions can be determined.

In addition, not all of products can be inspected. We assume the lot not inspected is good, but it may not always be true in really setting. There is a case that the lot is unqualified and we pass it because we do not inspect it. It brings a yield loss Cq from the gap between good and bad lot. For a long time, the expected yield loss of a non-inspected lot is \( \pi \left( {{\text{bad}}\,{\text{lot}}} \right) \times {\text{C}}_{\text{q}} = r(B) \).

With sampling frequency v, we will inspect a lot again after (v − 1) lots. Between two sampling lots, the quality loss is \( r(\delta_{2}^{*} (z),\pi_{L} ) + \left( {v - 1} \right)r(B) \). On the other hand, we need to consider sampling cost when sample a lot to inspect every time. Sampling cost consists of fixed sampling cost F and variant sampling cost S. If we sampled \( n_{w} \) wafers to inspect from a lot, then the sampling cost for a sample is

$$ {\text{C}}_{\text{s}} = {\text{F}} + n_{w} \times {\text{S}} $$
(5)

In order to determine the best sampling frequency v, we tradeoff sampling costs and quality cost with a function \( {\text{E}}\left( {\text{cost}} \right) = f(\delta_{2} ,\pi_{\text{L}} ,v) \).

The best sampling frequency is that with the minimum E(cost).

4 An Empirical Study

4.1 To Change the Sampling Inspection Plan and Frequency

From Fig. 4, we can find that given the same number of inspected wafers \( n_{w} \), the greater the sampling frequency \( v \) is, the less influence inspected dies \( n_{d} \) have on the quality loss cost. On the other hand, with the same \( n_{d} \), the greater \( v \) is, the less influence \( n_{w} \) have on the quality cost loss.

Fig. 4.
figure 4

The influence of the sampling strategy and frequency on E(Loss)

5 Conclusion and Further Study

This study proposed a general in-line metrology sampling framework for semiconductor manufacturing. The proposed framework can assist the decision maker in determining all parameters for in-line sampling strategy with different lot size and process capability. Moreover, the sampling acceptance level for a wafer and lot can also be decided.

However, not all in-line inspection station has the same process capability. Further study should be done to allocate the inspection resource to different inspection stations with different capability. Therefore, further studies need to decrease sampling rate at non-critical or with either stable or high process capability stations. On the other hand, increase sampling frequency at critical or low process capability stations. By the way, we can reduce cost of sampling and quality loss, decrease cycle time to increase throughputs.