Stochastics and Statistics
Degradation-based burn-in with preventive maintenance

https://doi.org/10.1016/j.ejor.2012.03.028Get rights and content

Abstract

As many products are becoming increasingly more reliable, traditional lifetime-based burn-in approaches that try to fail defective units during the test require a long burn-in duration, and thus are not effective. Therefore, we promote the degradation-based burn-in approach that bases the screening decision on the degradation level of a burnt-in unit. Motivated by the infant mortality faced by many Micro-Electro-Mechanical Systems (MEMSs), this study develops two degradation-based joint burn-in and maintenance models under the age and the block based maintenances, respectively. We assume that the product population comprises a weak and a normal subpopulations. Degradation of the product follows Wiener processes with linear drift, while the weak and the normal subpopulations possess distinct drift parameters. The objective of joint burn-in and maintenance decisions is to minimize the long run average cost per unit time during field use by properly choosing the burn-in settings and the preventive replacement intervals. An example using the MEMS devices demonstrates effectiveness of these two models.

Highlights

► Motivated by MEMS, we develop two degradation-based burn-in maintenance models. ► The average cost per unit time for each model is derived. ► A procedure is developed to numerically evaluate the cost function. ► We show that degradation-based burn-in requires much less burn-in time. ► These models are potentially very important for other reliable products.

Introduction

It has been well accepted that one major characteristic of modern semi-conductor devices is the infant mortality. The infant mortality period, characterized by a high initial failure rate, results from a small subpopulation of freak units, whose lifetimes are much shorter than the normal ones. These defective units may arise from flaws in the material and defects during the assembly process. Existence of these defective units leads to substantial amount of early failures, and thus degrading product performance in field use.

Before the product is put into field operation, burn-in is a common practice to identify and eliminate these weak units. Generally speaking, burn-in is a testing procedure towards the end of the production process. It is conducted by subjecting all units to accelerated stresses, e.g., elevated voltages, temperatures and power cycling, for an extended period of time, with the purpose of bringing out latent defects that might otherwise surface as early failures in the field (Ye et al., 2011). Some excellent recapitulations on this topic can be found in Liu and Mazzuchi (2008). Items with burn-in are then put into field use, under which a rational preventive replacement (PM) strategy is often adopted to further improve the system availability and cut down the field failure costs. For literature reviews on preventive maintenance, see Wang, 2002, van Noortwijk, 2009.

Compared with making two isolated decisions, joint modeling of the burn-in procedure and the PM decision would be more cost-effective, and thus has attracted much attention. For example, Mi (1994) assumed a bathtub failure rate and a minimal repair upon failure, and proposed joint burn-in and maintenance models under the age based preventive maintenance policies. Jiang and Jardine (2007) assumed two subpopulations and perfect repair, i.e., replacement upon failure, and derived the total costs of joint burn-in and maintenance under the age replacement policy. Excellent reviews were provided by Liu and Mazzuchi, 2008, Cha, 2011.

It is noted that a common feature of these models is that they all deal with binary system state, i.e., either working or failed. We shall call these models lifetime-based. On the other hand, the rapid development of modern manufacturing technology and the increasing efforts on process quality management have led to what is commonly referred to as highly reliable products. In fact, many devices are so well designed that it may need a long burn-in duration to fail a freaky unit even under highly accelerated environments, e.g., see Tseng and Peng (2004) for an example of a light emitting diode (LED) product and Ye et al. (2012) for an electronic device example. The traditional lifetime-based burn-in approach is thus not effective. Compounded by the need to shorten the time-to-market, engineers are faced with a difficult task of making the screening decision for a reliable product within an acceptable time frame. For these reliable products, there is often some quality characteristic, e.g., tear, wear, crack length, etc. that degrades over time and causes a product failure when the degradation level exceeds some threshold. Cumulative degradations can often be measured through modern real-time diagnostic techniques. The quality characteristic of a defective unit often degrades much faster than a normal one. Therefore, a degradation-based burn-in test can be adopted, where a unit is scrapped if its degradation level exceeds some degradation cut-off level during or right after burn-in (Ye et al., 2012). The cut-off level is often much lower than the failure threshold, making the degradation-based approach much more effective. However, compared with the traditional lifetime-based burn-in, degradation-based burn-in models are rare. The first model is found in Tseng and Tang (2001), who used a diffusion process to describe the degradation paths of the LED lamps. This model is further extended by Tseng et al., 2003, Tseng and Peng, 2004 based on variants of the Wiener process. On the other hand, burn-in models based on the gamma process were developed by Tsai et al., 2011, Ye et al., 2012. But degradation-based joint burn-in and maintenance models are not found, regardless of their potential importance. For example, this study is motivated by the infant mortality in modern Micro-Electro-Mechanical Systems (MEMSs). MEMSs that emerged in the late 1980s have the ability to sense, actuate and control in the micro scale, and generate effects on the macro scale. Generally speaking, a MEMS device consists of (a) mechanical micro-structures, (b) micro-sensors, (c) micro-electronics and (d) micro-actuators, all integrated onto the same silicon chip. They have shown great potentials in many applications including medical, aeronautical, space and military industries. But the greatest challenge to the successful commercialization of this new technology is how to improve the product reliability in a cost-effective way. As pointed out by Arney (2001), infant mortalities in MEMS devices are not uncommon due to the short history and the extremely small size. This leads to significant amount of early failures. To eliminate these early failures and to improve the MEMS reliability, packaging engineers usually rely on burn-in (Lee et al., 2003). The success of a burn-in test depends on a correct analysis of the failure mechanism. According to Tanner (2009), MEMSs are classified into four groups as follows.

  • Class I – No moving parts.

  • Class II – Moving parts with no rubbing or impacting surfaces.

  • Class III – Moving parts with impacting surfaces.

  • Class IV – Moving parts with impacting and rubbing surfaces.

The first two classes are susceptible to traumatic failures, while devices in Classes III and IV are more reliable and are prone to measureable wear-related failures. As such, degradation-based burn-in for devices in Classes III and IV is more cost-effective. In effect, Hogan et al. (2003) have addressed the importance of degradation-based burn-in for the production of Digital Micro-mirror Devices.

After a burnt-in MEMS device is put into field use, it is often preventively maintained. In the literature, Peng et al. (2009) developed a rudimentary maintenance model for MEMS devices. It is potentially very useful for improving the reliability of MEMS devices and reducing the field operational costs. They also mentioned the importance of burn-in for MEMS devices. However, the effects of burn-in and the determination of optimal burn-in settings are not considered by them. Obviously, simultaneous determination of both the optimal burn-in settings and the PM interval would lead to lower costs. To address this deficiency, this study develops two general degradation-based joint burn-in and maintenance models under the age and the block based maintenance policies, respectively.

The rest of this paper is organized as follows. In Section 2, we briefly introduce the Wiener process with linear drift and use it to model the degradation of a product. We then state the problem and the assumptions. Section 3 builds two burn-in maintenance models and derives the corresponding average cost functions. An example is provided to elaborate on the benefits of our models in Section 4. Section 5 concludes the paper and points out several topics for future research.

Section snippets

The Wiener process for degradation modeling

The Wiener process has received lots of applications to describe product degradations in the practice of reliability engineering and survival analysis (Nikulin et al., 2009). In this study, we confine attention to the Wiener process with linear shift, as it is sufficient to describe quality characteristics of many products, probably with a proper time-scale transformation (Tseng et al., 2003). A typical Wiener process with linear drift {Y(t); t  0} can be expressed asY(t)=βt+σB(t),where β is the

Burn-in cost

The expected burn-in cost is computed based on the “per-item-output” point of view, meaning that it calculates the amount of money a manufacturer needs to pay in order to obtain an accepted burnt-in unit (Liu and Mazzuchi, 2008). To obtain this cost, imagine a burn-in lot where units are sequentially subject to burn-in. The burn-in cost for each unit is c0b + cs + cm. Let M – 1 be the number of units scrapped until the first unit with L(b) < ξb is obtained. Then the expected cost until obtaining this

Illustrative example

For illustrative purpose, the example provided by Peng et al. (2009) is revisited with some modifications. Consider a MEMS device equipped with a micro-engine, whose major failure mechanism attributes to the degradation-threshold failure. Here, we assume that the population of the MEMS devices consists of a majority of normal units as well as a small proportion of weak class. The proportion of the normal and the weak items are p1 = 0.95 and p2 = 0.05, respectively. Degradation of a normal MEMS

Conclusions

In this study, two degradation-based burn-in maintenance models have been proposed. Optimal burn-in and maintenance settings were determined based on minimizing the average cost per unit time. The illustrative example showed the effectiveness of the degradation-based method compared with the traditional lifetime-based burn-in approach. These two joint burn-in and maintenance models are motivated by the infant mortality in some MEMS devices, but they have potential applications in many other

Acknowledgements

The authors thank the editor and two reviewers for their critical and constructive comments that have considerably helped in the revision of an earlier version of the paper. This work is partially supported by a grant from City University of Hong Kong (Project No. 9380058).

References (27)

  • T.J. Hogan et al.

    Burn-in test reduction for the digital micromirror device

    Proceedings of SPIE

    (2003)
  • R. Jiang et al.

    An optimal burn-in preventive-replacement model associated with a mixture distribution

    Quality and Reliability Engineering International

    (2007)
  • J.C. Lagarias et al.

    Convergence properties of the Nelder–Mead simplex method in low dimensions

    SIAM Journal on Optimization

    (1999)
  • Cited by (0)

    View full text