A new extended Birnbaum–Saunders regression model for lifetime modeling
Introduction
The two-parameter Birnbaum–Saunders (BS) distribution, also known as the fatigue life distribution, was introduced by Birnbaum and Saunders (1969) and has received considerable attention in recent years. It was originally derived from a model for a physical fatigue process where dominant crack growth causes failure. It was later derived by Desmond (1985) using a biological model which followed from relaxing some of the assumptions originally made by Birnbaum and Saunders (1969). The relationship between the BS distribution and the inverse Gaussian distribution was investigated by Desmond (1986) who demonstrated that the BS distribution is an equal-weight mixture of an inverse Gaussian distribution and its complementary reciprocal. For book treatments of inverse Gaussian and BS distributions and their relationships, see Marshall and Olkin (2007, Chapter 13) and especially Saunders (2007, Chapter 10). More recently, Jones (2012) also discussed the relationship between the BS and the inverse Gaussian distributions.
The cumulative distribution function of a random variable with BS distribution, say , is , with , where is the standard normal cumulative function, , , and and are the shape and scale parameters, respectively. The shape of the hazard function of the BS distribution is discussed in Kundu et al. (2008). The authors showed that the hazard function is not monotone and is unimodal for all ranges of the parameter values. Some interesting results on improved statistical inference as well as interval estimation for the BS distribution may be revised in Wu and Wong (2004), Lemonte et al., 2007, Lemonte et al., 2008 and Wang (2012). The BS distribution has been applied in a wide variety of fields. For the applications of the BS distributions, read, for example, Balakrishnan et al. (2007) in reliability and Leiva et al., 2008, Leiva et al., 2009 in other fields. It is worthwhile to mention that there has been a great deal of progress recently in developing statistical methodology for the BS model and its generalizations. Notable contributions include Professor Narayanaswamy Balakrishnan (http://www.math.mcmaster.ca/bala/bala.html) and co-workers, and Professor Victor Leiva (http://staff.deuv.cl/leiva/) and co-workers.
On the basis of the scheme proposed by Marshall and Olkin (1997), Lemonte (2013) introduced a quite flexible distribution which can be used to model failure times for materials subject to fatigue and lifetime data. The new distribution was called by the author as the Marshall–Olkin extended Birnbaum–Saunders (MOEBS) distribution. Hereafter, the random variable is said to have a MOEBS distribution with shape parameters and , and scale parameter , say , if its cumulative function is given by where . The survival function is , whereas the probability density function corresponding to (1) takes the form , where and . It can be shown that if , then , for , i.e. the class of MOEBS distributions is closed under scale transformations. The two-parameter BS distribution arises from (1) when , that is, .
Rieck and Nedelman (1991) proposed a log-linear regression model based on the BS distribution. They showed that if , then is sinh-normal (SN) distributed with shape, location and scale parameters given by , and , respectively; that is, the log-BS (LBS) distribution is a special case of the SN distribution introduced by them and, in this case, the notation is considered. The SN distribution is symmetrical, presents greater and smaller degrees of kurtosis than the normal model and also has bi-modality. Their regression model has received significant attention over the last few years by many researchers. For some recent references about the BS regression model, the reader is refereed to Desmond et al. (2008), Xiao et al. (2010), Lemonte et al. (2010), Lemonte (2011), Lemonte and Ferrari, 2011a, Lemonte and Ferrari, 2011b, Lemonte and Ferrari, 2011c, Qu and Xie (2011) and Li et al. (2012), among others.
Some generalizations of the log-linear BS regression model have been proposed in the statistical literature. For example, some efforts can be found in the works by Barros et al. (2008), Lemonte and Cordeiro (2009), Santana et al. (2011), Lemonte (2012), Desmond et al. (2012) and Villegas et al. (2011). Barros et al. (2008) introduced the generalized BS regression model based on the BS- distribution (that is, based on the BS Student- model with degrees of freedom), Lemonte and Cordeiro (2009) proposed a non-linear BS regression model, Santana et al. (2011) and Lemonte (2012) introduced the skewed BS regression model, whereas Villegas et al. (2011) and Desmond et al. (2012) studied a mixed log-linear model based on the BS distribution.
In this paper, in addition to the existing generalizations of the BS regression model, we shall propose the extended BS regression model based on the MOEBS distribution; that is, we will introduce a new class of lifetime regression models in which the errors follow the log-MOEBS distribution. The main motivation for introducing this new class of regression models relies on the fact that the practitioners will have a new BS regression model to use in practical applications. Moreover, the formulas related with the new regression model are manageable and with the use of modern computer resources and its numerical capabilities, the proposed model may prove to be an useful addition to the arsenal of applied statisticians. Additionally, the new model is quite flexible and can be widely applied in analyzing lifetime data. Further, we provide two applications to real data sets which show that the new regression model yields a better fit than the usual BS regression model. Furthermore, the new extended BS regression model can be used for modeling censored data as well as data without censoring. It should be mentioned that censored data is very common in lifetime data because of time limits and other restrictions on data collection. In a engineering life test experiment, for example, it is usually not feasible to continue experimentation until all items under study have failed. In a survival study, patients follow-up may be lost and also data analysis is usually done before all patients have reached the event of interest. The partial information contained in the censored observations is just a lower bound on the lifetime distribution. Reliability studies usually finish before all units have failed, even making use of accelerated tests. This is a special source of difficulty in the analysis of reliability data. Such data are said to be censored at right and they arise when some units are still running at the time of the data analysis, removed from test before they fail or because they failed from an extraneous cause. We refer the reader to Gijbels (2010) for a recent overview on censored data.
It is nowadays a well spread practice, after modeling, to check the model assumptions and conduct diagnostic studies in order to detect possible influential observations that may distort the results of the analysis. Diagnostic analysis is an efficient way to detect influential observations. The first technique developed to assess the individual impact of cases on the estimation process is, perhaps, the case deletion which became a very popular tool. Cook (1977) presents a great development of case deletion diagnostics for a general statistical model. Case deletion is an example of a global influence analysis, that is, the effect of an observation is assessed by completely removing it. However, case deletion excludes all information from an observation and we can hardly say whether this observation has some influence on a specific aspect of the model. To overcome this problem, one can resort to local influence approach where one investigates the model sensitivity under small perturbations. In this context, Cook (1986) proposed a general framework to detect influential observations which gives a measure of this sensitivity under small perturbations on the data or in the model. Many applications of the local influence method may be found in the statistical literature for various models and under different perturbation schemes. For instance, Espinheira et al. (2008), Vasconcellos and Fernandez (2009), Patriota et al. (2010), Lemonte and Patriota (2011), Zevallos et al. (2012) and Matos et al. (2013), among others. In this paper, we also propose a similar methodology to detect influential subjects in the new extended BS regression model. In particular, we obtain explicit formulas for Cook’s (1986) normal curvature measure under three perturbation schemes.
The paper unfolds as follows. The log-MOEBS distribution is proposed in Section 2. In Section 3, we introduce the extended BS regression model and discuss estimation of the model parameters. Specifically, we compute the maximum likelihood estimating equations by assuming random censoring. In Section 4, the normal curvatures of local influence are derived under various perturbation schemes and a kind of deviance residual is proposed to assess departures from the underlying log-MOEBS distribution as well as to detect outlying observations. In Section 5, we propose a likelihood ratio statistic for testing the homogeneity of the shape parameters. Two real data illustrations are considered in Section 6. The paper ends up with some concluding remarks in Section 7.
Section snippets
The log-MOEBS distribution
Let be a random variable having the MOEBS cumulative function (1). The random variable has a log-MOEBS (LMOEBS) distribution. After some algebra, the survival function, the cumulative function and the density function of , parameterized in terms of , can be expressed, respectively, as where is the standard normal density function, Evidently, the density
The model and estimation
The extended BS regression model (that is, the LMOEBS regression model) is defined in the form where is the observed log-lifetime or log-censoring time for the th individual, is a vector of known explanatory variables associated with , is a -vector (where and it is fixed) of unknown regression parameters to be estimated and . It is also assumed that the random variables ’s are independent and identically distributed.
Diagnostic analysis
Since regression models are sensitive to the underlying model assumptions, generally performing a sensitivity analysis is strongly advisable. In order to assess the sensitivity of the maximum likelihood estimates of the parameters of the regression model (4), the local influence method under three perturbation schemes is carried out. In order to assess departures from the underlying LMOEBS distribution as well as to detect outlying observations, a kind of deviance residual will be considered.
Testing the homogeneity of the shape parameters
In the extended BS regression model introduced in Section 3, the homogeneity of the shape parameters and is a standard assumption. This assumption, however, is not necessarily appropriate, because the actual shape parameters of the response variable may be related to the th observation. In this case, the inference would be much difficult to deal with. Hence, this assumption usually need to be checked. In this section, we consider a LR test statistic to verify the homogeneity of the
Real data illustrations
In this section, we use two real data sets to show the flexibility and applicability of the extended BS regression model in practice. We will consider real data with and without censoring. All the computations presented in this section were done using the Ox matrix programming language (Doornik, 2009), which is freely distributed for academic purposes and available at http://www.doornik.com. The Broyden–Fletcher–Goldfarb–Shanno (BFGS) method with analytical derivatives through the subroutine
Concluding remarks
The BS distribution has many attractive properties and has found several applications in the literature including lifetime, survival and environmental data analysis. It has received significant attention over the last few years and some generalizations and extensions of this distribution have been proposed by many researchers. Based on the BS distribution, Rieck and Nedelman (1991) introduced the BS regression model, which has been studied by several authors. Their regression model is becoming
Acknowledgments
The author thanks the associate editor and three anonymous referees for useful suggestions and comments that aided in improving the first version of the manuscript. The author gratefully acknowledges grants from FAPESP (Brazil).
References (56)
- et al.
Point and interval estimation for extreme-value regression model under Type-II censoring
Computational Statistics ans Data Analysis
(2008) - et al.
A mixed effects log-linear model based on the Birnbaum–Saunders distribution
Computational Statistics and Data Analysis
(2012) - et al.
Influence diagnostics in beta regression
Computational Statistics and Data Analysis
(2008) Relationships between distributions with certain symmetries
Statistics and Probability Letters
(2012)- et al.
On the hazard function of Birnbaum–Saunders distribution and associated inference
Computational Statistics and Data Analysis
(2008) - et al.
Influence diagnostics in log-Birnbaum–Saunders regression models with censored data
Computational Statistics and Data Analysis
(2007) - et al.
Birnbaum–Saunders nonlinear regression models
Computational Statistics and Data Analysis
(2009) - et al.
Improved statistical inference for the two-parameter Birnbaum–Saunders distribution
Computational Statistics and Data Analysis
(2007) - et al.
Size and power properties of some tests in the Birnbaum–Saunders regression model
Computational Statistics and Data Analysis
(2011) - et al.
Signed likelihood ratio tests in the Birnbaum–Saunders regression model
Journal of Statistical Planning and Inference
(2011)
Improved likelihood inference in Birnbaum–Saunders regressions
Computational Statistics and Data Analysis
Diagnostic analysis for heterogeneous log-Birnbaum–Saunders regression models
Statistics and Probability Letters
Diagnostics for a class of survival regression models with heavy-tailed errors
Computational Statistics and Data Analysis
Influence diagnostics in linear and nonlinear mixed-effects models with censored data
Computational Statistics and Data Analysis
Influence diagnostics in a multivariate normal regression model with general parameterization
Statistical Methodology
Influence analysis with homogeneous linear restrictions
Computational Statistics and Data Analysis
Generalized interval estimation for the Birnbaum–Saunders distribution
Computational Statistics and Data Analysis
Improved interval estimation for the two-parameter Birnbaum–Saunders distribution
Computational Statistics and Data Analysis
Estimation of the Birnbaum–Saunders regression model with current status data
Computational Statistics and Data Analysis
Diagnostics analysis for log-Birnbaum–Saunders regression models
Computational Statistics and Data Analysis
A note on influence diagnostics in AR(1) time series models
Journal of Statistical Planning and Inference
Two graphical display for outlying and influential observations in regression
Biometrika
Acceptance sampling plans from truncated life tests from generalized Birnbaum–Saunders distribution
Communications in Statistics—Simulation and Computation
Residuals for relative risk regression
Biometrika
A new class of survival regression models with heavy-tailed errors: robustness and diagnostics
Lifetime Data Analysis
A new family of life distributions
Journal of Applied Probability
Detection of influential observation in linear regression
Technometrics
Assessment of local influence
Journal of the Royal Statistical Society B
Cited by (13)
Modeling right-skewed financial data streams: A likelihood inference based on the generalized Birnbaum–Saunders mixture model
2020, Applied Mathematics and ComputationCitation Excerpt :The BS distribution could be criticized not only for its lack of robustness against atypical observations (highly skewed and heavy-tailed data) but also for the fact that it cannot accommodate monotone (increasing or decreasing) nor bathtub-shaped hazard rate functions [27]. To overcome these deficiencies, some generalizations of the BS distribution have recently been proposed in [28–33]. Although these generalized models may not have physical meaning as the BS distribution, they can be used for modeling right-skewed and non-negative datasets with strong asymmetrical features.
A family of autoregressive conditional duration models applied to financial data
2014, Computational Statistics and Data AnalysisCitation Excerpt :Birnbaum and Saunders (1969) introduced a distribution to model fatigue life data, assuming that the failure follows from the development and growth of a dominant fissure produced by stress. The Birnbaum–Saunders (BS) distribution has been widely studied because of its good properties and its relation with the normal distribution; see, e.g., Cysneiros et al. (2008), Balakrishnan et al. (2009a), Balakrishnan et al. (2011), Kotz et al. (2010), Vilca et al. (2010), Vilca et al. (2011), Villegas et al. (2011), Ferreira et al. (2012), Leiva et al. (2012), Li and Xie (2012), Vanegas et al. (2012), Fierro et al. (2013), Lemonte (2013) and Barros et al. (2014). In addition, although it has its genesis from engineering, its applications have been considered in other fields, including business, economics, finance and quality control; see Jin and Kawczak (2003), Balakrishnan et al. (2007), Ahmed et al. (2010), Bhatti (2010), Leiva et al. (2011b), Leiva et al. (2014a), Leiva et al. (2014b), Leiva et al. (2014c), Paula et al. (2012) and Marchant et al. (2013).
Multivariate Birnbaum–Saunders distribution based on a skewed distribution and associated EM-estimation
2023, Brazilian Journal of Probability and StatisticsA new log-linear bimodal Birnbaum–Saunders regression model with application to survival data
2019, Brazilian Journal of Probability and StatisticsOn Multivariate Log Birnbaum-Saunders Distribution
2017, Sankhya B