On estimation and influence diagnostics for the Grubbs’ model under heavy-tailed distributions

https://doi.org/10.1016/j.csda.2008.10.034Get rights and content

Abstract

The Grubbs’ measurement model is frequently used to compare several measuring devices. It is common to assume that the random terms have a normal distribution. However, such assumption makes the inference vulnerable to outlying observations, whereas scale mixtures of normal distributions have been an interesting alternative to produce robust estimates, keeping the elegancy and simplicity of the maximum likelihood theory. The aim of this paper is to develop an EM-type algorithm for the parameter estimation, and to use the local influence method to assess the robustness aspects of these parameter estimates under some usual perturbation schemes. In order to identify outliers and to criticize the model building we use the local influence procedure in a study to compare the precision of several thermocouples.

Introduction

The problem of comparing the precision and accuracy of different measuring instruments may appear in various scientific applications like engineering (Grubbs, 1948, Grubbs, 1973) and medicine (Barnett, 1969). Taking measurements of the same unknown characteristic x from different individuals or experimental units has been the usual way for comparing the instruments. These may differ in some aspects such as cost, speed and convenience. The relative quality in the measurements is evaluated considering the precision and bias of the different instruments.

The assessment of robustness aspects of the parameter estimates in statistical models has been an important concern of various researchers in the last decades. The deletion methodology, which consists of studying the impact on the parameter estimates after dropping individual observations, is probably the most employed technique to detect influential observations (see, for example, Cook and Weisberg (1982) and Chatterjee and Hadi (1988)). Nevertheless, the local influence procedure (Cook, 1986), that investigates the influence of small perturbations in the model/data on the parameter estimates, has received increasing attention in the last 20 years, mainly due to its’ flexibility in constructing different kinds of graphics and its’ applicability in various statistical models (see discussion in Cook (1997)). In particular, Galea et al. (2002) and Lachos et al. (2007) applied the methodology in normal comparative calibration and Grubbs’ models, notifying under some usual perturbation schemes the well known lack of robustness of the least-squares estimates against outlying observations.

Several methodologies have been proposed to attenuate the influence of outlying observations on the parameter estimates under normality, such as modifications of the least-squares methodology (see, for instance, Huber (1981)). Other approaches that assume heavy-tailed error distributions, for which the maximum likelihood estimates appear to be robust against extreme observations, have been proposed (see, for example, Galea et al. (2005)). In this work, we will assume scale mixtures of normal distributions (Andrews and Mallows, 1974) for the accommodation of extreme and outlying observations in the Grubbs’ model. Properties of distributions in this class, such as Student-t, power exponential and contaminated normal may be found in Andrews and Mallows (1974) and Lange and Sinsheimer (1993). In this paper, scale mixtures of normal distributions are assumed for the Grubbs’ model, and the hierarchical representation proposed by Pinheiro et al. (2001) is considered. Our aim is to apply the local influence method in the Grubbs’ model under heavy-tailed distributions in order to assess the influence of minor perturbations on the model/data, our results are generalizations of the results obtained by Lachos et al. (2007). The rest of the paper is organized as follows: In Section 2 some inferential aspects are discussed and an EM-type algorithm is developed for the parameter estimation; Section 3 introduces the local influence methodology (Cook, 1986, Zhu and Lee, 2001). The normal curvature for some usual perturbation schemes is derived in Section 4. The methodology is illustrated in Section 5 in which Grubbs’ models under normal and scale mixtures of normal distributions are compared according to the robustness aspects of the maximum likelihood estimates. Finally, some concluding remarks are given in Section 6.

Section snippets

Model description

Grubbs, 1948, Grubbs, 1973, Grubbs, 1983 proposes a linear model for comparing p different instruments, in which the characteristic xi of the ith experimental unit is measured once by all the p instruments. The model assumes the form

Yij=αj+xi+ϵij,i=1,,Mandj=1,,p, where Yij denotes the measurement of the jth instrument for the ith experimental unit and αj is called additive bias. The measurement errors ϵij are assumed to be independent of the random variables x1,,xM. In addition, one has that

Local influence

The aim of local influence (Cook, 1986) is to investigate the behavior of some influence measure T(ω) when small perturbations are made into the model/data, where ω is a q-dimensional vector of perturbations restricted to some open subset ΩRq. In this work we assess local influence by using an appropriate measure based on the complete log-likelihood function and particularly recommended for incomplete data.

Let L(θ,ω|Y) and L(θ,ω|Yc) be the perturbed log-likelihood functions for observed and

Curvature derivation

In this section we will derive the normal curvature for the Grubbs’ model by considering the hierarchical formulation given in (5). We will compute Q̈(θ)=2Q(θ|θ̂)/θθT and Δ=2Q(θ,ω|θ̂)/θωT by using results of matrix differentiation described in Magnus and Neudecker (1988). Details on the differential calculations for the matrices Q̈ and Δ under different perturbation schemes are given in Appendix B.

Application

We consider Grubbs’ model given in (2) with the following hierarchical formulation: Yi|ziindSMN5(μ+1zi,D(ϕ);H)andziindSMN(0,ϕx;H),i=1,,64, where Yi=(Yi1,,Yi5)T, D(ϕ)=diag(ϕ1,,ϕ5) and H denote the distribution function for the mixture variable Vi, i=1,,64.

In our analysis we suppose that the mixture variables follows a Gamma distribution, Beta, discrete and point mass in Vi, that is, the marginal response Yi follows a Student-t, slash, contaminated normal and normal distribution,

Concluding remarks

In this work, we have discussed the parameter estimation in the Grubbs’ model under a class of distributions that presents heavier tails than the normal ones. Through a local influence study some aspects of robustness of the maximum likelihood estimators under the scale mixture of normal distributions were noted. Explicit expressions are obtained for matrix Δ under different perturbation schemes considered. It is noted, however that other perturbation schemes can be considered in analogous way.

Acknowledgements

This work was partially supported by the FONDECYT grants 11075071 and 1070919, Chile and CNPq and FAPESP, Brazil. The authors thank the Associate Editor and a referee for valuable suggestions.

References (40)

  • S.Y. Lee et al.

    Influence analysis of nonlinear mixed-effects models

    Computational Statistics & Data Analysis

    (2004)
  • F. Osorio et al.

    Assessment of local influence in elliptical linear models with longitudinal structure

    Computational Statistics & Data Analysis

    (2007)
  • J.Y. Shyr et al.

    Inference about comparative precision in linear structural relationships

    Journal of Statistical Planning and Inference

    (1986)
  • D.F. Andrews et al.

    Scale mixtures of normal distributions

    Journal of the Royal Statistical Society, Series B

    (1974)
  • V.D. Barnett

    Simultaneous pairwise linear structural relationships

    Biometrics

    (1969)
  • E.J. Bedrick

    An efficient scores test for comparing several measuring devices

    Journal of Quality Technology

    (2001)
  • H. Bolfarine et al.

    Structural comparative calibration using the EM algorithm

    Journal of Applied Statistics

    (1995)
  • H. Bolfarine et al.

    One structural comparative calibration under a t-model

    Computational Statistics

    (1996)
  • S. Chatterjee et al.

    Sensitivity Analysis in Linear Regression

    (1988)
  • R. Christensen et al.

    Test for precision and accuracy of multiple measuring devices

    Technometrics

    (1993)
  • R.D. Cook

    Assessment of local influence (with discussion)

    Journal of the Royal Statistical Society, Ser. B

    (1986)
  • R.D. Cook
  • R.D. Cook et al.

    Residuals and Influence in Regression

    (1982)
  • A.P. Dempster et al.

    Maximum likelihood from incomplete data via the EM algorithm (with discussion)

    Journal of the Royal Statistical Society, Ser. B

    (1977)
  • K.T. Fang et al.

    Symmetric Multivariate and Related Distributions

    (1990)
  • C. Fernández et al.

    Multivariate Student-t regression models: Pitfalls and inference

    Biometrika

    (1999)
  • Galea, M., 1995. Calibração Comparativa Estrutural e Funcional. Unpublished Ph.D. Dissertation. (Dept. of Statistics,...
  • M. Galea et al.

    Local influence in comparative calibration models

    Biometrical Journal

    (2002)
  • M. Galea et al.

    Local influence in comparative calibration models under elliptical t-distributions

    Biometrical Journal

    (2005)
  • F.E. Grubbs

    On estimating precision of measuring instruments and product variability

    Journal of the American Statistical Association

    (1948)
  • Cited by (24)

    • Outliers detection method of multiple measuring points of parameters in power plant units

      2015, Applied Thermal Engineering
      Citation Excerpt :

      Based on the research of Grubbs, the distribution of test statistics for outlier detection drawn from heavy-tailed distributions was studied in literature [26], which extend classical results of outlier test statistics for the finite-variance case to the heavy-tailed infinite variance case. And literature [27] indicates Grubbs' model is based on the assumption that random terms have a normal distribution, and Osorio et al. proposed robustness aspects of parameter estimates to identify outliers under heavy-tailed distributions. Facing these conclusions, robust statistical outlier detection methods are more applicable for parameters in thermal system with small number of measuring points, because few or no assumptions about the distribution of data set is needed to make, and robust methods do not rely on distribution parameters.

    • Robust linear functional mixed models

      2015, Journal of Multivariate Analysis
      Citation Excerpt :

      Ultrastructural, structural and functional elliptical measurement error models are investigated in [7,3,2,39]. It is also the case that recent statistical literature has experienced growing interest in elliptical generalizations of normal models [34,4], perhaps influenced by the development of more efficient computational procedures. The paper is organized as follows.

    • Influence analyses of skew-normal/independent linear mixed models

      2010, Computational Statistics and Data Analysis
    View all citing articles on Scopus
    View full text