Skip to main content
Log in

Parametric versus nonparametric tolerance regions in detection problems

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

A major problem in statistical quality control is to detect a change in the distribution of independent sequentially observed random vectors. The case of a Gaussian pre-change distribution has been extensively analyzed. Here we are concerned with the non-normal multivariate case. In this setup it is natural to use tolerance regions as detection tools. These regions are defined in terms of density level sets, which can be estimated in a plug-in fashion. Under a normal mixture model we compare, through a simulation study, the performance of such a detection scheme for two density estimators: a (parametric) normal mixture and a (nonparametric) kernel estimator. The problem of the bandwidth choice for the latter is addressed. We also obtain a result concerning the convergence rates of the error probabilities under a general parametric model. Finally, a real data example is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Aitchison J, Dunsmore IR (1975) Statistical prediction analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Baíllo A (2003) Total error in a plug-in estimator of level sets. Statist Probab Lett 65:411–417

    Article  MathSciNet  Google Scholar 

  • Baíllo A, Cuevas A, Justel A (2000) Set estimation and nonparametric detection. Can J Statist 28:765–782

    Article  MathSciNet  Google Scholar 

  • Baíllo A, Cuesta-Albertos JA, Cuevas A (2001) Convergence rates in nonparametric estimation of level sets. Statist Probab Lett 53:27–35

    Article  MathSciNet  Google Scholar 

  • Cappé O (2001) A set of MATLAB/OCTAVE functions for the EM estimation of mixtures and hidden Markov models. Downloadable at https://doi.org/www.tsi.enst.fr/cappe/h2m/

  • Chatterjee SK, Patra NK (1980) Asymptotically minimal multivariate tolerance sets. Calcutta Statist Assoc Bull 29:73–93

    Article  MathSciNet  Google Scholar 

  • Csörgö M, Horváth L (1997) Limit theorems in change-point analysis. Wiley, New York

    MATH  Google Scholar 

  • Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Statist Soc B 39:1–38

    MathSciNet  MATH  Google Scholar 

  • Devroye L, Györfi L (1985) Nonparametric density estimation: the L 1 View. Wiley, New York

    MATH  Google Scholar 

  • Devroye L, Wise G (1980) Detection of abnormal behavior via nonparametric estimation of the support. SIAM J Appl Math 38:480–488

    Article  MathSciNet  Google Scholar 

  • Di Bucchianico A, Einmahl JHJ, Mushkudiani N (2001) Smallest nonparametric tolerance regions. Ann Statist 29(5):1320–1343

    Article  MathSciNet  Google Scholar 

  • Duong T, Hazelton ML (2003) Plug-in bandwidth matrices for bivariate kernel density estimation. J Nonparametr Statist 15:17–30

    Article  MathSciNet  Google Scholar 

  • Fuchs C, Kenett RS (1987) Multivariate tolerance regions and F-tests. J Quality Tech 19:122–131

    Article  Google Scholar 

  • Fuchs C, Kenett RS (1998) Multivariate quality control. Theory and Applications. Marcel Dekker, New York

    Book  Google Scholar 

  • Gombay E (1994) Testing for change-points with rank and sign statistics. Statist Probab Lett 20:49–55

    Article  MathSciNet  Google Scholar 

  • Gordon L, Pollak M (1995) A robust surveillance scheme for stochastically ordered alternatives. Ann Statist 23(4):1350–1375

    Article  MathSciNet  Google Scholar 

  • Hawkins DM, Olwell DH (1998) Cumulative sum charts and charting for quality improvement. Springer, Berlin Heidelberg New York

    Book  Google Scholar 

  • Hotelling H (1947) Multivariate quality control illustrated by air testing of sample bombsights. In: Eisenhart C et al. (eds) Selected techniques of statistical analysis. Mc Graw-Hill, New York

    Google Scholar 

  • Huskova M (1998) Multivariate rank statistics processes and change-point analysis. In: Ahmed SE et al. (eds) Applied statistical science, vol III. Nova Science Publishers, Commack

    Google Scholar 

  • Leroux BG (1992) Consistent estimation of a mixing distribution. Ann Statist 20:1350–1360

    Article  MathSciNet  Google Scholar 

  • Liu RY, Singh K (1993) A quality index based on data depth and multivariate rank tests. J Amer Statist Assoc 88(421):252–260

    MathSciNet  MATH  Google Scholar 

  • Marron JS (1996) Matlab Smoothing Software. https://doi.org/www.stat.unc.edu/faculty/marron/marron_software.html

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  Google Scholar 

  • Mushkudiani N (2000) Statistical applications of generalized quantiles: nonparametric tolerance regions and P–P Plots. Eindhoven University of Technology, Eindhoven

    MATH  Google Scholar 

  • Page ES (1954) Continuous inspection schemes. Biometrika 41:100–115

    Article  MathSciNet  Google Scholar 

  • Polansky AM (2001) A smooth nonparametric approach to multivariate process capability. Technometrics 43:199–211

    Article  MathSciNet  Google Scholar 

  • Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J Roy Statist Soc Ser B 53:683–690

    MathSciNet  MATH  Google Scholar 

  • Shewhart WA (1931) The economic control of the quality of manufactured product. Macmillan, New York

    Google Scholar 

  • Simonoff JS (1996) Smoothing methods in statistics. Springer, Berlin Heidelberg New York

    Book  Google Scholar 

  • Tsybakov AB (1997) On nonparametric estimation of density level sets. Ann Statist 25:948–969

    Article  MathSciNet  Google Scholar 

  • Wand MP, Jones MC (1994) Multivariate plug-in bandwidth selection. Comput Statist 9:97–116

    MathSciNet  MATH  Google Scholar 

  • Yakir B (1996) A lower bound on the ARL to detection of a change with a probability constraint on false alarm. Ann Statist 24:431–435

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amparo Baíllo.

Additional information

Research partially supported by Spanish grant MTM2004-00098.

Appendix: Proof of the theorem

Appendix: Proof of the theorem

  1. (a)

    Using condition (A2), by the mean value theorem there exists a constant C > 0 such that

    $$\left|f_{\hat{\theta}_{n}}(x)-f_{\theta}(x)\right| \leq C\left\|\hat{\theta}_{n}-\theta\right\|_{1}$$
    (8)

    uniformly in x. If there exists a set A with positive probability on which convergence of cn to c does not hold, then there exists a δ > 0 and a subsequence \(c_{n_{j}}\) such that for all j.

    If \(c_{n_{j}}>c+\delta\) then, by (8),\(\left|c_{n_{j}}-c\right|>\delta\)

    $$\begin{array}{*{20}{c}} {1 - \alpha \leqslant \int\limits_{\left\{ {{f_{\hat \theta n}} \geqslant c + \delta } \right\}} {{f_{\bar \theta }} \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_{{{\hat \theta }_n}}}\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ { \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta } + C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}\text{Leb}\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}.} } \end{array}$$

    We can take nj sufficiently large so that \(C\left\|\hat{\theta}_{n}-\theta\right\|_{1}<\delta / 2\). Then on A

    $$1-\alpha \leq \int\limits_{\left\{f_{\theta} \geq c+\delta / 2\right\}} f_{\theta}+C\left\|\hat{\theta}_{n}-\theta\right\|_{1} \operatorname{Leb}\left\{f_{\theta} \geq c+\delta / 2\right\},$$
    (9)

    which contradicts the definition of c. If we had assumed that \(c_{n_{j}}<c-\delta\) we would have reached a similar contradiction.

  2. (b)

    Observe that by (8)

    $$\left|P_{n}-\alpha\right| \leq \int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}}\left|f_{\hat{\theta}_{n}}-f_{\theta}\right| \leq C\left\|\hat{\theta}_{n}-\theta\right\|_{1} \operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}.$$

    Then the conclusion in (b) follows since, for n large, \(\operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}\) is bounded almost surely. To see this notice that, with probability one, eventually

    $$\operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\} \leq \frac{2}{c} c_{n} \operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\} \leq \frac{2}{c} \int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}} f_{\hat{\theta}_{n}}=\frac{2}{c}(1-\alpha).$$
  3. (c)

    Let M > 0 be a sufficiently large constant. By (8) and (A1) if γ > 1 then, with probability one and for n large enough,

    $$\begin{array}{*{20}{c}} {\int\limits_{\left\{ {{f_{{{\hat \theta }_n}}} \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma }} \right\}} {{f_{{{\hat \theta }_n}}} \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta }\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; + C\text{Leb}\left\{ {{f_\theta } \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \\ \;\;\;\;\;\;\;\;\;\;\;\; { \leqslant 1 - \alpha - \int\limits_{\left\{ {c < {f_\theta } < c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta } + {C_1}{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} } \\\;\;\;\;\;\; { \leqslant 1 - \alpha - {{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}\left[ {{C_2}{{\left( {M - 1} \right)}^\gamma } - {C_1}} \right] < 1 - \alpha ,} \end{array}$$

    where C1, C2 > 0 are constants. This implies that \(c_{n}<c+M\left\|\hat{\theta}_{n}-\theta\right\|_{1}^{1 / \gamma}\). In a similar way it is easy to check that, for large \(n, c_{n}>c-M\left\|\hat{\theta}_{n}-\theta\right\|_{1}^{1 / \gamma}\). Analogously, if γ ≤ 1, with probability one we have that

    $$\int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c+M\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right\}} f_{\hat{\theta}_{n}} \leq 1-\alpha-C_{3}\left[(M-C)\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right]^{\gamma}+C_{4}\left\|\hat{\theta}_{n}-\theta\right\|_{1}<1-\alpha$$

    and

    $$\int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c-M\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right\}}f_{\hat{\theta}_{n}} \geq 1-\alpha+C_{5}\left[(M-C)\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right]^{\gamma}-C_{6}\left\|\hat{\theta}_{n}-\theta\right\|_{1}>1-\alpha,$$

    where the Ci’s are positive constants. This proves (5). Regarding (6), using (A1) we have that with probability one, for n sufficiently large,

    $$\begin{array}{*{20}{c}} {\int\limits_{\left\{ {{f_{{{\hat \theta }_n}}} \geqslant {c_n}} \right\}\vartriangle \left\{ {{f_\theta } \geqslant c} \right\}} {{f_\theta } \leqslant \int\limits_{\left\{ {{c_n} + C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1} \geqslant {f_\theta } \geqslant {c_n} - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta }\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ { = O{{\left( {\max \left( {\left| {{c_n} - c} \right|,{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right)} \right)}^\gamma }} \end{array}$$

    which completes the proof of the statement in (c).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baíllo, A., Cuevas, A. Parametric versus nonparametric tolerance regions in detection problems. Computational Statistics 21, 523–536 (2006). https://doi.org/10.1007/s00180-006-0010-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-006-0010-3

Keywords

Navigation