Parametric versus nonparametric tolerance regions in detection problems

Baíllo, Amparo; Cuevas, Antonio

doi:10.1007/s00180-006-0010-3

Parametric versus nonparametric tolerance regions in detection problems

Original Paper
Published: 17 August 2006

Volume 21, pages 523–536, (2006)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Amparo Baíllo¹ &
Antonio Cuevas²

154 Accesses
16 Citations
Explore all metrics

Abstract

A major problem in statistical quality control is to detect a change in the distribution of independent sequentially observed random vectors. The case of a Gaussian pre-change distribution has been extensively analyzed. Here we are concerned with the non-normal multivariate case. In this setup it is natural to use tolerance regions as detection tools. These regions are defined in terms of density level sets, which can be estimated in a plug-in fashion. Under a normal mixture model we compare, through a simulation study, the performance of such a detection scheme for two density estimators: a (parametric) normal mixture and a (nonparametric) kernel estimator. The problem of the bandwidth choice for the latter is addressed. We also obtain a result concerning the convergence rates of the error probabilities under a general parametric model. Finally, a real data example is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Article 07 February 2017

Bayesian inference for psychology. Part II: Example applications with JASP

Article Open access 06 July 2017

A Guide for Sparse PCA: Model Comparison and Applications

Article Open access 29 June 2021

References

Aitchison J, Dunsmore IR (1975) Statistical prediction analysis. Cambridge University Press, Cambridge
Book Google Scholar
Baíllo A (2003) Total error in a plug-in estimator of level sets. Statist Probab Lett 65:411–417
Article MathSciNet Google Scholar
Baíllo A, Cuevas A, Justel A (2000) Set estimation and nonparametric detection. Can J Statist 28:765–782
Article MathSciNet Google Scholar
Baíllo A, Cuesta-Albertos JA, Cuevas A (2001) Convergence rates in nonparametric estimation of level sets. Statist Probab Lett 53:27–35
Article MathSciNet Google Scholar
Cappé O (2001) A set of MATLAB/OCTAVE functions for the EM estimation of mixtures and hidden Markov models. Downloadable at https://doi.org/www.tsi.enst.fr/cappe/h2m/
Chatterjee SK, Patra NK (1980) Asymptotically minimal multivariate tolerance sets. Calcutta Statist Assoc Bull 29:73–93
Article MathSciNet Google Scholar
Csörgö M, Horváth L (1997) Limit theorems in change-point analysis. Wiley, New York
MATH Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Statist Soc B 39:1–38
MathSciNet MATH Google Scholar
Devroye L, Györfi L (1985) Nonparametric density estimation: the L ¹ View. Wiley, New York
MATH Google Scholar
Devroye L, Wise G (1980) Detection of abnormal behavior via nonparametric estimation of the support. SIAM J Appl Math 38:480–488
Article MathSciNet Google Scholar
Di Bucchianico A, Einmahl JHJ, Mushkudiani N (2001) Smallest nonparametric tolerance regions. Ann Statist 29(5):1320–1343
Article MathSciNet Google Scholar
Duong T, Hazelton ML (2003) Plug-in bandwidth matrices for bivariate kernel density estimation. J Nonparametr Statist 15:17–30
Article MathSciNet Google Scholar
Fuchs C, Kenett RS (1987) Multivariate tolerance regions and F-tests. J Quality Tech 19:122–131
Article Google Scholar
Fuchs C, Kenett RS (1998) Multivariate quality control. Theory and Applications. Marcel Dekker, New York
Book Google Scholar
Gombay E (1994) Testing for change-points with rank and sign statistics. Statist Probab Lett 20:49–55
Article MathSciNet Google Scholar
Gordon L, Pollak M (1995) A robust surveillance scheme for stochastically ordered alternatives. Ann Statist 23(4):1350–1375
Article MathSciNet Google Scholar
Hawkins DM, Olwell DH (1998) Cumulative sum charts and charting for quality improvement. Springer, Berlin Heidelberg New York
Book Google Scholar
Hotelling H (1947) Multivariate quality control illustrated by air testing of sample bombsights. In: Eisenhart C et al. (eds) Selected techniques of statistical analysis. Mc Graw-Hill, New York
Google Scholar
Huskova M (1998) Multivariate rank statistics processes and change-point analysis. In: Ahmed SE et al. (eds) Applied statistical science, vol III. Nova Science Publishers, Commack
Google Scholar
Leroux BG (1992) Consistent estimation of a mixing distribution. Ann Statist 20:1350–1360
Article MathSciNet Google Scholar
Liu RY, Singh K (1993) A quality index based on data depth and multivariate rank tests. J Amer Statist Assoc 88(421):252–260
MathSciNet MATH Google Scholar
Marron JS (1996) Matlab Smoothing Software. https://doi.org/www.stat.unc.edu/faculty/marron/marron_software.html
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Book Google Scholar
Mushkudiani N (2000) Statistical applications of generalized quantiles: nonparametric tolerance regions and P–P Plots. Eindhoven University of Technology, Eindhoven
MATH Google Scholar
Page ES (1954) Continuous inspection schemes. Biometrika 41:100–115
Article MathSciNet Google Scholar
Polansky AM (2001) A smooth nonparametric approach to multivariate process capability. Technometrics 43:199–211
Article MathSciNet Google Scholar
Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J Roy Statist Soc Ser B 53:683–690
MathSciNet MATH Google Scholar
Shewhart WA (1931) The economic control of the quality of manufactured product. Macmillan, New York
Google Scholar
Simonoff JS (1996) Smoothing methods in statistics. Springer, Berlin Heidelberg New York
Book Google Scholar
Tsybakov AB (1997) On nonparametric estimation of density level sets. Ann Statist 25:948–969
Article MathSciNet Google Scholar
Wand MP, Jones MC (1994) Multivariate plug-in bandwidth selection. Comput Statist 9:97–116
MathSciNet MATH Google Scholar
Yakir B (1996) A lower bound on the ARL to detection of a change with a probability constraint on false alarm. Ann Statist 24:431–435
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Estadística, Universidad Carlos III de Madrid, Getafe (Madrid), 28903, Spain
Amparo Baíllo
Departamento de Matemáticas, Universidad Autónoma de Madrid, 28049, Madrid, Spain
Antonio Cuevas

Authors

Amparo Baíllo
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Cuevas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amparo Baíllo.

Additional information

Research partially supported by Spanish grant MTM2004-00098.

Appendix: Proof of the theorem

(a)
Using condition (A2), by the mean value theorem there exists a constant C > 0 such that
$$\left|f_{\hat{\theta}_{n}}(x)-f_{\theta}(x)\right| \leq C\left\|\hat{\theta}_{n}-\theta\right\|_{1}$$
(8)

uniformly in x. If there exists a set A with positive probability on which convergence of c_n to c does not hold, then there exists a δ > 0 and a subsequence $c_{n_{j}}$ such that for all j.

If $c_{n_{j}}>c+\delta$ then, by (8),$\left|c_{n_{j}}-c\right|>\delta$
$$\begin{array}{*{20}{c}} {1 - \alpha \leqslant \int\limits_{\left\{ {{f_{\hat \theta n}} \geqslant c + \delta } \right\}} {{f_{\bar \theta }} \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_{{{\hat \theta }_n}}}\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ { \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta } + C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}\text{Leb}\left\{ {{f_\theta } \geqslant c + \delta - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}.} } \end{array}$$

We can take n_j sufficiently large so that $C\left\|\hat{\theta}_{n}-\theta\right\|_{1}<\delta / 2$. Then on A
$$1-\alpha \leq \int\limits_{\left\{f_{\theta} \geq c+\delta / 2\right\}} f_{\theta}+C\left\|\hat{\theta}_{n}-\theta\right\|_{1} \operatorname{Leb}\left\{f_{\theta} \geq c+\delta / 2\right\},$$
(9)

which contradicts the definition of c. If we had assumed that $c_{n_{j}}<c-\delta$ we would have reached a similar contradiction.
(b)
Observe that by (8)
$$\left|P_{n}-\alpha\right| \leq \int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}}\left|f_{\hat{\theta}_{n}}-f_{\theta}\right| \leq C\left\|\hat{\theta}_{n}-\theta\right\|_{1} \operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}.$$

Then the conclusion in (b) follows since, for n large, $\operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}$ is bounded almost surely. To see this notice that, with probability one, eventually
$$\operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\} \leq \frac{2}{c} c_{n} \operatorname{Leb}\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\} \leq \frac{2}{c} \int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c_{n}\right\}} f_{\hat{\theta}_{n}}=\frac{2}{c}(1-\alpha).$$
(c)
Let M > 0 be a sufficiently large constant. By (8) and (A1) if γ > 1 then, with probability one and for n large enough,
$$\begin{array}{*{20}{c}} {\int\limits_{\left\{ {{f_{{{\hat \theta }_n}}} \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma }} \right\}} {{f_{{{\hat \theta }_n}}} \leqslant \int\limits_{\left\{ {{f_\theta } \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta }\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; + C\text{Leb}\left\{ {{f_\theta } \geqslant c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \\ \;\;\;\;\;\;\;\;\;\;\;\; { \leqslant 1 - \alpha - \int\limits_{\left\{ {c < {f_\theta } < c + M\left\| {{{\hat \theta }_n} - \theta } \right\|_1^{1/\gamma } - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta } + {C_1}{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} } \\\;\;\;\;\;\; { \leqslant 1 - \alpha - {{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}\left[ {{C_2}{{\left( {M - 1} \right)}^\gamma } - {C_1}} \right] < 1 - \alpha ,} \end{array}$$

where C₁, C₂ > 0 are constants. This implies that $c_{n}<c+M\left\|\hat{\theta}_{n}-\theta\right\|_{1}^{1 / \gamma}$. In a similar way it is easy to check that, for large $n, c_{n}>c-M\left\|\hat{\theta}_{n}-\theta\right\|_{1}^{1 / \gamma}$. Analogously, if γ ≤ 1, with probability one we have that
$$\int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c+M\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right\}} f_{\hat{\theta}_{n}} \leq 1-\alpha-C_{3}\left[(M-C)\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right]^{\gamma}+C_{4}\left\|\hat{\theta}_{n}-\theta\right\|_{1}<1-\alpha$$

and
$$\int\limits_{\left\{f_{\hat{\theta}_{n}} \geq c-M\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right\}}f_{\hat{\theta}_{n}} \geq 1-\alpha+C_{5}\left[(M-C)\left\|\hat{\theta}_{n}-\theta\right\|_{1}\right]^{\gamma}-C_{6}\left\|\hat{\theta}_{n}-\theta\right\|_{1}>1-\alpha,$$

where the C_i’s are positive constants. This proves (5). Regarding (6), using (A1) we have that with probability one, for n sufficiently large,
$$\begin{array}{*{20}{c}} {\int\limits_{\left\{ {{f_{{{\hat \theta }_n}}} \geqslant {c_n}} \right\}\vartriangle \left\{ {{f_\theta } \geqslant c} \right\}} {{f_\theta } \leqslant \int\limits_{\left\{ {{c_n} + C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1} \geqslant {f_\theta } \geqslant {c_n} - C{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right\}} {{f_\theta }\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} } } \\ { = O{{\left( {\max \left( {\left| {{c_n} - c} \right|,{{\left\| {{{\hat \theta }_n} - \theta } \right\|}_1}} \right)} \right)}^\gamma }} \end{array}$$

which completes the proof of the statement in (c).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baíllo, A., Cuevas, A. Parametric versus nonparametric tolerance regions in detection problems. Computational Statistics 21, 523–536 (2006). https://doi.org/10.1007/s00180-006-0010-3

Download citation

Published: 17 August 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s00180-006-0010-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parametric versus nonparametric tolerance regions in detection problems

Abstract

Access this article

Similar content being viewed by others

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Bayesian inference for psychology. Part II: Example applications with JASP

A Guide for Sparse PCA: Model Comparison and Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of the theorem

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parametric versus nonparametric tolerance regions in detection problems

Abstract

Access this article

Similar content being viewed by others

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Bayesian inference for psychology. Part II: Example applications with JASP

A Guide for Sparse PCA: Model Comparison and Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of the theorem

Appendix: Proof of the theorem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation