Skip to main content
Log in

A two-sample test when data are contaminated

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

In this paper we consider the problem of testing whether two samples of contaminated data arise from the same distribution. Is is assumed that the contaminations are additive noises with known, or estimated moments. This situation can also be viewed as two signals observed before and after perturbations. The problem is then to test the equality of both perturbations. The test statistic is based on the polynomials moments of the difference between observations and noises. The test is very simple and allows one to compare two independent as well as two paired contaminated samples. A data driven selection is proposed to choose automatically the number of involved polynomials. We present a simulation study in order to investigate the power of the proposed test within discrete and continuous cases. Real-data examples are presented to illustrate the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Akaike H (1974) A new look at statistical model identification. IEEE Trans Automat Control 19:716–723

    Article  MathSciNet  MATH  Google Scholar 

  • Antoch J, Husková M, Janic A, Ledwina T (2008) Data driven rank test for the change point problem. Metrika 1:1–15

    Article  Google Scholar 

  • Barndorff-Nielsen O (1978) Information and exponential families in statistical theory Wiley series in probability and mathematical statistics. Wiley, Chichester

    Google Scholar 

  • Carleman T (1926) Les fonctions quasi analytiques. Collection de Monographies sur la Théorie des Fonctions. Gauthier-Villars, Paris

    Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu C (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman Hall, New York

  • Chervoneva I, Iglewicz B (2005) Orthogonal bases approach for comparing. Nonnormal continuous distributions. Biometrika 92:679–690

    Article  MathSciNet  MATH  Google Scholar 

  • Ghattas B, Pommeret D, Reboul L, Yao AF (2011) Data driven smooth test for paired populations. J Stat Plan Inference 141:262–275

    Article  MathSciNet  MATH  Google Scholar 

  • Inglot T, Ledwina T (2006) Towards data driven selection of a penalty function for data driven Neyman test. Linear Algebra Its Appl 417:124–133

    Article  MathSciNet  MATH  Google Scholar 

  • Janic-Wróblewska A, Ledwina T (2000) Data driven rank test for two-sample problem. Scand J Stat 27: 281–297

    Google Scholar 

  • Kraus D (2009) Adaptive Neyman’s smooth tests of homogeneity of two samples of survival data. J Stat Plan Inference 139:3559–3569

    Article  MathSciNet  MATH  Google Scholar 

  • Kundu D, Gupta RD (2009) Bivariate generalized exponential distribution. J Multivar Anal 100:581–593

    Article  MathSciNet  MATH  Google Scholar 

  • Ledoit O, Wolf M (2004) A well-conditioned estimator for large-dimensional covariance matrices. J Multivar Anal 2:365–411

    Article  MathSciNet  Google Scholar 

  • Ledwina T (1994) Data-driven version of neymans smooth test of Fit. J Am Stat Assoc 89:1000–1005

    Article  MathSciNet  MATH  Google Scholar 

  • Meintanis SG (2007) Test of fit for Marshall–Olkin distributions with applications. J Stat Plan Inference 137:3954–3963

    Article  MathSciNet  MATH  Google Scholar 

  • Neyman J (1937) Smooth test for goodness of fit. Skandinavisk Aktuarietidskrift 20:149–199

    Google Scholar 

  • Pommeret D (2011) Data driven smooth test for contaminated density. J Stati Theory Pract 5:697–714

    Article  MathSciNet  Google Scholar 

  • Rayner JCW, Best DJ (1989) Smooth tests of goodness of fit. Oxford University Press, New York

    MATH  Google Scholar 

  • Rayner JCW, Best DJ (2001) A contingency table approach to nonparametric testing. Chapman and Hall/CRC, Boca Raton, Ela, USA

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MATH  Google Scholar 

  • Wang XF, Wang B (2011) Deconvolution estimation in measurement error models: the R package decon. J Stat Softw. http://www.jstatsoft.org/

  • Wylupek G (2010) Data driven k sample tests. Technometrics 52:107–123

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Denys Pommeret.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pommeret, D. A two-sample test when data are contaminated. Stat Methods Appl 22, 501–516 (2013). https://doi.org/10.1007/s10260-013-0235-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-013-0235-6

Keywords

Mathematics Subject Classification

Navigation