Skip to main content

Novel Nonparametric Test for Homogeneity and Change-Point Detection in Data Stream

  • Conference paper
  • First Online:
Data Stream Mining & Processing (DSMP 2020)

Abstract

In the paper, a new nonparametric algorithm for the homogeneity and change-point detection in random sequences is proposed. This algorithm is based on Klyushin–Petunin test for samples heterogeneity which allows us both absolutely continuous distributions and distributions with ties. The implementation of the algorithm may be both online and offline. It allows us to analyze small chunks of data stream for comparison providing the significance level less than 0.05. The comparisons show that proposed algorithm is more sensitive and robust than their counterparts. Opposite to the counterpart tests (Kolmogorov–Smirnov and Wilcoxon), the proposed algorithm well detect the homogeneity of samples from both distributions which differ in means and it has the same variance and distributions with the same mean but different variances. The algorithm has also wide field of applications from the detection of drift concept in texts to tracking the healthy parameters and coordinates of patients obtained from wearable gadgets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brodsky, B.: Change-Point Analysis in Nonstationary Stochastic Models. CRC Press, Boca Raton (2017). https://doi.org/10.1201/9781315367989

    Book  MATH  Google Scholar 

  2. Brodsky, B., Darkhovsky, B.: Extrapolation, Interpolation, and Smoothing of Stationary Time Series. Kluwer Academin Press, Dordrecht/Boston (1993). https://doi.org/10.1007/978-94-015-8163-9

  3. Brodsky, B., Darkhovsky, B.: Non-Parametric Statistical Diagnosis: Problems and Methods. Springer, Heidelberg (2010). https://doi.org/10.1007/978-94-015-9530-8

    Book  MATH  Google Scholar 

  4. Chen, J., Gupta, A.: Parametric Statistical Change Point Analysis With Applications to Genetics, Medicine, and Finance. Birkhauser, Basel (2012). https://doi.org/10.1007/978-0-8176-4801-5

  5. Fearnhead, P., Liu, Z.: On line inference for multiple change point problems. J. Roy. Stat. Soc. Ser. B 69, 203–213 (2007). https://doi.org/10.1111/j.1467-9868.2007.00601.x

    Article  Google Scholar 

  6. Ferger, D.: On the power of nonparametric changepoint-tests. Metrika 41, 277–292 (1994). https://doi.org/10.1007/BF01895324

    Article  MathSciNet  MATH  Google Scholar 

  7. Gombay, E.: U-statistics for sequential change detection. Metrika 52, 113–145 (2000). https://doi.org/10.1007/PL00003980

    Article  MathSciNet  MATH  Google Scholar 

  8. Gombay, E.: U-statistics for change under alternatives. J. Multivar. Anal. 78, 139–158 (2001). https://doi.org/10.1006/jmva.2000.1945

    Article  MathSciNet  MATH  Google Scholar 

  9. Gombay, E., Horvath, L.: An application of the maximum likelihood test to the change-point problem. Stoch. Process. Appl. 50, 161–171 (1994). https://doi.org/10.1016/0304-4149(94)90154-6

    Article  MathSciNet  MATH  Google Scholar 

  10. Gombay, E., Horvath, L.: On the rate of approximations for maximum likelihoodtests in change-point models. J. Multivar. Anal. 56, 120–152 (1996). https://doi.org/10.1006/jmva.1996.0007

    Article  MATH  Google Scholar 

  11. Gurevich, G.: Retrospective parametric tests for homogeneity of data. Commun. Stat. Theor. Methods 36, 2841–2862 (2007). https://doi.org/10.1080/03610920701386968

    Article  MathSciNet  MATH  Google Scholar 

  12. Gurevich, G., Vexler, A.: Retrospective change point detection: from parametric to distribution free policies. Commun. Stat. Simul. Comput. 39, 1–22 (2010). https://doi.org/10.1080/03610911003663881

    Article  MathSciNet  MATH  Google Scholar 

  13. Hill, B.: Posterior distribution of percentiles: Bayes’ theorem for sampling from a population. J. Am. Stat. Assoc. 63, 677–691 (1968). https://doi.org/10.1080/01621459.1968.11009286

    Article  MathSciNet  MATH  Google Scholar 

  14. Holmes, M., Kojadinovic, I., Quessy, J.: Nonparametric tests for change-point detection a la Gomabay and Hovath. J. Multivar. Anal. 115, 16–32 (2013). https://doi.org/10.1016/j.jmva.2012.10.004

    Article  MATH  Google Scholar 

  15. James, B., James, K., Siegmund, D.: Tests for a change-point. Biometrika 74, 71–83 (1987). https://doi.org/10.1093/biomet/74.1.71

    Article  MathSciNet  MATH  Google Scholar 

  16. Johnson, N., Kotz, S.: Some generalizations of Bernoulli and Polya-Eggenberger contagion models. Stat. Pap. 32, 1–17 (1991). https://doi.org/10.1007/BF02925473

    Article  MATH  Google Scholar 

  17. Klyushin, D., Petunin, Y.: A nonparametric test for the equivalence of populations based on a measure of proximity of samples. Ukrainian Math. J. 55(2), 181–198 (2003)

    Article  MathSciNet  Google Scholar 

  18. Matveichuk, S., Petunin, Y.: A generalization of the Bernoulli model occurring in order statistics. I. Ukrainian Math. J. 42(4), 459–466 (1990)

    Google Scholar 

  19. Matveichuk, S., Petunin, Y.: A generalization of the Bernoulli model occurring in order statistics. II. Ukrainian Math. J. 43(6), 728–734 (1991)

    Google Scholar 

  20. Mei, Y.: Sequential change-point detection when unknown parameters are present in the pre-change distribution. Ann. Stat. 34, 92–122 (2006). https://doi.org/10.1214/009053605000000859

    Article  MathSciNet  MATH  Google Scholar 

  21. Pettitt, A.: A non-parametric approach to the change-point problem. Appl. Stat. 28, 126–135 (1979). https://doi.org/10.2307/2346729

    Article  MATH  Google Scholar 

  22. Pires, A., Amado, C.: Interval estimators for a binomial proportion: comparison of twenty methods. REVSTAT-Stat. J. 6, 165–197 (2008). https://doi.org/10.1080/01621459.1968.11009286

    Article  MathSciNet  MATH  Google Scholar 

  23. Poor, H., Hadjiliadis, O.: Quickest Detection. Cambridge University Press, Cambridge (2009). https://doi.org/10.1017/CBO9780511754678

  24. Siegmund, D.: Sequential Analysis. Springer Series in Statistics. Springer, New York (1985). https://doi.org/10.1007/978-1-4757-1862-1

    Book  MATH  Google Scholar 

  25. Tartakovsky, A., Rozovskii, B., et al.: A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods. IEEE Trans. Sig. Process 54(9), 3372–3382 (2006)

    Article  Google Scholar 

  26. Truong, C., Oudre, L., Vayatis, N.: A review of change point detection methods. CoRR, abs/1801.00718 (2018), http://arxiv.org/abs/1801.00718

  27. Truong, C., Oudre, L., Vayatis, N.: Selective review of offline changepoint detection methods. Sig. Process. 167, 107299 (2020). https://doi.org/10.1016/j.sigpro.2019.107299

    Article  Google Scholar 

  28. Vexler, A., Gurevich, G.: Average most powerful tests for a segmented regression. Commun. Stat. Theor. Methods 38, 2214–2231 (2009). https://doi.org/10.1080/03610920802521208

    Article  MathSciNet  MATH  Google Scholar 

  29. Wolfe, D., Schechtman, E.: Nonparametric statistical procedures for the change point problem. J. Stat. Plann. Infer. 9, 389–396 (1984). https://doi.org/10.1016/0378-3758(84)90013-2

    Article  MATH  Google Scholar 

  30. Zou, C., Liu, Y., Qin, P., Wang, Z.: Empirical likelihood ratio test for the change-point problem. Stat. Prob. Lett. 77, 374–382 (2007). https://doi.org/10.1016/j.spl.2006.08.003

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitriy Klyushin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Klyushin, D., Martynenko, I. (2020). Novel Nonparametric Test for Homogeneity and Change-Point Detection in Data Stream. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds) Data Stream Mining & Processing. DSMP 2020. Communications in Computer and Information Science, vol 1158. Springer, Cham. https://doi.org/10.1007/978-3-030-61656-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61656-4_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61655-7

  • Online ISBN: 978-3-030-61656-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics