Skip to main content
Log in

A heuristic, iterative algorithm for change-point detection in abrupt change models

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Change-point detection in abrupt change models is a very challenging research topic in many fields of both methodological and applied Statistics. Due to strong irregularities, discontinuity and non-smootheness, likelihood based procedures are awkward; for instance, usual optimization methods do not work, and grid search algorithms represent the most used approach for estimation. In this paper a heuristic, iterative algorithm for approximate maximum likelihood estimation is introduced for change-point detection in piecewise constant regression models. The algorithm is based on iterative fitting of simple linear models, and appears to extend easily to more general frameworks, such as models including continuous covariates with possible ties, distinct change-points referring to different covariates, and further covariates without change-point. In these scenarios grid search algorithms do not straightforwardly apply. The proposed algorithm is validated through some simulation studies and applied to two real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bai J, Perron P (2003) Computation and analysis of multiple structural change models. J Appl Econom 18(1):1–22

    Article  Google Scholar 

  • Balke NS (1993) Detecting level shifts in time series. J Bus Econ Stat 11(1):81–92

    Google Scholar 

  • Banerjee A, Urga G (2005) Modelling structural breaks, long memory and stock market volatility: an overview. J Econom 129(1):1–34

    Article  MathSciNet  MATH  Google Scholar 

  • Beaulieu C, Chen J, Sarmiento JL (2012) Change-point analysis as a tool to detect abrupt climate variations. Philos Trans R Soc Lond A Math Phys Eng Sci 370(1962):1228–1249

    Article  Google Scholar 

  • Blythe DA, von Bunau P, Meinecke FC, Muller K (2012) Feature extraction for change-point detection using stationary subspace analysis. IEEE Trans Neural Netw Learn Syst 23(4):631–643

    Article  Google Scholar 

  • Boysen L, Kempe A, Liebscher V, Munk A, Wittich O (2009) Consistencies and rates of convergence of jump-penalized least squares estimators. Ann Stat 37(1):157–183

    Article  MathSciNet  MATH  Google Scholar 

  • Braun JV, Braun R, Müller HG (2000) Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika 87(2):301–314

    Article  MathSciNet  MATH  Google Scholar 

  • Cho H, Fryzlewicz P (2012) Multiscale and multilevel technique for consistent segmentation of nonstationary time series. Stat Sin 22(1):207–229

    MathSciNet  MATH  Google Scholar 

  • Cobb GW (1978) The problem of the nile: conditional solution to a changepoint problem. Biometrika 65(2):243–251

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho DL, Johnstone IM (1995) Adapting to unknown smoothness via wavelet shrinkage. J Am Stat Assoc 90(432):1200–1224

    Article  MathSciNet  MATH  Google Scholar 

  • Dumbgen L (1991) The asymptotic behavior of some nonparametric change-point estimators. Ann Stat 19(3):1471–1495

    Article  MathSciNet  MATH  Google Scholar 

  • Eilers PH, De Menezes RX (2005) Quantile smoothing of array cgh data. Bioinformatics 21(7):1146–1153

    Article  Google Scholar 

  • Fearnhead P (2006) Exact and efficient bayesian inference for multiple changepoint problems. Stat Comput 16(2):203–213

    Article  MathSciNet  Google Scholar 

  • Frick K, Munk A, Sieling H (2014) Multiscale change point inference. J R Stat Soc Ser B (Stat Methodol) 76(3):495–580

    Article  MathSciNet  Google Scholar 

  • Fridlyand J, Snijders AM, Pinkel D, Albertson DG, Jain AN (2004) Hidden markov models approach to the analysis of array cgh data. J Multivar Anal 90(1):132–153

    Article  MathSciNet  MATH  Google Scholar 

  • Friedrich F, Kempe A, Liebscher V, Winkler G (2008) Complexity penalized m-estimation: fast computation. J Comput Graph Stat 17(1):201–224

    Article  MathSciNet  Google Scholar 

  • Guha S, Li Y, Neuberg D (2008) Bayesian hidden markov modeling of array cgh data. J Am Stat Assoc 103(482):485–497

    Article  MathSciNet  MATH  Google Scholar 

  • Hawkins DM (2001) Fitting multiple change-point models to data. Comput Stat Data Anal 37(3):323–341

    Article  MathSciNet  MATH  Google Scholar 

  • Horváth L (1993) The maximum likelihood method for testing changes in the parameters of normal observations. Ann Stat 21(2):671–680

    Article  MathSciNet  MATH  Google Scholar 

  • Hsu L, Self SG, Grove D, Randolph T, Wang K, Delrow JJ, Loo L, Porter P (2005) Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics 6(2):211–226

    Article  MATH  Google Scholar 

  • Huang T, Wu B, Lizardi P, Zhao H (2005) Detection of DNA copy number alterations using penalized least squares regression. Bioinformatics 21(20):3811–3817

    Article  Google Scholar 

  • Jackson B, Scargle JD, Barnes D, Arabhi S, Alt A, Gioumousis P, Gwin E, Sangtrakulcharoen P, Tan L, Tsai TT (2005) An algorithm for optimal partitioning of data on an interval. IEEE Signal Process Lett 12(2):105–108

    Article  Google Scholar 

  • Jackson CH, Sharples LD (2004) Models for longitudinal data with censored changepoints. J R Stat Soc Ser C (Appl Stat) 53(1):149–162

    Article  MathSciNet  MATH  Google Scholar 

  • Jong K, Marchiori E, Van Der Vaart A, Ylstra B, Weiss M, Meijer G (2003) Chromosomal breakpoint detection in human cancer. In: Cagnoni S et al (eds) Applications of evolutionary computing, Springer, pp 54–65

  • Killick R, Eckley IA (2014) changepoint: an R package for changepoint analysis. J Stat Softw 58(3):1–19. http://www.jstatsoft.org/v58/i03/

  • Killick R, Fearnhead P, Eckley I (2012) Optimal detection of changepoints with a linear computational cost. J Am Stat Assoc 107(500):1590–1598

    Article  MathSciNet  MATH  Google Scholar 

  • Lavielle M (1999) Detection of multiple changes in a sequence of dependent variables. Stoch Process Appl 83(1):79–102

    Article  MathSciNet  MATH  Google Scholar 

  • Loader CR et al (1996) Change point estimation using nonparametric regression. Ann Stat 24(4):1667–1678

    Article  MathSciNet  MATH  Google Scholar 

  • Maidstone R, Hocking T, Rigaill G, Fearnhead P (2016) On optimal multiple changepoint algorithms for large data. Stat Comput 27(2):1–15

    MathSciNet  MATH  Google Scholar 

  • Muggeo VMR (2003) Estimating regression models with unknown break-points. Stat Med 22(19):3055–3071

    Article  Google Scholar 

  • Muggeo VMR, Adelfio G (2011) Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics 27(2):161–166

    Article  Google Scholar 

  • Muggeo VMR, Atkins D, Gallop R, Dimidjian S (2014) Segmented mixed models with random changepoints: a maximum likelihood approach with application to treatment for depression study. Stat Model 14(4):293–313

    Article  MathSciNet  Google Scholar 

  • Olshen AB, Venkatraman E, Lucito R, Wigler M (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4):557–572

    Article  MATH  Google Scholar 

  • Pastor-Barriuso R, Guallar E, Coresh J (2003) Transition models for change-point estimation in logistic regression. Stat Med 22(7):1141–1162

    Article  Google Scholar 

  • Pinkel D, Albertson DG (2005) Array comparative genomic hybridization and its applications in cancer. Nat Genet 37:S11–S17

    Article  Google Scholar 

  • Price TS, Regan R, Mott R, Hedman Å, Honey B, Daniels RJ et al (2005) Sw-array: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative genome hybridization data. Nucleic Acids Res 33(11):3455–3464

    Article  Google Scholar 

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/

  • Rigaill G, Lebarbier E, Robin S (2012) Exact posterior distributions and model selection criteria for multiple change-point detection problems. Stat Comput 22(4):917–929

    Article  MathSciNet  MATH  Google Scholar 

  • Rippe RC, Meulman JJ, Eilers PH (2012) Visualization of genomic changes by segmented smoothing using an l0 penalty. PloS One 7(6):e38230

    Article  Google Scholar 

  • Scott A, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512

    Article  MATH  Google Scholar 

  • Siegmund D (2013) Change-points: from sequential detection to biology and back. Seq Anal 32(1):2–14

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R, Wang P (2008) Spatial smoothing and hot spot detection for cgh data using the fused lasso. Biostatistics 9(1):18–29

    Article  MATH  Google Scholar 

  • Tishler A, Zang I (1981) A new maximum likelihood algorithm for piecewise regression. J Am Stat Assoc 76(376):980–987

    Article  MathSciNet  MATH  Google Scholar 

  • Venkatraman E, Olshen AB (2007) A faster circular binary segmentation algorithm for the analysis of array cgh data. Bioinformatics 23(6):657–663

    Article  Google Scholar 

  • Venkatraman ES (1992) Consistency results in multiple change-point problems. Ph.D. thesis, to the Department of Statistics, Stanford University

  • Wang P, Kim Y, Pollack J, Narasimhan B, Tibshirani R (2005) A method for calling gains and losses in array cgh data. Biostatistics 6(1):45–58

    Article  MATH  Google Scholar 

  • Yao YC, Au S (1989) Least-squares estimation of a step function. Sankhyā Indian J Stat Ser A 51(3):370–381

    MathSciNet  MATH  Google Scholar 

  • Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2):492–514

    Article  MathSciNet  Google Scholar 

  • Zhou H, Liang KY (2008) On estimating the change point in generalized linear models. In: Balakrishnan N, Peña EA, Silvapulle MJ (eds) Beyond parametrics in interdisciplinary research: festschrift in honor of professor Pranab K. Sen. IMS collections, vol 1. Institute of Mathematical Statistics, Beachwood, pp 305–320

Download references

Acknowledgements

The authors would like to thank the reviewers for their insightful comments and suggestions which greatly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salvatore Fasola.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fasola, S., Muggeo, V.M.R. & Küchenhoff, H. A heuristic, iterative algorithm for change-point detection in abrupt change models. Comput Stat 33, 997–1015 (2018). https://doi.org/10.1007/s00180-017-0740-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-017-0740-4

Keywords

Navigation