Skip to main content
Log in

A sequential multiple change-point detection procedure via VIF regression

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper, we propose a procedure for detecting multiple change-points in a mean-shift model, where the number of change-points is allowed to increase with the sample size. A theoretic justification for our new method is also given. We first convert the change-point problem into a variable selection problem by partitioning the data sequence into several segments. Then, we apply a modified variance inflation factor regression algorithm to each segment in sequential order. When a segment that is suspected of containing a change-point is found, we use a weighted cumulative sum to test if there is indeed a change-point in this segment. The proposed procedure is implemented in an algorithm which, compared to two popular methods via simulation studies, demonstrates satisfactory performance in terms of accuracy, stability and computation time. Finally, we apply our new algorithm to analyze two real data examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Auger I, Lawrence C (1989) Algorithms for the optimal identification of segment neighborhoods. Bull Math Biol 51:39–54

    Article  MathSciNet  MATH  Google Scholar 

  • Barry D, Hartigan JA (1992) Product partition models for change-point problems. Ann Stat 20:260–279

    Article  MathSciNet  MATH  Google Scholar 

  • Barry D, Hartigan JA (1993) A Bayesian analysis for change point problems. J Am Stat Assoc 35:309–319

    MathSciNet  MATH  Google Scholar 

  • Chen J, Gupta AK (2012) Parametric statistical change point analysis with applications to genetics medicine and finance, 2nd edn. Birkhäuser, Boston

    Book  MATH  Google Scholar 

  • Csörgő M, Horváth L (1997) Limit theorems in change-point analysis. Wiley, Chichester

    MATH  Google Scholar 

  • Erdman C, Emerson JW (2007) bcp: an R package for performing a Bayesian analysis of change point problems. J Stat Softw 23:1–13

    Article  Google Scholar 

  • Erdman C, Emerson JW (2008) A fast Bayesian change point analysis for the segmentation of microarray data. Bioinformatics 24:2143–2148

    Article  Google Scholar 

  • Harchaoui Z, Lévy-Leduc C (2008) Catching change-points with Lasso. Adv Neural Inf Process Syst 20:617–624

    Google Scholar 

  • Harchaoui Z, Lévy-Leduc C (2010) Multiple change-point estimation with a total variation penalty. J Am Stat Assoc 105:1480–1493

    Article  MathSciNet  MATH  Google Scholar 

  • Jackson B, Sargle J, Barnes D, Arabhi S, Alt A, Gioumousis P, Gwin E, Sangtrakulcharoen P, Tan L, Tsai TT (2005) An algorithm for optimal partitioning of data on an interval. IEEE Signal Process Lett 12:105–108

    Article  Google Scholar 

  • Jin B, Shi X, Wu Y (2013) A novel and fast methodology for simultaneous multiple structural break estimation and variable selection for nonstationary time series models. Stat Comput 23:221–231

    Article  MathSciNet  MATH  Google Scholar 

  • Killick R, Eckley IA (2014) changepoint: an R package for changepoint analysis. J Stat Softw 58(3):1–19

    Article  Google Scholar 

  • Killick R, Eckley IA, Haynes K (2014) changepoint: An R package for changepoint analysis. R package version 1(1):5

  • Killick R, Fearnhead P, Eckley IA (2012) Optimal detection of changepoints with a linear computational cost. J Am Stat Assoc 107:1590–1598

    Article  MathSciNet  MATH  Google Scholar 

  • Lin D, Foster DP, Ungar LH (2011) VIF regression: a fast regression algorithm for large data. J Am Stat Assoc 106:232–247

    Article  MathSciNet  MATH  Google Scholar 

  • Matteson DS, James NA (2013) A nonparametric approach for multiple change point analysis of multivariate data. J Am Stat Assoc 109:334–345

    Article  MathSciNet  Google Scholar 

  • Olshen A, Venkatraman E, Lucito R, Wigler M (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5:557–572

    Article  MATH  Google Scholar 

  • Qu L, Tu Y (2006) Change point estimation of bilevel functions. J Mod Appl Stat Methods 5:347–355

    Google Scholar 

  • Rigaill G (2010) Pruned dynamic programming for optimal multiple change-point detection. Technical Report, arXiv:1004.0887v1

  • Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30:507–512

    Article  MATH  Google Scholar 

  • Seshan VE, Olshen A (2015) DNAcopy: DNA copy number data analysis. R package version 1(40)

  • Shi X, Wang X, Wei W, Wu Y (2015) VIFCP: detecting change-points via VIFCP method. R package version 1.0

  • Stransky N, Vallot C, Reyal F, Bernard-Pierrot I, Diez de Medina SG, Segraves R, de Rycke Y, Elvin P, Cassidy A, Spraggon C, Graham A, Southgate J, Asselain B, Allory Y, Abbou CC, Albertson DG, Thiery J-P, Chopin DK, Pinkel D, Radvanyi F (2006) Regional copy number-independent deregulation of transcription in cancer. Nat Genet 38:1386–1396

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the associate editor and two anonymous reviewers for the critical comments and constructive suggestions which have led to the improvement of this paper. The authors would also like to thank Professor Trueman MacHenry for polishing the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuehua Wu.

Additional information

The research was partially supported by Natural Sciences and Engineering Research Council of Canada.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Since \(\varepsilon _i\), \(i=1,2,\ldots \), are iid zero-mean variables with variance \(\sigma ^2\), it follows from the definition of \(\rho _{i+1}\) in (8) and the idempotence of \( I- X^{(i+1)}[( X^{(i+1)})^T X^{(i+1)}]^{-1}( X^{(i+1)})^T\) that the variance of \(\rho _{i+1}^{-1}(\varvec{x}_{\mathrm{new}}^{(i+1)})^T\{I- X^{(i+1)}[( X^{(i+1)})^T X^{(i+1)}]^{-1}( X^{(i+1)})^T\}{\varvec{\varepsilon }}^{(i+1)}\) is still \(\sigma ^2\). By the central limit theorem, we obtain that

$$\begin{aligned}&\rho _{i+1}^{-1}\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^T\left\{ I- X^{(i+1)}\left[ \left( X^{(i+1)}\right) ^T X^{(i+1)}\right] ^{-1}\left( X^{(i+1)}\right) ^T\right\} {\varvec{\varepsilon }}^{(i+1)} \xrightarrow {d} N\left( 0,\sigma ^2\right) . \end{aligned}$$

Note that \( ( X^{(i+1)})^T X^{(i+1)}\) can be expressed as \(( U^{(i+1)})^T \varLambda ^{(i+1)} U^{(i+1)}\), where \(U^{(i+1)}\) is the lower triangular matrix of order \(k+1\) whose nonzero entries are all 1’s, and \(\varLambda ^{(i+1)}\) is a diagonal matrix with diagonal entries being \(k_1-k_0,\ k_2-k_1,\ \ldots ,\ k_m-k_{m-1},\ 1+(i+1)l-k_m\). Since the change-points are well-separated, i.e., \(k_r-k_{r-1}=O(n)\), \((\varLambda ^{(i+1)})^{-1}\) is of order O(1 / n), we have that \([( X^{(i+1)})^T X^{(i+1)}]^{-1}\) is also of order O(1 / n).

Next, we prove that \(\rho _{i+1}\) defined in (8) is asymptotically equal to \(\sqrt{l}\). Note that \(\varvec{x}_{\mathrm{new}}^{(i+1)}={{\varvec{\ell }}}_{il,l}\) is the vector with only the last l elements being ones, and all other elements are zeros. It can be seen that \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T\varvec{x}_{\mathrm{new}}^{(i+1)}=l\) and \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T X^{(n+1)}=O(l)\). Therefore, as \(n\rightarrow \infty \), it is readily seen from \([( X^{(i+1)})^T X^{(i+1)}]^{-1}=O(1/n)\) that

$$\begin{aligned} \rho _{i+1}^2&=\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^T\varvec{x}_{\mathrm{new}}^{(i+1)}-\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^T \\&\quad \times \left\{ I- X^{(i+1)}\left[ \left( X^{(i+1)}\right) ^T X^{(i+1)}\right] ^{-1}\left( X^{(i+1)}\right) ^T\right\} \varvec{x}_{\mathrm{new}}^{(i+1)}\\&=l-O\left( l^2/n\right) \sim l. \end{aligned}$$

Under the null hypothesis, there exists no change-point in the interval \([1+il,(i+1)l]\). It can be shown that the last l elements of the correction vector \({\varvec{\eta }}^{(i+1)}\) are zeros, which implies that \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T{\varvec{\eta }}^{(i+1)}=0\). Since \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T X^{(i+1)}=O(l)\), \((X^{(i+1)})^T{\varvec{\eta }}^{(i+1)}=o_p(bl)\), \([ (X^{(i+1)})^T X^{(i+1)}]^{-1}=O(1/n)\) and \(\rho _{i+1}/\sqrt{l}\rightarrow 1\), by Assumption A1, it follows that

$$\begin{aligned} \rho _{i+1}^{-1}\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^T\left\{ I- X^{(i+1)}\left[ \left( X^{(i+1)}\right) ^T X^{(i+1)}\right] ^{-1}\left( X^{(i+1)}\right) ^T\right\} {\varvec{\eta }}^{(i+1)}=o(1). \end{aligned}$$

In view of the fact that \(\beta _{\mathrm{new}}^{(i+1)}=0\), i.e., there is no change-point in \([1+il,(i+1)l]\), and \(\rho _{i+1}\rightarrow \infty \), by (7) and (9), we obtain that

$$\begin{aligned} \rho _{i+1}\hat{\beta }_{\mathrm{new}}^{(i+1)}\xrightarrow {d} N(0,\sigma ^2). \end{aligned}$$

This proves Theorem 1(a).

Under the alternative hypothesis, there exists a change-point, say \(k_m\), in the segment \([1+il,(i+1)l]\). Moreover, \(k_m-il\) many of the last l elements of the correction vector \({\varvec{\eta }}^{(i+1)}\) are equal to \(\beta _{\mathrm{new}}^{(i+1)}\), and \(\beta _{\mathrm{new}}^{(i+1)} \not =0\), which implies \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T{\varvec{\eta }}^{(i+1)}=\beta _{\mathrm{new}}^{(i+1)}\left( k_m-il\right) \).

Moreover, we have

$$\begin{aligned} \rho _{i+1}^{-2}\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^TX^{(i+1)}\left[ \left( X^{(i+1)}\right) ^T X^{(i+1)}\right] ^{-1}\left( X^{(i+1)}\right) ^T{\varvec{\eta }}^{(i+1)}=o_p(1) \end{aligned}$$

from the Proof of Theorem 1(a). In view of (11), we obtain that

$$\begin{aligned} \rho _{i+1}^{-2}\left( \varvec{x}_{\mathrm{new}}^{(i+1)}\right) ^T\left\{ I- X^{(i+1)}\left[ \left( X^{(i+1)}\right) ^T X^{(i+1)}\right] ^{-1}\left( X^{(i+1)}\right) ^T\right\} {\varvec{\varepsilon }}^{(i+1)}=o_p(1). \end{aligned}$$

Applying these results to (7) yields

$$\begin{aligned} \hat{\beta }_{\mathrm{new}}^{(i+1)}=\beta _{\mathrm{new}}^{(i+1)}\left[ 1-\rho _{i+1}^{-2}(k_m-il)\right] +o_p(1). \end{aligned}$$

Furthermore, if the change-point \(k_m\) is located in the artificial interval \([1+(i-1)l,il]\) (i.e., the change-point was previously undetected), then the correction vector \({\varvec{\eta }}^{(i+1)}\) has zero components in the last l rows, which implies that \((\varvec{x}_{\mathrm{new}}^{(i+1)})^T{\varvec{\eta }}^{(i+1)}=0\). A similar argument as above yields that \(\hat{\beta }_{\mathrm{new}}^{(i+1)}=\beta _{\mathrm{new}}^{(i+1)}+o_p(1).\) This ends the proof of Theorem 1(b).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, X., Wang, XS., Wei, D. et al. A sequential multiple change-point detection procedure via VIF regression. Comput Stat 31, 671–691 (2016). https://doi.org/10.1007/s00180-015-0587-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-015-0587-5

Keywords

Navigation