Skip to main content
Log in

Testing for multiple change points

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper we concentrate on testing for multiple changes in the mean of a series of independent random variables. Suggested method applies a maximum type test statistic. Our primary focus is on an effective calculation of critical values for very large sample sizes comprising (tens of) thousands of observations and a moderate to large number of segments. To that end, Monte Carlo simulations and a modified Bellman’s principle of optimality are used. It is shown that, indisputably, computer memory becomes a critical bottleneck in solving a problem of such a size. Thus, minimization of the memory requirements and appropriate order of calculations appear to be the keys to success. In addition, the formula that can be used to get approximate asymptotic critical values using the theory of exceedance probability of Gaussian fields over a high level is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. Let us recall that optimal partitioning is given by those \(\left\{ n_1,\ldots ,n_d\right\} \), for which the test statistic \(\varUpsilon _n^{2}\) is maximized, respectively the value of the test statistic \(Q^2(n_1,\ldots ,n_d;\epsilon )\) is minimized.

  2. Notice that a standard PC with 4GB RAM can store only one dense matrix of the approximate size \(15{,}000\times 15{,}000\), provided its values are represented in double precision.

  3. Recall that when speaking about the candidates or candidate splits, we have in mind the terms \(q_{1,k}^{j-1}+q_{k+1,i}^{1}\) appearing in (7), respectively in (8).

References

  • Antoch J, Hušková M (2001) Permutation tests for change point analysis. Stat Probab Lett 53:37–46

    Article  MATH  Google Scholar 

  • Antoch J, Hušková M, Jarušková D (2002) Off-line statistical process control. In: Multivariate total quality control, chapt 1, Springer/Physica, Heidelberg, pp 1–86, ISBN 3-7908-1383-4

  • Bai J, Perron P (1998) Estimating and testing linear models with multiple structural changes. Econometrica 66:47–78

    Article  MathSciNet  MATH  Google Scholar 

  • Bai J, Perron P (2003a) Computation and analysis of multiple structural change models. J Appl Econom 18:1–22

    Article  Google Scholar 

  • Bai J, Perron P (2003b) Critical values for multiple structural change tests. Econom J 6:72–78

    Article  MathSciNet  MATH  Google Scholar 

  • Bellman R (1957) Dynamic programming. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Bellman R, Dreyfus S (1962) Applied dynamic programming. Princeton University Press, Princeton, New Jersey

    MATH  Google Scholar 

  • Bellman R, Roth R (1969) Curve fitting by segmented straight lines. JASA 64:1079–1084

    Article  MathSciNet  Google Scholar 

  • Billingsley P (1968) Convergence of probability measures. Wiley, New York

    MATH  Google Scholar 

  • Braun JV, Braun RK, Müller HG (2000) A multiple change point fitting via quasi-likelihood with application to DNA sequence segmentation. Biometrika 87:301–314

    Article  MathSciNet  MATH  Google Scholar 

  • Csörgő M, Horváth L (1997) Limit theorems in change point analysis. Wiley, New York

    Google Scholar 

  • Hawkins DM (2001) Fitting multiple change-point models to data. Comput. Stat. Data Anal. 37:323–341

    Article  MathSciNet  MATH  Google Scholar 

  • Jarušková D (1996) Change-point detection in meteorological measurement. Mon Weather Rev 124:1535–1543

    Article  Google Scholar 

  • Jarušková D, Piterbarg VI (2011) Log-likelihood ratio test for detecting transient change. Stat. Probab. Lett. 81:552–559

    Article  MATH  Google Scholar 

  • Kim HJ, Yu B, Feuer EJ (2009) Selecting the number of change-points in segmented line regression. Stat Sin 19:597–609

    MathSciNet  MATH  Google Scholar 

  • Kirch C, Steinebach J (2006) Permutation principles for the change analysis of stochastic processes under strong invariance. J Comput Appl Math 186:64–88

    Google Scholar 

  • Lavielle M, Teyssière G (2006) Detection of multiple change-points in multivariate time series. Lith Math J 46:287–306

    Article  MATH  Google Scholar 

  • Lavielle M, Teyssière G (2007) Adaptive detection of multiple changepoints in asset price volatility. In: Teyssière G, Kirman AP (eds) Long memory in Economics. Springer, Heidelberg, pp 129–156

  • Lu Q, Lund R, Lee TCM (2010) An MDL approach to the climate segmentation problem. Ann Appl Stat 4:299–319

    Article  MathSciNet  MATH  Google Scholar 

  • Novotný P (2004) Optimal approach to data segmentation. In: Symposium COMPSTAT 2004 poster, Prague

  • Piterbarg VI (1996) Asymptotic methods in the theory of Gaussian processes and fields. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Yao Y-C (1988) Estimating the number of change-points via Schwarz criterion. Stat Probab Lett 6:181–189

    Article  MATH  Google Scholar 

  • Zhang NR, Siegmund D (2007) A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63:22–32

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaromír Antoch.

Additional information

This work was supported by grant GAČR 201/09/0755. Help from the IAP Research Network P7/06 of the Belgian Science Policy is also gratefully acknowledged. Authors thank Petr Lachout for fruitful discussion and two unknown referees for stimulating comments and suggestions that allowed to improve contents of the paper.

Appendix: High level exceedance probability for Gaussian fields

Appendix: High level exceedance probability for Gaussian fields

Theorem 2

Let \(\{X(\varvec{x}),\varvec{x} \in R^m\}\) be a zero mean unit variance Gaussian field defined on a compact set \(A \subset R^m\) with a covariance function \(r(\varvec{x};\varvec{y})=EX(\varvec{x})X(\varvec{y}).\) We suppose that for \(\varvec{x} \in A,\varvec{y} \in A\) the covariance function \(r(\varvec{x};\varvec{y})\) has the following expansion:

$$\begin{aligned}&r(x_1, \dots ,x_m;x_1+h_1,\dots ,x_m+h_m)\\&\qquad =1 -c_1(x_1,\dots ,x_m)|h_1|-\dots -c_p(x_1,\dots ,x_m)|h_p|\\&- c_{p+1}(x_1,\dots ,x_m)h_{p+1}^2-\dots - c_m(x_1,\dots ,x_m)h_m^2\\&+\,o(|h_1|+\dots +|h_p|+h_{p+1}^2+\dots +h_m^2) \quad \text{ as} \quad h_1\rightarrow 0,\dots , h_m \rightarrow 0, \end{aligned}$$

where \(c_1(x_1,\dots ,x_m)\), ..., \(c_m(x_1,\dots ,x_m)\) are continuous functions on \(A\). If we suppose that the Lebesque measure \(mes\big \{\varvec{x};c_1(\varvec{x})=0 \cup \dots \cup c_m(\varvec{x})=0\big \}=0\), then

$$\begin{aligned} P\big (\max _{\varvec{x} \in A} X(\varvec{x}) >u\big ) =\frac{1}{\pi ^{(m-p)/2}} I_A u^{m+p} \big (1-\Phi (u)\big ) \big (1+o(1)\big ) \quad {\text{ as}} \quad u \rightarrow \infty ,\qquad \end{aligned}$$
(11)

where \(I_A=\int \underset{A}{\dots }\int c_1(\varvec{x})\dots c_p(\varvec{x})\sqrt{c_{p+1}(\varvec{x})}\dots \ \sqrt{c_m(\varvec{x})}dx_1\dots dx_m\).

Proof

The theorem is a slight modification of Theorem 7.1. by Piterbarg (1996), see also Jarušková and Piterbarg (2011). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Antoch, J., Jarušková, D. Testing for multiple change points. Comput Stat 28, 2161–2183 (2013). https://doi.org/10.1007/s00180-013-0401-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-013-0401-1

Keywords

Navigation