Skip to main content
Log in

Clustering of interval time series

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Interval time series occur when real intervals of some variable of interest are registered as an ordered sequence along time. We address the problem of clustering interval time series (ITS), for which different approaches are proposed. First, clustering is performed based on point-to-point comparisons. Time-domain and wavelet features also serve as clustering variables in alternative approaches. Furthermore, autocorrelation matrix functions, gathering the autocorrelation and cross-correlation functions of the ITS upper and lower bounds, may be compared using adequate distances (e.g. the Frobenius distance) and used for clustering ITS. An improved procedure to determine the autocorrelation function of ITS is proposed, which also serves as a basis for clustering. The different alternative approaches are explored and their performances compared for ITS simulated under different setups. An application to sea level daily ranges, observed at different locations in Australia, illustrates the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. In what follows we omit the index t, for simplifying the notation.

References

  • Antunes, A.M.C., Subba Rao, T.: On hypotheses testing for the selection of spatio-temporal models. J. Time Ser. Anal. 27, 767–791 (2006)

    MathSciNet  MATH  Google Scholar 

  • Arroyo, J.: Métodos de Predicción para Series Temporales de Intervalos e Histogramas. PhD thesis, Universidad Pontificia Comillas, Madrid (2008)

  • Arroyo, J., Maté, C.: Forecasting histogram time series with k-nearest neighbours methods. Int. J. Forecast. 25(1), 192–207 (2009)

    Google Scholar 

  • Bertrand, P., Goupil, F.: Descriptive statistics for symbolic data. In: Bock, H.-H., Diday, E. (eds.) Analysis of Symbolic Data, pp. 106–124. Exploratory Methods for Extracting Statistical Information from Complex Data, Springer, Heidelberg (2000)

  • Billard, L.: Sample covariance functions for complex quantitative data. In: Proceedings of the World IASC Conference, Yokohama, Japan, pp. 157–163 (2008)

  • Billard, L., Diday, E.: From the statistics of data to the statistics of knowledge: symbolic data analysis. J. Am. Stat. Assoc. 98(462), 470–487 (2003)

    MathSciNet  Google Scholar 

  • Billard, L., Diday, E.: Symbolic Data Analysis: Conceptual Statistics and Data Mining. Wiley, Chichester (2006)

    MATH  Google Scholar 

  • Brito, P.: Symbolic data analysis: another look at the interaction of data mining and statistics. WIREs Data Min. Knowl. Discov. 4(4), 281–295 (2014)

    Google Scholar 

  • Caldwell, P.C., Merrifield, M.A., Thompson, P.R.: Sea level measured by tide gauges from global oceans–the joint archive for sea level holdings (NCEI Accession 0019568), Version 5.5. In: NOAA National Centers for Environmental Information, Dataset (2015). https://doi.org/10.7289/V5V40S7W

  • Caiado, J., Maharaj, E.A., D’Urso, P.: Time series clustering. In: Hennig, C., Meila, M., Murtagh, F., Rocci, R. (eds.) Handbook of Cluster Analysis. Chapman and Hall, New York (2015)

    Google Scholar 

  • Chavent, M., Lechevallier, Y.: Dynamical clustering of interval data: optimization of an adequacy criterion based on Hausdorff distance. In: Classification, Clustering, and Data Analysis, pp. 53–60. Springer, Berlin (2002)

  • Cliff, A.D., Ord, J.K.: Model building and the analysis of spatial pattern in human geography. J. R. Stat. Soc. B 37, 297–328 (1975)

    MathSciNet  MATH  Google Scholar 

  • Crespo, F., Peters, G., Weber, R.: Rough clustering approaches for dynamic environments. In: Peters, G., Lingras, P., Ślȩzak, D., Yao, Y. (eds.) Rough Sets: Selected Methods and Applications in Management and Engineering. Advanced Information and Knowledge Processing. Springer, London (2012)

  • Cressie, N.A.C.: Statistics for Spatial Data. Wiley, New York (1993)

    MATH  Google Scholar 

  • Cressie, N.A.C., Wikle, C.K.: Statistics for Spatio-temporal Data. Wiley, Hoboken (2011)

    MATH  Google Scholar 

  • De Carvalho, F.A.T., Lechevallier, Y.: Partitional clustering algorithms for symbolic interval data based on single adaptive distances. Pattern Recognit. 42(7), 1223–1236 (2009)

    MATH  Google Scholar 

  • De Carvalho, F.A.T., Brito, P., Bock, H.-H.: Dynamic clustering for interval data based on \(L_2\) distance. Comput. Stat. 21(2), 231–250 (2006a)

    MATH  MathSciNet  Google Scholar 

  • De Carvalho, F.A.T., De Souza, R.M.C.R., Chavent, M., Lechevallier, Y.: Adaptive Hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recognit. Lett. 27(3), 167–179 (2006b)

    Google Scholar 

  • De Carvalho, F.A.T., Lechevallier, Y., Verde R.: Clustering methods in symbolic data analysis. In: Diday, E., Noirhomme-Fraiture, M. (eds) Symbolic Data Analysis and the SODAS Software, Chichester, pp. 182–203 (2008)

  • De Souza, R.M.C.R., De Carvalho, F.A.T.: Clustering of interval data based on city-block distances. Pattern Recognit. Lett. 25(3), 353–365 (2004)

    Google Scholar 

  • Dias, S., Brito, P.: Off the beaten track: a new linear model for interval data. Eur. J. Oper. Res. 258(3), 1118–1130 (2017)

    MathSciNet  MATH  Google Scholar 

  • Diday, E., Simon, J.C.: Clustering Analysis. Digital Pattern Recognition, pp. 47–94. Springer, Berlin (1976)

    Google Scholar 

  • Diggle, P.J., Ribeiro Jr., P.J.: Model-Based Geostatistics. Springer, New York (2007)

    MATH  Google Scholar 

  • Douzal-Chouakria, A., Billard, L., Diday, E.: Principal component analysis for interval-valued observations. Stat. Anal. Data Min. 4(2), 229–246 (2011)

    MathSciNet  Google Scholar 

  • Duarte Silva, A.P., Brito, P.: Linear discriminant analysis for interval data. Comput. Stat. 21(2), 289–308 (2006)

    MathSciNet  MATH  Google Scholar 

  • Duarte Silva, A.P., Brito, P.: Discriminant analysis of interval data: an assessment of parametric and distance-based approaches. J. Classif. 32(3), 516–541 (2015)

    MathSciNet  MATH  Google Scholar 

  • D’Urso, P., Maharaj, E.A.: Autocorrelation-based fuzzy clustering of time series. Fuzzy Sets Syst. 160, 3565–3589 (2009)

    MathSciNet  Google Scholar 

  • D’Urso, P., Maharaj, E.A.: Wavelets-based clustering of multivariate time series. Fuzzy Sets Syst. 193, 33–61 (2012)

    MathSciNet  MATH  Google Scholar 

  • Finkenstadt, B., Held, L., Isham, V. (eds).: Statistical Methods for Spatio-Temporal Systems. Chapman and Hall, London (2007)

  • García-Ascanio, C., Maté, C.: Electric power demand forecasting using interval time series: a comparison between var and imlp. Energy Policy 38(2), 715–725 (2010)

    Google Scholar 

  • Genolini, C., Falissard, B.: Kml: k-means for longitudinal data. Comput. Stat. 25, 317–328 (2010)

    MathSciNet  MATH  Google Scholar 

  • González-Rivera, G., Arroyo, J.: Time series modeling of histogram-valued data: the daily histogram time series of s&p500 intradaily returns. Int. J. Forecast. 28(1), 20–33 (2012)

    Google Scholar 

  • Han, A., Yongmiao, H., La, K.K., Shouyang, W.: Interval time series analysis with an application to the sterling-dollar exchange rate. J. Syst. Sci. Complex. 21(4), 558–573 (2008)

    MathSciNet  MATH  Google Scholar 

  • Han, A., Hong, Y., Wang, S.: Autoregressive conditional models for interval-valued time series data. In: The 3rd International Conference on Singular Spectrum Analysis and Its Applications (2012)

  • Hennig, C., Meila, M., Murtagh, F., Rocci, R. (eds): Handbook of Cluster Analysis. Chapman and Hall/CRC, London (2015)

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    MATH  Google Scholar 

  • Irpino, A., Verde, R. (2006) A new Wasserstein based distance for the hierarchical clustering of histogram symbolic data. In: Batagelj V, Bock HH, Ferligoj A (eds.) Proceedings of the Conference of the International Federation of Classification Societies (IFCS06), pp. 185–192. Springer, Heidelberg

  • Johnston, J., Dinardo, J.: Econometric Methods, 2nd edn. McGraw-Hill, New York (1997)

    Google Scholar 

  • Le, N.D., Zidek, J.V.: Statistical Analysis of Environmental Space-Time Processes. Springer, New York (2006)

    MATH  Google Scholar 

  • Le-Rademacher, J., Billard, L.: Symbolic covariance principal component analysis and visualization for interval-valued data. J. Comput. Gr. Stat. 21(2), 413–432 (2012)

    MathSciNet  Google Scholar 

  • LimaNeto, E., De Carvalho, F.A.T.: Centre and range method for fitting a linear regression model to symbolic interval data. Comput. Stat. Data Anal. 52(3), 1500–1515 (2008)

    MathSciNet  MATH  Google Scholar 

  • LimaNeto, E., De Carvalho, F.A.T.: Constrained linear regression models for symbolic interval-valued variables. Comput. Stat. Data Anal. 54(2), 333–347 (2010)

    MathSciNet  Google Scholar 

  • LimaNeto, E., De Carvalho, F.A.T.: Bivariate symbolic regression models for interval-valued variables. J. Stat. Comput. Simul. 81(11), 1727–1744 (2011)

    MathSciNet  MATH  Google Scholar 

  • Maia, A.L.S., De Carvalho, F.A.T., Ludermir, T.B.: Forecasting models for interval-valued time series. Neurocomputing 71(16), 3344–3352 (2008)

    Google Scholar 

  • Percival, D., Walden, A.: Wavelets Analysis for Time Series Analysis. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  • Pfeifer, P., Deutsch, S.: A three stage interactive procedure for space-time modeling. Technometrics 22, 35–47 (1980)

    MATH  Google Scholar 

  • Ramos-Guajardo, A.B., Grzegorzewski, P.: Distance-based linear discriminant analysis for interval-valued data. Inf. Sci. 372, 591–607 (2016)

    Google Scholar 

  • Rodrigues, P.M., Salish, N.: Modeling and forecasting interval time series with threshold models. Adv. Data Anal. Classif. 9(1), 41–57 (2015)

    MathSciNet  MATH  Google Scholar 

  • Teles, P., Brito, P.: Modelling interval time series data. In: Proceedings of the 3rd IASC World Conference on Computational Statistics and Data Analysis, Limassol, Cyprus (2005)

  • Teles, P., Brito, P.: Modeling interval time series with space-time processes. Commun. Stat.Theory Methods 44(17), 3599–3627 (2015)

    MathSciNet  MATH  Google Scholar 

  • Verde, R., Irpino, A.: Dynamic clustering of histogram data: Using the right metric. In: Brito, P., Bertrand, P., Cucumel, G., De Carvalho, F.A.T. (eds.) Selected Contributions in Data Analysis and Classification, pp. 123–134. Springer, Heidelberg (2007)

  • Verde, R., Irpino, A.: Comparing histogram data using a Mahalanobis-Wasserstein distance. In: Brito, P. (ed) Proceedings of the COMPSTAT’2008, pp. 77–89. Springer, Heidelberg (2008)

  • Wei, W.W.S.: Time Series Analysis–Univariate and Multivariate Methods, 2nd edn. Pearson, New York (2006)

    MATH  Google Scholar 

Download references

Acknowledgements

The work of P. Teles and P. Brito is financed by the ERDF—European Regional Development Fund—through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020 Programme within project “POCI-01-0145-FEDER-006961”—and by the National Funds through the FCT—Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology)–as part of project UID/EEA/50014/2013. We thank the associate editor and reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elizabeth Ann Maharaj.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 14198 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maharaj, E.A., Teles, P. & Brito, P. Clustering of interval time series. Stat Comput 29, 1011–1034 (2019). https://doi.org/10.1007/s11222-018-09851-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-018-09851-z

Keywords