Skip to main content
Log in

Canonical correlation for principal components of time series

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

With contemporary data collection capacity, data sets containing large numbers of different multivariate time series relating to a common entity (e.g., fMRI, financial stocks) are becoming more prevalent. One pervasive question is whether or not there are patterns or groups of series within the larger data set (e.g., disease patterns in brain scans, mining stocks may be internally similar but themselves may be distinct from banking stocks). There is a relatively large body of literature centered on clustering methods for univariate and multivariate time series, though most do not utilize the time dependencies inherent to time series. This paper develops an exploratory data methodology which in addition to the time dependencies, utilizes the dependency information between S series themselves as well as the dependency information between p variables within the series simultaneously while still retaining the distinctiveness of the two types of variables. This is achieved by combining the principles of both canonical correlation analysis and principal component analysis for time series to obtain a new type of covariance/correlation matrix for a principal component analysis to produce a so-called “principal component time series”. The results are illustrated on two data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New York

    MATH  Google Scholar 

  • Beran J, Mazzola G (1999) Visualizing the relationship between time series by hierarchical smoothing models. J Comput Graph Stat 8:213–228

    MathSciNet  Google Scholar 

  • Bogué R, Smilde AK (1999) Monitoring and diagnosing batch processes with multiway covariates regression models. Am Inst Chem Eng J 45:1504–1520

    Article  Google Scholar 

  • Box GEP, Jenkins GM, Reinsel GC (2011) Time series analysis: forecasting and control, 4th edn. Wiley, New York

    MATH  Google Scholar 

  • Box GEP, Tiao GC (1977) A canonical analysis of multiple time series. Biometrika 64:355–365

    Article  MathSciNet  MATH  Google Scholar 

  • Bro R (2006) Review on multiway analysis in chemistry: 2000–2005. Crit Rev Anal Chem 36:279–293

    Article  Google Scholar 

  • Bro R, Sidiropoulos ND, Smilde AK (2002) Maximum likelihood fitting using ordinary least squares algorithms. J Chemom 16:387–400

    Article  Google Scholar 

  • Devlin SJ, Gnanadesikan R, Kettenring JR (1975) Robust estimation and outlier detection with correlation coefficients. Biometrika 62:531–545

    Article  MATH  Google Scholar 

  • Engle RF, Granger CWJ (1987) Co-integration and error-correction: representation, estimation and testing. Econometrica 55:251–276

    Article  MathSciNet  MATH  Google Scholar 

  • Goutte C, Toft P, Rostrup E (1999) On clustering fMRI time series. Neuroimage 9:298–310

    Article  Google Scholar 

  • Harrison L, Penny WD, Friston K (2003) Multivariate autoregressive modeling of fMRI time series. Neuroimage 19:1477–1491

    Article  Google Scholar 

  • Higham N (2001) Computing the nearest correlation matrix a problem from finance. IMA J Numer Anal 22:329–343

    Article  MathSciNet  MATH  Google Scholar 

  • Ho MR, Ombao H, Shumway R (2005) A state-space approach to modelling brain dynamics. Stat Sin 15:407–428

    MathSciNet  MATH  Google Scholar 

  • Hotelling H (1936) Relations between two sets of variates. Biometrika 28:321–377

    Article  MATH  Google Scholar 

  • Huzurbazar S, Humphrey NF (2008) Functional clustering of time series: an insight into length scales in subglacial water flow. Water Resour Res 44:W11420

    Article  Google Scholar 

  • Jäckel P (2002) Monte Carlo methods in finance. Wiley, New York

    Google Scholar 

  • Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 7th edn. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Jones RH (1964) Prediction of multivariate time series. J Appl Meteorol 3:285–289

    Article  Google Scholar 

  • Kadous MW (1995) Recognition of Australian sign language using instrumented gloves. Thesis University of South Wales

  • Kadous MW (1999) Learning comprehensible descriptions and multivariate time series. In: Bratko I, Dzeroski S (eds) Proceedings of the sixteenth international conference on machine learning. Morgan Kaufmann Publishers, San Fransisco, pp 454–463

    Google Scholar 

  • Kakizawa Y, Shumway RH, Taniguchi N (1998) Discrimination and clustering for mulitvariate time series. J Am Stat Assoc 93:328–340

    Article  MATH  Google Scholar 

  • Kalpakis K, Gada D, Puttagunta V (2001) Distance measures for effective clustering of ARIMA time-series. In: Cercone N, Lin TY, Wu X (eds) Proceedings IEEE international conference on data mining. IEEE, San Jose, pp 273–280

    Google Scholar 

  • Košmelj K, Batagelj V (1990) Cross-sectional approach for clustering time varying data. J Classif 7:99–109

    Article  MathSciNet  Google Scholar 

  • Košmelj K, Zabkar V (2008) A methodology for identifying time-trend patterns: an application to the advertising expenditure of 28 European countries in the 1994–2004 period. In: Furbach U (ed) Lecture notes in computer science, KI: advances in artificial inteligence. Springer, Berlin, pp 92–106

    Google Scholar 

  • Kroonenberg PM (2008) Applied multiway data analysis. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • Kroonenberg PM, Harshman RA, Murakami T (2009) Analysing three-way profile data using the PARAFAC and Tucker3 models illustrated with views on parenting. Appl Multivar Res 13:5–41

    Article  Google Scholar 

  • Kupiec PH (1998) Stress testing in a value at risk framework. J Deriv 6:724

    Article  Google Scholar 

  • Liao TW (2007) A clustering procedure for exploratory mining of vector time series. Pattern Recogn 40:2550–2562

    Article  MATH  Google Scholar 

  • Liao TW (2005) Clustering of time series: a survey. Pattern Recogn 38:1857–1874

    Article  MATH  Google Scholar 

  • Min W, Tsay RS (2005) On canonical analysis of multivariate time series. Stat Sin 15:303–323

    MathSciNet  MATH  Google Scholar 

  • Owsley LMD, Atlas LE, Bernard GD (1997) Self-organizing feature maps and hidden Markov models for machine-tool monitoring. IEEE Trans Signal Process 45:2787–2798

    Article  Google Scholar 

  • Piccolo D (1990) A distance measure for classifying ARIMA models. J Time Ser Anal 11:153–164

    Article  MATH  Google Scholar 

  • Policker S, Geva AB (2000) Nonstationary time series analysis by temporal clustering. IEEE Trans Syst Man Cybern-B: Cybern 30:339–343

    Article  Google Scholar 

  • Rapisarda F, Brigo D, Mercurio F (2007) Parameterizing correlations: a geometric interpretation. IMA J Manag Math 18:55–73

    Article  MathSciNet  MATH  Google Scholar 

  • Rebonato R, Jäckel P (1999) The most general methodology to create a valid correlation matrix for risk management and option pricing purposes. J Risk 2:17–28

    Article  Google Scholar 

  • Robinson PM (1973) Generalized canonical analysis for time series. J Multivar Anal 3:141–160

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw P, Molenberghs G (1993) Transformation of non positive semidefnite correlation matrices. Commun Stat Theory Methods 22:965–984

    Article  MATH  Google Scholar 

  • Shumway RH (2003) Time-frequency clustering and discriminant analysis. Stat Probab Lett 63:307–314

    Article  MathSciNet  MATH  Google Scholar 

  • Simonian J (2010) The most simple methodology to create a valid correlation matrix for risk management and option pricing purposes. Appl Econ Lett 17:1767–1768

    Article  Google Scholar 

  • Smilde A, Bro R, Geladi P (2004) Multi-way analysis: applications in the chemical sciences. Wiley, Chichester

    Book  Google Scholar 

  • Tiao GC, Tsay RS (1989) Model specification in multivariate time series. J R Stat Soc Ser B 51:157–213 (\({\bf with discussion}\))

    MathSciNet  MATH  Google Scholar 

  • Tsay RS, Tiao GC (1985) Use of canonical analysis in time series model identification. Biometrika 72:299–315

    Article  MathSciNet  MATH  Google Scholar 

  • Whittle P (1963) On the fitting of multivariate autoregressions, and the approximate canonical factorization of a spectral density matrix. Biometrika 50:129–134

    Article  MathSciNet  MATH  Google Scholar 

  • Wismüller A, Lange O, Dersch DR, Leinsinger GL, Hahn K, Pütz B, Auer D (2002) Cluster analysis of biomedical image time series. Int J Comput Vis 46:103–128

    Article  MATH  Google Scholar 

  • Yin X (2004) Canonical correlation analysis based on information theory. J Multivar Anal 91:161–176

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors are grateful to anonymous referees and the editor for helpful suggestions which considerably improved the text.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. Billard.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Samadi, S.Y., Billard, L., Meshkani, M.R. et al. Canonical correlation for principal components of time series. Comput Stat 32, 1191–1212 (2017). https://doi.org/10.1007/s00180-016-0667-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-016-0667-1

Keywords

Navigation