Skip to main content

Set-Oriented Dimension Reduction: Localizing Principal Component Analysis Via Hidden Markov Models

  • Conference paper
Computational Life Sciences II (CompLife 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4216))

Included in the following conference series:

  • 662 Accesses

Abstract

We present a method for simultaneous dimension reduction and metastability analysis of high dimensional time series. The approach is based on the combination of hidden Markov models (HMMs) and principal component analysis. We derive optimal estimators for the log-likelihood functional and employ the Expectation Maximization algorithm for its numerical optimization. We demonstrate the performance of the method on a generic 102-dimensional example, apply the new HMM-PCA algorithm to a molecular dynamics simulation of 12–alanine in water and interpret the results.

Supported in part by the DFG Research Center MATHEON, Berlin, and Microsoft Research Ltd., Cambridge, UK (Contract No. 2005-042).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ichiye, T., Karplus, M.: Collective motions in proteins – a covariance analysis of atomic fluctuations in molecular dynamics and normal mode simulations. Proteins 11, 205–217 (1991)

    Article  Google Scholar 

  2. Frenkel, D., Smit, B.: Understanding Molecular Dynamics: From Algorithms to Applications. Academic Press, London (2002)

    Google Scholar 

  3. Weinan, E., Vanden-Eijnden, E.: Metastability, conformation dynamics, and transition pathways in complex systems. In: Attinger, S., Koumoutsakos, P. (eds.) Multiscale, Modelling, and Simulation, pp. 35–68. Springer, Berlin (2004)

    Google Scholar 

  4. Deuflhard, P., Schütte, C.: Molecular conformation dynamics and computational drug design. In: Applied Mathematics Entering the 21st Century: Invited Talks from the ICIAM 2003 Congress (2004)

    Google Scholar 

  5. Holmes, P., Lumley, J., Berkooz, G.: Turbulence, Coherent Structures, Dynamical Systems and Symmetry. Cambridge University Press, Cambridge (1996)

    Book  MATH  Google Scholar 

  6. Givon, D., Kupferman, R., Stuart, A.: Extracting macroscopic dynamics: Model problems and algorithms. Nonlinearity 17, R55–R127 (2004)

    Article  MathSciNet  Google Scholar 

  7. Kupferman, R., Stuart, A.: Fitting sde models to nonlinear kac-zwanzig heat bath models. Physica D 199, 279–316 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Balsera, M., Wriggers, W., Oono, Y., Schulten, K.: Pricipal Component Analysis and long time protein dynamics. J. Chem. Phys. 100, 2567–2572 (1996)

    Article  Google Scholar 

  9. Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, Chichester (2001)

    Book  Google Scholar 

  10. Meyer, T., Ferrer-Costa, C., Perez, A., Rueda, M., Bidon-Chanal, A., Luque, F., Laughton, C., Orozco, M.: Essential dynamics: a tool for efficient trajectory compression and management. JCTC 2, 251–258 (2006)

    Google Scholar 

  11. Hünenberger, P., Mark, A., van Gunsteren, W.: Fluctuation and cross-correlation analysis of protein motions observed in nanosecond molecular dynamics simulations. J. Mol. Biol. 252, 492–503 (1995)

    Article  Google Scholar 

  12. Monahan, A.: Nonlinear principal component analysis by neural networks: Theory and application to the lorenz system. J. Climate 13, 821–835 (2000)

    Article  Google Scholar 

  13. Christiansen, B.: The shortcomings of NLPCA in identifying circulation regimes. J. Climate 18, 4814–4823 (2005)

    Article  MathSciNet  Google Scholar 

  14. Aggarwal, C., Wolf, J., Yu, P., Procopiuc, C., Park, J.: Fast algorithms for projected clustering. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999)

    Google Scholar 

  15. Chakrabarti, K., Mehrotra, S.: Local dimensionality reduction: A new approach to indexing high dimensional spaces. In: Proceedings of the 26th VLDB Conference, Cairo, Egypt, pp. 98–115 (2000)

    Google Scholar 

  16. Zhang, P., Huang, Y., Shekhar, S., Kumar, V.: Correlation analysis of spatial time series datasets: A filter-and-refine approach. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, Springer, Heidelberg (2003)

    Google Scholar 

  17. Baum, L., Petrie, T., Soules, G., Weiss, N.: A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Stat. 41, 164–171 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  18. Baum, L.: An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3, 1–8 (1972)

    Google Scholar 

  19. Bilmes, J.: A Gentle Tutorial of the EM Algorithm and its Applications to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Thechnical Report. International Computer Science Institute, Berkeley (1998)

    Google Scholar 

  20. Ghahramani, Z.: An introduction to hidden Markov models and Bayesian networks. Int. J. Pattern Recognition and Artificial Intelligence 15, 9–42 (2001)

    Article  Google Scholar 

  21. Frydman, J., Lakner, P.: Maximum likelihood estimation of hidden Markov processes. Ann. Appl. Prob. 13, 1296–1312 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  22. Horenko, I., Dittmer, E., Fischer, A., Schütte, C.: Automated model reduction for complex systems exhibiting metastability. In: SIAM Multiscale Modeling and Simulation (accepted for publication, 2005)

    Google Scholar 

  23. Golub, G., van Loan, C.: Matrix computations, 2nd edn. The John Hopkins University Press, Baltimore (1989)

    MATH  Google Scholar 

  24. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  25. Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Informat. Theory 13, 260–269 (1967)

    Article  MATH  Google Scholar 

  26. Schmidt-Ehrenberg, J., Baum, D., Hege, H.C.: Visualizing dynamic molecular conformations. In: Proceedings of IEEE Visualization 2002, pp. 235–242 (2002)

    Google Scholar 

  27. Schütte, C., Fischer, A., Huisinga, W., Deuflhard, P.: A direct approach to conformational dynamics based on hybrid Monte Carlo. J. Comput. Phys. 151, 146–168 (1999)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Horenko, I., Schmidt-Ehrenberg, J., Schütte, C. (2006). Set-Oriented Dimension Reduction: Localizing Principal Component Analysis Via Hidden Markov Models. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_8

Download citation

  • DOI: https://doi.org/10.1007/11875741_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45767-1

  • Online ISBN: 978-3-540-45768-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics