Skip to main content

Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT)

  • Conference paper
  • First Online:
Algorithmic Learning Theory (ALT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9355))

Included in the following conference series:

Abstract

This note is a short version of that in [1]. It is intended as a survey for the 2015 Algorithmic Learning Theory (ALT) conference.

This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models—including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation—which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin’s perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anandkumar, A., Ge, R., Hsu, D., Kakade, S., Telgarsky, M.: Tensor decompositions for learning latent variable models. Journal of Machine Learning Research 15, (2014)

    Google Scholar 

  2. Anandkumar, A., Foster, D.P., Hsu, D., Kakade, S.M., Liu, Y.-K.: A spectral algorithm for latent Dirichlet allocation. In: Advances in Neural Information Processing Systems 25, (2012)

    Google Scholar 

  3. Anandkumar, A., Hsu, D., Huang, F., Kakade, S.M.: Learning mixtures of tree graphical models. In: Advances in Neural Information Processing Systems 25 (2012)

    Google Scholar 

  4. Anandkumar, A., Hsu, D., Kakade, S.M.: A method of moments for mixture models and hidden Markov models. In: Twenty-Fifth Annual Conference on Learning Theory, vol. 23, pp. 33.1–33.34 (2012)

    Google Scholar 

  5. Austin, T.: On exchangeable random variables and the statistics of large graphs and hypergraphs. Probab. Survey 5, 80–145 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cardoso, J.-F.: Super-symmetric decomposition of the fourth-order cumulant tensor. Blind identification of more sources than sensors. In: ICASSP-91, 1991 International Conference on Acoustics, Speech, and Signal Processing, pp. 3109–3112. IEEE (1991)

    Google Scholar 

  7. Cardoso, J.-F., Comon, P.: Independent component analysis, a survey of some algebraic methods. In: IEEE International Symposium on Circuits and Systems, pp. 93–96 (1996)

    Google Scholar 

  8. Cattell, R.B.: Parallel proportional profiles and other principles for determining the choice of factors by rotation. Psychometrika 9(4), 267–283 (1944)

    Article  Google Scholar 

  9. Chang, J.T.: Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency. Mathematical Biosciences 137, 51–73 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  10. Comon, P.: Independent component analysis, a new concept? Signal Processing 36(3), 287–314 (1994)

    Article  MATH  Google Scholar 

  11. Comon, P., Golub, G., Lim, L.-H., Mourrain, B.: Symmetric tensors and symmetric tensor rank. SIAM Journal on Matrix Analysis Appl. 30(3), 1254–1279 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  12. Comon, P., Jutten, C.: Handbook of Blind Source Separation: Independent Component Analysis and Applications. Academic Press, Elsevier (2010)

    Google Scholar 

  13. Delfosse, N., Loubaton, P.: Adaptive blind separation of independent sources: a deflation approach. Signal Processing 45(1), 59–83 (1995)

    Article  MATH  Google Scholar 

  14. Alan, M., Frieze, M.J., Kannan, R.: Learning linear transformations. In: Thirty-Seventh Annual Symposium on Foundations of Computer Science, pp. 359–368 (1996)

    Google Scholar 

  15. Golub, G.H., van Loan, C.F.: Matrix Computations. Johns Hopkins University Press (1996)

    Google Scholar 

  16. Hsu, D., Kakade, S.M.: Learning mixtures of spherical Gaussians: moment methods and spectral decompositions. In: Fourth Innovations in Theoretical Computer Science (2013)

    Google Scholar 

  17. Hsu, D., Kakade, S.M., Liang, P.: Identifiability and unmixing of latent parse trees. In: Advances in Neural Information Processing Systems 25 (2012)

    Google Scholar 

  18. Hsu, D., Kakade, S.M., Zhang, T.: A spectral algorithm for learning hidden Markov models. Journal of Computer and System Sciences 78(5), 1460–1480 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Networks 13(4–5), 411–430 (2000)

    Article  Google Scholar 

  20. Kolda, T.G., Mayo, J.R.: Shifted power method for computing tensor eigenpairs. SIAM Journal on Matrix Analysis and Applications 32(4), 1095–1124 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  21. De Lathauwer, L., De Moor, B., Vandewalle, J.: On the best rank-1 and rank-\(({R}_1, {R}_2,., {R}_n)\) approximation and applications of higher-order tensors. SIAM J. Matrix Anal. Appl. 21(4), 1324–1342 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  22. Le Cam, L.: Asymptotic Methods in Statistical Decision Theory. Springer (1986)

    Google Scholar 

  23. Lim, L.-H.: Singular values and eigenvalues of tensors: a variational approach. In: Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing vol. 1, pp. 129–132 (2005)

    Google Scholar 

  24. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)

    Google Scholar 

  25. McCullagh, P.: Tensor Methods in Statistics. Chapman and Hall (1987)

    Google Scholar 

  26. Moitra, A., Valiant, G.: Settling the polynomial learnability of mixtures of Gaussians. In: Fifty-First Annual IEEE Symposium on Foundations of Computer Science, pp. 93–102 (2010)

    Google Scholar 

  27. Mossel, E., Roch, S.: Learning nonsingular phylogenies and hidden Markov models. Annals of Applied Probability 16(2), 583–614 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  28. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, 1999

    Google Scholar 

  29. Pearson, K.: Contributions to the mathematical theory of evolution. In: Philosophical Transactions of the Royal Society, London, A., p. 71 (1894)

    Google Scholar 

  30. Qi, L.: Eigenvalues of a real supersymmetric tensor. Journal of Symbolic Computation 40(6), 1302–1324 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  31. Stegeman, A., Comon, P.: Subtracting a best rank-1 approximation may increase tensor rank. Linear Algebra and Its Applications 433, 1276–1300 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Wedin, P.: Perturbation bounds in connection with singular value decomposition. BIT Numerical Mathematics 12(1), 99–111 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  33. Zhang, T., Golub, G.: Rank-one approximation to high order tensors. SIAM Journal on Matrix Analysis and Applications 23, 534–550 (2001)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sham M. Kakade .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Anandkumar, A., Ge, R., Hsu, D., Kakade, S.M., Telgarsky, M. (2015). Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT). In: Chaudhuri, K., GENTILE, C., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2015. Lecture Notes in Computer Science(), vol 9355. Springer, Cham. https://doi.org/10.1007/978-3-319-24486-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24486-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24485-3

  • Online ISBN: 978-3-319-24486-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics