Abstract
Hidden Markov Models (HMMs) are widely used to model discrete time series data, but the EM and Gibbs sampling methods used to estimate them are often slow or prone to get stuck in local minima. A more recent class of reduced-dimension spectral methods for estimating HMMs has attractive theoretical properties, but their finite sample size behavior has not been well characterized. We introduce a new spectral model for HMM estimation, a corresponding spectral bilinear regression model, and systematically compare them with a variety of competing simplified models, explaining when and why each method gives superior performance. Using regression to estimate HMMs has a number of advantages, allowing more powerful and flexible modeling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baum, L.E., Eagon, J.A.: An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a model for ecology. Bull. Amer. Math. Soc (1967)
Brown, P.F., de Souza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Computational Linguistics (1992)
Cohen, S., Stratos, K., Collins, M., Foster, D., Ungar, L.: Spectral learning of latent-variable pcfgs. In: Association of Computational Linguistics (ACL), vol. 50 (2012)
Cohen, S.B., Stratos, K., Collins, M., Foster, D.P., Ungar, L.: Experiments with spectral learning of latent-variable pcfgs. In: NAACL (2013)
Dhillon, P., Foster, D., Ungar, L.: Multi-view learning of word embeddings via cca. In: NIPS (2011)
Foster, D., Rodu, J., Ungar, L.: Spectral dimensionality reduction for HMMs. ArXiV (2012)
Himmelmann, S.S.D.L.: HMM: Hidden Markov Models (2010), http://CRAN.R-project.org/package=HMM
Hsu, D., Kakade, S.M., Zhang, T.: A spectral algorithm for learning hidden markov models. In: COLT (2009)
Huang, F., Yates, A.: Open-domain semantic role labeling by modeling word spans. In: Association of Computational Linguistics (ACL) (2010)
Jaeger, H.: Observable operator models for discrete stochastic time series. Neural Computation 12(6) (2000)
Li, D., Miller, T., Schuler, W.: A pronoun anaphora resolution system based on factorial hidden markov models. In: Association of Computational Linguistics (ACL) (2011)
Luque, F., Quattoni, A., Balle, B., Carreras, X.: Spectral learning for non-deterministic dependency parsing. In: EACL (2012)
Maskey, S., Hirschberg, J.: Summarizing speech without text using hidden markov models. In: Association of Computational Linguistics (ACL) (2006)
Parker, R., et al.: English gigaword, 4th edn. Linguistic Data Consortium, Philadelphia (2009)
Rosenfeld, R.: A maximum entropy approach to adaptive statistical language modeling. Computer Speech and Language 10, 187–228 (1996)
Siddiqi, S.M., Boots, B., Gordon, G.J.: Reduced-rank hidden markov models. In: Proc. 13th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS) (2010)
Song, L., Boots, B., Siddiqi, S., Gordon, G., Smola, A.: Hilbert space embeddings of hidden markov models. In: Proc. 27th Intl. Conf. on Machine Learning (ICML) (2010)
Turdakov, D., Lizorkin, D.: Hmm expanded to multiple interleaved chains as a model for word sense disambiguation. In: PACLIC (2009)
Zabokrtsky, Z., Popel, M.: Hidden markov tree model in dependency-based machine translation. In: ACL-IJCNLP (2009)
Zhao, S., Gildea, D.: A fast fertility hidden markov model forword alignment using mcmc. In: EMNLP (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rodu, J., Foster, D.P., Wu, W., Ungar, L.H. (2013). Using Regression for Spectral Estimation of HMMs. In: Dediu, AH., MartÃn-Vide, C., Mitkov, R., Truthe, B. (eds) Statistical Language and Speech Processing. SLSP 2013. Lecture Notes in Computer Science(), vol 7978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39593-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-39593-2_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39592-5
Online ISBN: 978-3-642-39593-2
eBook Packages: Computer ScienceComputer Science (R0)