Comparing a Hidden Markov Model and a Stochastic Context-Free Grammar

Jagota, Arun; Lyngsø, Rune B.; Pedersen, Christian N. S.

doi:10.1007/3-540-44696-6_6

Arun Jagota⁶,
Rune B. Lyngsø⁶ &
Christian N. S. Pedersen⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2149))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

490 Accesses
1 Citations

Abstract

Stochastic models are commonly used in bioinformatics, e.g., hidden Markov models for modeling sequence families or stochastic context-free grammars for modeling RNA secondary structure formation. Comparing data is a common task in bioinformatics, and it is thus natural to consider how to compare stochastic models. In this paper we present the first study of the problem of comparing a hidden Markov model and a stochastic context-free grammar. We describe how to compute their co-emission—or collision—probability, i.e., the probability that they independently generate the same sequence. We also consider the related problem of finding a run through a hidden Markov model and derivation in a grammar that generate the same sequence and have maximal joint probability by a generalization of the C YK algorithm for parsing a sequence by a stochastic context-free grammar. We illustrate the methods by an experiment on RNA secondary structures.

Supported by grants from Carlsbergfondet and the Program in Mathematics and Molecular Biology

Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K. Asai, S. Hayamizu, and K. Handa. Prediction of protein secondary strucuture by the hidden markov model. Computer Applications in the Biosciences (CABIOS), 9:141–146, 1993.
Google Scholar
J. K. Baker. Trainable grammars for speech recognition. In Speech Communications Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550, 1979.
Google Scholar
G. A. Churchill. Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical Biology, 51:79–94, 1989.
MATH MathSciNet Google Scholar
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Press, 1990.
Google Scholar
R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probalistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.
Google Scholar
J. Hℴastad, S. Phillips, and S. Safra. A well characterized approximation problem. Information Processing Letters, 47(6):301–305, 1993.
Article MathSciNet Google Scholar
B. Knudsen and J. Hein. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics, 15:446–454, 1999.
Article Google Scholar
A. Krogh. Two methods for improving performance of an HMM and their application for gene finding. In Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 179–186, 1997.
Google Scholar
A. Krogh, M. Brown, I. S. Mian, K. Sjölander, and D. Haussler. Hidden markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology, 235:1501–1531, 1994.
Article Google Scholar
R. B. Lyngsø, C. N. S. Pedersen, and H. Nielsen. Metrics and similarity measures for hidden Markov models. In Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 178–186, 1999.
Google Scholar
J. S. McCaskill. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29:1105–1119, 1990.
Article Google Scholar
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, volume 77, pages 257–286, 1989.
Article Google Scholar
E. Rivas and S. R. Eddy. The language of RNA: A formal grammar that includes pseudo-knots. Bioinformatics, 16(4):334–340, 2000.
Article Google Scholar
Y. Sakakibara, M. Brown, R. Hughey, I. S. Mian, K. Sjölander, R. C. Underwood, and D. Haussler. Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research, 22:5112–5120, 1994.
Article Google Scholar
D. B. Searls. The linguistics of DNA. American Scientist, 80(579–591), 1992.
Google Scholar
E. L. L. Sonnhammer, G. von Heijne, and A. Krogh. A hidden Markov model for predicting transmembrane helices in protein sequences. In Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology (ISMB), 1998.
Google Scholar
T. A. Sudkamp. Languages and Machines. Computer Science. Addison-Wesley Publishing Company, Inc., 1998.
Google Scholar
Y. Uemura, A. Hasegawa, S. Kobayashi, and T. Yokomori. Tree adjoining grammars for RNA structure prediction. Theoretical Computer Science, 210:277–303, 1999.
Article MATH MathSciNet Google Scholar
M. Zuker. On finding all suboptimal foldings of an RNA molecule. Science, 244:48–52, 1989.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Baskin Center for Computer Science and Engineering, University of California at Santa Cruz, Santa Cruz, CA, 95064, USA
Arun Jagota & Rune B. Lyngsø
Basic Research in Computer Science (BRICS) Department of Computer Science, University of Aarhus, Ny Munkegade, DK-8000, Århus C, DK
Christian N. S. Pedersen

Authors

Arun Jagota
View author publications
You can also search for this author in PubMed Google Scholar
Rune B. Lyngsø
View author publications
You can also search for this author in PubMed Google Scholar
Christian N. S. Pedersen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIRMM, 161 rue Ada, 34392, Montpellier, France
Olivier Gascuel
Department of Computer Science, University of New Mexico, Albuquerque, NM, 87131, USA
Bernard M. E. Moret

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jagota, A., Lyngsø, R.B., Pedersen, C.N.S. (2001). Comparing a Hidden Markov Model and a Stochastic Context-Free Grammar. In: Gascuel, O., Moret, B.M.E. (eds) Algorithms in Bioinformatics. WABI 2001. Lecture Notes in Computer Science, vol 2149. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44696-6_6

Download citation

DOI: https://doi.org/10.1007/3-540-44696-6_6
Published: 17 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42516-8
Online ISBN: 978-3-540-44696-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics