Skip to main content

Comparing a Hidden Markov Model and a Stochastic Context-Free Grammar

  • Conference paper
  • First Online:
Algorithms in Bioinformatics (WABI 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2149))

Included in the following conference series:

Abstract

Stochastic models are commonly used in bioinformatics, e.g., hidden Markov models for modeling sequence families or stochastic context-free grammars for modeling RNA secondary structure formation. Comparing data is a common task in bioinformatics, and it is thus natural to consider how to compare stochastic models. In this paper we present the first study of the problem of comparing a hidden Markov model and a stochastic context-free grammar. We describe how to compute their co-emission—or collision—probability, i.e., the probability that they independently generate the same sequence. We also consider the related problem of finding a run through a hidden Markov model and derivation in a grammar that generate the same sequence and have maximal joint probability by a generalization of the C YK algorithm for parsing a sequence by a stochastic context-free grammar. We illustrate the methods by an experiment on RNA secondary structures.

Supported by grants from Carlsbergfondet and the Program in Mathematics and Molecular Biology

Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Asai, S. Hayamizu, and K. Handa. Prediction of protein secondary strucuture by the hidden markov model. Computer Applications in the Biosciences (CABIOS), 9:141–146, 1993.

    Google Scholar 

  2. J. K. Baker. Trainable grammars for speech recognition. In Speech Communications Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550, 1979.

    Google Scholar 

  3. G. A. Churchill. Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical Biology, 51:79–94, 1989.

    MATH  MathSciNet  Google Scholar 

  4. T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Press, 1990.

    Google Scholar 

  5. R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probalistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.

    Google Scholar 

  6. J. Hℴastad, S. Phillips, and S. Safra. A well characterized approximation problem. Information Processing Letters, 47(6):301–305, 1993.

    Article  MathSciNet  Google Scholar 

  7. B. Knudsen and J. Hein. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics, 15:446–454, 1999.

    Article  Google Scholar 

  8. A. Krogh. Two methods for improving performance of an HMM and their application for gene finding. In Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 179–186, 1997.

    Google Scholar 

  9. A. Krogh, M. Brown, I. S. Mian, K. Sjölander, and D. Haussler. Hidden markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology, 235:1501–1531, 1994.

    Article  Google Scholar 

  10. R. B. Lyngsø, C. N. S. Pedersen, and H. Nielsen. Metrics and similarity measures for hidden Markov models. In Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 178–186, 1999.

    Google Scholar 

  11. J. S. McCaskill. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29:1105–1119, 1990.

    Article  Google Scholar 

  12. L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, volume 77, pages 257–286, 1989.

    Article  Google Scholar 

  13. E. Rivas and S. R. Eddy. The language of RNA: A formal grammar that includes pseudo-knots. Bioinformatics, 16(4):334–340, 2000.

    Article  Google Scholar 

  14. Y. Sakakibara, M. Brown, R. Hughey, I. S. Mian, K. Sjölander, R. C. Underwood, and D. Haussler. Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research, 22:5112–5120, 1994.

    Article  Google Scholar 

  15. D. B. Searls. The linguistics of DNA. American Scientist, 80(579–591), 1992.

    Google Scholar 

  16. E. L. L. Sonnhammer, G. von Heijne, and A. Krogh. A hidden Markov model for predicting transmembrane helices in protein sequences. In Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology (ISMB), 1998.

    Google Scholar 

  17. T. A. Sudkamp. Languages and Machines. Computer Science. Addison-Wesley Publishing Company, Inc., 1998.

    Google Scholar 

  18. Y. Uemura, A. Hasegawa, S. Kobayashi, and T. Yokomori. Tree adjoining grammars for RNA structure prediction. Theoretical Computer Science, 210:277–303, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  19. M. Zuker. On finding all suboptimal foldings of an RNA molecule. Science, 244:48–52, 1989.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jagota, A., Lyngsø, R.B., Pedersen, C.N.S. (2001). Comparing a Hidden Markov Model and a Stochastic Context-Free Grammar. In: Gascuel, O., Moret, B.M.E. (eds) Algorithms in Bioinformatics. WABI 2001. Lecture Notes in Computer Science, vol 2149. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44696-6_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-44696-6_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42516-8

  • Online ISBN: 978-3-540-44696-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics