Improved Algorithms for Parsing ESLTAGs: A Grammatical Model Suitable for RNA Pseudoknots

Rajasekaran, Sanguthevar; Al Seesi, Sahar; Ammar, Reda

doi:10.1007/978-3-642-01551-9_14

Sanguthevar Rajasekaran²²,
Sahar Al Seesi²² &
Reda Ammar²²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5542))

Included in the following conference series:

International Symposium on Bioinformatics Research and Applications

735 Accesses

Abstract

Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammars (SLTAGs) and Extended SLTAGs (ESLTAGs). Performance of techniques that employ grammatical formalisms critically depend on the efficiency of the underlying parsing algorithms. In this paper we present efficient algorithms for parsing SLTAGs and ESLTAGs. Our algorithm for SLTAGs parsing takes O( min {m,n ⁴}) time and O( min {m,n ⁴}) space where m is the number of entries that will ever be made in the matrix M (that is normally used by TAG parsing algorithms). Our algorithm for ESLTAGs parsing takes O(n min {m,n ⁴}) time and O( min {m,n ⁴}) space. We show that these algorithms perform better in practice than the algorithms of Uemura, et al. [19].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Modeling RNA Secondary Structures Based on Stochastic Tree Adjoining Grammars

Linear-Time Algorithms for RNA Structure Prediction

Pseudoknot-Generating Operation

References

Al Seesi, S., Rajasekaran, S., Ammar, R.: Pseudoknot identification through learning TAG _RNA s. In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 132–143. Springer, Heidelberg (2008)
Chapter Google Scholar
Al Seesi, S., Rajasekaran, S., Ammar, R.: RNA pseudoknot folding through inference and identification using TAG _RNA s. In: International Conference on Bioinformatics and Computational Biology, BiCob 2009 (2009) (to appear)
Google Scholar
van Batenburg, F.H.D., Gultyaev, A.P., Pleij, C.W.A., Ng, J., Oliehoek, J.: Pseudobase: a database with RNA pseudoknots. Nucl. Acids Res. 28(1), 201–204 (2000)
Article PubMed PubMed Central Google Scholar
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation 9, 251–280 (1990); Also in: Proc. 19th Annual ACM Symposium on Theory of Computing, pp. 1-6 (1987)
Article Google Scholar
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: Annotating non-coding RNAs in complete genomes. Nucl. Acids Res. 33, D121-D124 (2005)
Article Google Scholar
Guan, Y., Hotz, G.: An O(n ⁵) recognition algorithm for coupled parenthesis rewriting systems. In: Proc. TAG+ Workshop. University of Pennsylvania, Philadelphia (1992)
Google Scholar
Harbusch, K.: An efficient parsing algorithm for tree adjoining grammars. In: Proc. 28th Meeting of the Association for Computational Linguistics, Pittsburgh, pp. 284–291 (1990)
Google Scholar
Joshi, A.K., Levy, L.S., Takahashi, M.: Tree adjunct. grammars. Journal of Computer and System Sciences 10(1), 136–163 (1975)
Article Google Scholar
Matsui, H., Sato, K., Sakakibara, Y.: Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Bioinformatics 21(11), 2611–2617 (2005)
Article CAS PubMed Google Scholar
Nurkkala, T., Kumar, V.: A parallel parsing algorithm for natural language using tree adjoining grammar. In: Proc. 8th International Parallel Processing Symposium (1994)
Google Scholar
Paillart, J.C., Skripkin, E., Ehresmann, B., Ehresmann, C., Marquet, R.: In vitro evidence for a long range pseudoknot in the 5-untranslated and matrix coding regions of HIV-1 genomic RNA. J. Biol. Chem. 277, 5995–6004 (2002)
Article CAS PubMed Google Scholar
Palis, M., Shende, S., Wei, D.S.L.: An optimal linear time parallel parser for tree adjoining languages. SIAM Journal on Computing 19(1), 1–31 (1990)
Article Google Scholar
Partee, B.H., Ter Meulen, A., Wall, R.E.: Studies in Linguistics and Philosophy, vol. 30. Kluwer Academic Publishers, Dordrecht (1990)
Google Scholar
Rajasekaran, S.: TAL parsing in o(n ⁶) time. SIAM Journal on Computing 25(4), 862–873 (1996)
Article Google Scholar
Rajasekaran, S., Yooseph, S.: TAL parsing in O(M(n ²)) time. Journal of Computer and System Sciences 56(1), 83–89 (1998)
Article Google Scholar
Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjolander, K., Underwood, R.C., Haussler, D.: Stochastic context-free grammars for tRNA modeling. Nucl. Acids Res. 22, 5112–5120 (1994)
Article CAS PubMed PubMed Central Google Scholar
Satta, G.: Tree adjoining grammar parsing and Boolean matrix multiplication. In: Proc. 32nd Meeting of the Association for Computational Linguistics (1994)
Google Scholar
Schabes, Y., Joshi, A.K.: An Earley-type parsing algorithm for tree adjoining grammars. In: Proc. 26th Meeting of the Association for Computational Linguistics, pp. 258–269 (1988)
Google Scholar
Uemura, Y., Hasegawa, A., Kobayashi, S., Yokomori, T.: Tree adjoining grammars for RNA structure prediction. Theoretical Computer Science 210, 277–303 (1999)
Article Google Scholar
Vijayashanker, K., Joshi, A.K.: Some computational properties of tree adjoining grammars. In: Proc. 23rd Meeting of the Association for Computational Linguistics, pp. 82–93 (1985)
Google Scholar
Williams, K.P.: The tmRNA website: Invasion by an intron. Nucl. Acids Res. 30(1), 179–182 (2002)
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Connecticut, Storrs, CT 06269-2155, USA
Sanguthevar Rajasekaran, Sahar Al Seesi & Reda Ammar

Authors

Sanguthevar Rajasekaran
View author publications
You can also search for this author in PubMed Google Scholar
Sahar Al Seesi
View author publications
You can also search for this author in PubMed Google Scholar
Reda Ammar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science & Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 2155, CT 06269, Storrs, USA
Ion Măndoiu
Bioinformatics Research Group (BioRG), School of Computing and Information Sciences, Florida International University, 11200 SW 8th Street, Room ECS254, University Park, FL 33199, Miami, USA
Giri Narasimhan
Department of Computer Science, Georgia State University, P.O. Box 3994, GA 30302-3994, Atlanta, USA
Yanqing Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajasekaran, S., Al Seesi, S., Ammar, R. (2009). Improved Algorithms for Parsing ESLTAGs: A Grammatical Model Suitable for RNA Pseudoknots. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science(), vol 5542. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01551-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-01551-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01550-2
Online ISBN: 978-3-642-01551-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics