Skip to main content

Genome-Wide Prokaryotic Promoter Recognition Based on Sequence Alignment Kernel

  • Conference paper
  • 1672 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Abstract

In this paper an application of Sequence Alignment Kernel for recognition of prokaryotic promoters with transcription start sites (TSS) is presented. An algorithm for computing this kernel in square time is described. Using this algorithm, a “promoter map” of E.coli genome has been computed. This is a curve reflecting the likelihood of every base of a given genomic sequence to be a TSS. A viewer showing the likelihood curve with positions of known and putative genes and known TSS has also been developed and made available online.

Although the visual analysis of the promoter regions is very intuitive, we propose an automatic genome-wide promoter prediction scheme that simplifies the routine of checking the whole promoter map visually. Computational efficiency and speed issue are also discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blattner, F.R., Plunkett, G., Bloch, C.A., Perna, N.T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K., Mayhew, G.F., Gregor, J., Davis, N.W., Kirkpatrick, H.A., Goeden, M.A., Rose, D.J., Mau, B., Shao, Y.: The complete genome sequence of Escherichia coli k-12. Science 277, 1453–1462 (1997)

    Article  Google Scholar 

  2. De Haseth, P.L., Zupancic, M.L., Record Jr., M.T.: RNA polymerase-promoter interactions: the comings and goings of RNA polymerase. Journal of Bacteriology 180, 3019–3025 (1998)

    Google Scholar 

  3. Staden, R.: Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 12, 505–519 (1984)

    Article  Google Scholar 

  4. Lukashin, A.V., Anshelevich, V.V., Amirikyan, B.R., Gragerov, A.I., Frank- Kamenetskii, M.D.: Neural network models for promoter recognition. J Biomol. Struct. Dyn. 6, 1123–1133 (1989)

    Google Scholar 

  5. O’Neill, M.C.: Training back-propagation neural networks to define and detect DNA-binding sites. Nucleic Acids Res. 19, 313–318 (1991)

    Article  Google Scholar 

  6. O’Neill, M.C.: Escherichia coli promoters: neural networks develop distinct descriptions in learning to search for promoters of different spacing classes. Nucleic Acids Res. 20, 3471–3477 (1992)

    Article  MathSciNet  Google Scholar 

  7. Mahadevan, I., Ghosh, I.: Analysis of E.coli promoter structures using neural networks. Nucleic Acids Res. 22, 2158–2165 (1994)

    Article  Google Scholar 

  8. Alexandrov, N.N., Mironov, A.A.: Application of a new method of pattern recognition in DNA sequence analysis: a study of E.coli promoters. Nucleic Acids Res. 18, 1847–1852 (1990)

    Article  Google Scholar 

  9. Pedersen, A.G., Baldi, P., Brunak, S., Chauvin, Y.: Characterization of prokaryotic and eukaryotic promoters using hidden markov models. In: Proceedings of the, Conference on Intelligent Systems for Molecular Biology, 182–191 (1996)

    Google Scholar 

  10. Bailey, T., Hart, W.E.: Learning consensus patterns in unaligned DNA sequences using a genetic algorithm (web), http://citeseer.nj.nec.com/172804.html

  11. Rosenblueth, D.A., Thieffry, D., Huerta, A.M., Salgado, H., Collado-Vides, J.: Syntactic recognition of regulatory regions in Escherichia coli. Computer Applications in Biology 12, 415–422 (1996)

    Google Scholar 

  12. Leung, S.W., Mellish, C., Robertson, D.: Basic gene grammars and dna-chartparser for language processing of escherichia coli promoter dna sequences. Bioinformatics 17, 226–236 (2001)

    Article  Google Scholar 

  13. Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning 21, 51–80 (1995)

    Google Scholar 

  14. Tompa, M.: An exact method for finding short motifs in sequences, with application to the ribosome binding site problem. In: Seventh International Conference on Intelligent Systems for Molecular Biology, 262–271 (1999)

    Google Scholar 

  15. Kent, J.: Improbizer motif discovery program with web interface (web), http://www.cse.ucsc.edu/~kent/improbizer/improbizer.html

  16. Horton, P.B., Kanehisa, M.: An assesment of neural network and statistical approaches for prediction of E.coli promoter sites. Nucleic Acids Res. 20, 4331–4338 (1992)

    Article  Google Scholar 

  17. Hawley, D.K., McClure, W.R.: Compilation and analysis of escherichia coli promoter dna sequences. Nucleic Acids Res. 11, 2237–2255 (1983)

    Article  Google Scholar 

  18. O’Neill, M.C.: Escherichia coli promoters. I. Consensus as it relates to spacing class, specificity, repeat substructure, and three-dimensional organization. Journal of Biological Chemistry 264, 5522–5530 (1989)

    Google Scholar 

  19. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 433–453 (1970)

    Article  Google Scholar 

  20. Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1982)

    Article  Google Scholar 

  21. Watkins, C.: Dynamic alignment kernels. In: Smola, A.J., Bartlett, P.L., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 39–50. MIT Press, Cambridge (2000)

    Google Scholar 

  22. Vapnik, V.N.: Statistical learning theory. Wiley, New York (1998)

    MATH  Google Scholar 

  23. Gordon, L., Chervonenkis, A.Y., Gammerman, A.J., Shahmuradov, I.A., Solovyev, V.V.: Sequence Alignment Kernel for recognition of promoter regions. Bioinformatics (2003) (to appear)

    Google Scholar 

  24. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S., Millan-Zarate, D., Blattner, F.R., Collado-Vides, J.: Regulondb (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res. 28, 65–67 (2000), http://www.cifn.unam.mx/Computational_Genomics/regulondb/

  25. Hershberg, R., Bejerano, G., Santos-Zavaleta, A., Margalit, H.: Promec: An updated database of Escherichia coli mRNA promoters with experimentally identified transcriptional start sites. Nucleic Acids Res. 29, 277 (2001), http://bioinfo.md.huji.ac.il/marg/promec/

    Article  Google Scholar 

  26. Gordon, L.: VIsualiser of GENes – E.coli gene and TSS map together with promoter prediction curve – web interface (web), http://nostradamus.cs.rhul.ac.uk/~leo/vigen/

  27. Foster, I., Kesselman, C.: Computational grids. In: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  28. Gammerman, A., Vovk, V.: Prediction algorithms and confidence measures based on algorithmic randomness theory. Theoretical Computer Science 287, 209–217 (2002)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gordon, L., Chervonenkis, A.Y., Gammerman, A.J., Shahmuradov, I.A., Solovyev, V.V. (2003). Genome-Wide Prokaryotic Promoter Recognition Based on Sequence Alignment Kernel. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45231-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40813-0

  • Online ISBN: 978-3-540-45231-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics