Abstract
This paper describes an application of the logic programming paradigm to large-scale comparison of complete microbial genomes each containing four-million amino acid characters and approximately two thousand genes. We present algorithms and a Sicstus Prolog based implementation to model genome comparisons as bipartite graph matching to identify orthologs — genes across different genomes with the same function — and groups of orthologous genes — orthologous genes in close proximity, and gene duplications. The application is modular, and integrates logic programming with Unix-based programming and a Prolog based text-processing library developed during this project. The scheme has been successfully applied to compare eukaryotes such as yeast. The data generated by the software is being used by microbiologists and computational biologists to understand the regulation mechanisms and the metabolic pathways in microbial genomes.
This research was supported in part by Research Council grant, Kent State University, Kent, Ohio, USA and Deutsche Forschungsgemeinschaft — The German Federal Agency
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and Watson, J. D.: Molecular Biology of THE CELL, Garland Publishing Inc. 1983
Almgren, J., Anderson J., Anderson S., et. al.: Sicstus 3 Prolog Manual. Swedish Institute of Computer Science, 1995
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J: Basic Alignment Search Tools, J. Mol. Biol., vol. 215, (1991) 403–410
Baby, O. and: Non-Deterministic, Constraint-Based Parsing of Human Gene, Ph. D. thesis, Brandeis University, USA, (http://www.cs.brandeis.edu/~obaby/phdthesis.html)
Bansal, A. K.: Establishing a Framework for Comparative Analysis of Genome Sequences. Proceedings of the International Symposium of Intelligence in Neural and Biological Systems, Herndon VA USA (1995) 84–91
Bansal, A. K.: Automated Reconstruction of Metabolic Pathway of Newly Sequenced Microbial Genomes using Ortholog Analysis. to be submitted
Bansal, A. K., Bork, P., and Stuckey P.: Automated Pair-wise Comparisons of Microbial Genomes. Mathematical Modeling and Scientific Computing, Vol. 9 Issue 1 (1998) 1–23
Fitch, W. M.: Distinguishing Homologous from Analogous Proteins. Systematic Zoology, (1970) 99–113
Fleischmann, R. D., Adams, M. D., White O., et. al.: Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae Rd. Science 269 (1995) 496–512
Gaasterland, T., Maltsev, N., and Overbeek, R.: The Role of Integrated Databases In Microbial Genome Sequence Analysis and Metabolic Reconstruction. In: Proceedings of the Second International Meeting on Integration of Molecular Biological Databases, Cambridge, England, July 1995.
Gish, W.: WU-BLAST version 2.0. Washington University, St. Louis MO USA, (http://blast.wustl.edu)
Hardy, P. and Waterman, M. S.: The Sequence Alignment Software Library. University of Southern California LA USA, (http://www-hto.usc.edu/software/seqaln/)
Olsen, J., Woese, C. R., and Overbeek R.: The Winds of Evolutionary Change: Breathing New Life into Microbiology. Journal of Bacteriology, Vol. 176 issue 1 (1994) 1–6
Papadimitrou, C. H., and Steiglitz, K.: Combinatorial Optimization: Algorithm and Complexity. Prentice Hall, (1982)
Searls, D. B.: The Linguistics of DNA. American Scientist 80: 579–591
Setubal, J. and Meidanis J.: Introduction to Computational Biology. PWS Publishing Company, (1997)
Sterling, L. S.. and Shapiro, E. Y.: The Art of Prolog. MIT Press, (1994)
Tatusov, R. L., Mushegian, M., Bork P. et. al.: Metabolism and Evolution of Haemophilius Influenzae Deduced From a Whole-Genome Comparison with Escherichia Coli. Current Biology, Vol. 6 issue 3 (1996) 279–291
Tatusov, R. L., Koonin, E. V., Lipman, D. J.: A Genomic Perspective on Protein Families. Science, Vol. 278 (1997) 631–637
Vitreschak, A., Bansal, A. K., Gelfand, M. S.: Conserved RNA structures regulation initiation of translation of Escerichia coli and Haemophilus influenzae ribosomal protein operons. First International Conference on Bioinformatics of Genome Regulation and Structure, Novosibirsk, Russia, (August 1998) 229
Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. Chapman & Hall, (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bansal, A.K., Bork, P. (1998). Applying Logic Programming to Derive Novel Functional Information of Genomes. In: Gupta, G. (eds) Practical Aspects of Declarative Languages. PADL 1999. Lecture Notes in Computer Science, vol 1551. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49201-1_19
Download citation
DOI: https://doi.org/10.1007/3-540-49201-1_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65527-5
Online ISBN: 978-3-540-49201-6
eBook Packages: Springer Book Archive