Abstract
A genomic island (GI) is a segment of genomic sequence that is horizontally transferred from other genomes. The detection of genomic islands is extremely important to the medical research. Most of current computational approaches that use sequence composition to predict genomic islands have the problem of low prediction accuracy. In this paper, we report, for the first time, that gene information and inter-genic distance are different between genomic islands and non-genomic islands. Using these two sources and sequence information, we have trained the genomic island datasets from 113 genomes, and developed a decision-tree based bagging model for genomic island prediction. In order to test the performance our approach, we have applied it on three genomes: Salmonella typhimurium LT2, Streptococcus pyogenes MGAS315, and Escherichia coli O157:H7 str. Sakai. The performance metrics have shown that our approach is better than other sequence composition based approaches. We conclude that the incorporation of gene information and intergenic distance could improve genomic island prediction accuracy. Our prediction software, Genomic Island Hunter (GIHunter), is available at http://www.esu.edu/cpsc/che_lab/software/GIHunter .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hacker, J., Kaper, J.B.: Pathogenicity islands and the evolution of microbes. Annu. Rev. Microbiol. 54, 641–679 (2000)
Hacker, J., Bender, L., Ott, M., et al.: Deletions of chromosomal regions coding for fimbriae and hemolysins occur in vitro and in vivo in various extraintestinal Escherichia coli isolates. Microb. Pathog. 8(3), 213–225 (1990)
Hacker, J., Blum-Oehler, G., Muhldorfer, I., et al.: Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol. Microbiol. 23(6), 1089–1097 (1997)
Lawrence, J.G., Ochman, H.: Amelioration of bacterial genomes: rates of change and exchange. J. Mol. Evol. 44(4), 383–397 (1997)
Karlin, S., Mrazek, J., Campbell, A.M.: Codon usages in different gene classes of the Escherichia coli genome. Mol. Microbiol. 29(6), 1341–1355 (1998)
Karlin, S.: Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol. 9(7), 335–343 (2001)
Hensel, M.: Genome-based identification and molecular analyses of pathogenicity islands and genomic islands in Salmonella enterica. Methods Mol. Biol. 394, 77–88 (2007)
Cheetham, B.F., Katz, M.E.: A role for bacteriophages in the evolution and transfer of bacterial virulence determinants. Mol. Microbiol. 18(2), 201–208 (1995)
Langille, M.G., Hsiao, W.W., Brinkman, F.S.: Detection of genomic islands using bioinformatics approaches. Nature Reviews Microbiology 8(5), 373–382 (2010)
Langille, M.G., Hsiao, W.W., Brinkman, F.S.: Evaluation of genomic island predictors using a comparative genomics approach. BMC Bioinformatics 9, 329 (2008)
Ou, H.Y., He, X., Harrison, E.M., et al.: MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands. Nucleic Acids Res. 35, W97–W104 (2007)
Vernikos, G.S., Parkhill, J.: Resolving the structural features of genomic islands: a machine learning approach. Genome Res. 18(2), 331–342 (2008)
Vernikos, G.S., Parkhill, J.: Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 22(18), 2196–2203 (2006)
Rajan, I., Aravamuthan, S., Mande, S.S.: Identification of compositionally distinct regions in genomes using the centroid method. Bioinformatics 23(20), 2672–2677 (2007)
Hsiao, W., Wan, I., Jones, S.J., et al.: IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics 19(3), 418–420 (2003)
Tu, Q., Ding, D.: Detecting pathogenicity islands and anomalous gene clusters by iterative discriminant analysis. FEMS Microbiology Letters 221, 269–275 (2003)
Waack, S., Keller, O., Oliver, A., et al.: Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7(1), 142 (2006)
Karlin, S., Mrazek, J.: Predicted highly expressed genes of diverse prokaryotic genomes. J. Bacteriology 182(18), 5238–5250 (2000)
Brieman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)
Che, D., Hockenbury, C., Marmelstein, R., Rasheed, K.: Classification of genomic islands using decision trees and their ensemble algorithms. BMC Genomics 11(Suppl 2), S1 (2010)
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Perna, N.T., Plunkett, G., Burland, V., et al.: Complete genome Sequence of Enterohaemorrhagic Escherichia coli O157:H7. Nature 409, 529–533 (2001)
Beres, S.B., Sylva, G.L., Barbian, K.D., et al.: Genome Sequence of a serotype M3 strain of group A Sreptococcus: Phage-encoded toxins, the high-virulence phenotype, and clone emergence. Proceedings of National Academy of Science 99, 10078–10083 (2002)
McClelland, M., Sanderson, K.E., Spieth, J., et al.: Complete genome Squence of Salmonella enterica serovar Typhimurium LT2. Nature 413, 852–856 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Fazekas, J., Booth, M., Liu, Q., Che, D. (2011). An Integrative Approach for Genomic Island Prediction in Prokaryotic Genomes. In: Chen, J., Wang, J., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2011. Lecture Notes in Computer Science(), vol 6674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21260-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-21260-4_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21259-8
Online ISBN: 978-3-642-21260-4
eBook Packages: Computer ScienceComputer Science (R0)