Skip to main content

Advertisement

Log in

Computational Challenges in Characterization of Bacteria and Bacteria-Host Interactions Based on Genomic Data

  • Survey
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

With the rapid development of next-generation sequencing technologies, bacterial identification becomes a very important and essential step in processing genomic data, especially for metagenomic data. Many computational methods have been developed and some of them are widely used to address the problems in bacterial identification. In this article we review the algorithms of these methods, discuss their drawbacks, and propose future computational methods that use genomic data to characterize bacteria. In addition, we tackle two specific computational problems in bacterial identification, namely, the detection of host-specific bacteria and the detection of disease-associated bacteria, by offering potential solutions as a starting point for those who are interested in the area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Whitman W B, Coleman D C, Wiebe W J. Prokaryotes: The unseen majority. Proc. Natl. Acad. Sci. U.S.A., 1998, 95(12): 6578–6583.

    Google Scholar 

  2. Curtis T P, Sloan W T, Scannell J W. Estimating prokaryotic diversity and its limits. Proc. Natl. Acad. Sci. U.S.A., 2002, 99(16): 10494–10499.

    Google Scholar 

  3. Fredrickson J K, Zachara J M, Balkwill D L, Kennedy D, Li S M, Kostandarithes H M, Daly M J, Romine M F, Brockman F J. Geomicrobiology of high-level nuclear waste-contaminated vadose sediments at the hanford site, Washington State. Appl. Environ. Microbiol., 2004, 70(7): 4230–4241.

    Google Scholar 

  4. Turnbaugh P J, Hamady M, Yatsunenko T, Cantarel B L, Duncan A, Ley R E, Sogin M L, Jones W J, Roe B A, Affourtit J P, Egholm M, Henrissat B, Heath A C, Knight R, Gordon J I. A core gut microbiome in obese and lean twins. Nature, 2009, 457(7228): 480–484.

    Google Scholar 

  5. Dinsdale E A, Pantos O, Smriga S, Edwards R A, Angly F, Wegley L, Hatay M, Hall D, Brown E, Haynes M, Krause L, Sala E, Sandin S A, Thurber R V, Willis B L, Azam F, Knowlton N, Rohwer F. Microbial ecology of four coral atolls in the Northern Line Islands. PLoS One, 2008, 3(2): e1584.

    Google Scholar 

  6. Lorenz P, Eck J. Metagenomics and industrial applications. Nat. Rev. Microbiol., 2005, 3(6): 510–516.

    Google Scholar 

  7. Ishige T, Honda K, Shimizu S. Whole organism biocatalysis. Curr. Opin. Chem. Biol., 2005, 9(2): 174–180.

    Google Scholar 

  8. Andries K, Verhasselt P, Guillemont J, Gohlmann H W, Neefs J M, Winkler H, Van Gestel J, Timmerman P, Zhu M, Lee E, Williams P, de Chaffoy D, Huitric E, Hoffner S, Cambau E, Truffot-Pernot C, Lounis N, Jarlier V. A diarylquinoline drug active on the ATP synthase of Mycobacterium tuberculosis. Science, 2005, 307(5707): 223–227.

    Google Scholar 

  9. Fleischmann R D, Adams M D, White O, Clayton R A, Kirkness E F, Kerlavage A R, Bult C J, Tomb J F, Dougherty B A, Merrick J M et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 1995, 269(5223): 496–512.

    Google Scholar 

  10. Nishida H, Kondo S, Nojiri H, Noma K, Oshima K. Evolutionary mechanisms of microbial genomes. Int. J. Evol. Biol., 2011: 319479.

  11. Schloss P D, Handelsman J. Status of the microbial census. Microbiol. Mol. Biol. Rev., 2004, 68(4): 686–691.

    Google Scholar 

  12. Petrosino J F, Highlander S, Luna R A, Gibbs R A, Versalovic J. Metagenomic pyrosequencing and microbial identification. Clin. Chem., 2009, 55(5): 856–866.

    Google Scholar 

  13. Wooley J C, Ye Y. Metagenomics: Facts and artifacts, and computational challenges. J. Comput. Sci. Technol., 2009, 25(1): 71–81.

    Google Scholar 

  14. Pallen M J and Wren B W. Bacterial pathogenomics. Nature, 2007, 449(7164): 835–842.

    Google Scholar 

  15. Fricke W F, Rasko D A, Ravel J. The role of genomics in the identification, prediction, and prevention of biological threats. PLoS Biol., 2009, 7(10): e1000217.

    Google Scholar 

  16. Medini D, Serruto D, Parkhill J, Relman D A, Donati C, Moxon R, Falkow S, Rappuoli R. Microbiology in the post-genomic era. Nat. Rev. Microbiol., 2008, 6(6): 419–430.

    Google Scholar 

  17. Welch R A, Burland V, Plunkett G, Redford P, Roesch P, Rasko D, Buckles E L, Liou S R, Boutin A, Hackett J, Stroud D, Mayhew G F, Rose D J, Zhou S, Schwartz D C, Perna N T, Mobley H L, Donnenberg M S, Blattner F R. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. U.S.A., 2002, 99(26): 17020–17024.

    Google Scholar 

  18. Turnbaugh P J, Ley R E, Hamady M, Fraser-Liggett C M, Knight R, Gordon J I. The human microbiome project. Nature, 2007, 449(7164): 804–810.

    Google Scholar 

  19. Eckburg P B, Bik E M, Bernstein C N, Purdom E, Dethlefsen L, Sargent M, Gill S R, Nelson K E, Relman D A. Diversity of the human intestinal microbial Flora. Science, 2005, 308(5728): 1635–1638.

    Google Scholar 

  20. Woese C R, Kandler O, Wheelis M L. Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. U.S.A., 1990, 87(12): 4576–4579.

    Google Scholar 

  21. Relman D A, Falkow S, LeBoit P E, Perkocha L A, Min K W, Welch D F, Slater L N. The organism causing bacillary angiomatosis, peliosis hepatis, and fever and bacteremia in immunocompromised patients. N. Engl. J. Med., 1991, 324(21): 1514.

    Google Scholar 

  22. Winker S, Woese C R. A definition of the domains Archaea, Bacteria and Eucarya in terms of small subunit ribosomal RNA characteristics. Syst. Appl. Microbiol., 1991, 14(4): 305–310.

    Google Scholar 

  23. Maiden M C, Bygraves J A, Feil E, Morelli G, Russell J E, Urwin R, Zhang Q, Zhou J, Zurth K, Caugant D A, Feavers I M, Achtman M, Spratt B G. Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. U.S.A., 1998, 95(6): 3140–3145.

    Google Scholar 

  24. Lin G N, Zhang C, Xu D. Polytomy identification in microbial phylogenetic reconstruction. BMC Systems Biology, 2011, Submitted.

  25. Bansal A K, Meyer T E. Evolutionary analysis by whole-genome comparisons. J. Bacteriol., 2002, 184(8): 2260–2272.

    Google Scholar 

  26. Van de Peer Y, Chapelle S, De Wachter R. A quantitative map of nucleotide substitution rates in bacterial rRNA. Nucleic Acids Res., 1996, 24(17): 3381–3391.

    Google Scholar 

  27. Peterson D A, Frank D N, Pace N R, Gordon J I. Metagenomic approaches for defining the pathogenesis of inflammatory bowel diseases. Cell Host Microbe, 2008, 3(6): 417–427.

    Google Scholar 

  28. http://rna.ucsc.edu/rnacenter/xrna/xrna.html.

  29. Garrity G. Bergey's Manual of Systematic Bacteriology, Vol. 2 (Parts A, B & C; Three-Volume Set). New York: Springer, 2005.

  30. Pace N R. A molecular view of microbial diversity and the biosphere. Science, 1997, 276(5313): 734–740.

    Google Scholar 

  31. Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar, Buchner A, Lai T, Steppi S, Jobb G, Forster W, Brettske I, Gerber S, Ginhart A W, Gross O, Grumann S, Hermann S, Jost R, Konig A, Liss T, Lussmann R, May M, Nonhoff B, Reichel B, Strehlow R, Stamatakis A, Stuckmann N, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer K H. ARB: A software environment for sequence data. Nucleic Acids Res., 2004, 32(4): 1363–1371.

    Google Scholar 

  32. Hugenholtz P. Exploring prokaryotic diversity in the genomic era. Genome Biol., 2002, 3(2): review 0003.1-review 0003.8.

  33. Cole J R, Wang Q, Cardenas E, Fish J, Chai B, Farris R J, Kulam-Syed-Mohideen A S, McGarrell D M, Marsh T, Garrity G M, Tiedje J M. The Ribosomal Database Project: Improved alignments and new tools for rRNA analysis. Nucleic Acids Res., 2009, 37(Database issue): 141–145.

    Google Scholar 

  34. http://rdp.cme.msu.edu/.

  35. DeSantis T Z, Hugenholtz P, Larsen N, Rojas M, Brodie E L, Keller K, Huber T, Dalevi D, Hu P, Andersen G L. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol., 2006, 72(7): 5069–5072.

    Google Scholar 

  36. http://greengenes.lbl.gov.

  37. Pruesse E, Quast C, Knittel K, Fuchs B M, Ludwig W, Peplies J, Glockner F O. SILVA: A comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res., 2007, 35(21): 7188–7196.

    Google Scholar 

  38. http://www.arb-silva.de.

  39. Larkin M A, Blackshields G, Brown N P, Chenna R, McGettigan P A, McWilliam H, Valentin F, Wallace I M, Wilm A, Lopez R, Thompson J D, Gibson T J, Higgins D G. Clustal W and Clustal X version 2.0. Bioinformatics, 2007, 23(21): 2947–2948.

    Google Scholar 

  40. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol., 2011.

  41. DeSantis T Z Jr., Hugenholtz P, Keller K, Brodie E L, Larsen N, Piceno Y M, Phan R, Andersen G L. NAST: A multiple sequence alignment server for comparative analysis of 16S rRNA genes. Nucleic Acids Res., 2006, 34(Web Server issue): 394–399.

    Google Scholar 

  42. Edgar R C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res., 2004, 32(5): 1792–1797.

    Google Scholar 

  43. Schloss P D. The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies. PLoS Comput. Biol., 2010, 6(7): e1000844.

    Google Scholar 

  44. Baker G C, Smith J J, Cowan D A. Review and re-analysis of domain-specific 16S primers. J. Microbiol. Methods, 2003, 55(3): 541–555.

    Google Scholar 

  45. Luna R A, Fasciano L R, Jones S C, Boyanton B L Jr., Ton T T, Versalovic J. DNA pyrosequencing-based bacterial pathogen identification in a pediatric hospital setting. J. Clin. Microbiol., 2007, 45(9): 2985–2992.

    Google Scholar 

  46. Chakravorty S, Helb D, Burday M, Connell N, Alland D. A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of pathogenic bacteria. J. Microbiol. Methods, 2007, 69(2): 330–339.

    Google Scholar 

  47. Crielaard W, Zaura E, Schuller A A, Huse S M, Montijn R C, Keijser B J. Exploring the oral microbiota of children at various developmental stages of their dentition in the relation to their oral health. BMC Med. Genomics, 2011, 4: 22.

    Google Scholar 

  48. Wade W G. Has the use of molecular methods for the characterization of the human oral microbiome changed our under-standing of the role of bacteria in the pathogenesis of periodontal disease? J. Clin. Periodontol., 2011, 38(Suppl 11): 7–16.

    Google Scholar 

  49. Schmalenberger A, Schwieger F, Tebbe C C. Effect of primers hybridizing to different evolutionarily conserved regions of the small-subunit rRNA gene in PCR-based microbial community analyses and genetic profiling. Appl. Environ. Microbiol., 2001, 67(8): 3557–3563.

    Google Scholar 

  50. Wu G D, Lewis J D, Hoffmann C, Chen Y Y, Knight R, Bittinger K, Hwang J, Chen J, Berkowsky R, Nessel L, Li H, Bushman F D. Sampling and pyrosequencing methods for characterizing bacterial communities in the human gut using 16S sequence tags. BMC Microbiol., 2010, 10: 206.

    Google Scholar 

  51. Liu Z, DeSantis T Z, Andersen G L, Knight R. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers. Nucleic Acids Res., 2008, 36(18): e120.

    Google Scholar 

  52. Claesson M J, O'Sullivan O,Wang Q, Nikkila J, Marchesi J R, Smidt H, de Vos W M, Ross R P, O’Toole P W. Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One, 2009, 4(8): e6669.

    Google Scholar 

  53. Sacchi C T, Whitney A M, Mayer L W, Morey R, Steigerwalt A, Boras A, Weyant R S, Popovic T. Sequencing of 16S rRNA gene: A rapid tool for identification of Bacillus anthracis. Emerg. Infect. Dis., 2002, 8(10): 1117–1123.

    Google Scholar 

  54. Gori F, Folino G, Jetten M S, Marchiori E. MTR: Taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks. Bioinformatics, 2011, 27(2): 196–203.

    Google Scholar 

  55. Rosen G L, Essinger S D. Comparison of statistical methods to classify environmental genomic fragments. IEEE Trans. Nanobioscience, 2010, 9(4): 310–316.

    Google Scholar 

  56. Altschul S F, Gish W, Miller W, Myers E W, Lipman D J. Basic local alignment search tool. J. Mol. Biol., 1990, 215(3): 403–410.

    Google Scholar 

  57. Foerstner K U, von Mering C, Hooper S D, Bork P. Environments shape the nucleotide composition of genomes. EMBO Rep., 2005, 6(12): 1208–1213.

    Google Scholar 

  58. Wommack K E, Bhavsar J, Ravel J. Metagenomics: Read length matters. Appl. Environ. Microbiol., 2008, 74(5): 1453–1463.

    Google Scholar 

  59. Andersson A F, Lindberg M, Jakobsson H, Backhed F, Nyren P, Engstrand L. Comparative analysis of human gut microbiota by barcoded pyrosequencing. PLoS One, 2008, 3(7): e2836.

    Google Scholar 

  60. Dalevi D, Ivanova N N, Mavromatis K, Hooper S D, Szeto E, Hugenholtz P, Kyrpides N C, Markowitz V M. Annotation of metagenome short reads using proxygenes. Bioinformatics, 2008, 24(16): i7–i13.

    Google Scholar 

  61. Koski L B, Golding G B. The closest BLAST hit is often not the nearest neighbor. J. Mol. Evol., 2001, 52(6): 540–542.

    Google Scholar 

  62. Pignatelli M, Aparicio G, Blanquer I, Hernandez V, Moya A, Tamames J. Metagenomics reveals our incomplete knowledge of global diversity. Bioinformatics, 2008, 24(18): 2124–2125.

    Google Scholar 

  63. Manichanh C, Chapple C E, Frangeul L, Gloux K, Guigo R, Dore J. A comparison of random sequence reads versus 16S rDNA sequences for estimating the biodiversity of a metagenomic library. Nucleic Acids Res., 2008, 36(16): 5180–5188.

    Google Scholar 

  64. Huson D H, Auch A F, Qi J, Schuster S C. MEGAN analysis of metagenomic data. Genome Res., 2007, 17(3): 377–386.

    Google Scholar 

  65. Clemente J C, Jansson J, Valiente G. Flexible taxonomic assignment of ambiguous sequencing reads. BMC Bioinformatics, 2011, 12: 8.

    Google Scholar 

  66. Clemente J C, Jansson J, Valiente G. Accurate taxonomic assignment of short pyrosequencing reads. In Proc. Pac. Symp. Biocomput., Jan. 2010, pp.3–9.

  67. Vinga S, Almeida J. Alignment-free sequence comparison-a review. Bioinformatics, 2003, 19(4): 513–523.

    Google Scholar 

  68. Wang Q, Garrity G M, Tiedje J M, Cole J R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol., 2007, 73(16): 5261–5267.

    Google Scholar 

  69. Brady A, Salzberg S L. Phymm and PhymmBL: Metagenomic phylogenetic classification with interpolated Markov models. Nat. Methods, 2009, 6(9): 673–676.

    Google Scholar 

  70. Kotamarti R M, Hahsler M, Raiford D, McGee M, Dunham M H. Analyzing taxonomic classification using extensible Markov models. Bioinformatics, 2010, 26(18): 2235–2241.

    Google Scholar 

  71. McHardy A C, Martin H G, Tsirigos A, Hugenholtz P, Rigoutsos I. Accurate phylogenetic classification of variable-length DNA fragments. Nat. Methods, 2007, 4(1): 63–72.

    Google Scholar 

  72. Smoot M E, Ono K, Ruscheinski J, Wang P L, Ideker T. Cytoscape 2.8: New features for data integration and network visualization. Bioinformatics, 2011, 27(3): 431–432.

    Google Scholar 

  73. http://scienleclouds.org/.

  74. http://metagenomics.anl.gov.

  75. Schatz M C. CloudBurst: Highly sensitive read mapping with MapReduce. Bioinformatics, 2009, 25(11): 1363–1369.

    Google Scholar 

  76. Underwood A, Green J. Call for a quality standard for sequence-based assays in clinical microbiology: Necessity for quality assessment of sequences used in microbial identification and typing. J. Clin. Microbiol., 2011, 49(1): 23–26.

    Google Scholar 

  77. Teng J L, Yeung M Y, Yue G, Au-Yeung R K, Yeung E Y, Fung A M, Tse H, Yuen K Y, Lau S K,Woo P C. In silico analysis of 16S ribosomal RNA gene sequencing based methods for identification of medically important aerobic Gram-negative bacteria. J Med. Microbiol., 2011.

  78. Woo P C, Teng J L, Yeung J M, Tse H, Lau S K, Yuen K Y. Automated identification of medically important bacteria by 16S rRNA gene sequencing using a novel comprehensive database 16SpathDB. J. Clin. Microbiol., 2011, 49(5): 1799–1809.

    Google Scholar 

  79. Lecomte J, St-Arnaud M, Hijri M. Isolation and identification of soil bacteria growing at the expense of arbuscular mycorrhizal fungi. FEMS Microbiol. Lett., 2011, 317(1): 43–51.

    Google Scholar 

  80. Schloss P D, Handelsman J. Toward a census of bacteria in soil. PLoS Comput. Biol., 2006, 2(7): e92.

    Google Scholar 

  81. Arumugam M, Raes J, Pelletier E et al. Enterotypes of the human gut microbiome. Nature, 2011, 473(7346): 174–180.

    Google Scholar 

  82. Fierer N, Lauber C L, Zhou N, McDonald D, Costello E K, Knight R. Forensic identification using skin bacterial communities. Proc. Natl. Acad. Sci. U.S.A., 2010, 107(14): 6477–6481.

    Google Scholar 

  83. Janda J M, Abbott S L. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: Pluses, perils, and pitfalls. J. Clin. Microbiol., 2007, 45(9): 2761–2764.

    Google Scholar 

  84. Silverman A P, Kool E T. Quenched autoligation probes allow discrimination of live bacterial species by single nucleotide differences in rRNA. Nucleic Acids Res., 2005, 33(15): 4978–4986.

    Google Scholar 

  85. Robertson G A, Thiruvenkataswamy V, Shilling H, Price E P, Huygens F, Henskens F A, Giffard P M. Identification and interrogation of highly informative single nucleotide polymorphism sets defined by bacterial multilocus sequence typing databases. J. Med. Microbiol., 2004, 53(Pt. 1): 35–45.

    Google Scholar 

  86. Lu J, Santo Domingo J, Shanks O C. Identification of chicken-specific fecal microbial sequences using a metagenomic approach. Water Res., 2007, 41(16): 3561–3574.

    Google Scholar 

  87. Yoder J S. Centers for Disease C and Prevention. surveillance for waterborne disease and outbreaks associated with recreational water use and other aquatic facility-associated health events – United States, 2005-2006.

  88. Xu J, Gordon J I. Honor thy symbionts. Proc. Natl. Acad. Sci. U.S.A., 2003, 100(18): 10452–10459.

    Google Scholar 

  89. Bacterial water quality standards for recreational waters, freshwater and marine waters status report. United States Environmental Protection Agency, Office of Water, http://purl.access.gpo.gov/GPO/LPS67028.

  90. Carson C A, Christiansen J M, Yampara-Iquise H, Benson V W, Baffaut C, Davis J V, Broz R R, Kurtz W B, Rogers W M, Fales W H. Specificity of a Bacteroides thetaiotaomicron marker for human feces. Appl. Environ. Microbiol., 2005, 71(8): 4945–4949.

    Google Scholar 

  91. Bonjoch X, Balleste E, Blanch A R. Enumeration of bifidobacterial populations with selective media to determine the source of waterborne fecal pollution. Water Res., 2005, 39(8): 1621–1627.

    Google Scholar 

  92. Sorensen D L, Eberl S G, Dicksa R A. Clostridium perfringens as a point source indicator in non-point polluted streams. Water Research, 1989, 23(2): 191–197.

    Google Scholar 

  93. Marti R, Dabert P, Ziebal C, Pourcher A M. Evaluation of Lactobacillus sobrius/L. amylovorus as a new microbial marker of pig manure. Appl. Environ. Microbiol., 2010, 76(5): 1456–1461.

    Google Scholar 

  94. Ufnar J A, Wang S Y, Ufnar D F, Ellender R D. Methanobre-vibacter ruminantium as an indicator of domesticated-ruminant fecal pollution in surface waters. Appl. Environ. Microbiol., 2007, 73(21): 7118–7121.

    Google Scholar 

  95. Zheng G, Yampara-Iquise H, Jones J E, Andrew Carson C. Development of Faecalibacterium 16S rRNA gene marker for identification of human faeces. J. Appl. Microbiol., 2009, 106(2): 634–641.

    Google Scholar 

  96. Duncan S H, Hold G L, Harmsen H J, Stewart C S, Flint H J. Growth requirements and fermentation products of Fusobacterium prausnitzii, and a proposal to reclassify it as Faecalibacterium prausnitzii gen. nov., comb. nov. Int. J. Syst. Evol. Microbiol., 2002, 52(Pt. 6): 2141–2146.

    Google Scholar 

  97. Tap J, Mondot S, Levenez F, Pelletier E, Caron C, Furet J P, Ugarte E, Munoz-Tamayo R, Paslier D L, Nalin R, Dore J, Leclerc M. Towards the human intestinal microbiota phylogenetic core. Environ. Microbiol., 2009, 11(10): 2574–2584.

    Google Scholar 

  98. Dowd S E, Callaway T R, Wolcott R D, Sun Y, McKeehan T, Hagevoort R G, Edrington T S. Evaluation of the bacterial diversity in the feces of cattle using 16S rDNA bacterial tag-encoded FLX amplicon pyrosequencing (bTEFAP). BMC Microbiol., 2008, 8: 125.

    Google Scholar 

  99. Leser T D, Amenuvor J Z, Jensen T K, Lindecrona R H, Boye M, Moller K. Culture-independent analysis of gut bacteria: The pig gastrointestinal tract microbiota revisited. Appl. Environ. Microbiol., 2002, 68(2): 673–690.

    Google Scholar 

  100. Zhu X Y, Zhong T, Pandya Y, Joerger R D. 16S rRNA-based analysis of microbiota from the cecum of broiler chickens. Appl. Environ. Microbiol., 2002, 68(1): 124–137.

    Google Scholar 

  101. Reva B, Antipin Y, Sander C. Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol., 2007, 8(11): R232.

    Google Scholar 

  102. Zhang C, Xu S, Xu D. Detection and application of CagA sequence markers for assessing risk factor of gastric cancer caused by Helicobacter pylori. In proc. IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Dec. 18-21, 2010, pp.485–488.

  103. Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol., 2003, 52(5): 696–704.

    Google Scholar 

  104. Hacker J, Hentschel U, Dobrindt U. Prokaryotic chromosomes and disease. Science, 2003, 301(5634): 790–793.

    Google Scholar 

  105. Ullman T A, Itzkowitz S H. Intestinal inflammation and cancer. Gastroenterology, 2011, 140(6): 1807–1816.

    Google Scholar 

  106. Round J L, Mazmanian S K. The gut microbiota shapes intestinal immune responses during health and disease. Nat. Rev. Immunol., 2009, 9(5): 313–323.

    Google Scholar 

  107. Franco A T, Friedman D B, Nagy T A, Romero-Gallo J, Krishna U, Kendall A, Israel D A, Tegtmeyer N, Washington M K, Peek R M Jr. Delineation of a carcinogenic Helicobacter pylori proteome. Mol. Cell. Proteomics, 2009, 8(8): 1947–1958.

    Google Scholar 

  108. Covacci A, Censini S, Bugnoli M, Petracca R, Burroni D, Macchia G, Massone A, Papini E, Xiang Z, Figura N et al. Molecular characterization of the 128-kDa immunodominant antigen of Helicobacter pylori associated with cytotoxicity and duodenal ulcer. Proc. Natl. Acad. Sci. U.S.A., 1993, 90(12): 5791–5795.

    Google Scholar 

  109. Ernst P B, Gold B D. The disease spectrum of Helicobacter pylori: The immunopathogenesis of gastroduodenal ulcer and gastric cancer. Annu. Rev. Microbiol., 2000, 54: 615–640.

    Google Scholar 

  110. Uemura N, Okamoto S, Yamamoto S, Matsumura N, Yamaguchi S, Yamakido M, Taniyama K, Sasaki N, Schlemper R J. Helicobacter pylori infection and the development of gastric cancer. N. Engl. J. Med., 2001, 345(11): 784–789.

    Google Scholar 

  111. Xia Y, Yamaoka Y, Zhu Q, Matha I, Gao X. A comprehensive sequence and disease correlation analyses for the C-terminal region of CagA protein of Helicobacter pylori. PLoS One, 2009, 4(11): e7736.

    Google Scholar 

  112. Eddy S R. Profile hidden Markov models. Bioinformatics, 1998, 14(9): 755–763.

    Google Scholar 

  113. Beck D, Settles M, Foster J A. OTUbase: An R infrastructure package for operational taxonomic unit data. Bioinformatics, 2011, 27(12): 1700–1701.

    Google Scholar 

  114. Seshadri R, Kravitz S A, Smarr L, Gilna P, Frazier M. CAMERA: A community resource for metagenomics. PLoS Biol., 2007, 5(3): e75.

    Google Scholar 

  115. Meyer F, Paarmann D, D'Souza M, Olson R, Glass E M, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, Wilkening J, Edwards R A. The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics, 2008, 9: 386.

    Google Scholar 

  116. Gerlach W, Stoye J. Taxonomic classification of metagenomic shotgun sequences with CARMA3. Nucleic Acids Res., 2011, 39(14): e91–e91.

    Google Scholar 

  117. Giongo A, Crabb D B, Davis-Richardson A G et al. PANGEA: Pipeline for analysis of next generation amplicons. ISME J., 2010, 4(7): 852–861.

    Google Scholar 

  118. Horton M, Bodenhausen N, Bergelson J. MARTA: A suite of Java-based tools for assigning taxonomic status to DNA sequences. Bioinformatics, 2010, 26(4): 568–569.

    Google Scholar 

  119. Devulder G, Perriere G, Baty F, Flandrois J P. BIBI, a bioinformatics bacterial identification tool. J. Clin. Microbiol., 2003, 41(4): 1785–1787.

    Google Scholar 

  120. Caporaso J G, Kuczynski J, Stombaugh J et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods, 2010, 7(5): 335–336.

    Google Scholar 

  121. Wu D, Hartman A, Ward N, Eisen J A. An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP). PLoS One, 2008, 3(7): e2566.

    Google Scholar 

  122. Kosakovsky Pond S, Wadhawan S, Chiaromonte F et al. Windshield splatter analysis with the Galaxy metagenomic pipeline. Genome Res., 2009, 19(11): 2144–2153.

    Google Scholar 

  123. Rosen G L, Reichenberger E R, Rosenfeld A M. NBC: The Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics, 2011, 27(1): 127–129.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Xu.

Additional information

This work was partially supported by the National Institute of Health of USA under Grant No. R21/R33 GM078601, USDA NIFA’s Evans-Allen Grant (Project NO: MOX-Zheng) under Grant No. 0223248, and International Exchange and Cooperation Office of Nanjing Medical University of China.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 86.2 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, C., Zheng, G., Xu, SF. et al. Computational Challenges in Characterization of Bacteria and Bacteria-Host Interactions Based on Genomic Data. J. Comput. Sci. Technol. 27, 225–239 (2012). https://doi.org/10.1007/s11390-012-1219-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1219-y

Keywords

Navigation