skip to main content
10.1145/2147805.2147827acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Knowledge-based Bayesian network for the classification of Mycobacterium tuberculosis complex sublineages

Published: 01 August 2011 Publication History

Abstract

We develop a novel knowledge-based Bayesian network (KBBN) that models our knowledge of the Mycobacterium tuberculosis complex (MTBC) obtained from expert-defined rules and large DNA fingerprint databases to classify strains of MTBC into fifty-one genetic sublineages. The model uses two high-throughput biomarkers: spacer oligonucleotide types (spoligotypes) and mycobacterial interspersed repetitive units (MIRU) types to represent strains of MTBC, since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. KBBN provides an elegant and simple way to incorporate existing widely accepted visual rules for MTBC sublineages into a classifier designed to capture known properties of the MTBC biomarkers. Unlike prior knowledge-based SVM approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. Computational results show that KBBN achieves much higher accuracy than methods based purely on rules, and than Bayesian networks trained on biomarker data alone.

References

[1]
Hirsh, A. E., Tsolaki, A. G., DeRiemer, K., Feldman, M. W. and Small, P. M. Stable association between strains of Mycobacterium tuberculosis and their human host populations. P Natl Acad Sci USA, 101, 14 (Apr 6 2004), 4871--4876.
[2]
Gagneux, S., DeRiemer, K., Van, T., Kato-Maeda, M., de Jong, B. C., Narayanan, S., Nicol, M., Niemann, S., Kremer, K., Gutierrez, M. C., Hilty, M., Hopewell, P. C. and Small, P. M. Variable host-pathogen compatibility in Mycobacterium tuberculosis. Proc Natl Acad Sci USA, 103, 8 (Feb 21 2006), 2869--2873.
[3]
Gagneux, S. and Small, P. Global phylogeography of Mycobacterium tuberculosis and implications for tuberculosis product development. Lancet Infect Dis, 7, 5 (2007), 328--337.
[4]
Kato-Maeda, M., Bifani, P. J., Kreiswirth, B. N. and Small, P. M. The nature and consequence of genetic variability within Mycobacterium tuberculosis. The Journal of Clinical Investigation, 107, 5 (2001), 533--537.
[5]
Malik, A. N. J. and Godfrey-Faussett, P. Effects of genetic variability of Mycobacterium tuberculosis strains on the presentation of disease. The Lancet Infectious Diseases, 5, 3 (2005), 174--183.
[6]
Brudey, K., Driscoll, J., Rigouts, L., Prodinger, W., Gori, A., Al-Hajoj, S., Allix, C., Aristimuno, L., Arora, J. and Baumanis, V. Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpoIDB4) for classification, population genetics and epidemiology. Bmc Microbiol, 6 (2006).
[7]
Filliol, I., Driscoll, J. R., van Soolingen, D., Kreiswirth, B. N., Kremer, K., Valetudie, G., Anh, D. D., Barlow, R., Banerjee, D., Bifani, P. J., Brudey, K., Cataldi, A., Cooksey, R. C., Cousins, D. V., Dale, J. W., Dellagostin, O. A., Drobniewski, F., Engelmann, G., Ferdinand, S., Binzi, D. G., Gordon, M., Gutierrez, M. C., Haas, W. H., Heersma, H., Kallenius, G., Kassa-Kelembho, E., Koivula, T., Ly, H. M., Makristathis, A., Mammina, C., Martin, G., Mostrom, P., Mokrousov, I., Narbonne, V., Narvskaya, O., Nastasi, A., Niobe-Eyangoh, S. N., Pape, J. W., Rasolofo-Razanamparany, V., Ridell, M., Rossetti, M. L., Stauffer, F., Suffys, P. N., Takiff, H., Texier-Maugein, J., Vincent, V., de Waard, J. H., Sola, C. and Rastogi, N. Global distribution of Mycobacterium tuberculosis spoligotypes. Emerg Infect Dis, 8, 11 (Nov 2002), 1347--1349.
[8]
Filliol, I., Driscoll, J. R., van Soolingen, D., Kreiswirth, B. N., Kremer, K., Valetudie, G., Dang, D. A., Barlow, R., Banerjee, D., Bifani, P. J., Brudey, K., Cataldi, A., Cooksey, R. C., Cousins, D. V., Dale, J. W., Dellagostin, O. A., Drobniewski, F., Engelmann, G., Ferdinand, S., Gascoyne-Binzi, D., Gordon, M., Gutierrez, M. C., Haas, W. H., Heersma, H., Kassa-Kelembho, E., Ho, M. L., Makristathis, A., Mammina, C., Martin, G., Mostrom, P., Mokrousov, I., Narbonne, V., Narvskaya, O., Nastasi, A., Niobe-Eyangoh, S. N., Pape, J. W., Rasolofo-Razanamparany, V., Ridell, M., Rossetti, M. L., Stauffer, F., Suffys, P. N., Takiff, H., Texier-Maugein, J., Vincent, V., de Waard, J. H., Sola, C. and Rastogi, N. Snapshot of moving and expanding clones of Mycobacterium tuberculosis and their global distribution assessed by spoligotyping in an international study. J Clin Microbiol, 41,5 (May 2003), 1963--1970.
[9]
Baker, L., Brown, T., Maiden, M. C. and Drobniewski, F. Silent nucleotide polymorphisms and a phylogeny for Mycobacterium tuberculosis. Emerg Infect Dis, 10, 9 (Sep 2004), 1568--1577.
[10]
Filliol, I., Motiwala, A. S., Cavatore, M., Qi, W., Hazbon, M. H., Bobadilla del Valle, M., Fyfe, J., Garcia-Garcia, L., Rastogi, N., Sola, C., Zozio, T., Guerrero, M. I., Leon, C. I., Crabtree, J., Angiuoli, S., Eisenach, K. D., Durmaz, R., Joloba, M. L., Rendon, A., Sifuentes-Osornio, J., Ponce de Leon, A., Cave, M. D., Fleischmann, R., Whittam, T. S. and Alland, D. Global phylogeny of Mycobacterium tuberculosis based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set. J Bacteriol, 188, 2 (Jan 2006), 759--772.
[11]
Gutacker, M. M., Smoot, J. C., Migliaccio, C. A., Ricklefs, S. M., Hua, S., Cousins, D. V., Graviss, E. A., Shashkina, E., Kreiswirth, B. N. and Musser, J. M. Genome-wide analysis of synonymous single nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms: resolution of genetic relationships among closely related microbial strains. Genetics, 162, 4 (Dec 2002), 1533--1543.
[12]
Supply, P., Allix, C., Lesjean, S., Cardoso-Oelemann, M., Rusch-Gerdes, S., Willery, E., Savine, E., de Haas, P., van Deutekom, H. and Roring, S. Proposal for standardization of optimized mycobacterial interspersed repetitive unit-variable-number tandem repeat typing of Mycobacterium tuberculosis. J Clin Microbiol, 44, 12 (2006), 4498--4510.
[13]
Warren, R. M., Streicher, E. M., Sampson, S. L., van der Spuy, G. D., Richardson, M., Nguyen, D., Behr, A. A., Victor, T. C. and van Helden, P. D. Microevolution of the direct repeat region of Mycobacterium tuberculosis: Implications for interpretation of spoligotyping data. J Clin Microbiol, 40, 12 (Dec 2002), 4457--4465.
[14]
Ozcaglar, C., Shabbeer, A., Vandenberg, S., Yener, B. and Bennett, K. P. A clustering framework for Mycobacterium tuberculosis complex strains using multiple-biomarker tensors. In Proceedings of the IEEE International Conference on Bioinformatics & Biomedicine (Hong Kong, 2010)
[15]
Vitol, I., Driscoll, J., Kreiswirth, B., Kurepina, N. and Bennett, K. Identifying Mycobacterium tuberculosis complex strain families using spoligotypes. Infect Genet Evol, 6, 6 (2006), 491--504.
[16]
Kunapuli, G., Bennett, K., Shabbeer, A., Maclin, R. and Shavlik, J. Online knowledge-based support vector machines. Machine Learning and Knowledge Discovery in Databases (2010), 145--161.
[17]
Kamerbeek, J., Schouls, L., Kolk, A., vanAgterveld, M., vanSoolingen, D., Kuijper, S., Bunschoten, A., Molhuizen, H., Shaw, R. and Goyal, M. Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol, 35, 4 1997), 907--914.
[18]
Driscoll, J. R. Spoligotyping for molecular epidemiology of the Mycobacterium tuberculosis complex. Methods Mol. Biol, 551 (2009), 117--128.
[19]
Supply, P., Mazars, E., Lesjean, S., Vincent, V., Gicquel, B. and Locht, C. Variable human minisatellite like regions in the Mycobacterium tuberculosis genome. Mol Microbiol, 36, 3 (2000), 762--771.
[20]
Supply, P., Lesjean, S., Savine, E., Kremer, K., van Soolingen, D. and Locht, C. Automated high-throughput genotyping for study of global epidemiology of Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units. J Clin Microbiol, 39, 10 (Oct 2001), 3563--3571.
[21]
Aminian, M., Shabbeer, A. and Bennett, K. A conformal Bayesian network for classification of Mycobacterium tuberculosis complex lineages. BMC Bioinformatics, 11, Suppl 3 (2010), S4.
[22]
Aminian, M., Shabbeer, A. and Bennett, K. Determination of Major Lineages of Mycobacterium tuberculosis using Mycobacterial Interspersed Repetitive Units. IEEE International Conference on Bioinformatics & Biomedicine (2009).
[23]
Shabbeer, A. a. C., L. and Driscoll, J. R. and Ozcaglar, C. and Vandenberg, S. and Yener, B. and Bennett, K. P. TB-Lineage: an online tool for classification and analysis of strains of Mycobacterium tuberculosis complex. Unpublished Manuscript (2011).
[24]
Holmes, G., Donkin, A. and Witten, I. H. Weka: A machine learning workbench. IEEE, City, (1994).
[25]
Hershberg, R., Lipatov, M., Small, P. M., Sheffer, H., Niemann, S., Homolka, S., Roach, J. C., Kremer, K., Petrov, D. A., Feldman, M. W. and Gagneux, S. High Functional Diversity in Mycobacterium tuberculosis Driven by Genetic Drift and Human Demography. Plos Biol, 6, 12 (Dec 2008), 2658--2671.
[26]
Allix-Beguec, C., Harmsen, D., Weniger, T., Supply, P. and Niemann, S. Evaluation and strategy for use of MIRU-VNTRplus, a multifunctional database for online analysis of genotyping data and phylogenetic identification of Mycobacterium tuberculosis complex isolates. J Clin Microbiol, 46, 8 (Aug 2008), 2692--2699.
[27]
Towell, G. G. and Shavlik, J. W. Knowledge-based artificial neural networks. Artif. Intell., 70, 1--2 (1994), 119--165.
[28]
Lauer, F. and Bloch, G. Incorporating prior knowledge in support vector machines for classification: A review. Neurocomputing, 71, 7--9 (2008), 1578--1594.

Cited By

View all
  • (2012)Web tools for molecular epidemiology of tuberculosisInfection, Genetics and Evolution10.1016/j.meegid.2011.08.01912:4(767-781)Online publication date: Jun-2012

Index Terms

  1. Knowledge-based Bayesian network for the classification of Mycobacterium tuberculosis complex sublineages

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
      August 2011
      688 pages
      ISBN:9781450307963
      DOI:10.1145/2147805
      • General Chairs:
      • Robert Grossman,
      • Andrey Rzhetsky,
      • Program Chairs:
      • Sun Kim,
      • Wei Wang
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 August 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Bayesian networks
      2. multiple interspersed repetitive units
      3. spoligotype
      4. sublineages
      5. tuberculosis

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      BCB' 11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 254 of 885 submissions, 29%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2012)Web tools for molecular epidemiology of tuberculosisInfection, Genetics and Evolution10.1016/j.meegid.2011.08.01912:4(767-781)Online publication date: Jun-2012

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media