ABSTRACT
Transposon mutagenesis experiments enable the identification of essential genes in bacteria. Deep-sequencing of mutant libraries provides a large amount of high-resolution data on essentiality. Statistical methods developed to analyze this data have traditionally assumed that the probability of observing a transposon insertion is the same across the genome. This assumption, however, is inconsistent with the observed insertion frequencies from transposon mutant libraries of M. tuberculosis.
We propose a modified binomial model of essentiality that can characterize the insertion probability of individual genes in which we allow local variation in the background insertion frequency in different non-essential regions of the genome. Using the Metropolis-Hastings algorithm, samples of the posterior insertion probabilities are obtained for each gene, and the probability of each gene being essential is estimated. We compare our predictions to those of previous methods and show that, by taking into consideration local insertion frequencies, our method is capable of making more conservative predictions that better match what is experimentally known about essential and non-essential genes.
- S. G. Acinas, R. Sarma-Rupavtarm, V. Klepac-Ceraj, and M. F. Polz. PCR-induced sequence artifacts and bias: insights from comparison of two 16S rRNA clone libraries constructed from the same sample. Appl. Environ. Microbiol., 71:8966--8969, Dec 2005.Google ScholarCross Ref
- S. Banu, N. Honore, B. Saint-Joanis, D. Philpott, M. C. Prevost, and S. T. Cole. Are the PE-PGRS proteins of Mycobacterium tuberculosis variable surface antigens? Mol. Microbiol., 44:9--19, Apr 2002.Google ScholarCross Ref
- N. J. Blades and K. W. Broman. Estimating the number of essential genes in a genome by random transposon mutagenesis. Technical Report MSU-CSE-00-2, Dept. of Biostatistics Working Papers, Johns Hopkins University, July 2002.Google Scholar
- S. T. Cole, R. Brosch, and J. Parkhill. Deciphering the biology of mycobacterium tuberculosis from the complete genome sequence. Nature, 393(6685):537--544, 1998.Google ScholarCross Ref
- M. A. Dejesus, Y. J. Zhang, C. M. Sassetti, E. J. Rubin, J. C. Sacchettini, and T. R. Ioerger. Bayesian analysis of gene essentiality based on sequencing of transposon insertion libraries. Bioinformatics, 29(6):695--703, Mar 2013. Google ScholarDigital Library
- P. Domenech, M. B. Reed, and C. E. Barry. Contribution of the Mycobacterium tuberculosis MmpL protein family to virulence and drug resistance. Infect. Immun., 73:3492--3501, Jun 2005.Google ScholarCross Ref
- C. L. Gee, K. G. Papavinasasundaram, S. R. Blair, C. E. Baer, A. M. Falick, D. S. King, J. E. Griffin, H. Venghatakrishnan, A. Zukauskas, J. R. Wei, R. K. Dhiman, D. C. Crick, E. J. Rubin, C. M. Sassetti, and T. Alber. A phosphorylated pseudokinase complex controls cell wall synthesis in mycobacteria. Sci Signal, 5:ra7, 2012.Google Scholar
- J. E. Griffin, J. D. Gawronski, M. A. DeJesus, T. R. Ioerger, B. J. Akerley, and C. M. Sassetti. High-resolution phenotypic profiling defines genes essential for mycobacterial growth and cholesterol catabolism. PLoS Pathog, 7(9):e1002251, 09 2011.Google ScholarCross Ref
- S. Hasan, S. Daugelat, P. S. Rao, and M. Schreiber. Prioritizing genomic drug targets in pathogens: application to Mycobacterium tuberculosis. PLoS Comput. Biol., 2(6):e61, Jun 2006.Google ScholarCross Ref
- G. Lamichhane, S. Tyagi, and W. R. Bishai. Designer arrays for defined mutant analysis to detect genes essential for survival of Mycobacterium tuberculosis in mouse lungs. Infect. Immun., 73(4):2533--2540, Apr 2005.Google ScholarCross Ref
- G. Lamichhane, M. Zignol, N. J. Blades, D. E. Geiman, A. Dougherty, J. Grosset, K. W. Broman, and W. R. Bishai. A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: Application to mycobacterium tuberculosis. PNAS, 100(12):7213--7218, 2003.Google ScholarCross Ref
- P. Muller, G. Parmigiani, and K. Rice. Fdr and bayesian multiple comparisons rules. In Proceedings of the ISBA 8th World Meeting on Bayesian Statistics, Benidorm, Spain, Juner 2006.Google Scholar
- C. M. Sassetti, D. H. Boyd, and E. J. Rubin. Comprehensive identification of conditionally essential genes in mycobacteria. PNAS, 98(22):12712--12717, 2001.Google ScholarCross Ref
- C. M. Sassetti, D. H. Boyd, and E. J. Rubin. Genes required for mycobacterial growth defined by high density mutagenesis. Molecular Microbiology, 48(1):77--84, 2003.Google ScholarCross Ref
- C. M. Sassetti and E. J. Rubin. Genetic requirements for mycobacterial survival during infection. PNAS, 100(22):12989--12994, 2003.Google ScholarCross Ref
- Y. J. Zhang, T. R. Ioerger, C. Huttenhower, J. E. Long, C. M. Sassetti, J. C. Sacchettini, and E. J. Rubin. Global assessment of genomic regions required for growth in Mycobacterium tuberculosis. PLoS Pathog., 8(9):e1002946, Sep 2012.Google ScholarCross Ref
- A. Zomer, P. Burghout, H. J. Bootsma, P. W. Hermans, and S. A. van Hijum. ESSENTIALS: software for rapid analysis of high throughput transposon insertion sequencing data. PLoS ONE, 7(8):e43012, 2012.Google ScholarCross Ref
Index Terms
- Improving discrimination of essential genes by modeling local insertion frequencies in transposon mutagenesis data
Recommendations
Capturing uncertainty by modeling local transposon insertion frequencies improves discrimination of essential genes
Transposon mutagenesis experiments enable the identification of essential genes in bacteria. Deep-sequencing of mutant libraries provides a large amount of high-resolution data on essentiality. Statistical methods developed to analyze this data have ...
Bayesian analysis of gene essentiality based on sequencing of transposon insertion libraries
Motivation: Next-generation sequencing affords an efficient analysis of transposon insertion libraries, which can be used to identify essential genes in bacteria. To analyse this high-resolution data, we present a formal Bayesian framework for ...
An algorithm for identification of bacterial selenocysteine insertion sequence elements and selenoprotein genes
Motivation: Incorporation of selenocysteine (Sec) into proteins in response to UGA codons requires a cis -acting RNA structure, Sec insertion sequence (SECIS) element. Whereas SECIS elements in Escherichia coli are well characterized, a bacterial ...
Comments