Abstract
In the study of the evolution of various bacteria, the content of the CRISPR locus has proven to be quite useful. This locus has been made famous because it allows for simple and inexpensive genome editing. And bacteriologists are used to studying this locus, through tools such as spoligotyping, in order to experimentally be able to determine the lineage or even the sub-lineage of a given strain, and to deduce an optimal antibiotic cocktail. The problem is that the study of the content of this locus is very often delicate and difficult. Therefore, we propose in this paper a new way of representing them, which makes sense biologically speaking, and which allows a simplified and enriched study of the CRISPR content. After explaining how to extract this locus from Whole Genome Sequencing data, we propose an embedding of this locus in a high dimensional space, followed by a reduction to dimension 2, which makes sense of the content. This method is applied to the case of the Mycobacterium tuberculosis complex, and a discussion is proposed to list the advantages of this approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
SRA toolkit development team. https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software. Accessed 16 Mar 2022
Ranga Suri, N.N.R., Murty M, N., Athithan, G.: Outlier detection. In: Outlier Detection: Techniques and Applications. ISRL, vol. 155, pp. 13–27. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05127-3_2
Bland, C., et al.: Crispr recognition tool (crt): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinform. 8(1), 1–8 (2007)
Brudey, K., et al.: Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (spoldb4) for classification, population genetics and epidemiology. BMC Microbiol. 6(1), 1–17 (2006)
Coll, F.: Spolpred: rapid and accurate prediction of mycobacterium tuberculosis spoligotypes from short genomic sequences. Bioinformatics 28(22), 2991–2993 (2012)
Coll, F., et al.: A robust SNP barcode for typing mycobacterium tuberculosis complex strains. Nat. Commun. 5(1), 1–5 (2014)
Faksri, K., Xia, E., Tan, J.H., Teo, Y.-Y., Ong, R.T.-H.: In silico region of difference (RD) analysis of mycobacterium tuberculosis complex from sequence reads using RD-analyzer. BMC Genom. 17(1), 1–10 (2016)
Freidlin, P.J., et al.: Structure and variation of CRISPR and CRISPR-flanking regions in deleted-direct repeat region mycobacterium tuberculosis complex strains. BMC Genom. 18(1), 1–14 (2017)
Groenen, P.M.A., Bunschoten, A.E., van Soolingen, D., van Errtbden, J.D.A.: Nature of DNA polymorphism in the direct repeat cluster of mycobacterium tuberculosis; application for strain differentiation by a novel typing method. Mol. Microbiol. 10(5), 1057–1065 (1993)
Guyeux, C., Al-Nuaimi, B., AlKindy, B., Couchot, J.-F., Salomon, M.: On the reconstruction of the ancestral bacterial genomes in genus mycobacterium and Brucella. BMC Syst. Biol., IWBBIO 2017 Special Issue 12(5), 100 (2018)
Guyeux, C., Salomon, M., Al-Nuaimi, B., AlKindy, B., Couchot, J.-F.: Ancestral reconstruction and investigations of genomic recombination on some pentapetalae chloroplasts. J. Integrative Bioinform. *, 20180057 (2019)
Guyeux, C., Senelle, G., Refrégier, G., Bretelle-Establet, F., Cambau, E., Sola, C.: Connection between two historical tuberculosis outbreak sites in Japan, Honshu, by a new ancestral mycobacterium tuberculosis l2 sublineage. Epidemiol. Infect. 150, e56 (2022)
Guyeux, C., Sola, C., Noûs, C., Refrégier, G.: Crisprbuilder-tb: “crispr-builder for tuberculosis’’. Exhaustive reconstruction of the CRISPR locus in mycobacterium tuberculosis complex using SRA. PLOS Computational Biology 17(3), 1–21 (2021)
Hodge, V., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)
Kamerbeek, J., et al.: Simultaneous detection and strain differentiation of mycobacterium tuberculosis for diagnosis and epidemiology. J. Clin. Microbiol. 35(4), 907–914 (1997)
Kato-Maeda, M., et al.: Strain classification of mycobacterium tuberculosis: congruence between large sequence polymorphisms and spoligotypes. Int. J. Tuberculosis Lung Disease 15(1), 131–133 (2011)
Makarova, K.S., Wolf, Y.I., Koonin, E.V.: Classification and nomenclature of CRISPR-CAS systems: where from here? CRISPR J. 1(5), 325–336 (2018)
McInnes, L., Healy, J., Melville, J.: Umap: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Palittapongarnpim, P., et al.: Evidence for host-bacterial co-evolution via genome sequence analysis of 480 THAI mycobacterium tuberculosis lineage 1 isolates. Sci. Rep. 8(1), 1–14 (2018)
Refrégier, G., Sola, C., Guyeux, C.: Unexpected diversity of crispr unveils some evolutionary patterns of repeated sequences in mycobacterium tuberculosis. BMC Genomics 21(1), 1–12 (2020)
Shitikov, E., et al.: Evolutionary pathway analysis and unified classification of east Asian lineage of mycobacterium tuberculosis. Sci. Rep. 7(1), 1–10 (2017)
Stucki, D., et al.: Mycobacterium tuberculosis lineage 4 comprises globally distributed and geographically restricted sublineages. Nat. Genet. 48(12), 1535–1543 (2016)
Tsolaki, A.G., et al.: Functional and evolutionary genomics of mycobacterium tuberculosis: insights from genomic deletions in 100 strains. Proc. Natl. Acad. Sci. 101(14), 4865–4870 (2004)
Van der Maaten, L., Hinton, G.: Visualizing data using T-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Van Embden, J.D.A., Van Gorkom, T., Kremer, K., Jansen, R., Van der Zeijst, B.A.M., Schouls, L.M.: Genetic variation and evolutionary origin of the direct repeat locus of mycobacterium tuberculosis complex bacteria. J. Bacteriol. 182(9), 2393–2401 (2000)
Wei, W., et al.: Mycobacterium tuberculosis type III-A CRISPR/Cas system CRRNA and its maturation have atypical features. FASEB J. 33(1), 1496–1509 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Guyeux, C., Refrégier, G., Sola, C. (2022). Spolmap: An Enriched Visualization of CRISPR Diversity. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-07802-6_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07801-9
Online ISBN: 978-3-031-07802-6
eBook Packages: Computer ScienceComputer Science (R0)