Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR) are structured regions in bacterial and archaeal genomes, which are part of an adaptive immune system against phages. Most of the automated tools that detect CRISPR loci rely on assembled genomes. However, many assemblers do not successfully handle repetitive regions. The first tool to work directly on raw sequence data is Crass, which requires that reads are long enough to contain two copies of the same repeat. We developed a method to identify CRISPR repeats from a raw sequence data of short reads. The algorithm is based on an observation differentiating CRISPR repeats from other types of repeats, and it involves a series of partial constructions of the overlap graph. A preliminary implementation of the algorithm shows good results and detects CRISPR repeats in cases where other tools fail to do so.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sorek, R., Kunin, V., Hugenholtz, P.: CRISPR - a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol. 6, 181–186 (2008)
Ishino, Y., Shinagawa, H., Makino, K., Amemura, M., Nakata, A.: Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product. J. Bacteriol. 169, 5429–5433 (1987)
Mojica, F.J., Diez-Villasenor, C., Garcia-Martinez, J., Soria, E.: Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60, 174–182 (2005)
Horvath, P., Barrangou, R.: CRISPR-Cas, the immune system of bacteria and archaea. Science 327, 167–170 (2010)
Stern, A., Mick, E., Tirosh, I., Sagy, O., Sorek, R.: CRISPR targeting reveals a reservoir of common phages associated with the human gut microbiome. Genome Res. 22, 1985–1994 (2012)
Hu, W., et al.: RNA-directed gene editing specifically eradicates latent and prevents new HIV-1 infection. Proc. Natl. Acad Sci. USA 111(31), 11461–11466 (2014)
Edgar, R.C.: PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinformatics 8, 18 (2007)
Bland, C., Ramsey, T.L., Sabree, F., Lowe, M., Brown, K., Kyrpides, N.C., Hugenholtz, P.: CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8, 209 (2007)
Grissa, I., Vergnaud, G., Pourcel, C.: CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 35, W52–W57 (2007)
Skennerton, C.T., Imelfort, M., Tyson, G.W.: Crass: identification and reconstruction of CRISPR from unassembled metagenomic data. Nucleic Acids Res. 41, e105 (2012)
Myers, E.: Toward Simplifying and Accurately Formulating Fragment Assembly. Jornal of Computational Biology 2, 275–290 (1995)
Roy, R.S., Bhattacharya, D., Schliep, A.: Turtle: Identifying frequent k-mers with cache-efficient algorithms. Bioinformatics (2014). doi:10.1093/bioinformatics/btu132
CRISPRs web server. http://crispr.u-psud.fr/
Zerbino, D.R., Birney, E.: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ben-Bassat, I., Chor, B. (2015). CRISPR Detection from Short Reads Using Partial Overlap Graphs. In: Przytycka, T. (eds) Research in Computational Molecular Biology. RECOMB 2015. Lecture Notes in Computer Science(), vol 9029. Springer, Cham. https://doi.org/10.1007/978-3-319-16706-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-16706-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16705-3
Online ISBN: 978-3-319-16706-0
eBook Packages: Computer ScienceComputer Science (R0)