Abstract
The mobile element variant is a very important structural variant, accounting for a quarter of structural variants, and it is closely related to many issues such as genetic diseases and species diversity. However, few detection algorithms of mobile element variants have been developed on third-generation sequencing data. We propose an algorithm ricME that combines sequence realignment and identity calculation for detecting mobile element variants. The ricME first performs an initial detection to obtain the positions of insertions and deletions, and extracts the variant sequences; then applies sequence realignment and identity calculation to obtain the transposon classes related to the variant sequences; finally, adopts a multi-level judgment rule to achieve accurate detection of mobile element variants based on the transposon classes and identities. Compared with a representative long-read based mobile element variant detection algorithm rMETL, the ricME improves the F1-score by 11.5 and 21.7% on simulated datasets and real datasets, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Niu, Y., Teng, X., Zhou, H., et al.: Characterizing mobile element insertions in 5675 genomes. Nucleic Acids Res. 50(5), 2493–2508 (2022)
Hancks, D.C., Kazazian, H.H.: Roles for retrotransposon insertions in human disease. Mob. DNA 7(1), 1–28 (2016)
Lee, E., Iskow, R., Yang, L., et al.: Landscape of somatic retrotransposition in human cancers. Science 337(6097), 967–971 (2012)
Gardner, E.J., Lam, V.K., Harris, D.N., et al.: The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res. 27(11), 1916–1929 (2017)
Thung, D.T., de Ligt, J., Vissers, L.E.M., et al.: Mobster: accurate detection of mobile element insertions in next generation sequencing data. Genome Biol. 15(10), 1–11 (2014)
Wu, J., Lee, W.P., Ward, A., et al.: Tangram: a comprehensive toolbox for mobile element insertion detection. BMC Genom. 15, 1–15 (2014)
Mahmoud, M., Gobet, N., Cruz-Dávalos, D.I., et al.: Structural variant calling: the long and the short of it. Genome Biol. 20(1), 1–14 (2019)
Merker, J.D., Wenger, A.M., Sneddon, T., et al.: Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genet. Med. 20(1), 159–163 (2018)
Jiang, T., Liu, B., Li, J., et al.: RMETL: sensitive mobile element insertion detection with long read realignment. Bioinformatics 35(18), 3484–3486 (2019)
Sedlazeck, F.J., Rescheneder, P., Smolka, M., et al.: Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15(6), 461–468 (2018)
Ma, H., Zhong, C., Chen, D., et al.: CnnLSV: detecting structural variants by encoding long-read alignment information and convolutional neural network. BMC Bioinform. 24(1), 1–19 (2023)
Li, H.: Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18), 3094–3100 (2018)
Altschul, S.F., Erickson, B.W.: Optimal sequence alignment using affine gap costs. Bull. Math. Biol. 48, 603–616 (1986)
Smit, A.F.A., Hubley, R., Green, P.: RepeatMasker Open-4.0. 2013–2015. http://www.repeatmasker.org
Ono, Y., Asai, K., Hamada, M.: PBSIM: PacBio reads simulator—toward accurate genome assembly. Bioinformatics 29(1), 119–121 (2013)
Danecek, P., Bonfield, J.K., Liddle, J., et al.: Twelve years of SAMtools and BCFtools. Gigascience 10(2), giab008 (2021)
Zook, J.M., Catoe, D., McDaniel, J., et al.: Extensive sequencing of seven human genomes to characterize benchmark reference materials. Scientific Data 3(1), 1–26 (2016)
Chu, C., Borges-Monroy, R., Viswanadham, V.V., et al.: Comprehensive identification of transposable element insertions using multiple sequencing technologies. Nat. Commun. 12(1), 3836 (2021)
Hoen, D.R., Hickey, G., Bourque, G., et al.: A call for benchmarking transposable element annotation methods. Mob. DNA 6, 1–9 (2015)
Ou, S., Su, W., Liao, Y., et al.: Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20(1), 1–18 (2019)
Acknowledgement
This work is partly supported by the National Natural Science Foundation of China under Grant No. 61962004 and Guangxi Postgraduate Innovation Plan under Grant No. A30700211008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ma, H., Zhong, C., Sun, H., Chen, D., Lin, H. (2023). ricME: Long-Read Based Mobile Element Variant Detection Using Sequence Realignment and Identity Calculation. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_13
Download citation
DOI: https://doi.org/10.1007/978-981-99-7074-2_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7073-5
Online ISBN: 978-981-99-7074-2
eBook Packages: Computer ScienceComputer Science (R0)