Skip to main content

GcnSV: A Method Based on Deep Learning of Calling Structural Variations from the Third-Generation Sequencing Data

  • Conference paper
  • First Online:
Computer Science and Education (ICCSE 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1813))

Included in the following conference series:

  • 547 Accesses

Abstract

The birth of the third-generation sequencing technology provides a large number of long-read data for calling structural variations (SVs). However, the existing calling tools for these long-read data have high precision but low recall. Therefore, to solve this problem, a new method called GcnSV is proposed in this paper. Firstly, GcnSV maps all reads in the genome sequencing data into corresponding graphs as the input of the graph neural network. Then, it uses these graphs to train the graph neural network in order to learn the characteristics of variations themselves and their upstream and downstream. Finally, a clustering algorithm is designed to obtain the final calling results. On the simulated and real data, we give the evaluation results of GcnSV and other calling tools. The experimental results show that GcnSV has higher recall and F1-score on different coverage depths and different variant lengths.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stankiewicz, P., Lupski, J.R.: Structural variation in the human genome and its role in disease. Annu. Rev. Med. 61, 437–455 (2010)

    Article  Google Scholar 

  2. Yang, L., et al.: Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 153(4), 919–929 (2013)

    Article  Google Scholar 

  3. Lupski, J.R.: Structural variation mutagenesis of the human genome: impact on disease and evolution. Environ. Mol. Mutagen 56, 419–436 (2015)

    Article  Google Scholar 

  4. Ye, K., Schulz, M.H., Long, Q., et al.: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21), 2865–2871 (2009)

    Google Scholar 

  5. Xian, F., Abbott, T.E., Larson, D., et al.: BreakDancer - Identification of Genomic Structural Variation from Paired-End Read Mapping. Wiley, Hoboken (2014)

    Google Scholar 

  6. Layer, R.M., Chiang, C., Quinlan, A.R., et al.: LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15(6), 1–19 (2014)

    Google Scholar 

  7. English, A.C., et al.: Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics 16, 1–15 (2015)

    Google Scholar 

  8. Goodwin, S., McPherson, J.D., McCombie, W.R.: Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333–351 (2016)

    Article  Google Scholar 

  9. Sedlazeck, F.J., Rescheneder, P., Smolka, M., Fang, H., Nattestad, M., von Haeseler, A., Schatz, M.C.: Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461 (2018)

    Google Scholar 

  10. Jiang, T., Liu, B., Jiang, Y., et al.: Long Read based Human Genomic Structural Variation Detection with cuteSV (2019)

    Google Scholar 

  11. Fang, L., Hu, J., Wang, D., Wang, K.: NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data. Bioinformatics 19, 180 (2018). https://doi.org/10.1186/s12859-018-2207-1

    Article  Google Scholar 

  12. Kip, F.T.N., Welling, M.: Semi-Supervised Classification with Graph Convolutional Networks (2016)

    Google Scholar 

  13. Qiu, J., Tang, J., Ma, H., Dong, Y., Wang, K., Tang, J.: Deepinf: modeling influence locality in large social networks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018)

    Google Scholar 

  14. Qi, X., Liao, R., Jia, J., Fidler, S., Urtasun, R.: 3D graph neural networks for RGBD semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  15. Li, Y., Yu, R., Shahabi, C., Liu, Y.: Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: Proceedings of the 7th International Conference on Learning Representations (2018)

    Google Scholar 

  16. Wen, T., Altman, R.B.: Graph convolutional neural networks for predicting drug-target interactions. J. Chem. Inf. Model. 59(10), 4131–4149 (2019)

    Article  Google Scholar 

  17. Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38

    Chapter  Google Scholar 

  18. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ester, M., Kriegel, H.P., Sander, J., et al.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. AAAI Press (1996)

    Google Scholar 

  20. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Google Scholar 

  21. Dierckxsens, N., Li, T., Vermeesch, J.R., et al.: A benchmark of structural variation detection by long reads through a realistic simulated model (2020)

    Google Scholar 

  22. Heller, D., Vingron, M.: SVIM: structural variant identification using mapped long reads. Bioinformatics 35(17), 2907–2915 (2019)

    Article  Google Scholar 

  23. Jeffares, D.C., Jolly, C., Hoti, M., et al.: Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8(1), 14061 (2017)

    Article  Google Scholar 

  24. Zhang, W., Jia, B., Wei, C.: PaSS: a sequencing simulator for PacBio sequencing. BMC Bioinform. 20(1), 1–7 (2019)

    Google Scholar 

  25. Sedlazeck, F.J., Rescheneder, P., Smolka, M., et al.: Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15(6), 461–468 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingyang Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, M., Wang, H., Gao, J. (2023). GcnSV: A Method Based on Deep Learning of Calling Structural Variations from the Third-Generation Sequencing Data. In: Hong, W., Weng, Y. (eds) Computer Science and Education. ICCSE 2022. Communications in Computer and Information Science, vol 1813. Springer, Singapore. https://doi.org/10.1007/978-981-99-2449-3_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-2449-3_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-2448-6

  • Online ISBN: 978-981-99-2449-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics