Skip to main content

Qualitative Comparison of Selected Indel Detection Methods for RNA-Seq Data

  • Conference paper
  • First Online:
  • 1113 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11465))

Abstract

RNA sequencing (RNA-Seq) provides both gene expression and sequence information, which can be exploited for a joint approach to explore cell processes in general and diseases caused by genomic variants in particular. However, the identification of insertions and deletions (indels) from RNA-Seq data, which for instance play a significant role in the development, detection, and treatment of cancer, still poses a challenge. In this paper, we present a qualitative comparison of selected methods for indel detection from RNA-Seq data. More specifically, we benchmarked two promising aligners and two filter methods on simulated as well as on real RNA-Seq data. We conclude that in cases where reliable detection of indels is crucial, e.g. in a clinical setting, the usage of our pipeline setup is superior to other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://bioinf.itmat.upenn.edu/BEERS/bp1/datasets.php.

  2. 2.

    https://github.com/khayer/aligner_benchmark.

  3. 3.

    https://github.com/tamslo/koala/tree/master/scripts/count_insertions_and_deletions.

References

  1. Baruzzo, G., Hayer, K.E., Kim, E.J., Di Camillo, B., FitzGerald, G.A., Grant, G.R.: Simulation-based comprehensive benchmarking of RNA-seq aligners. Nat. Methods 14(2), 135 (2017)

    Article  Google Scholar 

  2. Broad Institute: Calling variants in RNAseq, January 2017. https://software.broadinstitute.org/gatk/documentation/article.php?id=3891

  3. Broad Institute: Introduction to the GATK best practices, January 2018. https://software.broadinstitute.org/gatk/best-practices

  4. Chen, L.Y., et al.: RNASEQR-a streamlined and accurate RNA-seq sequence analysis program. Nucleic Acids Res. 40(6), e42 (2011)

    Article  Google Scholar 

  5. Dobin, A., Gingeras, T.R.: Mapping RNA-seq reads with star. Curr. Protoc. Bioinform. 51(1), 11–14 (2015)

    Google Scholar 

  6. ENCODE Project Consortium and Others: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57 (2012)

    Article  Google Scholar 

  7. Engström, P.G., et al.: Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10(12), 1185 (2013)

    Article  MathSciNet  Google Scholar 

  8. Guo, Y., Dai, Y., Yu, H., Zhao, S., Samuels, D.C., Shyr, Y.: Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis. Genomics 109(2), 83–90 (2017)

    Article  Google Scholar 

  9. Kim, D., Langmead, B., Salzberg, S.L.: HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12(4), 357 (2015)

    Article  Google Scholar 

  10. Krusche, P., et al.: Best practices for benchmarking germline small variant calls in human genomes. bioRxiv, p. 270157 (2018)

    Google Scholar 

  11. Li, H.: A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27(21), 2987–2993 (2011)

    Article  Google Scholar 

  12. Li, H.: Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30(20), 2843–2851 (2014)

    Article  Google Scholar 

  13. Novocraft Technologies Sdn Bhd: RNAseq analysis: mRNA and the spliceosome. http://www.novocraft.com/documentation/novoalign-2/novoalign-user-guide/rnaseq-analysis-mrna-and-the-spliceosome

  14. Novocraft Technologies Sdn Bhd: Novoalign reference manual, March 2014. http://www.novocraft.com/wp-content/uploads/Novocraft.pdf

  15. Oikkonen, L., Lise, S.: Making the most of RNA-seq: pre-processing sequencing data with opossum for reliable SNP variant detection. Wellcome Open Res. 2, 6 (2017)

    Article  Google Scholar 

  16. Piskol, R., Ramaswami, G., Li, J.B.: Reliable identification of genomic variants from RNA-seq data. Am. J. Hum. Genet. 93(4), 641–651 (2013)

    Article  Google Scholar 

  17. Poplin, R., et al.: Scaling accurate genetic variant discovery to tens of thousands of samples. bioRxiv, p. 201178 (2017)

    Google Scholar 

  18. QIAGEN Bioinformatics: CLC genomics workbench. https://www.qiagenbioinformatics.com/products/clc-genomics-workbench

  19. Quinn, E.M., et al.: Development of strategies for SNP detection in RNA-seq data: application to lymphoblastoid cell lines and evaluation using 1000 Genomes data. PloS One 8(3), e58815 (2013)

    Article  Google Scholar 

  20. Rimmer, A., et al.: Integrating mapping-, assembly-and haplotype-based approaches for calling variants in clinical sequencing applications. Nat. Genet. 46(8), 912 (2014)

    Article  Google Scholar 

  21. Sloan, C.A., et al.: ENCODE data at the ENCODE portal. Nucleic Acids Res. 44(D1), D726–D732 (2015)

    Article  Google Scholar 

  22. Sun, Z., Bhagwate, A., Prodduturi, N., Yang, P., Kocher, J.P.A.: Indel detection from RNA-seq data: tool evaluation and strategies for accurate detection of actionable mutations. Brief. Bioinform. 18(6), 973–983 (2016)

    Google Scholar 

  23. Wu, T.D., Nacu, S.: Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26(7), 873–881 (2010)

    Article  Google Scholar 

  24. Zook, J., et al.: Reproducible integration of multiple sequencing datasets to form high-confidence SNP, indel, and reference calls for five human genome reference materials. bioRxiv, p. 281006 (2018)

    Google Scholar 

Download references

Acknowledgement

Parts of this work were generously supported by a grant of the German Federal Ministry of Education and Research (031A427B).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Milena Kraus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Slosarek, T., Kraus, M., Schapranow, MP., Boettinger, E. (2019). Qualitative Comparison of Selected Indel Detection Methods for RNA-Seq Data. In: Rojas, I., Valenzuela, O., Rojas, F., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2019. Lecture Notes in Computer Science(), vol 11465. Springer, Cham. https://doi.org/10.1007/978-3-030-17938-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17938-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17937-3

  • Online ISBN: 978-3-030-17938-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics