Skip to main content

A Cloud-Native NGS Data Processing and Annotation Platform

  • Conference paper
  • First Online:
Heterogeneous Data Management, Polystores, and Analytics for Healthcare (DMAH 2021, Poly 2021)

Abstract

Low-cost and widely available Next-Generation Sequencing (NGS) is revolutionizing clinical practice, paving the way for the realization of precision medicine. Applying NGS to clinical practice requires establishing a complex loop involving sample collection and sequencing, computational processing of the NGS outputs to identify variants, and the interpretation of the variants to establish their significance for the condition being treated. The computational tools that perform variant calling have been extensively used in bioinformatics, but there are few attempts to integrate them in a comprehensive, production-grade, Cloud-native infrastructure able to scale to national levels. Furthermore, there are no established interfaces for closing the loop between NGS machines, computational infrastructure, and variant interpretation experts.

We present here the platform developed for the Greek National Precision Medicine Network for Oncology. The platform integrates bioinformatics tools and their orchestration, makes provisions for both experimental and clinical usage of variant calling pipelines, provides programmatic interfaces for integration with NGS machines and for analytics, and provides user interfaces for supporting variant interpretation. We also present benchmarking results and discuss how these results confirm the soundness of our architectural and implementation choices.

The work described here has received funding from the Greek General Secretariat for Research and Innovation in the context of the Hellenic Network of Precision Medicine on Cancer. See also https://oncopmnet.gr for more details.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Cf. https://www.elixir-europe.org.

  2. 2.

    Cf. https://www.dnanexus.com.

  3. 3.

    Cf. https://basespace.illumina.com.

  4. 4.

    Cf. https://cromwell.readthedocs.io.

  5. 5.

    cf. https://kubernetes.io.

  6. 6.

    https://github.com/ga4gh/task-execution-schemas.

  7. 7.

    Cf. https://github.com/elixir-cloud-aai/TESK.

  8. 8.

    Cf. https://docs.gitlab.com/charts.

  9. 9.

    An exhaustive list of annotation databases used with VEP’s default configuration can be found here:

    https://www.ensembl.org/info/docs/tools/vep/script/VEP_script_documentation.pdf.

  10. 10.

    This plugin retrieves LOVD variation data from http://www.lovd.nl.

  11. 11.

    https://www.metabase.com.

  12. 12.

    Cf. https://www.ohdsi.org/data-standardization.

References

  1. Fjukstad, B., Bongo, L.A.: A review of scalable bioinformatics pipelines. Data Sci. Eng. 2, 245–251 (2017)

    Article  Google Scholar 

  2. Fokkema, I.F., Taschner, P.E., Schaafsma, G.C., Celli, J., Laros, J.F., den Dunnen, J.T.: LOVD v.2.0: the next generation in gene variant databases. Hum. Mutat. 32(5), 557–563 (2011)

    Google Scholar 

  3. McLaren, W., et al.: The ensembl variant effect predictor. Genome Biol. 17(1), 1–14 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stasinos Konstantopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mouchakis, G. et al. (2021). A Cloud-Native NGS Data Processing and Annotation Platform. In: Rezig, E.K., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2021 2021. Lecture Notes in Computer Science(), vol 12921. Springer, Cham. https://doi.org/10.1007/978-3-030-93663-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93663-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93662-4

  • Online ISBN: 978-3-030-93663-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics