Skip to main content

EVE: Cloud-Based Annotation of Human Genetic Variants

  • Conference paper
  • First Online:
Applications of Evolutionary Computation (EvoApplications 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10199))

Included in the following conference series:

Abstract

Annotation of human genetic variants enables genotype-phenotype association studies at the gene, pathway, and tissue level. Annotation results are difficult to reproduce across study sites due to shifting software versions and a lack of a unified hardware interface between study sites. Cloud computing offers a promising solution by integrating hardware and software into reproducible virtual appliances which may be utilized on-demand and shared across institutions. We developed ENSEMBL VEP on EC2 (EVE), a cloud-based virtual appliance for annotation of human genetic variants built around the ENSEMBL Variant Effect Predictor. We integrated virtual hardware infrastructure, open-source software, and publicly available genomic datasets to provide annotation capability for genetic variants in the context of genes/transcripts, Gene Ontology pathways, tissue-specific expression from the Gene Expression Atlas, miRNA annotations, minor allele frequencies from the 1000 Genomes Project and the Exome Aggregation Consortium, and deleteriousness scores from Combined Annotation Dependent Depletion. We demonstrate the utility of EVE by annotating the genetic variants in a case-control study of glaucoma. Cloud computing can reduce the difficulty of replicating complex software pipelines such as annotation pipelines across study sites. We provide a publicly available CloudFormation template of the EVE virtual appliance which can automatically provision and deploy a parameterized, preconfigured hardware/software stack ready for annotation of human genetic variants (github.com/epistasislab/EVE). This approach offers increased reproducibility in human genetic studies by providing a unified appliance to researchers across the world.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Klein, R., Zeiss, C., Chew, E., Tsai, J.: Complement factor H polymorphism in age-related macular degeneration. Science 308(5720), 385–389 (2005). doi:10.1126/science.1109557.Complement

    Article  Google Scholar 

  2. Welter, D., MacArthur, J., Morales, J., et al.: The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42(D1), 1001–1006 (2014). doi:10.1093/nar/gkt1229

    Article  Google Scholar 

  3. Witte, J.S.: Genome-wide association studies and beyond. Annu. Rev. Public Health 77, 9–20 (2014). doi:10.1146/annurev.publhealth.012809.103723.Genome-Wide

    Google Scholar 

  4. Manolio, T.A.: Genomewide association studies and assessment of risk of disease. N. Engl. J. Med. 363, 2076–2077 (2010). doi:10.1056/NEJMc1010310

    Article  Google Scholar 

  5. Moore, J.H., Asselbergs, F.W., Williams, S.M.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4), 445–455 (2010). doi:10.1093/bioinformatics/btp713

    Article  Google Scholar 

  6. Greene, C.S., Voight, B.F.: Pathway and network-based strategies to translate genetic discoveries into effective therapies. Hum. Mol. Genet., 1–5 (2016). doi:10.1093/hmg/ddw160

  7. Greene, C.S., Krishnan, A., Wong, A.K., et al.: Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47(6) (2015). doi:10.1038/ng.3259

  8. McLaren, W., Gil, L., Hunt, S.E., et al.: The ensembl variant effect predictor. Genome Biol. 17(122) (2016). doi:10.1186/s13059-016-0974-4

  9. Wang, K., Li, M., Hakonarson, H.: ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38(16), e164 (2010). doi:10.1093/nar/gkq603

    Article  Google Scholar 

  10. Evangelou, E., Ioannidis, J.P.A.: Meta-analysis methods for genome-wide association studies and beyond. Nat. Rev. Genet. 14(6), 379–389 (2013). doi:10.1038/nrg3472

    Article  Google Scholar 

  11. Purcell, S., Neale, B., Todd-Brown, K., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007). doi:10.1086/519795

    Article  Google Scholar 

  12. Lek, M., Karczewski, K.J., Minikel, E.V., et al.: Analysis of protein-coding genetic variation in 60,706 humans. Nature 536(7616), 285–291 (2016). doi:10.1038/nature19057

    Article  Google Scholar 

  13. Kircher, M.: A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46(3), 310–315 (2014). doi:10.1038/ng.2892.A

    Article  Google Scholar 

  14. Consortium TGO: Gene ontologie: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000). doi:10.1038/75556.Gene

    Article  Google Scholar 

  15. Kapushesky, M., Adamusiak, T., Burdett, T., et al.: Gene Expression Atlas update–a value-added database of microarray and sequencing-based functional genomics experiments. Nucleic Acids Res. 40(Database issue), D1077-81 (2012). doi:10.1093/nar/gkr913

  16. Wiggs, J.L., Hauser, M.A., Abdrabou, W., et al.: The NEIGHBOR consortium primary open angle glaucoma genome-wide association study: rationale, study design and clinical variables. J. Glaucoma 22(7), 517–525 (2013). doi:10.1097/IJG.0b013e31824d4fd8

    Article  Google Scholar 

  17. Wiggs, J.L., Yaspan, B.L., Hauser, M.A., et al.: Common variants at 9p21 and 8q22 are associated with increased susceptibility to optic nerve degeneration in glaucoma. PLoS Genet. 8(4) (2012). doi:10.1371/journal.pgen.1002654

  18. Anderson, C.A., Pettersson, F.H., Clarke, G.M., Cardon, L.R., Morris, A.P., Zondervan, K.T.: Data quality control in genetic case-control association studies. Nat. Protoc. 5(9), 1564–1573 (2010). doi:10.1038/nprot.2010.116

    Article  Google Scholar 

  19. Begley, C.G., Ioannidis, J.P.A.: Reproducibility in science: improving the standard for basic and preclinical research. Circ. Res. 116(1), 116–126 (2015). doi:10.1161/CIRCRESAHA.114.303819

    Article  Google Scholar 

  20. Peng, R.D.: Reproducible research in computational science. Science 334(6060), 1226–1227 (2011). doi:10.1126/science.1213847

    Article  Google Scholar 

  21. Stein, L.D., Knopers, B.M., Campell, P., Getz, G., Korbel, J.O.: Create a cloud commons. Nature 523, 149–151 (2015). doi:10.1038/523149a

    Article  Google Scholar 

  22. Project Consortium G, Consortium Participants are arranged by project role G, by institution alphabetically then, et al.: An integrated map of genetic variation from 1,092 human genomes. Nature 490(7422), 56–65 (2012). doi:10.1038/nature11632

  23. McLendon, R., Friedman, A., Bigner, D., et al.: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455(7216), 1061–1068 (2008). doi:10.1038/nature07385

    Article  Google Scholar 

  24. Li, J., Doyle, M.A., Saeed, I., et al.: Bioinformatics pipelines for targeted resequencing and whole-exome sequencing of human and mouse genomes: a virtual appliance approach for instant deployment. PLoS One 9(4) (2014). doi:10.1371/journal.pone.0095217

Download references

Acknowledgements

This work is supported by an Amazon Web Services Cloud Credits for Research award to BSC and NIH AI116794 to JHM.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian S. Cole .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Cole, B.S., Moore, J.H. (2017). EVE: Cloud-Based Annotation of Human Genetic Variants. In: Squillero, G., Sim, K. (eds) Applications of Evolutionary Computation. EvoApplications 2017. Lecture Notes in Computer Science(), vol 10199. Springer, Cham. https://doi.org/10.1007/978-3-319-55849-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55849-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55848-6

  • Online ISBN: 978-3-319-55849-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics