Skip to main content

Addressing the Threats of Inference Attacks on Traits and Genotypes from Individual Genomic Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10330))

Abstract

The decreasing cost of DNA-sequencing empowers high availability of genetic-oriented services, which further promote growing number of genomes and traits of individuals being accessible online. Notoriously, these data are sensitive and may further lead to more sensitive data leakage. In this paper, we formulate the trait and genotype inference problem and develop an efficient inference method based on factor graph and belief propagation. An adversary then can infer the potential traits and genotypes of the victims whose portions of data are observed, depending on trait/SNP associations available from GWAS catalog. To protect against such inference attacks, we detail privacy and utility metrics then propose a genomic data-sanitization method that can effectively tradeoff genomic data openness and privacy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. https://www.patientslikeme.com/

  2. https://opensnp.org/

  3. http://www.nytimes.com/2013/03/24/opinion/sunday/the-immortal-life-of-henrietta-lacks-the-sequel.html?pagewanted=all

  4. https://www.cdc.gov/nchs/fastats/hypertension.htm

  5. The NHGRI-EBI catalog of published genome-wide association studies. https://www.ebi.ac.uk/gwas/docs/about

  6. Ayday, E., Cristofaro, E.D., Hubaux, J., Tsudik, G.: The chills and thrills of whole genome sequencing (2013). CoRR abs/1306.1264

    Google Scholar 

  7. Cai, Z., He, Z., Guan, X., Li, Y.: Collective data-sanitization for preventing sensitive information inference attacks in social networks. IEEE Trans. Dependable Secur. Comput. PP(99), 1 (2016)

    Article  Google Scholar 

  8. Collins, F.S., Hamburg, M.A.: First FDA authorization for next-generation sequencer. New Engl. J. Med. 369(25), 2369–2371 (2013)

    Article  Google Scholar 

  9. Erlich, Y., Narayanan, A.: Routes for breaching and protecting genetic privacy. Nat. Rev. Genet. 15(6), 409–421 (2014)

    Article  Google Scholar 

  10. Fishelson, M., Geiger, D.: Exact genetic linkage computations for general pedigrees. Bioinformatics 18, S189 (2002)

    Article  Google Scholar 

  11. Guo, X., Zhang, J., Cai, Z., Du, D.-Z., Pan, Y.: DAM: a Bayesian method for detecting genome-wide associations on multiple diseases. In: Harrison, R., Li, Y., Măndoiu, I. (eds.) ISBRA 2015. LNCS, vol. 9096, pp. 96–107. Springer, Cham (2015). doi:10.1007/978-3-319-19048-8_9

    Google Scholar 

  12. Gymrek, M., McGuire, A.L., Golan, D., Halperin, E., Erlich, Y.: Identifying personal genomes by surname inference. Science 339(6117), 321–324 (2013)

    Article  Google Scholar 

  13. Han, M., Li, J., Cai, Z., Han, Q.: Privacy reserved influence maximization in GPS-enabled cyber-physical and online social networks. In: 2016 IEEE International Conferences on Social Computing and Networking (SocialCom), pp. 284–292. IEEE (2016)

    Google Scholar 

  14. He, Z., Cai, Z., Han, Q., Tong, W., Sun, L., Li, Y.: An energy efficient privacy-preserving content sharing scheme in mobile social networks. Pers. Ubiquit. Comput. 20(5), 833–846 (2016)

    Article  Google Scholar 

  15. He, Z., Cai, Z., Sun, Y., Li, Y., Cheng, X.: Customized privacy preserving for inherent data and latent data. Pers. Ubiquit. Comput. 21(1), 43–54 (2017)

    Article  Google Scholar 

  16. Howie, B., Fuchsberger, C., Stephens, M., Marchini, J., Abecasis, G.R.: Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44(8), 955–959 (2012)

    Article  Google Scholar 

  17. Humbert, M., Ayday, E., Hubaux, J.P., Telenti, A.: Reconciling utility with privacy in genomics. In: Proceedings of the 13th Workshop on Privacy in the Electronic Society, WPES 2014, pp. 11–20. ACM (2014)

    Google Scholar 

  18. Humbert, M., Ayday, E., Hubaux, J.P., Telenti, A.: Addressing the concerns of the lacks family: quantification of kin genomic privacy. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, pp. 1141–1152. ACM (2013)

    Google Scholar 

  19. Humbert, M., Huguenin, K., Hugonot, J., Ayday, E., Hubaux, J.P.: De-anonymizing genomic databases using phenotypic traits. Proc. Priv. Enhanc. Technol. 2015(2), 99–114 (2015)

    Google Scholar 

  20. Johnson, A., Shmatikov, V.: Privacy-preserving data exploration in genome-wide association studies. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 1079–1087. ACM, New York (2013)

    Google Scholar 

  21. Lauritzen, S.L., Sheehan, N.A.: Graphical models for genetic analyses. Stat. Sci. 18, 489–514 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Marchini, J., Howie, B.: Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11(7), 499–511 (2010)

    Article  Google Scholar 

  23. Nyholt, D.R., Yu, C.-E., Visscher, P.M.: On Jim Watson’s APOE status: genetic information is hard to hide. Eur. J. Hum. Genet. 17(2), 147–149 (2009)

    Article  Google Scholar 

  24. O’Connell, J., Sharp, K., Shrine, N., Wain, L., Hall, I., Tobin, M., Zagury, J.F., Delaneau, O., Marchini, J.: Haplotype estimation for biobank-scale data sets. Technical report, Nature Publishing Group (2016)

    Google Scholar 

  25. Sviridenko, M.: A note on maximizing a submodular set function subject to a knapsack constraint. Oper. Res. Lett. 32(1), 41–43 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  26. Wang, Y., Wu, X., Shi, X.: Using aggregate human genome data for individual identification. In: 2013 IEEE International Conference on Bioinformatics and Biomedicine, pp. 410–415, December 2013

    Google Scholar 

  27. Zhang, L., Cai, Z., Wang, X.: Fakemask: a novel privacy preserving approach for smartphones. IEEE Trans. Netw. Serv. Manag. 13(2), 335–348 (2016)

    Article  Google Scholar 

  28. Zhang, L., Pan, Q., Wu, X., Shi, X.: Building Bayesian networks from GWAS statistics based on independence of causal influence. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 529–532, December 2016

    Google Scholar 

  29. Zheng, X., Cai, Z., Li, J., Gao, H.: Location-privacy-aware review publication mechanism for local business service systems. In: The 36th Annual IEEE International Conference on Computer Communications (INFOCOM) (2017)

    Google Scholar 

Download references

Acknowledgments

This work is partly supported by the National Science Foundation (NSF) of China under grant 61632010, 61602129.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingshu Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

He, Z., Li, Y., Li, J., Yu, J., Gao, H., Wang, J. (2017). Addressing the Threats of Inference Attacks on Traits and Genotypes from Individual Genomic Data. In: Cai, Z., Daescu, O., Li, M. (eds) Bioinformatics Research and Applications. ISBRA 2017. Lecture Notes in Computer Science(), vol 10330. Springer, Cham. https://doi.org/10.1007/978-3-319-59575-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59575-7_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59574-0

  • Online ISBN: 978-3-319-59575-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics