skip to main content
10.1145/2046707.2046785acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Countering GATTACA: efficient and secure testing of fully-sequenced human genomes

Published:17 October 2011Publication History

ABSTRACT

Recent advances in DNA sequencing technologies have put ubiquitous availability of fully sequenced human genomes within reach. It is no longer hard to imagine the day when everyone will have the means to obtain and store one's own DNA sequence. Widespread and affordable availability of fully sequenced genomes immediately opens up important opportunities in a number of health-related fields. In particular, common genomic applications and tests performed in vitro today will soon be conducted computationally, using digitized genomes. New applications will be developed as genome-enabled medicine becomes increasingly preventive and personalized. However, this progress also prompts significant privacy challenges associated with potential loss, theft, or misuse of genomic data. In this paper, we begin to address genomic privacy by focusing on three important applications: Paternity Tests, Personalized Medicine, and Genetic Compatibility Tests. After carefully analyzing these applications and their privacy requirements, we propose a set of efficient techniques based on private set operations. This allows us to implement in in silico some operations that are currently performed via in vitro methods, in a secure fashion. Experimental results demonstrate that proposed techniques are both feasible and practical today.

References

  1. A. Abbott. Special section on human genetics: With your genes? Take one of these, three times a day. Nature, 425(6960), 2003.Google ScholarGoogle Scholar
  2. M. Adams et al. The Genome Sequence of Drosophila melanogaster. Science, 287(5461), 2000.Google ScholarGoogle Scholar
  3. J. Beckmann and M. Soller. Restriction fragment length polymorphisms and genetic improvement of agricultural species. Euphytica, 35(1), 1986.Google ScholarGoogle Scholar
  4. M. Blanton and M. Aliasgari. Secure outsourcing of dna searching via finite automata. In DBSec, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Brandon, D. Wallace, and P. Baldi. Data structures and compression algorithms for genomic sequence data. Bioinformatics, 25(14), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Bruekers, S. Katzenbeisser, K. Kursawe, and P. Tuyls. Privacy-Preserving Matching of DNA Profiles. http://eprint.iacr.org/2008/203, 2008.Google ScholarGoogle Scholar
  7. C. Børsting et al. Performance of the SNPforID 52 SNP-plex assay in paternity testing. Forensic Science International: Genetics, 2(4), 2008.Google ScholarGoogle Scholar
  8. J. Camenisch and G. Zaverucha. Private intersection of certified sets. In FC, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Carlson. SNPs -- A shortcut to personalized medicine. Genetic Engineering & Biotechnology News, 2008.Google ScholarGoogle Scholar
  10. Center for Applied Genomics, University of Toronto. Database of Genomic Variants. http://projects.tcag.ca/variation, 2011.Google ScholarGoogle Scholar
  11. F. Collins and V. McKusick. Implications of the Human Genome Project for medical science. Jama, 285(5), 2001.Google ScholarGoogle Scholar
  12. L. Cunningham. High-stakes Test. Daily Business Review, 2003.Google ScholarGoogle Scholar
  13. K. Daily et al. Data structures and compression algorithms for high-throughput sequencing technologies. BMC bioinformatics, 11(1), 2010.Google ScholarGoogle Scholar
  14. G. Danezis et al. Efficient negative databases from cryptographic hash functions. In ISC, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. De Cristofaro, J. Kim, and G. Tsudik. Linear-complexity private set intersection protocols secure in malicious model. In Asiacrypt, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  16. E. De Cristofaro and G. Tsudik. Practical Private Set Intersection Protocols with Linear Complexity. In FC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. De Cristofaro and G. Tsudik. Fast and Private Computation of Set Intersection Cardinality. Cryptology ePrint Archive, 2011.Google ScholarGoogle Scholar
  18. N. Dracopoli, J. Haines, and B. Korf. Current protocols in human genetics. John Wiley & Sons, 1994.Google ScholarGoogle Scholar
  19. R. Durbin et al. A map of human genome variation from population-scale sequencing. Nature, 467(7319), 2010.Google ScholarGoogle Scholar
  20. M. Durham. How Research Will Adapt to HIPAA: A View from Within the Healthcare Delivery System. Am. JL and Med., 28, 2002.Google ScholarGoogle Scholar
  21. T. ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE transactions on Information Theory, 31(4), 1985.Google ScholarGoogle Scholar
  22. D. Endean. RFLP analysis for paternity testing: observations and caveats. In International Symposium on Human Identification, 1989.Google ScholarGoogle Scholar
  23. J. Fowler, J. Settle, and N. Christakis. Correlated genotypes in friendship networks. Proceedings of the National Academy of Sciences, 108(5), 2011.Google ScholarGoogle ScholarCross RefCross Ref
  24. M. Freedman, Y. Ishai, B. Pinkas, and O. Reingold. Keyword search and oblivious pseudorandom functions. In TCC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Freedman, K. Nissim, and B. Pinkas. Efficient private matching and set intersection. In Eurocrypt, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  26. Genetics Home Reference. HBB Gene. http://ghr.nlm.nih.gov/gene/HBB.Google ScholarGoogle Scholar
  27. R. Gennaro, C. Hazay, and J. Sorensen. Text Search Protocols with Simulation Based Security. In PKC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Gibbs and A. Singleton. Application of genome-wide single nucleotide polymorphism typing: Simple association and beyond. PLoS Genet, 2(10), 10 2006.Google ScholarGoogle ScholarCross RefCross Ref
  29. G. Ginsburg and H. Willard. Genomic and personalized medicine: foundations and applications. Translational Research, 154(6), 2009.Google ScholarGoogle Scholar
  30. A. Goffeau et al. Life with 6000 Genes. Science, 1996.Google ScholarGoogle Scholar
  31. O. Goldreich. Foundations of cryptography: Basic applications, chapter 7.2.2. Cambridge Univ Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. O. Goldreich, R. Israel, and V. Rosen. On the security of modular exponentiation with application to the construction of pseudorandom generators. Journal of Cryptology, 16, 2000.Google ScholarGoogle Scholar
  33. M. Gordillo et al. The molecular mechanism underlying Roberts syndrome involves loss of ESCO2 acetyltransferase activity. Human molecular genetics, 17(14), 2008.Google ScholarGoogle Scholar
  34. J. Gusella et al. A polymorphic DNA marker genetically linked to Huntington's disease. Nature, 306(5940), 1983.Google ScholarGoogle Scholar
  35. C. Hazay and Y. Lindell. Efficient protocols for set intersection and pattern matching with security against malicious and covert adversaries. In TCC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Hazay and T. Toft. Computationally secure pattern matching in the presence of malicious adversaries. Asiacrypt, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  37. J. Ho, Choi, et al. Replication study of SNP associations for colorectal cancer in Hong Kong Chinese. British Journal of Cancer, 2010.Google ScholarGoogle Scholar
  38. M. Hoffman. The genome-enabled electronic medical record. Journal of Biomedical Informatics, 40(1), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Hsi-Yang Fritz, R. Leinonen, G. Cochrane, and E. Birney. Efficient storage of high throughput dna sequencing data using reference-based compression. Genome Research, 21(5), May 2011.Google ScholarGoogle Scholar
  40. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature, 409, 2001.Google ScholarGoogle Scholar
  41. S. Jarecki and X. Liu. Fast Secure Computation of Set Intersection. In SCN, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. Jha, L. Kruger, and V. Shmatikov. Towards practical privacy for genomic computation. In S&P, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. Kaiser. A plan to capture human diversity in 1000 genomes. Science, 319, 2008.Google ScholarGoogle Scholar
  44. M. Kantarcioglu, W. Jiang, Y. Liu, and B. Malin. A cryptographic approach to securely share and query genomic sequences. Transactions on Information Technology in Biomedicine, 12(5), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. F. Kastrinos et al. Risk of pancreatic cancer in families with Lynch syndrome. JAMA: The Journal of the American Medical Association, 302(16), 2009.Google ScholarGoogle ScholarCross RefCross Ref
  46. J. Katz and Y. Lindell. Introduction to modern cryptography. Chapman & Hall/CRC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. Katz and J. Malka. Secure text processing with applications to private dna matching. In CCS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. L. Kissner and D. Song. Privacy-preserving set operations. In Crypto, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. J. Kulynych and D. Korn. The New HIPAA (Health Insurance Portability and Accountability Act of 1996) Medical Privacy Rule. Circulation, 108, 2003.Google ScholarGoogle Scholar
  50. E. Lander. DNA fingerprinting on trial. Nature, 339(6225), 1989.Google ScholarGoogle Scholar
  51. V. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics Doklady, volume 10, 1966.Google ScholarGoogle Scholar
  52. S. Levy et al. The diploid genome sequence of an individual human. PLoS biology, 5(10), 2007.Google ScholarGoogle Scholar
  53. R. Lewis and A. Reynolds. Human genetics: concepts and applications. McGraw-Hill, 2003.Google ScholarGoogle Scholar
  54. B. Malin. An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future. Journal of the American Medical Informatics Association, 12(1), 2005.Google ScholarGoogle ScholarCross RefCross Ref
  55. A. McGuire and R. Gibbs. Currents in Contemporary Ethics: Meeting the Growing Demands of Genetic Research. JL Med. & Ethics, 34, 2006.Google ScholarGoogle Scholar
  56. V. McKusick and S. Antonarakis. Mendelian inheritance in man: a catalog of human genes and genetic disorders. John Hopkins University Press, 1994.Google ScholarGoogle Scholar
  57. A. Menezes, P. Van Oorschot, and S. Vanstone. Handbook of applied cryptography. CRC, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. S. Migueles et al. HLA B* 5701 is highly associated with restriction of virus replication in a subgroup of HIV-infected long term nonprogressors. Proceedings of the National Academy of Sciences, 97(6), 2000.Google ScholarGoogle Scholar
  59. National Center for Biotechnology Information (US). Single Nucleotide Polymorphism Database. http://www.ncbi.nlm.nih.gov/projects/SNP/.Google ScholarGoogle Scholar
  60. National Center for Biotechnology Information (US). TPMT thiopurine S-methyltransferase. http://1.usa.gov/orAYkF.Google ScholarGoogle Scholar
  61. National Center for Biotechnology Information (US). Restriction Fragment Length Polymorphism (RFLP). http://1.usa.gov/pha5sw, 2011.Google ScholarGoogle Scholar
  62. NCBI. Genome Mapping. http://1.usa.gov/oWNiYo, 2011.Google ScholarGoogle Scholar
  63. A. Prat and J. Baselga. The role of hormonal therapy in the management of hormonal-receptor-positive breast cancer with co-expression of her2. Nature Clinical Practice Oncology, 5(9), 2008.Google ScholarGoogle Scholar
  64. ScientificMatch.com. http://scientificmatch.com, 2011.Google ScholarGoogle Scholar
  65. R. F. Service. The race for the \$1000 genome. Science, 311, 2006.Google ScholarGoogle Scholar
  66. N. Siva. 1000 Genomes project. Nature biotechnology, 26(3), 2008.Google ScholarGoogle Scholar
  67. T. Smith and M. Waterman. Identification of common molecular subsequences. Journal of Molecular Biology, 147, 1981.Google ScholarGoogle Scholar
  68. P. Stenson et al. The human gene mutation database: 2008 update. Genome Medicine, 1(1), 2009.Google ScholarGoogle Scholar
  69. The Federal Bureau of Investigation. Combined DNA Index System (CODIS). http://www.fbi.gov/about-us/lab/codis, 2011.Google ScholarGoogle Scholar
  70. T. Tokino et al. Isolation and mapping of 62 new RFLP markers on human chromosome 11. American journal of human genetics, 48(2), 1991.Google ScholarGoogle Scholar
  71. J. Troncoso-Pastoriza, S. Katzenbeisser, and M. Celik. Privacy preserving error resilient dna searching through oblivious automata. In CCS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. J. Vaidya and C. Clifton. Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. M. Wadman. Genetics bill cruises through senate. Nature, 453, 2008.Google ScholarGoogle Scholar
  74. J. Wang et al. The diploid genome sequence of an Asian individual. Nature, 456(7218), 2008.Google ScholarGoogle Scholar
  75. R. Wang, X. Wang, Z. Li, H. Tang, M. Reiter, and Z. Dong. Privacy-preserving genomic computation through program specialization. In CCS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. R. Waterston et al. Initial sequencing and comparative analysis of the mouse genome. Nature, 420(6915), 2002.Google ScholarGoogle Scholar
  77. A. Weston and L. Hood. Systems biology, proteomics, and the future of health care: toward predictive, preventative, and personalized medicine. Journal of proteome research, 3(2), 2004.Google ScholarGoogle Scholar
  78. D. Wheeler et al. The complete genome of an individual by massively parallel DNA sequencing. Nature, 452(7189), 2008.Google ScholarGoogle Scholar
  79. A. Yao. Protocols for secure computations. In FOCS, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. C. Yates et al. Molecular diagnosis of thiopurine S-methyltransferase deficiency: genetic basis for azathioprine and mercaptopurine intolerance. Annals of internal medicine, 126(8), 1997.Google ScholarGoogle Scholar

Index Terms

  1. Countering GATTACA: efficient and secure testing of fully-sequenced human genomes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CCS '11: Proceedings of the 18th ACM conference on Computer and communications security
      October 2011
      742 pages
      ISBN:9781450309486
      DOI:10.1145/2046707

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CCS '11 Paper Acceptance Rate60of429submissions,14%Overall Acceptance Rate1,261of6,999submissions,18%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader