Skip to main content

Differentially Private Histogram and Synthetic Data Publication

  • Chapter

Abstract

Differential privacy has recently emerged as one of the strongest privacy guarantees by making few assumptions on the background or external knowledge of an attacker. Differentially private data analysis and publishing have received considerable attention in biomedical communities as promising approaches for sharing medical and health data, while preserving the privacy of individuals represented in data records. In this chapter, we provide a broad survey of the recent works in differentially private histogram and synthetic data publishing. We categorize most recent and emerging techniques in this field from two major aspects: (a) various data types (e.g, relational data, transaction data, dynamic stream data, etc.), and (b) parametric, and non-parametric techniques. We also present some challenges and future research directions for releasing differentially private histogram and synthetic data in health and medical data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

     | D | means the data cardinality or the number of tuples in database D.

  2. 2.

    Data cube and histogram are the same objects in this chapter.

  3. 3.

    In linguistics, a copula is a word used to link the subject of a sentence with a predicate, such as the word “is” in the sentence “The sky is blue”. In probability and statistics theory, a copula is a multivariate probability distribution for which the marginal probability distribution of each variable is uniform.

  4. 4.

    Another definition is the removal or addition of a single data record.

References

  1. Ács, G., Castelluccia, C., Chen, R.: Differentially private histogram publishing through lossy compression. In: ICDM (2012)

    Google Scholar 

  2. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: PODS (2007)

    Book  Google Scholar 

  3. Bhaskar, R., Laxman, S., Smith, A., Thakurta, A.: Discovering frequent patterns in sensitive data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 25–28 July 2010, pp. 503–512 (2010). doi:10.1145/1835804.1835869, http://doi.acm.org/10.1145/1835804.1835869

  4. Chan, T.H., Shi, E., Song, D.: Private and continual release of statistics. ACM Trans. Inf. Syst. Secur. 14(3), 26 (2011). doi:10.1145/2043621.2043626, http://doi.acm.org/10.1145/2043621.2043626

  5. Chen, R., Mohammed, N., Fung, B.C.M., Desai, B.C., Xiong, L.: Publishing set-valued data via differential privacy. Proc.VLDB 4(11), 1087–1098 (2011). http://www.vldb.org/pvldb/vol4/p1087-chen.pdf

  6. Cormode, G., Procopiuc, C.M., Srivastava, D., Shen, E., Yu, T.: Differentially private spatial decompositions. In: ICDE (2012)

    Book  Google Scholar 

  7. Cormode, G., Procopiuc, C.M., Srivastava, D., Tran, T.T.L.: Differentially private summaries for sparse data. In: ICDT, pp. 299–311 (2012)

    Google Scholar 

  8. Dankar, F.K., Emam, K.E.: The application of differential privacy to health data. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, Berlin, Germany, 30 March 2012, pp. 158–166 (2012). doi:10.1145/2320765.2320816, http://doi.acm.org/10.1145/2320765.2320816

  9. Dwork, C.: Differential privacy. In: Automata, Languages and Programming, Pt 2, vol. 4052. Springer, Berlin (2006)

    Google Scholar 

  10. Dwork, C.: Differential privacy: a survey of results. In: Proceedings of the 5th International Conference on Theory and Applications of Models of Computation, pp. 1–19 (2008)

    Google Scholar 

  11. Dwork, C.: Differential privacy in new settings. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17–19, 2010, pp. 174–183 (2010). doi:10.1137/1.9781611973075.16, http://dx.doi.org/10.1137/1.9781611973075.16

  12. Dwork, C.: Differential privacy. In: Encyclopedia of Cryptography and Security, 2nd edn., pp. 338–340. Springer, Heidelberg (2011)

    Google Scholar 

  13. Dwork, C., Naor, M., Pitassi, T., Rothblum, G.N.: Differential privacy under continual observation. In: STOC, pp. 715–724 (2010)

    Google Scholar 

  14. Dwork, C., Mcsherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. Theory of Cryptography, New York, NY, USA, pp. 1–20 (2006)

    Google Scholar 

  15. Emam, K.E.: Guidelines for the deidentification of health information CRC Press; 1 edition (May 6, 2013)

    Google Scholar 

  16. Fan, L., Xiong, L., Sunderam, V.S.: Fast: differentially private real-time aggregate monitor with filtering and adaptive sampling. In: SIGMOD Conference, pp. 1065–1068 (2013)

    Google Scholar 

  17. Geng, Q., Viswanath, P.: The optimal mechanism in differential privacy. In: IEEE Symposium on Information Theory (2014)

    Book  Google Scholar 

  18. Ghosh, A., Roughgarden, T., Sundararajan, M.: Universally utility-maximizing privacy mechanisms. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, pp. 351–360 (2009)

    MathSciNet  Google Scholar 

  19. Inan, A., Kantarcioglu, M., Ghinita, G., Bertino, E.: Private record matching using differential privacy. In: EDBT (2010)

    Book  Google Scholar 

  20. Kasiviswanathan, S.P., Rudelson, M., Smith, A., Ullman, J.: The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In: Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5–8 June 2010, pp. 775–784 (2010)

    Google Scholar 

  21. Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. In: Proceedings of the VLDB Endowment, PVLDB, vol. 7(12), pp. 1155–1166 (2014). http://www.vldb.org/pvldb/vol7/p1155-kellaris.pdf

  22. Lee, J., Clifton, C.: Differential identifiability. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, 12–16 August 2012, pp. 1041–1049 (2012)

    Google Scholar 

  23. Li, N., Qardaji, W.H., Su, D., Cao, J.: Privbasis: frequent itemset mining with differential privacy. Proc. VLDB 5(11), 1340–1351 (2012). http://vldb.org/pvldb/vol5/p1340_ninghuili_vldb2012.pdf

  24. Li, H., Xiong, L., Jiang, X.: Differentially private synthesization of multi-dimensional data using copula functions. In: EDBT, pp. 475–486 (2014)

    Google Scholar 

  25. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. TKDD 1(1), (2007). doi:10.1145/1217299.1217302, http://doi.acm.org/10.1145/1217299.1217302

  26. Machanavajjhala, A., Kifer, D., Abowd, J.M., Gehrke, J., Vilhuber, L.: Privacy: theory meets practice on the map. In: Proceedings of the 24th International Conference on Data Engineering, ICDE, pp. 277–286 (2008)

    Google Scholar 

  27. McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: SIGMOD ’09: Proceedings of the 35th SIGMOD International Conference on Management of Data, pp. 19–30. ACM, New York, NY, USA (2009). doi:http://doi.acm.org/10.1145/1559845.1559850

  28. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: FOCS, pp. 94–103 (2007)

    Google Scholar 

  29. Mohammed, N., Chen, R., Fung, B.C.M., Yu, P.S.: Differentially private data release for data mining. In: KDD, pp. 493–501 (2011)

    Google Scholar 

  30. Muralidhar, K., Sarathy, R.: Does differential privacy protect terry gross privacy? In: Privacy in Statistical Databases. Springer, Berlin (2011)

    Google Scholar 

  31. Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis. In: STOC (2007)

    Book  Google Scholar 

  32. Privacy, H.: New guidance on de-identification methods under the HIPAA privacy rule. In: TMA Privacy Office Information Paper (2013)

    Google Scholar 

  33. Rastogi, V., Nath, S.: Differentially private aggregation of distributed time-series with transformation and encryption. In: SIGMOD Conference, pp. 735–746 (2010)

    Google Scholar 

  34. Sarathy R., Muralidhar, K.: Evaluating laplace noise addition to satisfy differential privacy for numeric data. Trans. Data Priv. 4(1), 1–17 (2011)

    MathSciNet  Google Scholar 

  35. Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. In: ICDE, pp. 225–236 (2010)

    Google Scholar 

  36. Xu, J., Zhang, Z., Xiao, X., Yang, Y., Yu, G.: Differentially private histogram publication. In: ICDE, pp. 32–43 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoran Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Li, H., Xiong, L., Jiang, X. (2015). Differentially Private Histogram and Synthetic Data Publication. In: Gkoulalas-Divanis, A., Loukides, G. (eds) Medical Data Privacy Handbook. Springer, Cham. https://doi.org/10.1007/978-3-319-23633-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23633-9_3

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23632-2

  • Online ISBN: 978-3-319-23633-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics