Skip to main content

Privacy Through Data Recolouring

  • Conference paper
  • First Online:
  • 850 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12121))

Abstract

Current anonymization techniques for statistical databases exhibit significant limitations, related to the utility-privacy trade-off, the introduction of artefacts, and the vulnerability to correlation. We propose an anonymization technique based on the whitening/recolouring procedure that considers the database as an instance of a random population and applies statistical signal processing methods to it. In response to a query, the technique estimates the covariance matrix of the true data and builds a linear transformation of the data, producing an output that has the same statistical characteristics of the true data up to the second order, but is not directly linked to single records. The technique is applied to a real database containing the location data of taxi trips in New York. We show that the technique reduces the amount of artefacts introduced by noise addition while preserving first- and second-order statistical features of the true data (hence maintaining the utility of the query output).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Balle, B., Wang, Y.X.: Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. arXiv preprint arXiv:1805.06530 (2018)

  2. Beaulieu-Jones, B.K., et al.: Privacy-preserving generative deep neural networks support clinical data sharing. Circ. Cardiovasc. Qual. Outcomes 12(7), e005122 (2019)

    Article  Google Scholar 

  3. Björck, Å., Hammarling, S.: A schur method for the square root of a matrix. Linear Algebra Appl. 52, 127–140 (1983)

    Article  MathSciNet  Google Scholar 

  4. Brand, R.: Microdata protection through noise addition. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 97–116. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47804-3_8

    Chapter  Google Scholar 

  5. Ciriani, V., Capitani di Vimercati, S., Foresti, S., Samarati, P.: Microdata protection. In: Yu, T., Jajodia, S. (eds.) Secure Data Management in Decentralized Systems. Advances in Information Security, vol 33, pp. 291–321. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-27696-0_9

  6. Domingo-Ferrer, J., Sebé, F., Castellà-Roca, J.: On the security of noise addition for privacy in statistical databases. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 149–161. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25955-8_12

    Chapter  Google Scholar 

  7. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1

    Chapter  Google Scholar 

  8. Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79228-4_1

    Chapter  MATH  Google Scholar 

  9. Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54(1), 86–95 (2011)

    Article  Google Scholar 

  10. He, X., Yuan, H., Chen, Y.: Exploring the privacy bound for differential privacy: from theory to practice. EAI Endorsed Trans. Secur. Saf. 5(18) (2019)

    Google Scholar 

  11. Hsu, J., et al.: Differential privacy: an economic method for choosing epsilon. In: 2014 IEEE 27th Computer Security Foundations Symposium (CSF), pp. 398–410. IEEE (2014)

    Google Scholar 

  12. Johnson, N., Near, J.P., Song, D.: Towards practical differential privacy for SQL queries. Proc. VLDB Endow. 11(5), 526–539 (2018)

    Article  Google Scholar 

  13. Kessy, A., Lewin, A., Strimmer, K.: Optimal whitening and decorrelation. Am. Statist. 72(4), 309–314 (2018)

    Article  MathSciNet  Google Scholar 

  14. Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proceedings of the Section on Survey Research Methods, pp. 303–308. American Statistical Association (1986)

    Google Scholar 

  15. Kohli, N., Laskowski, P.: Epsilon voting: Mechanism design for parameter selection in differential privacy. In: 2018 IEEE Symposium on Privacy-Aware Computing (PAC), pp. 19–30. IEEE (2018)

    Google Scholar 

  16. Lasko, T.A., Vinterbo, S.A.: Spectral anonymization of data. IEEE Trans. Knowl. Data Eng. 22(3), 437–446 (2009)

    Article  Google Scholar 

  17. Lee, J., Clifton, C.: How much is enough? Choosing \({\varepsilon }\) for differential privacy. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 325–340. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24861-0_22

    Chapter  Google Scholar 

  18. Liu, C., Chakraborty, S., Mittal, P.: Dependence makes you vulnerable: differential privacy under dependent tuples. In: Proceedings Network and Distributed System Security Symposium (NDSS 2016) (2016)

    Google Scholar 

  19. Meiser, S.: Approximate and probabilistic differential privacy definitions. IACR Cryptology ePrint Archive 2018, 277 (2018)

    Google Scholar 

  20. Mivule, K.: Utilizing noise addition for data privacy, an overview. arXiv preprint arXiv:1309.3958 (2013)

  21. Naldi, M., D’Acquisto, G.: Differential privacy for counting queries: can Bayes estimation help uncover the true value? arXiv preprint arXiv:1407.0116 (2014)

  22. Naldi, M., D’Acquisto, G.: Differential privacy: an estimation theory-based method for choosing epsilon. CoRR, arXiv Preprint Series abs/1510.00917 (2015). http://arxiv.org/abs/1510.00917

  23. Naldi, M., Mazzoccoli, A., D’Acquisto, G.: Hiding alice in wonderland: a case for the use of signal processing techniques in differential privacy. In: Medina, M., Mitrakas, A., Rannenberg, K., Schweighofer, E., Tsouroulas, N. (eds.) APF 2018. LNCS, vol. 11079, pp. 77–90. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02547-2_5

    Chapter  Google Scholar 

  24. Shoshani, A.: Statistical databases: characteristics, problems, and some solutions. In: Proceedings of the 8th International Conference on Very Large Data Bases, pp. 208–222. Morgan Kaufmann Publishers Inc., Burlington (1982)

    Google Scholar 

  25. Spruill, N.L.: The confidentiality and analytic usefulness of masked business microdata. Rev. Public Data Use 12(4), 307–314 (1984)

    MathSciNet  Google Scholar 

  26. Sullivan, G.R.: The use of added error to avoid disclosure in microdata releases. Ph.D. thesis, Iowa State University (1989)

    Google Scholar 

  27. Tendick, P.: Optimal noise addition for preserving confidentiality in multivariate data. J. Statist. Plan. Infer. 27(3), 341–353 (1991)

    Article  MathSciNet  Google Scholar 

  28. Tendick, P., Matloff, N.: A modified random perturbation method for database security. ACM Trans. Database Syst. (TODS) 19(1), 47–63 (1994)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maurizio Naldi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

D’Acquisto, G., Mazzoccoli, A., Ciminelli, F., Naldi, M. (2020). Privacy Through Data Recolouring. In: Antunes, L., Naldi, M., Italiano, G., Rannenberg, K., Drogkaris, P. (eds) Privacy Technologies and Policy. APF 2020. Lecture Notes in Computer Science(), vol 12121. Springer, Cham. https://doi.org/10.1007/978-3-030-55196-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55196-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55195-7

  • Online ISBN: 978-3-030-55196-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics