Skip to main content

Distributed Privacy Preserving Data Collection

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6587))

Included in the following conference series:

Abstract

We study the distributed privacy preserving data collection problem: an untrusted data collector (e.g., a medical research institute) wishes to collect data (e.g., medical records) from a group of respondents (e.g., patients). Each respondent owns a multi-attributed record which contains both non-sensitive (e.g., quasi-identifiers) and sensitive information (e.g., a particular disease), and submits it to the data collector. Assuming T is the table formed by all the respondent data records, we say that the data collection process is privacy preserving if it allows the data collector to obtain a k-anonymized or l-diversified version of T without revealing the original records to the adversary.

We propose a distributed data collection protocol that outputs an anonymized table by generalization of quasi-identifier attributes. The protocol employs cryptographic techniques such as homomorphic encryption, private information retrieval and secure multiparty computation to ensure the privacy goal in the process of data collection. Meanwhile, the protocol is designed to leak limited but non-critical information to achieve practicability and efficiency. Experiments show that the utility of the anonymized table derived by our protocol is in par with the utility achieved by traditional anonymization techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asmuth, C., Bloom, J.: A modular approach to key safeguarding. IEEE Trans. Information Theory 29(2), 208–210 (1983)

    Article  MathSciNet  Google Scholar 

  2. Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proc. of ICDE, pp. 217–228 (2005)

    Google Scholar 

  3. Brickell, J., Shmatikov, V.: Efficient anonymity-preserving data collection. In: KDD 2006, pp. 76–85. ACM, New York (2006)

    Google Scholar 

  4. Damgard, I., Fitzi, M., Kiltz, E., Nielsen, J., Toft, T.: Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation, pp. 285–304 (2006)

    Google Scholar 

  5. Gentry, C., Ramzan, Z.: Single-database private information retrieval with constant communication rate, pp. 803–815 (2005)

    Google Scholar 

  6. Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast data anonymization with low information loss. In: Proc. of VLDB, pp. 758–769 (2007)

    Google Scholar 

  7. Jurczyk, P., Xiong, L.: Privacy-preserving data publishing for horizontally partitioned databases. In: CIKM 2008: Proceeding of the 17th ACM Conference on Information and Knowledge Mmanagement, pp. 1321–1322. ACM, New York (2008)

    Google Scholar 

  8. Kaya, K., Selçuk, A.A.: Threshold cryptography based on asmuth-bloom secret sharing. Inf. Sci. 177(19), 4148–4160 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  9. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: Proc. of ACM SIGMOD, pp. 49–60 (2005)

    Google Scholar 

  10. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: Proc. of ICDE (2006)

    Google Scholar 

  11. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS 2004, pp. 223–228. ACM, New York (2004)

    Google Scholar 

  12. Moon, B., Jagadish, H.v., Faloutsos, C., Saltz, J.H.: Analysis of the clustering properties of the hilbert space-filling curve. IEEE TKDE 13(1), 124–141 (2001)

    Google Scholar 

  13. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes, pp. 223–238 (1999)

    Google Scholar 

  14. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proc. of ACM PODS, p. 188 (1998)

    Google Scholar 

  15. Shamir, A.: How to share a secret. Commun. ACM 22(11), 612–613 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  16. Sweeney, L.: k-anonymity: A model for protecting privacy. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Yang, Z., Zhong, S., Wright, R.N.: Anonymity-preserving data collection. In: KDD 2005, pp. 334–343. ACM, New York (2005)

    Google Scholar 

  18. Zhong, S., Yang, Z., Chen, T.: k-anonymous data collection. Inf. Sci. 179(17), 2948–2963 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  19. Zhong, S., Yang, Z., Wright, R.N.: Privacy-enhancing k-anonymization of customer data. In: PODS 2005, pp. 139–147. ACM, New York (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xue, M., Papadimitriou, P., Raïssi, C., Kalnis, P., Pung, H.K. (2011). Distributed Privacy Preserving Data Collection. In: Yu, J.X., Kim, M.H., Unland, R. (eds) Database Systems for Advanced Applications. DASFAA 2011. Lecture Notes in Computer Science, vol 6587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20149-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20149-3_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20148-6

  • Online ISBN: 978-3-642-20149-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics