Skip to main content
Log in

Anonymizing bipartite graph data using safe groupings

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Private data often come in the form of associations between entities, such as customers and products bought from a pharmacy, which are naturally represented in the form of a large, sparse bipartite graph. As with tabular data, it is desirable to be able to publish anonymized versions of such data, to allow others to perform ad hoc analysis of aggregate graph properties. However, existing tabular anonymization techniques do not give useful or meaningful results when applied to graphs: small changes or masking of the edge structure can radically change aggregate graph properties. We introduce a new family of anonymizations for bipartite graph data, called (k, ℓ)-groupings. These groupings preserve the underlying graph structure perfectly, and instead anonymize the mapping from entities to nodes of the graph. We identify a class of “safe” (k, ℓ)-groupings that have provable guarantees to resist a variety of attacks, and show how to find such safe groupings. We perform experiments on real bipartite graph data to study the utility of the anonymized version, and the impact of publishing alternate groupings of the same graph data. Our experiments demonstrate that (k, ℓ)-groupings offer strong tradeoffs between privacy and utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore are thou R3579X? Anonymized social networks, hidden patterns and structural steganography. In: International Conference on World Wide Web (WWW) (2007)

  2. Bennett, J., Lanning, S.: The Netflix prize. In: KDDCup Workshop (2007)

  3. Bhagat, S., Cormode, G., Krishnamurthy, B., Srivastava, D.: Class-based graph anonymization for social network data. In: International Conference on Very Large Data Bases (2009)

  4. Campan, A., Truta, T.M.: A clustering approach for data and structural anonymity in social networks. In: International Workshop on Privacy, Security and Trust in KDD (PinKDD) (2008)

  5. Garey M.R., Johnson D.S. (1979) Computers and Intractability, a Guide to the Theory of NP-Completeness. W.H. Freeman and Company, San Francisco

    MATH  Google Scholar 

  6. Ghinita, G., Tao, Y., Kalnis, P.: On the anonymization of sparse high-dimensional data. In: IEEE International Conference on Data Engineering (2008)

  7. Hay, M., Jensen, D., Miklau, G., Towsley, D., Weis, P.: Resisting structural re-identification in anonymized social networks. In: International Conference on Very Large Data Bases (2008)

  8. Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing social networks. Technical Report 07-19, University of Massachusetts Amherst (2007)

  9. Korolova, A., Motwani, R., Nabar, S., Xu, Y.: Link privacy in social networks. In: ACM Conference on Information and Knowledge Management (CIKM) (2008)

  10. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: IEEE International Conference on Data Engineering (2007)

  11. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: -diversity: Privacy beyond k-anonymity. In: IEEE International Conference on Data Engineering (2006)

  12. Martin, D.J., Kifer, D., Machanavajjhala, A., Gehrke, J.: Worse-case background knowledge for privacy-preserving data publishing. In: IEEE International Conference on Data Engineering (2007)

  13. Narayanan, A., Shmatikov, V.: How to break anonymity of the Netflix prize dataset. Technical Report arXiv:cs/0610105v1, arXiv (2006)

  14. Nergiz, M.E., Clifton, C., Nergiz, A.E.: Multirelational k-anonymity. In: IEEE International Conference on Data Engineering (2007)

  15. Samarati P. (2001) Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6): 1010–1027

    Article  Google Scholar 

  16. Sweeney L. (2002) k-Anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5): 557–570

    Article  MATH  MathSciNet  Google Scholar 

  17. Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. In: International Conference on Very Large Data Bases (2008)

  18. Wong, R., Li, J., Fu, A., Wang, K.: (α, k)-anonymity: An enhanced k-anonymity model for privacy-preserving data publishing. In: ACM SIGKDD (2006)

  19. Wong, R.C.-W., Fu, A.W.-C., Wang, K., Pei, J.: Minimality attack in privacy preserving data publishing. In: International Conference on Very Large Data Bases (2007)

  20. Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: International Conference on Very Large Data Bases (2006)

  21. Xiao, X., Tao, Y.: M-invariance: towards privacy preserving re-publication of dynamic datasets. In: ACM SIGMOD International Conference on Management of Data (2007)

  22. Xu, Y., Wang, K., Fu, A.W.-C., Yu, P.S.: Anonymizing transaction databases for publication. In: ACM SIGKDD (2008)

  23. Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: IEEE International Conference on Data Engineering (2007)

  24. Zheleva, E., Getoor, L.: Preserving the privacy of sensitive relationships in graph data. In: International Workshop on Privacy, Security and Trust in KDD (PinKDD) (2007)

  25. Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: IEEE International Conference on Data Engineering (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Graham Cormode.

Additional information

T. Yu and Q. Zhang were partially sponsored by the NSF through grants IIS-0430166 and CNS-0747247.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cormode, G., Srivastava, D., Yu, T. et al. Anonymizing bipartite graph data using safe groupings. The VLDB Journal 19, 115–139 (2010). https://doi.org/10.1007/s00778-009-0167-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-009-0167-9

Keywords

Navigation