research-article

Differentially private network data release via structural inference

Authors:
Qian Xiao

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Rui Chen

Hong Kong Baptist University, Hong Kong, Hong Kong

Hong Kong Baptist University, Hong Kong, Hong Kong
View Profile

,
Kian-Lee Tan

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2014Pages 911–920https://doi.org/10.1145/2623330.2623642

Published:24 August 2014Publication History

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 911–920

ABSTRACT

Information networks, such as social media and email networks, often contain sensitive information. Releasing such network data could seriously jeopardize individual privacy. Therefore, we need to sanitize network data before the release. In this paper, we present a novel data sanitization solution that infers a network's structure in a differentially private manner. We observe that, by estimating the connection probabilities between vertices instead of considering the observed edges directly, the noise scale enforced by differential privacy can be greatly reduced. Our proposed method infers the network structure by using a statistical hierarchical random graph (HRG) model. The guarantee of differential privacy is achieved by sampling possible HRG structures in the model space via Markov chain Monte Carlo (MCMC). We theoretically prove that the sensitivity of such inference is only O(log n), where n is the number of vertices in a network. This bound implies less noise to be injected than those of existing works. We experimentally evaluate our approach on four real-life network datasets and show that our solution effectively preserves essential network structural properties like degree distribution, shortest path length distribution and influential nodes.

Supplemental Material

p911-sidebyside.mp4

mp4

279.1 MB

Download

References

J. Blocki, A. Blum, A. Datta, and O. Sheffet. Differentially private data analysis of social networks via restricted sensitivity. In ITCS, 2013. Google ScholarDigital Library
S. Brooks, A. Gelman, G. L. Jones, and X.-L. Meng. Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC, 2011.Google ScholarCross Ref
R. Chen, B. C. M. Fung, P. S. Yu, and B. C. Desai. Correlated network data publication via differential privacy. VLDBJ, in press. Google ScholarDigital Library
S. Chen and S. Zhou. Recursive mechanism: towards node differential privacy and unrestricted joins. In SIGMOD, 2013. Google ScholarDigital Library
J. Cheng, A. W. C. Fu, and J. Liu. K-isomorphism: privacy preserving network publication against structural attacks. In SIGMOD, 2010. Google ScholarDigital Library
A. Clauset, C. Moore, and M. E. J. Newman. Structural inference of hierarchies in networks. In ICML on Statistical Network Analysis, 2007. Google ScholarDigital Library
A. Clauset, C. Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98--101, 2008.Google ScholarCross Ref
G. Cormode, D. Srivastava, T. Yu, and Q. Zhang. Anonymizing bipartite graph data using safe groupings. PVLDB, 1(1):833--844, 2008. Google ScholarDigital Library
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google ScholarDigital Library
A. Gupta, A. Roth, and J. Ullman. Iterative constructions and private data release. In TCC, 2012. Google ScholarDigital Library
M. Hardt and A. Roth. Beating randomized response on incoherent matrices. In STOC, 2012. Google ScholarDigital Library
M. Hay, C. Li, G. Miklau, and D. Jensen. Accurate estimation of the degree distribution of private networks. In ICDM, 2009. Google ScholarDigital Library
M. Hay, G. Miklau, D. Jensen, D. F. Towsley, and P. Weis. Resisting structural re-identification in anonymized social networks. PVLDB, 1(1):102--114, 2008. Google ScholarDigital Library
V. Karwa, S. Raskhodnikova, A. Smith, and G. Yaroslavtsev. Private analysis of graph structure. PVLDB, 4(11):1146--1157, 2011.Google ScholarDigital Library
S. P. Kasiviswanathan, K. Nissim, S. Raskhodnikova, and A. Smith. Analyzing graphs with node differential privacy. In TCC, 2013. Google ScholarDigital Library
K. Liu and E. Terzi. Towards identity anonymization on graphs. In SIGMOD, 2008. Google ScholarDigital Library
F. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Commun. ACM, 53(9), 2010. Google ScholarDigital Library
F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, 2007. Google ScholarDigital Library
E. Mossel and E. Vigoda. Phylogenetic MCMC algorithms are misleading on mixtures of trees. Science, 309(5744):2207--2209, 2005.Google ScholarCross Ref
D. Proserpio, S. Goldberg, and F. McSherry. A work ow for differentially-private graph synthesis. In WOSN, 2012. Google ScholarDigital Library
A. Sala, X. Zhao, C. Wilson, H. Zheng, and B. Y. Zhao. Sharing graphs using differentially private graph models. In IMC, 2011. Google ScholarDigital Library
P. Samarati. Protecting respondents' identities in microdata release. TKDE, 13(6):1010--1027, 2001. Google ScholarDigital Library
E. Shen and T. Yu. Mining frequent graph patterns with differential privacy. In SIGKDD, 2013. Google ScholarDigital Library
Y. Wang and X. Wu. Preserving differential privacy in degree-correlation based graph generation. TDP, 6(2), 2013. Google ScholarDigital Library
Y. Wang, X. Wu, and L. Wu. Differential privacy preserving spectral graph analysis. In PAKDD, 2013.Google ScholarCross Ref
L. Wu, X. Ying, X. Wu, and Z.-H. Zhou. Line orthogonality in adjacency eigenspace with application to community partition. In IJCAI, 2011. Google ScholarDigital Library
B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In ICDE, 2008. Google ScholarDigital Library
L. Zou, L. Chen, and M. T. Ozsu. K-automorphism: a general framework for privacy preserving network publication. PVLDB, 2(1):946--957, 2009. Google ScholarDigital Library

Index Terms

Differentially private network data release via structural inference
1. Security and privacy
2. Social and professional topics
  1. Computing / technology policy
    1. Computer crime

Recommendations

A differentially private algorithm for location data release

The rise of mobile technologies in recent years has led to large volumes of location information, which are valuable resources for knowledge discovery such as travel patterns mining and traffic analysis. However, location dataset has been confronted ...
Read More
Differentially private data release for data mining
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Privacy-preserving data publishing addresses the problem of disclosing sensitive data when mining for useful information. Among the existing privacy models, ∈-differential privacy provides one of the strongest privacy guarantees and has no assumptions ...
Read More
Differentially Private Network Data Release via Stochastic Kronecker Graph
WISE 2016: Proceedings of the 17th International Conference on Web Information Systems Engineering - Volume 10042

Excessive sensitivity problem due to complication of data has been a non-negligible challenge to data privacy protection under differential privacy recently. We design a private data release framework called DPDR-SKG Differentially Private Data Release ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
General Chairs:
Sofus Macskassy
Facebook
,
Claudia Perlich
Dstillery
,
Program Chairs:
Jure Leskovec
Stanford University
,
Wei Wang
UCLA
,
Rayid Ghani
University of Chicago
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
differential privacy
network data
structural inference
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 94
  Total Citations
  View Citations
- 1,464
  Total Downloads
- Downloads (Last 12 months)68
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Differentially private network data release via structural inference

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

A differentially private algorithm for location data release

Differentially private data release for data mining

Differentially Private Network Data Release via Stochastic Kronecker Graph