skip to main content
10.1145/3133956.3134086acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Generating Synthetic Decentralized Social Graphs with Local Differential Privacy

Published: 30 October 2017 Publication History

Abstract

A large amount of valuable information resides in decentralized social graphs, where no entity has access to the complete graph structure. Instead, each user maintains locally a limited view of the graph. For example, in a phone network, each user keeps a contact list locally in her phone, and does not have access to other users' contacts. The contact lists of all users form an implicit social graph that could be very useful to study the interaction patterns among different populations. However, due to privacy concerns, one could not simply collect the unfettered local views from users and reconstruct a decentralized social network.
In this paper, we investigate techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. We show that existing local differential privacy and synthetic graph generation techniques are insufficient for preserving important graph properties, due to excessive noise injection, inability to retain important graph structure, or both. Motivated by this, we propose LDPGen, a novel multi-phase technique that incrementally clusters users based on their connections to different partitions of the whole population. Every time a user reports information, LDPGen carefully injects noise to ensure local differential privacy.We derive optimal parameters in this process to cluster structurally-similar users together. Once a good clustering of users is obtained, LDPGen adapts existing social graph generation models to construct a synthetic social graph.
We conduct comprehensive experiments over four real datasets to evaluate the quality of the obtained synthetic graphs, using a variety of metrics, including (i) important graph structural measures; (ii) quality of community discovery; and (iii) applicability in social recommendation. Our experiments show that the proposed technique produces high-quality synthetic graphs that well represent the original decentralized social graphs, and significantly outperform those from baseline approaches.

Supplemental Material

MP4 File

References

[1]
William Aiello, Fan Chung, and Linyuan Lu. 2000. A random graph model for massive graphs. In Proceedings of the thirty-second annual ACM symposium on Theory of computing. Acm, 171--180.
[2]
Raef Bassily and Adam D. Smith. 2015. Local, Private, Efficient Protocols for Succinct Histograms. In STOC. 127--135.
[3]
Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011).
[4]
Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2012. The johnsonlindenstrauss transform itself preserves differential privacy. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on. IEEE, 410--419.
[5]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008.
[6]
Christian Braune, Christian Borgelt, and Rudolf Kruse. 2013. Behavioral clustering for point processes. In International Symposium on Intelligent Data Analysis. Springer, 127--137.
[7]
L da F Costa, Francisco A Rodrigues, Gonzalo Travieso, and Paulino Ribeiro Villas Boas. 2007. Characterization of complex networks: A survey of measurements. Advances in physics 56, 1 (2007), 167--242.
[8]
Wei-Yen Day, Ninghui Li, and Min Lyu. 2016. Publishing graph degree distribution with node differential privacy. In Proceedings of the 2016 International Conference on Management of Data. ACM, 123--138.
[9]
John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy and statistical minimax rates. In FOCS. 429--438.
[10]
Cynthia Dwork. 2006. Differential privacy. In Automata, languages and programming. 1--12.
[11]
Cynthia Dwork and Jing Lei. 2009. Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of computing. ACM, 371--380.
[12]
Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N Rothblum, and Sergey Yekhanin. 2010. Pan-Private Streaming Algorithms. In ICS. 66--80.
[13]
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3--4 (2014), 211--407.
[14]
Paul Erdös and Alfréd Rényi. 1959. On random graphs, I. Publicationes Mathematicae (Debrecen) 6 (1959), 290--297.
[15]
Úlfar Erlingsson et al. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In CCS. 1054--1067.
[16]
Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 1054--1067.
[17]
Giulia Fanti et al. 2016. Building a RAPPOR with the Unknown: PrivacyPreserving Learning of Associations and Data Dictionaries. PoPETS issue 3, 2016 (2016).
[18]
Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. 2012. Evolution of social-attribute networks: measurements, modeling, and implications using google+. In Proceedings of the 2012 ACM conference on Internet measurement conference. ACM, 131--144.
[19]
Roger Guimera, Marta Sales-Pardo, and Luís A Nunes Amaral. 2004. Modularity from fluctuations in random graphs and complex networks. Physical Review E 70, 2 (2004), 025101.
[20]
Justin Hsu, Sanjeev Khanna, and Aaron Roth. 2012. Distributed private heavy hitters. In Automata, Languages, and Programming. 461--472.
[21]
Mohsen Jamali and Martin Ester. 2010. A matrix factorization technique with trust propagation for recommendation in social networks. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 135--142.
[22]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422--446.
[23]
Zach Jorgensen, Ting Yu, and Graham Cormode. 2016. Publishing Attributed Social Graphs with Formal Privacy Guarantees. In SIGMOD. 107--122.
[24]
Vishesh Karwa and Aleksandra B Slavković. 2012. Differentially private graphical degree sequences and synthetic graphs. In Privacy in Statistical Databases. Springer, 273--285.
[25]
Shiva Prasad Kasiviswanathan et al. 2011. What can we learn privately? SICOMP 40, 3 (2011), 793--826.
[26]
Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2013. Analyzing graphs with node differential privacy. In Theory of Cryptography. Springer, 457--476.
[27]
Dor Konforty, Yuval Adam, Daniel Estrada, and Lucius Gregory Meredith. 2015. Synereo: The Decentralized and Distributed Social Network. (2015).
[28]
Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, and Zoubin Ghahramani. 2010. Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research 11, Feb (2010), 985--1042.
[29]
Jure Leskovec, Kevin J Lang, Anirban Dasgupta, and Michael W Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1 (2009), 29--123.
[30]
Jure Leskovec and Julian J Mcauley. 2012. Learning to discover social circles in ego networks. In Advances in neural information processing systems. 539--547.
[31]
Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6 (2011), 1150--1170.
[32]
Wentian Lu and Gerome Miklau. 2014. Exponential Random Graph Estimation under Differential Privacy. In KDD. 921--930.
[33]
R Duncan Luce and Albert D Perry. 1949. A method of matrix analysis of group structure. Psychometrika 14, 2 (1949), 95--116.
[34]
Christopher D Manning, Prabhakar Raghavan, Hinrich Schütze, et al. 2008. Introduction to information retrieval. Vol. 1. Cambridge university press Cambridge.
[35]
Marina Meilă. 2003. Comparing clusterings by the variation of information. In Learning theory and kernel machines. Springer, 173--187.
[36]
Darakhshan Mir and Rebecca N Wright. 2012. A differentially private estimator for the stochastic kronecker graph model. In Proceedings of the 2012 Joint EDBT/ICDT Workshops. ACM, 167--176.
[37]
Thông T. Nguyên, Xiaokui Xiao, Yin Yang, Siu Cheung Hui, Hyejin Shin, and Junbum Shin. 2016. Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy. CoRR abs/1606.05053 (2016). http://arxiv.org/abs/ 1606.05053
[38]
Derek de Solla Price. 1976. A general theory of bibliometric and other cumulative advantage processes. Journal of the American society for Information science 27, 5 (1976), 292--306.
[39]
Zhan Qin, Yin Yang, Ting Yu, Issa Khalil, Xiaokui Xiao, and Kui Ren. 2016. Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS '16). ACM, New York, NY, USA, 192--203. https://doi.org/10.1145/ 2976749.2978409
[40]
William M Rand. 1971. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical association 66, 336 (1971), 846--850.
[41]
Garry Robins, Pip Pattison, Yuval Kalish, and Dean Lusher. 2007. An introduction to exponential random graph (p*) models for social networks. Social networks 29, 2 (2007), 173--191.
[42]
Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y Zhao. 2011. Sharing graphs using differentially private graph models. In IMC. ACM, 81--98.
[43]
Comandur Seshadhri, Tamara G Kolda, and Ali Pinar. 2012. Community structure and scale-free collections of Erd's-Rényi graphs. Physical Review E 85, 5 (2012), 056109.
[44]
Nguyen Xuan Vinh, Julien Epps, and James Bailey. 2010. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research 11, Oct (2010), 2837--2854.
[45]
Yue Wang and Xintao Wu. 2013. Preserving Differential Privacy in DegreeCorrelation Based Graph Generation. Trans. Data Privacy 6, 2 (Aug. 2013).
[46]
Yue Wang, Xintao Wu, and Leting Wu. 2013. Differential Privacy Preserving Spectral Graph Analysis. In PAKDD. 329--340.
[47]
Stanley L Warner. 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the ASA 60, 309 (1965), 63--69.
[48]
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of "smallworld" networks. nature 393, 6684 (1998), 440--442.
[49]
Sijie Xiong, Anand D. Sarwate, and Narayan B. Mandayam. 2016. Randomized requantization with local differential privacy. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20--25, 2016. 2189--2193. https://doi.org/10.1109/ICASSP.2016.7472065
[50]
Jun Zhang, Graham Cormode, Cecilia M. Procopiuc, Divesh Srivastava, and Xiaokui Xiao. 2015. Private Release of Graph Statistics using Ladder Functions. In SIGMOD. 731--745.

Cited By

View all
  • (2025)Privacy Passport: Privacy-Preserving Cross-Domain Data SharingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351579720(636-650)Online publication date: 2025
  • (2024)Publishing number of walks and katz centrality under local differential privacyProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702694(377-393)Online publication date: 15-Jul-2024
  • (2024)Differentially private exact recovery for stochastic block modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693606(37798-37839)Online publication date: 21-Jul-2024
  • Show More Cited By

Index Terms

  1. Generating Synthetic Decentralized Social Graphs with Local Differential Privacy

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
    October 2017
    2682 pages
    ISBN:9781450349468
    DOI:10.1145/3133956
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. community discovery
    2. decentralized social networks
    3. local differential privacy
    4. synthetic graph generation

    Qualifiers

    • Research-article

    Funding Sources

    • QNRF
    • MOE

    Conference

    CCS '17
    Sponsor:

    Acceptance Rates

    CCS '17 Paper Acceptance Rate 151 of 836 submissions, 18%;
    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)317
    • Downloads (Last 6 weeks)46
    Reflects downloads up to 07 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Privacy Passport: Privacy-Preserving Cross-Domain Data SharingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351579720(636-650)Online publication date: 2025
    • (2024)Publishing number of walks and katz centrality under local differential privacyProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702694(377-393)Online publication date: 15-Jul-2024
    • (2024)Differentially private exact recovery for stochastic block modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693606(37798-37839)Online publication date: 21-Jul-2024
    • (2024)Pervasive User Data Collection from Cyberspace: Privacy Concerns and CountermeasuresCryptography10.3390/cryptography80100058:1(5)Online publication date: 31-Jan-2024
    • (2024)Toward Answering Federated Spatial Range Queries Under Local Differential PrivacyInternational Journal of Intelligent Systems10.1155/2024/24082702024:1Online publication date: 26-Oct-2024
    • (2024)Common Neighborhood Estimation over Bipartite Graphs under Local Differential PrivacyProceedings of the ACM on Management of Data10.1145/36988032:6(1-26)Online publication date: 20-Dec-2024
    • (2024)Privacy-Preserved Neural Graph DatabasesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671678(1108-1118)Online publication date: 25-Aug-2024
    • (2024)DPCAG: A Community Affiliation Graph Generation Model for Preserving Group RelationshipsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679657(3259-3268)Online publication date: 21-Oct-2024
    • (2024)Improving the Accuracy of Locally Differentially Private Community Detection by Order-consistent Data PerturbationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657836(1743-1752)Online publication date: 10-Jul-2024
    • (2024)LinkGuard: Link Locally Privacy-Preserving Graph Neural Networks with Integrated Denoising and Private LearningCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651533(593-596)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media