skip to main content
10.1145/3465481.3465482acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Enabling Privacy-Preserving Rule Mining in Decentralized Social Networks

Published: 17 August 2021 Publication History

Abstract

Decentralized online social networks enhance users’ privacy by empowering them to control their data. However, these networks mostly lack for practical solutions for building recommender systems in a privacy-preserving manner that help to improve the network’s services. Association rule mining is one of the basic building blocks for many recommender systems. In this paper, we propose an efficient approach enabling rule mining on distributed data. We leverage the Metropolis-Hasting random walk sampling and distributed FP-Growth mining algorithm to maintain the users’ privacy. We evaluate our approach on three real-world datasets. Results reveal that the approach achieves high average precision scores () for as low as 1% sample size in well-connected social networks with remarkable reduction in communication and computational costs.

References

[1]
Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating aggregate queries about web pages via random walks. In VLDB. 535–544.
[2]
Salvatore A Catanese, Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti. 2011. Crawling facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics. 1–8.
[3]
Harendra Chahar, B. N. Keshavamurthy, and Chirag Modi. 2017. Privacy-preserving distributed mining of association rules using Elliptic-curve cryptosystem and Shamir’s secret sharing scheme. Sadhana - Academy Proceedings in Engineering Sciences 42, 12 (2017), 1997–2007. https://doi.org/10.1007/s12046-017-0743-4
[4]
Venkatesan T Chakaravarthy, Vinayaka Pandit, and Yogish Sabharwal. 2009. Analysis of sampling techniques for association rule mining. In Proceedings of the 12th international conference on database theory. ACM, 276–283.
[5]
Kun-Ta Chuang, Ming-Syan Chen, and Wen-Chieh Yang. 2005. Progressive sampling for association rules based on sampling error estimation. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 505–515.
[6]
Jörg Daubert, Leon Bock, Panayotis Kikirasy, Max Mühlhauser, and Mathias Fischer. 2014. Twitterize: Anonymous micro-blogging. In 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA). IEEE, 817–823.
[7]
Aggelos Delis, Vassilios S Verykios, and Achilleas A Tsitsonis. 2010. A data perturbation approach to sensitive classification rule hiding. In Proceedings of the 2010 ACM Symposium on Applied Computing. ACM, 605–609.
[8]
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2004. Privacy preserving mining of association rules. Information Systems 29, 4 (2004), 343–364.
[9]
Vikram Garg, Anju Singh, and Divakar Singh. 2014. A survey of association rule hiding algorithms. Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014(2014), 404–407. https://doi.org/10.1109/CSNT.2014.86
[10]
Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2010. Walking in facebook: A case study of unbiased sampling of osns. In 2010 Proceedings IEEE Infocom. Ieee, 1–9.
[11]
The Guardian. 2018. Facebook to contact 87 million users affected by data breach. https://www.theguardian.com/technology/2018/apr/08/facebook-to-contact-the-87-million-users-affected-by-data-breach. [Online; accessed 11-Dec-2018].
[12]
Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. ACM sigmod record 29, 2 (2000), 1–12.
[13]
Pili Hu and Wing Cheong Lau. 2013. A survey and taxonomy of graph sampling. arXiv preprint arXiv:1308.5865(2013).
[14]
Murat Kantarcioglu and Chris Clifton. 2004. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16, 9(2004), 1026–1037. https://doi.org/10.1109/TKDE.2004.45
[15]
Sotiris Kotsiantis and Dimitris Kanellopoulos. 2006. Association Rules Mining: A Recent Overview. Greece - Science 32, 1 (2006), 71–82. https://doi.org/10.4103/0377-4929.94858
[16]
Michal Kryczka, Ruben Cuevas, Carmen Guerrero, Eiko Yoneki, and Arturo Azcorra. 2010. A first step towards user assisted online social networks. In Proceedings of the 3rd workshop on social network systems. 1–6.
[17]
Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 631–636.
[18]
Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Y Chang. 2008. Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems. 107–114.
[19]
Rong-Hua Li, Jeffrey Xu Yu, Lu Qin, Rui Mao, and Tan Jin. 2015. On random walk based graph sampling. In 2015 IEEE 31st international conference on data engineering. IEEE, 927–938.
[20]
Wanying Luo, Qi Xie, and Urs Hengartner. 2009. Facecloak: An architecture for user privacy on social networking sites. In Computational Science and Engineering, 2009. CSE’09. International Conference on, Vol. 3. IEEE, 26–33.
[21]
David McCandless. 2019. World’s Biggest Data Breaches & Hacks. https://informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/. [Online; accessed 02-Jan-2019].
[22]
Nicholas Metropolis, Arianna W Rosenbluth, Marshall N Rosenbluth, Augusta H Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. The journal of chemical physics 21, 6 (1953), 1087–1092.
[23]
Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29–42.
[24]
Miquel Montaner, Beatriz López, and Josep Lluís De La Rosa. 2003. A taxonomy of recommender agents on the internet. Artificial intelligence review 19, 4 (2003), 285–330.
[25]
Srinivasan Parthasarathy. 2002. Efficient progressive sampling for association rules. In null. IEEE, 354.
[26]
Bruno Ribeiro and Don Towsley. 2010. Estimating and sampling graphs with multidimensional random walks. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. ACM, 390–403.
[27]
Matthew J Salganik and Douglas D Heckathorn. 2004. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological methodology 34, 1 (2004), 193–240.
[28]
Maxwell Salzberg. 2010. Kickstarter Pitch. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html Online, accessed 21.03.2019.
[29]
Richard L Scheaffer, William Mendenhall III, R Lyman Ott, and Kenneth G Gerow. 2011. Elementary survey sampling. Cengage Learning.
[30]
Xiujin Shi, Shaozong Chen, and Hui Yang. 2017. DFPS: Distributed FP-growth algorithm based on Spark. In 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). IEEE, 1725–1731.
[31]
Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2006. Sampling techniques for large, dynamic graphs. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications. IEEE, 1–6.
[32]
Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2008. On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Transactions on Networking 17, 2 (2008), 377–390.
[33]
Tamir Tassa. 2014. Secure mining of association rules in horizontally distributed databases. IEEE Transactions on Knowledge and Data Engineering 26, 4(2014), 970–983. https://doi.org/10.1109/TKDE.2013.41 arxiv:1106.5113
[34]
Hannu Toivonen. 1996. Sampling large databases for association rules. In VLDB, Vol. 96. 134–145.
[35]
Theja Tulabandhula, Shailesh Vaya, and Aritra Dhar. 2017. Privacy-preserving Targeted Advertising. arXiv preprint arXiv:1710.03275(2017).
[36]
Aidmar Wainakh, Tim Grube, Jörg Daubert, and Max Mühlhäuser. 2019. Efficient privacy-preserving recommendations based on social graphs. In Proceedings of the 13th ACM Conference on Recommender Systems. 78–86.
[37]
Aidmar Wainakh, Tim Grube, Jorg Daubert, Carsten Porth, and Max Muhlhauser. 2019. Tweet beyond the Cage: A Hybrid Solution for the Privacy Dilemma in Online Social Networks. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.
[38]
Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, and Xing Li. 2010. Unbiased sampling in directed social graph. In Proceedings of the ACM SIGCOMM 2010 conference. 401–402.
[39]
Tianyi Wang, Yang Chen, Zengbin Zhang, Tianyin Xu, Long Jin, Pan Hui, Beixing Deng, and Xing Li. 2011. Understanding graph sampling algorithms for social network analysis. In 2011 31st international conference on distributed computing systems workshops. IEEE, 123–128.
[40]
Yongqing Wang and Yan Chen. 2012. A new association rules mining method based on ontology theory. In 2012 IEEE Fifth International Conference on Advanced Computational Intelligence (ICACI). IEEE, 287–291.
[41]
Zhongjie Zhang, Witold Pedrycz, and Jian Huang. 2017. Efficient frequent itemsets mining through sampling and information granulation. Engineering Applications of Artificial Intelligence 65 (2017), 119–136.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security
August 2021
1447 pages
ISBN:9781450390514
DOI:10.1145/3465481
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. decentralized online social networks
  2. frequent itemset mining.
  3. privacy-preserving rule mining

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ARES 2021

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 102
    Total Downloads
  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media