research-article

Enabling Privacy-Preserving Rule Mining in Decentralized Social Networks

Authors:

Aidmar Wainakh,

Aleksej Strassheim,

Max MühlhäuserAuthors Info & Claims

ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security

Article No.: 27, Pages 1 - 11

https://doi.org/10.1145/3465481.3465482

Published: 17 August 2021 Publication History

Abstract

Decentralized online social networks enhance users’ privacy by empowering them to control their data. However, these networks mostly lack for practical solutions for building recommender systems in a privacy-preserving manner that help to improve the network’s services. Association rule mining is one of the basic building blocks for many recommender systems. In this paper, we propose an efficient approach enabling rule mining on distributed data. We leverage the Metropolis-Hasting random walk sampling and distributed FP-Growth mining algorithm to maintain the users’ privacy. We evaluate our approach on three real-world datasets. Results reveal that the approach achieves high average precision scores () for as low as 1% sample size in well-connected social networks with remarkable reduction in communication and computational costs.

References

[1]

Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating aggregate queries about web pages via random walks. In VLDB. 535–544.

[2]

Salvatore A Catanese, Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti. 2011. Crawling facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics. 1–8.

Digital Library

[3]

Harendra Chahar, B. N. Keshavamurthy, and Chirag Modi. 2017. Privacy-preserving distributed mining of association rules using Elliptic-curve cryptosystem and Shamir’s secret sharing scheme. Sadhana - Academy Proceedings in Engineering Sciences 42, 12 (2017), 1997–2007. https://doi.org/10.1007/s12046-017-0743-4

[4]

Venkatesan T Chakaravarthy, Vinayaka Pandit, and Yogish Sabharwal. 2009. Analysis of sampling techniques for association rule mining. In Proceedings of the 12th international conference on database theory. ACM, 276–283.

Digital Library

[5]

Kun-Ta Chuang, Ming-Syan Chen, and Wen-Chieh Yang. 2005. Progressive sampling for association rules based on sampling error estimation. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 505–515.

Digital Library

[6]

Jörg Daubert, Leon Bock, Panayotis Kikirasy, Max Mühlhauser, and Mathias Fischer. 2014. Twitterize: Anonymous micro-blogging. In 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA). IEEE, 817–823.

[7]

Aggelos Delis, Vassilios S Verykios, and Achilleas A Tsitsonis. 2010. A data perturbation approach to sensitive classification rule hiding. In Proceedings of the 2010 ACM Symposium on Applied Computing. ACM, 605–609.

Digital Library

[8]

Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2004. Privacy preserving mining of association rules. Information Systems 29, 4 (2004), 343–364.

Digital Library

[9]

Vikram Garg, Anju Singh, and Divakar Singh. 2014. A survey of association rule hiding algorithms. Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014(2014), 404–407. https://doi.org/10.1109/CSNT.2014.86

Digital Library

[10]

Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2010. Walking in facebook: A case study of unbiased sampling of osns. In 2010 Proceedings IEEE Infocom. Ieee, 1–9.

[11]

The Guardian. 2018. Facebook to contact 87 million users affected by data breach. https://www.theguardian.com/technology/2018/apr/08/facebook-to-contact-the-87-million-users-affected-by-data-breach. [Online; accessed 11-Dec-2018].

[12]

Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. ACM sigmod record 29, 2 (2000), 1–12.

[13]

Pili Hu and Wing Cheong Lau. 2013. A survey and taxonomy of graph sampling. arXiv preprint arXiv:1308.5865(2013).

[14]

Murat Kantarcioglu and Chris Clifton. 2004. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16, 9(2004), 1026–1037. https://doi.org/10.1109/TKDE.2004.45

Digital Library

[15]

Sotiris Kotsiantis and Dimitris Kanellopoulos. 2006. Association Rules Mining: A Recent Overview. Greece - Science 32, 1 (2006), 71–82. https://doi.org/10.4103/0377-4929.94858

[16]

Michal Kryczka, Ruben Cuevas, Carmen Guerrero, Eiko Yoneki, and Arturo Azcorra. 2010. A first step towards user assisted online social networks. In Proceedings of the 3rd workshop on social network systems. 1–6.

Digital Library

[17]

Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 631–636.

Digital Library

[18]

Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Y Chang. 2008. Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems. 107–114.

Digital Library

[19]

Rong-Hua Li, Jeffrey Xu Yu, Lu Qin, Rui Mao, and Tan Jin. 2015. On random walk based graph sampling. In 2015 IEEE 31st international conference on data engineering. IEEE, 927–938.

[20]

Wanying Luo, Qi Xie, and Urs Hengartner. 2009. Facecloak: An architecture for user privacy on social networking sites. In Computational Science and Engineering, 2009. CSE’09. International Conference on, Vol. 3. IEEE, 26–33.

Digital Library

[21]

David McCandless. 2019. World’s Biggest Data Breaches & Hacks. https://informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/. [Online; accessed 02-Jan-2019].

[22]

Nicholas Metropolis, Arianna W Rosenbluth, Marshall N Rosenbluth, Augusta H Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. The journal of chemical physics 21, 6 (1953), 1087–1092.

[23]

Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29–42.

Digital Library

[24]

Miquel Montaner, Beatriz López, and Josep Lluís De La Rosa. 2003. A taxonomy of recommender agents on the internet. Artificial intelligence review 19, 4 (2003), 285–330.

[25]

Srinivasan Parthasarathy. 2002. Efficient progressive sampling for association rules. In null. IEEE, 354.

Digital Library

[26]

Bruno Ribeiro and Don Towsley. 2010. Estimating and sampling graphs with multidimensional random walks. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. ACM, 390–403.

Digital Library

[27]

Matthew J Salganik and Douglas D Heckathorn. 2004. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological methodology 34, 1 (2004), 193–240.

[28]

Maxwell Salzberg. 2010. Kickstarter Pitch. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html Online, accessed 21.03.2019.

[29]

Richard L Scheaffer, William Mendenhall III, R Lyman Ott, and Kenneth G Gerow. 2011. Elementary survey sampling. Cengage Learning.

[30]

Xiujin Shi, Shaozong Chen, and Hui Yang. 2017. DFPS: Distributed FP-growth algorithm based on Spark. In 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). IEEE, 1725–1731.

[31]

Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2006. Sampling techniques for large, dynamic graphs. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications. IEEE, 1–6.

[32]

Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2008. On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Transactions on Networking 17, 2 (2008), 377–390.

Digital Library

[33]

Tamir Tassa. 2014. Secure mining of association rules in horizontally distributed databases. IEEE Transactions on Knowledge and Data Engineering 26, 4(2014), 970–983. https://doi.org/10.1109/TKDE.2013.41 arxiv:1106.5113

Digital Library

[34]

Hannu Toivonen. 1996. Sampling large databases for association rules. In VLDB, Vol. 96. 134–145.

[35]

Theja Tulabandhula, Shailesh Vaya, and Aritra Dhar. 2017. Privacy-preserving Targeted Advertising. arXiv preprint arXiv:1710.03275(2017).

[36]

Aidmar Wainakh, Tim Grube, Jörg Daubert, and Max Mühlhäuser. 2019. Efficient privacy-preserving recommendations based on social graphs. In Proceedings of the 13th ACM Conference on Recommender Systems. 78–86.

Digital Library

[37]

Aidmar Wainakh, Tim Grube, Jorg Daubert, Carsten Porth, and Max Muhlhauser. 2019. Tweet beyond the Cage: A Hybrid Solution for the Privacy Dilemma in Online Social Networks. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.

Digital Library

[38]

Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, and Xing Li. 2010. Unbiased sampling in directed social graph. In Proceedings of the ACM SIGCOMM 2010 conference. 401–402.

Digital Library

[39]

Tianyi Wang, Yang Chen, Zengbin Zhang, Tianyin Xu, Long Jin, Pan Hui, Beixing Deng, and Xing Li. 2011. Understanding graph sampling algorithms for social network analysis. In 2011 31st international conference on distributed computing systems workshops. IEEE, 123–128.

Digital Library

[40]

Yongqing Wang and Yan Chen. 2012. A new association rules mining method based on ontology theory. In 2012 IEEE Fifth International Conference on Advanced Computational Intelligence (ICACI). IEEE, 287–291.

[41]

Zhongjie Zhang, Witold Pedrycz, and Jian Huang. 2017. Efficient frequent itemsets mining through sampling and information granulation. Engineering Applications of Artificial Intelligence 65 (2017), 119–136.

Digital Library

Index Terms

Enabling Privacy-Preserving Rule Mining in Decentralized Social Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Survey on Privacy Preserving Association Rule Data Mining

The progress in the development of data mining techniques achieved in the recent years is gigantic. The collative data mining techniques makes the privacy preserving an important issue. The ultimate aim of the privacy preserving data mining is to ...
Incremental Privacy-Preserving Association Rule Mining Using Negative Border
Proceedings of the 11th Pacific Asia Workshop on Intelligence and Security Informatics - Volume 9650

Privacy preserving association rule mining can extract important rules from distributed data with limited privacy breaches. Protecting privacy in incremental maintenance for distributed association rule mining is necessary since data are frequently ...
Collusion-Free Privacy Preserving Data Mining

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security

August 2021

1447 pages

ISBN:9781450390514

DOI:10.1145/3465481

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Deutsche Forschungsgemeinschaft

Conference

ARES 2021

ARES 2021: The 16th International Conference on Availability, Reliability and Security

August 17 - 20, 2021

Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
102
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten