skip to main content
10.1145/3465481.3465482acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Enabling Privacy-Preserving Rule Mining in Decentralized Social Networks

Published:17 August 2021Publication History

ABSTRACT

Decentralized online social networks enhance users’ privacy by empowering them to control their data. However, these networks mostly lack for practical solutions for building recommender systems in a privacy-preserving manner that help to improve the network’s services. Association rule mining is one of the basic building blocks for many recommender systems. In this paper, we propose an efficient approach enabling rule mining on distributed data. We leverage the Metropolis-Hasting random walk sampling and distributed FP-Growth mining algorithm to maintain the users’ privacy. We evaluate our approach on three real-world datasets. Results reveal that the approach achieves high average precision scores () for as low as 1% sample size in well-connected social networks with remarkable reduction in communication and computational costs.

References

  1. Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating aggregate queries about web pages via random walks. In VLDB. 535–544.Google ScholarGoogle Scholar
  2. Salvatore A Catanese, Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti. 2011. Crawling facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics. 1–8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Harendra Chahar, B. N. Keshavamurthy, and Chirag Modi. 2017. Privacy-preserving distributed mining of association rules using Elliptic-curve cryptosystem and Shamir’s secret sharing scheme. Sadhana - Academy Proceedings in Engineering Sciences 42, 12 (2017), 1997–2007. https://doi.org/10.1007/s12046-017-0743-4Google ScholarGoogle ScholarCross RefCross Ref
  4. Venkatesan T Chakaravarthy, Vinayaka Pandit, and Yogish Sabharwal. 2009. Analysis of sampling techniques for association rule mining. In Proceedings of the 12th international conference on database theory. ACM, 276–283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kun-Ta Chuang, Ming-Syan Chen, and Wen-Chieh Yang. 2005. Progressive sampling for association rules based on sampling error estimation. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 505–515.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jörg Daubert, Leon Bock, Panayotis Kikirasy, Max Mühlhauser, and Mathias Fischer. 2014. Twitterize: Anonymous micro-blogging. In 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA). IEEE, 817–823.Google ScholarGoogle ScholarCross RefCross Ref
  7. Aggelos Delis, Vassilios S Verykios, and Achilleas A Tsitsonis. 2010. A data perturbation approach to sensitive classification rule hiding. In Proceedings of the 2010 ACM Symposium on Applied Computing. ACM, 605–609.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2004. Privacy preserving mining of association rules. Information Systems 29, 4 (2004), 343–364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Vikram Garg, Anju Singh, and Divakar Singh. 2014. A survey of association rule hiding algorithms. Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014(2014), 404–407. https://doi.org/10.1109/CSNT.2014.86Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2010. Walking in facebook: A case study of unbiased sampling of osns. In 2010 Proceedings IEEE Infocom. Ieee, 1–9.Google ScholarGoogle ScholarCross RefCross Ref
  11. The Guardian. 2018. Facebook to contact 87 million users affected by data breach. https://www.theguardian.com/technology/2018/apr/08/facebook-to-contact-the-87-million-users-affected-by-data-breach. [Online; accessed 11-Dec-2018].Google ScholarGoogle Scholar
  12. Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. ACM sigmod record 29, 2 (2000), 1–12.Google ScholarGoogle Scholar
  13. Pili Hu and Wing Cheong Lau. 2013. A survey and taxonomy of graph sampling. arXiv preprint arXiv:1308.5865(2013).Google ScholarGoogle Scholar
  14. Murat Kantarcioglu and Chris Clifton. 2004. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16, 9(2004), 1026–1037. https://doi.org/10.1109/TKDE.2004.45Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sotiris Kotsiantis and Dimitris Kanellopoulos. 2006. Association Rules Mining: A Recent Overview. Greece - Science 32, 1 (2006), 71–82. https://doi.org/10.4103/0377-4929.94858Google ScholarGoogle Scholar
  16. Michal Kryczka, Ruben Cuevas, Carmen Guerrero, Eiko Yoneki, and Arturo Azcorra. 2010. A first step towards user assisted online social networks. In Proceedings of the 3rd workshop on social network systems. 1–6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 631–636.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Y Chang. 2008. Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems. 107–114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rong-Hua Li, Jeffrey Xu Yu, Lu Qin, Rui Mao, and Tan Jin. 2015. On random walk based graph sampling. In 2015 IEEE 31st international conference on data engineering. IEEE, 927–938.Google ScholarGoogle ScholarCross RefCross Ref
  20. Wanying Luo, Qi Xie, and Urs Hengartner. 2009. Facecloak: An architecture for user privacy on social networking sites. In Computational Science and Engineering, 2009. CSE’09. International Conference on, Vol. 3. IEEE, 26–33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. David McCandless. 2019. World’s Biggest Data Breaches & Hacks. https://informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/. [Online; accessed 02-Jan-2019].Google ScholarGoogle Scholar
  22. Nicholas Metropolis, Arianna W Rosenbluth, Marshall N Rosenbluth, Augusta H Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. The journal of chemical physics 21, 6 (1953), 1087–1092.Google ScholarGoogle ScholarCross RefCross Ref
  23. Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Miquel Montaner, Beatriz López, and Josep Lluís De La Rosa. 2003. A taxonomy of recommender agents on the internet. Artificial intelligence review 19, 4 (2003), 285–330.Google ScholarGoogle Scholar
  25. Srinivasan Parthasarathy. 2002. Efficient progressive sampling for association rules. In null. IEEE, 354.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Bruno Ribeiro and Don Towsley. 2010. Estimating and sampling graphs with multidimensional random walks. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. ACM, 390–403.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matthew J Salganik and Douglas D Heckathorn. 2004. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological methodology 34, 1 (2004), 193–240.Google ScholarGoogle Scholar
  28. Maxwell Salzberg. 2010. Kickstarter Pitch. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html. https://web.archive.org/web/20110814222702http://blog.joindiaspora.com/2010/04/27/kickstarter-pitch.html Online, accessed 21.03.2019.Google ScholarGoogle Scholar
  29. Richard L Scheaffer, William Mendenhall III, R Lyman Ott, and Kenneth G Gerow. 2011. Elementary survey sampling. Cengage Learning.Google ScholarGoogle Scholar
  30. Xiujin Shi, Shaozong Chen, and Hui Yang. 2017. DFPS: Distributed FP-growth algorithm based on Spark. In 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). IEEE, 1725–1731.Google ScholarGoogle Scholar
  31. Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2006. Sampling techniques for large, dynamic graphs. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications. IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  32. Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2008. On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Transactions on Networking 17, 2 (2008), 377–390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tamir Tassa. 2014. Secure mining of association rules in horizontally distributed databases. IEEE Transactions on Knowledge and Data Engineering 26, 4(2014), 970–983. https://doi.org/10.1109/TKDE.2013.41 arxiv:1106.5113Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hannu Toivonen. 1996. Sampling large databases for association rules. In VLDB, Vol. 96. 134–145.Google ScholarGoogle Scholar
  35. Theja Tulabandhula, Shailesh Vaya, and Aritra Dhar. 2017. Privacy-preserving Targeted Advertising. arXiv preprint arXiv:1710.03275(2017).Google ScholarGoogle Scholar
  36. Aidmar Wainakh, Tim Grube, Jörg Daubert, and Max Mühlhäuser. 2019. Efficient privacy-preserving recommendations based on social graphs. In Proceedings of the 13th ACM Conference on Recommender Systems. 78–86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Aidmar Wainakh, Tim Grube, Jorg Daubert, Carsten Porth, and Max Muhlhauser. 2019. Tweet beyond the Cage: A Hybrid Solution for the Privacy Dilemma in Online Social Networks. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, and Xing Li. 2010. Unbiased sampling in directed social graph. In Proceedings of the ACM SIGCOMM 2010 conference. 401–402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tianyi Wang, Yang Chen, Zengbin Zhang, Tianyin Xu, Long Jin, Pan Hui, Beixing Deng, and Xing Li. 2011. Understanding graph sampling algorithms for social network analysis. In 2011 31st international conference on distributed computing systems workshops. IEEE, 123–128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yongqing Wang and Yan Chen. 2012. A new association rules mining method based on ontology theory. In 2012 IEEE Fifth International Conference on Advanced Computational Intelligence (ICACI). IEEE, 287–291.Google ScholarGoogle ScholarCross RefCross Ref
  41. Zhongjie Zhang, Witold Pedrycz, and Jian Huang. 2017. Efficient frequent itemsets mining through sampling and information granulation. Engineering Applications of Artificial Intelligence 65 (2017), 119–136.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security
    August 2021
    1447 pages
    ISBN:9781450390514
    DOI:10.1145/3465481

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 August 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate228of451submissions,51%
  • Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)2

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format