Skip to main content
Log in

Privacy preservation for recommendation databases

  • Special Issue Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

Since recommendation systems play an important role in the current situations where such digital transformation is highly demanded, the privacy of the individuals’ collected data in the systems must be secured effectively. In this paper, the vulnerability of the existing query framework for the recommendation systems is identified. Thus, we propose to apply the well-known k-anonymity model to generalize the given recommendation databases to satisfy the privacy preservation constraint. We show that such data generalization problem which minimizes the impact on data utility is NP-hard. To tackle with such problem, an algorithm to preserve the privacy of the individuals in the recommendation databases is proposed. The idea is to avoid excessive generalizing on the databases by forming a group of similar tuples in the databases. Thus, the impact on the data utility of the generalizing such group can be minimized. Our work is evaluated by extensive experiments. From the results, it is found that our work is highly effective, i.e., the impact quantified by the data utility metrics and the errors of the query results are less than the compared algorithms, and also it is highly efficient, i.e., the execution time is less than the result of its effectiveness-comparable algorithm by more than three times.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. MAX or MIN functions only return one value as the query result such that MAX function that only returns a maximum value from a set of values is satisfied by the query condition, and MIN function that only returns a minimum value from a set of values is satisfied by the query condition. They are designed to support the various data domains such as \(\textit{numeric}\), \(\textit{character}\), \(\textit{unique}\)-\(\textit{identifier}\), and \(\textit{date}\)-\(\textit{time}\), as referenced from the website: https://msdn.microsoft.com/en-us/library/ms187751.aspx.

References

  1. Shvarts M, Lobur M, Stekh Y (2017) Some trends in modern recommender systems. In: Proceedings of the 2017 XIIIth international conference on perspective technologies and methods in MEMS design. IEEE

  2. The Statistics Portal (2016) Number of apps available in leading app stores as of June 2016. https://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores/. Retrieved 12 June 2016

  3. Yamato Y (2017) Performance-aware server architecture recommendation and automatic performance verification technology on iaas cloud. Serv Oriented Comput Appl 11:121–135

    Article  Google Scholar 

  4. Chan NN, Tata WG (2012) A recommender system based on historical usage data for web service discovery. Serv Oriented Comput Appl 6:51–63

    Article  Google Scholar 

  5. Lam SKT, Frankowski D, Riedl J (2006) Do you trust your recommendations? An exploration of security and privacy issues in recommender systems. In: Proceedings of the 2006 international conference on emerging trends in information and communication security. ETRICS’06, Springer, pp 14–29

  6. Beel J, Gipp B, Langer S, Breitinger C (2016) Research-paper recommender systems: a literature survey. Int J Digit Libr 17:305–338

    Article  Google Scholar 

  7. Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G (2001) Privacy risks in recommender systems. IEEE Internet Comput 5:54–62

    Article  Google Scholar 

  8. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10:557–570

    Article  MathSciNet  Google Scholar 

  9. Aggarwal CC (2005) On k-anonymity and the curse of dimensionality. In: Proceedings of the 31st international conference on very large data bases. VLDB ’05, VLDB endowment, pp 901–909

  10. Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: 21st International conference on data engineering (ICDE’05), pp 217–228

  11. Fung BCM, Wang K, Wang L, Hung PCK (2009) Privacy-preserving data publishing for cluster analysis. Data Knowl Eng 68:552–575

    Article  Google Scholar 

  12. LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: 22nd International conference on data engineering (ICDE’06), pp 25–25

  13. Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl Based Syst 10:571–588

    Article  MathSciNet  Google Scholar 

  14. Nergiz ME, Clifton C (2007) Thoughts on k-anonymization. Data Knowl Eng 63:622–645

    Article  Google Scholar 

  15. Fung BCM, Wang K, Yu PS (2005) Top-down specialization for information and privacy preservation. In: 21st international conference on data engineering (ICDE’05), pp 205–216

  16. LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data. SIGMOD ’05. ACM, pp 49–60

  17. Byun JW, Kamra A, Bertino E, Li N (2007) Efficient k-anonymization using clustering techniques. In: Proceedings of the 12th International conference on database systems for advanced applications. DASFAA’07. Springer, Berlin, pp 188–200

  18. Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06. ACM, New York, NY, pp 785–790

  19. Iyengar VS (2002) Transforming data to satisfy privacy constraints. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’02. ACM, pp 279–288

  20. Zhang Q, Koudas N, Srivastava D, Yu T (2007) Aggregate query answering on anonymized tables. In: 2007 IEEE 23rd international conference on data engineering, pp 116–125

  21. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd international conference on data engineering, pp 106–115

  22. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data. https://doi.org/10.1145/1217299.1217302

  23. Wong RCW, Li J, Fu AWC, Wang K (2006) (\(\alpha \), k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06. ACM, New York, NY, pp 754–759

  24. Terrovitis M, Mamoulis N, Kalnis P (2008) Privacy-preserving anonymization of set-valued data. Proc VLDB Endow 1:115–125

    Article  Google Scholar 

  25. Verbert K, Manouselis N, Ochoa X, Wolpers M, Drachsler H, Bosnic I, Duval E (2012) Context-aware recommender systems for learning: a survey and future challenges. IEEE Trans Learn Technol 5:318–335

    Article  Google Scholar 

  26. Beel J, Langer S, Genzmehr M, Gipp B, Breitinger C, Nürnberger A (2013) Research paper recommender system evaluation: a quantitative literature survey. In: Proceedings of the international workshop on reproducibility and replication in recommender systems evaluation. RepSys ’13. ACM, pp 15–22

  27. Herlocker JL, Konstan JA, Borchers A, Riedl J (1999) An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. SIGIR ’99. ACM, pp 230–237

  28. Hijikata Y, Iwahama K, Nishida S (2006) Content-based music filtering system with editable user profile. In: Proceedings of the 2006 ACM symposium on applied computing, SAC ’06. ACM, pp 1050–1057

  29. Carrer-Neto W, Hernández-Alcaraz ML, Valencia-García R, García-Sánchez F (2012) Social knowledge-based recommender system. Application to the movies domain. Expert Syst Appl 39:10990–11000

    Article  Google Scholar 

  30. Burke R (2002) Hybrid recommender systems: survey and experiments. User Model User-Adapt Interact 12:331–370

    Article  Google Scholar 

  31. Isinkaye F, Folajimi Y, Ojokoh B (2015) Recommendation systems: principles, methods and evaluation. Egypt Inform J 16:261–273

    Article  Google Scholar 

  32. Khusro S, Ali Z, Ullah I (2016) Recommender systems: issues, challenges, and research opportunities. In: Kim KJ, Joukov N (eds) Information science and applications (ICISA) 2016. Springer, Singapore, pp 1179–1189

    Chapter  Google Scholar 

  33. Lam XN, Vu T, Le TD, Duong AD (2008) Addressing cold-start problem in recommendation systems. In: Proceedings of the 2nd international conference on ubiquitous information management and communication. ICUIMC ’08. ACM, pp 208–211

  34. Sarwar BM, Karypis G, Konstan J, Riedl J (2002) Recommender systems for large-scale e-commerce: scalable neighborhood formation using clustering. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.6985

  35. Tsai WT, Xiao Wei YCRPJYC, Zhang D (2007) Data provenance in SOA: security, reliability, and integrity. Service Oriented Comput Appl 1:223–247

    Article  Google Scholar 

  36. Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) “You might also like: ” privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy. SP ’11, IEEE Computer Society, pp 231–246

  37. Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42:14:1–14:53

    Article  Google Scholar 

  38. di Vimercati SDC, Foresti S, Livraga G, Samarati P (2012) Data privacy: definitions and techniques. Int J Uncertain Fuzziness Knowl Based Syst 20(06):793–817

  39. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-Completeness. W. H. Freeman & Co., New York

    MATH  Google Scholar 

  40. Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5:19:1–19:19

    Article  Google Scholar 

  41. Office for Government Policy Coordination, RoK (2016) Guidelines for de-identification of personal data - guide for de-identification standards and support/management system. https://www.privacy.go.kr/cmm/fms/FileDown.do?atchFileId=FILE_000000000830764&fileSn=0

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juggapong Natwichai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Riyana, S., Natwichai, J. Privacy preservation for recommendation databases. SOCA 12, 259–273 (2018). https://doi.org/10.1007/s11761-018-0248-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-018-0248-y

Keywords

Navigation