Abstract
The problem of information overload on the Internet increased the need for personalized information retrieval (PIR) systems capable of providing information that corresponds to the user interests. Although, for most people, the word personalization comes with trust issues and privacy concerns. Since giving the user a personalized browsing experience usually comes at the cost of his privacy. Thus, most people are afraid of using such applications. To address this issue, we propose a new model for privacy protection in PIR systems. Our model aims at achieving a trade-off between the personalization quality and the privacy risk, to keep the latter under control. We have studied the assets and drawbacks of the existing profile-based PIR structures, from a privacy protection perspective, along with the possible privacy threats in this field in a threat modeling approach. The model we propose is based on the vector space model and targets profile-based PIR systems. It uses query expansion and re-ranking algorithms on the client-side to ensure personalization quality. While privacy protection is ensured during the personalization process, by taking into consideration the user’s privacy requirements, and through encryption. We use the Advanced Encryption Standard (AES) algorithm to protect user data at-rest and a fully homomorphic encryption (FHE) scheme for data in-transit and in-use protection. To prove the feasibility and efficiency of our model, this paper includes a proof-of-concept implementation with proper experimental results.
Similar content being viewed by others
References
Albrecht MR (2017) On dual lattice attacks against small-secret lwe and parameter choices in helib and seal. In: Annual international conference on the theory and applications of cryptographic techniques, Springer, pp 103–129
Anjali S, Reeshma K (2015) An efficient privacy preserved personalized web search model using fully homomorphic encryption. Int J Innov Res Comput Commun Eng. https://doi.org/10.15680/ijircce.2015.0307030
Bountouridis D, Harambam J, Makhortykh M, Marrero M, Tintarev N, Hauff C (2019) Siren: A simulation framework for understanding the effects of recommender systems in online news environments. In: Proceedings of the conference on fairness, accountability, and transparency, ACM, New York, NY, USA, FAT* ’19, pp 150–159. https://doi.org/10.1145/3287560.3287583,
Chaney AJ, Blei DM, Eliassi-Rad T (2015) A probabilistic model for using social networks in personalized item recommendation. In: Proceedings of the 9th ACM conference on recommender systems, ACM, New York, NY, USA, RecSys ’15, pp 43–50. https://doi.org/10.1145/2792838.2800193,
Cheon JH, Kim A, Kim M, Song Y (2017) Homomorphic encryption for arithmetic of approximate numbers. In: International conference on the theory and application of cryptology and information security. Springer, pp 409–437
Cheon JH, Kim D, Kim D (2019a) Efficient homomorphic comparison methods with optimal complexity. IACR Cryptol ePrint Arch p 1234
Cheon JH, Kim D, Kim D, Lee HH, Lee K (2019b) Numerical method for comparison on homomorphically encrypted numbers. In: International conference on the theory and application of cryptology and information security. Springer, pp 415–445
DBpedia (2015) Dbpedia (release 2.0). https://wiki.dbpedia.org/data-set-20
Dennis WL, Erwin A, Galinium M (2016) Data mining approach for user profile generation on advertisement serving. In: 2016 8th international conference on information technology and electrical engineering (ICITEE), pp 1–6. https://doi.org/10.1109/ICITEED.2016.7863269
Desfontaines D, Pejó B (2020) Sok: differential privacies. Proc Priv Enhanc Technol 2:288–313
Eke CI, Norman AA, Shuib L, Nweke HF (2019) A survey of user profiling: state-of-the-art, challenges, and solutions. IEEE Access 7:144907–144924
El-Ansari A, Beni-Hssane A, Saadi M (2017) A multiple ontologies based system for answering natural language questions. In: Rocha Á, Serrhini M, Felgueiras C (eds) Europe and MENA cooperation advances in information and communication technologies. Springer, Cham, pp 177–186
El-Ansari A, Beni-Hssane A, Saadi M (2020a) An improved modeling method for profile-based personalized search. In: Proceedings of the 3rd international conference on networking, information systems & security, pp 1–6. https://doi.org/10.1145/3386723.3387874
El-Ansari A, Beni-Hssane A, Saadi M (2020b) An ontology-based profiling method for accurate web personalization systems. J Theor Appl Inf Technol 98(14):2817–2827
ElShaweesh O, Hussain FK, Lu H, Al-Hassan M, Kharazmi S (2017) Personalized web search based on ontological user profile in transportation domain. In: International conference on neural information processing. Springer, pp 239–248
Erkin Z, Veugen T, Toft T, Lagendijk RL (2012) Generating private recommendations efficiently using homomorphic encryption and data packing. Trans Info For Sec 7(3):1053–1066. https://doi.org/10.1109/TIFS.2012.2190726
Greenstein-Messica A, Rokach L (2018) Personal price aware multi-seller recommender system: evidence from ebay. Knowl Based Syst 150:14–26. https://doi.org/10.1016/j.knosys.2018.02.026
Hawalah A, Fasli M (2015) Dynamic user profiles for web personalisation. Expert Syst Appl 42(5):2547–2569. https://doi.org/10.1016/j.eswa.2014.10.032
Hossain MS, Mondal S, Ali RS, Hasan M (2020) Optimizing complexity of quick sort. International conference on computing science, communication and security. Springer, Berlin, pp 329–339
Liao CL, Lee SJ (2016) A clustering based approach to improving the efficiency of collaborative filtering recommendation. Electron Commerce Res Appl 18:1–9. https://doi.org/10.1016/j.elerap.2016.05.001
Liu A, Wang W, Li Z, Liu G, Li Q, Zhou X, Zhang X (2017a) A privacy-preserving framework for trust-oriented point-of-interest recommendation. IEEE Access. https://doi.org/10.1109/ACCESS.2017.2765317
Liu X, Liu A, Zhang X, Li Z, Liu G, Zhao L, Zhou X (2017b) When differential privacy meets randomized perturbation: a hybrid approach for privacy-preserving recommender system. In: International conference on database systems for advanced applications. Springer, pp 576–591
Lully V, Laublet P, Stankovic M, Radulovic F (2018) Image user profiling with knowledge graph and computer vision. In: European semantic web conference. Springer, pp 100–104
Lv G, Hu C, Chen S (2016) Research on recommender system based on ontology and genetic algorithm. Neurocomputing 187:92–97. https://doi.org/10.1016/j.neucom.2015.09.113
Makkaoui KE, Beni-Hssane A, Ezzati A, El-Ansari A (2017) Fast cloud-rsa scheme for promoting data confidentiality in the cloud computing. Procedia Comput Sci 113:33 –40, https://doi.org/10.1016/j.procs.2017.08.282. (the 8th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2017) / The 7th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH-2017) / Affiliated Workshops)
Minkus T, Ross KW (2014) I know what you’re buying: Privacy breaches on ebay. In: International symposium on privacy enhancing technologies symposium. Springer, pp 164–183
Mohseni M, Maher ML, Grace K, Najjar N, Abbas F, Eltayeby O (2019) Pique: Recommending a personalized sequence of research papers to engage student curiosity. In: Isotani S, Millán E, Ogan A, Hastings P, McLaren B, Luckin R (eds) Artificial intelligence in education. Springer, Cham, pp 201–205
Polatidis N, Georgiadis CK, Pimenidis E, Mouratidis H (2017) Privacy-preserving collaborative recommendations based on random perturbations. Expert Syst Appl 71:18–25
SEAL (2020) Microsoft SEAL (release 3.5). https://github.com/Microsoft/SEAL, microsoft Research, Redmond, WA
Shen Y, Jin H (2016) Epicrec: Towards practical differentially private framework for personalized recommendation. In: Proceedings of the 2016 ACM sigsac conference on computer and communications security, ACM, New York, NY, USA, CCS ’16, pp 180–191. https://doi.org/10.1145/2976749.2978316
Shou L, Bai H, Chen K, Chen G (2014) Supporting privacy protection in personalized web search. IEEE Trans Knowl Data Eng 26(2):453–467. https://doi.org/10.1109/TKDE.2012.201
Singhal A, Sinha P, Pant R (2017) Use of deep learning in modern recommendation system: a summary of recent works. CoRR abs/1712.07525, arXiv:1712.07525
Siraj MM, Rahmat NA, Din MM (2019) A survey on privacy preserving data mining approaches and techniques. In: Proceedings of the 2019 8th international conference on software and computer applications, pp 65–69
Smith B, Linden G (2017) Two decades of recommender systems at amazon.com. IEEE Internet Comput 21(3):12–18. https://doi.org/10.1109/MIC.2017.72
SparkFHE (2020) Sparkfhe (release 2.0). https://github.com/SpiRITlab/SparkFHE-Examples
Swain K, Nayak AK (2018) A review on rule-based and hybrid stemming techniques. In: 2018 2nd international conference on data science and business analytics (ICDSBA), IEEE, pp 25–29
Tomashchuk O, Van Landuyt D, Pletea D, Wuyts K, Joosen W (2019) A data utility-driven benchmark for de-identification methods. In: International conference on trust and privacy in digital business. Springer, pp 63–77
Trautman LJ, Ormerod PC (2016) Corporate directors’ and officers’ cybersecurity standard of care: The yahoo data breach. Am UL Rev 66:1231
Tuttle H (2018) Facebook scandal raises data privacy concerns. Risk Manag 65(5):6–9
Voigt P, Von dem Bussche A (2017) The eu general data protection regulation (gdpr). A practical guide, 1st edn. Springer, Cham
Wang X, Luo T, Li J (2020) An efficient fully homomorphic encryption scheme for private information retrieval in the cloud. Int J Pattern Recogn Artif Intell 34(04):2055008
Wikipedia (2019) Aol search data leak. https://en.wikipedia.org/wiki/AOL_search_data_leak
Wu C, Wu F, An M, Huang J, Huang Y, Xie X (2019) NPA: neural news recommendation with personalized attention. CoRR abs/1907.05559, arXiv:1907.05559,
Wu Z, Li R, Zhou Z, Guo J, Jiang J, Su X (2020) A user sensitive subject protection approach for book search service. J Assoc Inf Sci Technol 71(2):183–195
Yang M, Gong G (2019) Lempel-ziv compression with randomized input-output for anti-compression side-channel attacks under https/tls. In: International symposium on foundations and practice of security. Springer, pp 117–136
Yu P, Ahmad WU, Wang H (2018) Hide-n-seek: An intent-aware privacy protection plugin for personalized web search. In: The 41st international ACM SIGIR conference on research & development in information retrieval, ACM, New York, NY, USA, SIGIR ’18, pp 1333–1336. https://doi.org/10.1145/3209978.3210180,
Zhang J, Yang Q, Shen Y, Wang Y, Yang X, Wei B (2020) A differential privacy based probabilistic mechanism for mobility datasets releasing. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-01746-0
Zhou Y, Li N, Tian Y, An D, Wang L (2020) Public key encryption with keyword search in cloud: a survey. Entropy 22(4):421
Zhu Y, Xiong L, Verdery C (2010) Anonymizing user profiles for personalized web search. In: Proceedings of the 19th international conference on World Wide Web, ACM, New York, NY, USA, WWW ’10, pp 1225–1226. https://doi.org/10.1145/1772690.1772886
Zhu T, Li G, Ren Y, Zhou W, Xiong P (2013) Differential privacy for neighborhood-based collaborative filtering. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013), pp 752–759. https://doi.org/10.1109/ASONAM.2013.6785787
Zhu J, He P, Zheng Z, Lyu MR (2015) A privacy-preserving qos prediction framework for web service recommendation. In: Proceedings of the 2015 IEEE international conference on web services, IEEE computer society, Washington, DC, USA, ICWS ’15, pp 241–248. https://doi.org/10.1109/ICWS.2015.41,
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
El-Ansari, A., Beni-Hssane, A., Saadi, M. et al. PAPIR: privacy-aware personalized information retrieval. J Ambient Intell Human Comput 12, 9891–9907 (2021). https://doi.org/10.1007/s12652-020-02736-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02736-y