Privacy-aware smart city: A case study in collaborative filtering recommender systems

https://doi.org/10.1016/j.jpdc.2017.12.015Get rights and content

Highlights

  • Contribution to privacy-aware smart city research by introducing the data matrix as the underlying data format.

  • Accomplishing robust privacy preserving in CF with increased recommendation accuracy.

  • Presenting the paralleled implementation of k-coRating and the empirical results accordingly.

Abstract

Ensuring privacy in recommender systems for smart cities remains a research challenge, and in this paper we study collaborative filtering recommender systems for privacy-aware smart cities. Specifically, we use the rating matrix to establish connections between a privacy-aware smart city and k-coRating, a novel privacy-preserving rating data publishing model. First, we model privacy concerns in a smart city as the problem of privacy-preserving collaborative filtering recommendation. Then, we introduce k-coRating to address privacy concerns in published rating matrices, by filling the null ratings with predicted scores. This allows us to mask the original ratings to preserve k-anonymity-like data privacy, and enhance data utility (quantified using prediction accuracy in this paper). We show that the optimal k-coRated mapping is an NP-hard problem and design an efficient greedy algorithm to achieve k-coRating. We then demonstrate the utility of our approach empirically.

Introduction

Building smart cities enhances the efficiency of e-government, business-to-business and business-to-citizen services, improves the quality of life and experience, and facilitates decision making and service delivery [58]. Achieving these goals involves the integration of multiple information and communication technology (ICT).

Recommendation technology plays an important role in implementing a smart city [[37], [49]]. It helps optimize the running of transportation systems, recommend tourism attractions to tourists, search the best services for community residents, match friends to users through social networks, and so forth [[21], [64]]. Collaborative Filtering (CF) provides recommendations to active users (e.g., users accepting recommendation services) based on the preferences of their like-minded neighborhood. It is one of the most successful recommendation technologies during the past decades [61].

With the rapid evolution of CF in conjunction with smart cities, concerns about data privacy are arising [[32], [41]]. In this paper, we study privacy issues in CF recommender systems in order to build privacy-aware smart cities.

This is an extension of our previous work [62]. In this work, we extend the application scenario of k-coRating to smart cities, demonstrating how to utilize it to build a privacy-aware smart city. We also introduce the basic idea of k-coRating and the basic algorithms to implement it, including GeCom (Ge nerate k-co Rated M atrix) and its paralleled version, PaGecom. In addition, we present the detailed proof for NP-Hardness of k-coRating matching. We also present a more complex privacy analysis and detailed proofs.

This is the first known study to accomplish the following:

(1) Contribution to privacy-aware smart city research by introducing the data matrix as the underlying data format.

(2) Accomplishing robust privacy preserving in CF with increased recommendation accuracy.

In Section 2, we introduce a matrix method to connect recommendations and smart cities. In Section 3, we briefly introduce collaborative filtering recommendation, and problem definitions. In Section 4, we introduce the trust derivation method and the k-coRated privacy-preserving algorithm. The experimental methods and results are presented in Section 5. Related work is discussed in Section 6. Finally, the last section concludes this paper and future work is discussed.

Section snippets

Recommendations to improve the smartness of a city

CF recommendation predicts the best information for active users based on the tastes of like-minded users. A CF recommendation system works as follows. Based on the similarities between user–item rating vectors, it attempts to find the active user’s nearest neighbors. Then, it produces recommendations based on the nearest neighbors’ preferences. CF is widely applied in areas such as electronic commerce, social network services, music recommendation, intelligent transportation systems, and movie

Basics on collaborative filtering recommendation

Recommendation systems can be classified into three categories: content-based filtering, collaborative filtering, and hybrid filtering. Content-based filtering produces recommendations for active users according to the similarities among items. Collaborative filtering, however, provides recommendations based on the preferences of other like-minded users. Hybrid filtering combines the two methods to produce recommendations [5].

The underlying rationale behind CF is the assumption that users like

Methodologies of PPCF

This section first gives formal definitions for the PPCF problem and proves that the optimal k-coRated mapping is NP-hard. It then examines how to derive trusts to generate filling data: additional ratings which help to disguise the original ratings. Finally and most importantly, since the optimal k-coRated mapping is NP-hard, a greedy algorithm based on the trust-based prediction to achieve k-coRating is introduced.

Experiments

This section reports experimental results to demonstrate that the proposed method can conduct privacy-preserving CF recommendation while improving accuracy. We attribute the retained privacy and the improved recommendation accuracy to the filling data (i.e., the predicted rating scores).

Related work

A smart city is the key to transforming a traditional city to be more efficient, more reliable, and more secure. A smart city is characterized by environmental sustainability, social sustainability, regional competitiveness, natural resources management, cyber Security, and an improved quality of life [[29], [58]].

The recommender system is one key technology to implement smart cities. Luberg et al. designed a rule-based tourist recommendation system in the context of smart cities [37]. Negre

Conclusions

In this paper, we demonstrate the research connection between privacy-preserving CF and privacy-aware smart city, and challenge the traditional assumption that accuracy and privacy are two goals in conflict [[9], [34]].

Although the goals of the study were achieved, future research opportunities also became apparent. Privacy-preserving CF is just one channel to enter privacy-aware smart city research, and there are definitely other channels to accomplish the task. k-coRating is an elegant

Acknowledgments

The study is partially supported by the National Natural Science Foundation of China under Grant No. U1711266, U1711267, and 61672029, the Natural Science Foundation of Hubei Province under Grant No. 2015CFB450, and the Fundamental Research Founds for National University under Grant No. 1610491B22, China University of Geosciences (Wuhan).

Feng Zhang received the Ph.D. degree in computer science from Sun Yat-set University, China, in 2008. He worked in Kent State University as a visiting scholar in 2012. Currently, he is an associate professor at China University of Geosciences, Wuhan, China. His research interests include HPC, privacy-preserving data mining, geoscience data processing and big data.

References (66)

  • R. Agrawal, R. Srikant, Privacy preserving data mining, in: Proceedings of the 2000 ACM SIGMOD International Conference...
  • R. Andersen, C. Borgs, J. Chayes, U. Feige, A. Flaxman, A. Kalai, V. Mirrokni, M. Tennenholtz, Trust-based...
  • S. Ansari, R. Kohavi, L. Mason, Z. Zheng, Integrating E-commerce and data mining: Architecture and challenges, in:...
  • A. Basu, J. Vaidya, H. Kikuchi, Perturbation based privacy preserving slope one predictors for collaborative filtering,...
  • S. Berkovsky, Y. Eytani, T. Kuflik, F. Ricc, Enhancing privacy and preserving accuracy of a distributed collaborative...
  • J.S. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in:...
  • J. Brickell, V. Shmatikov, The cost of privacy: destruction of data-mining utility in anonymized data publishing, in:...
  • J. Canny, Collaborative filtering with privacy , in: Proceedings of the 2002 IEEE Symposium on Security and Privacy,...
  • J. Canny, Collaborative filtering with privacy via factor analysis, in: Proceedings of the 25th Annual International...
  • CarulloG. et al.

    A triadic closure and homophily-based recommendation system for online social networks

    World Wide Web

    (2015)
  • G. Carullo, A. Castiglione, A.D. Santis, Friendship recommendations in online social networks, in: 2014 International...
  • ChengC. et al.

    Securing internet of things in a quantum world

    IEEE Commun. Mag.

    (2017)
  • DengZ. et al.

    Parallel processing of dynamic continuous queries over streaming data flows

    IEEE Trans. Parallel Distrib. Syst.

    (2015)
  • C. Dwork, Differential privacy, in: Proceedings of 2006 International Colloquium on Automata, Languages and...
  • C. Dwork, F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, in: Proceedings...
  • GolbeckJ.A.

    Computing and Applying Trust in Web-based Social Networks

    (2005)
  • M. Hay, V. Rastogi, G. Miklau, D. Suciu, Boosting the accuracy of differentially private histograms through...
  • C.-S. Hwang, Y.-P. Chen, Using trust in collaborative filtering recommendation, in: Proceedings of the 20th...
  • M. Jamali, M. Ester, TrustWalker: A random walk model for combining trust-based and item-based recommendation, in:...
  • JiZ. et al.

    Differential privacy based on importance weighting

    Mach. Learn.

    (2013)
  • KarguptaH. et al.

    Random data perturbation techniques and privacy preserving data mining

    Knowl. Inf. Syst. J.

    (2005)
  • S.P. Kasiviswanathan, K. Nissim, S. Raskhodnikova, A. Smith, Analyzing graphs with node differential privacy, in:...
  • KhoshkbarforoushhaA. et al.

    Distribution based workload modelling of continuous queries in clouds

    IEEE Trans. Emerg. Top. Comput.

    (2017)
  • Cited by (34)

    • Three-way Naive Bayesian collaborative filtering recommendation model for smart city

      2022, Sustainable Cities and Society
      Citation Excerpt :

      Collaborative filtering recommendation algorithm is currently one of the most mature and popular personalized recommendation technologies. It has been successfully applied to movie recommendation (Zhang et al., 2019), service recommendation (Yao, Wang, Li, & Rodrigues 2021 ; Zhang, Yin, Wu, He, & Zhu 2020) and other fields. Data sparsity is a common problem in collaborative filtering.

    • Improvement of recommendation algorithm based on Collaborative Deep Learning and its Parallelization on Spark

      2021, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      As the support of Recommender System, machine learning algorithm is based on personalized recommendation, and most of the systems are based on collaborative filtering methods for recommendation. However, the biggest problem faced by collaborative filtering is the cold start [41]. When a user or an item is a newly added object, due to its insufficient historical information, the system often fails to analyze and operate it well, resulting in a significant reduction in recommended performance.

    • Locally differentially private item-based collaborative filtering

      2019, Information Sciences
      Citation Excerpt :

      How to rapidly and exactly find the information potentially interesting to a target user has become a tough problem. Item-based collaborative filtering is considered to be an efficient approach to solve the problem, which is also widely used in other fields, e.g., IoT environments [21], smart city [44]. This method analyzes the similarities between items and then determines the set of items to be recommended according to the similarities [5].

    • A Note on Quality of Service Issues in Smart Cities

      2019, Journal of Parallel and Distributed Computing
    View all citing articles on Scopus

    Feng Zhang received the Ph.D. degree in computer science from Sun Yat-set University, China, in 2008. He worked in Kent State University as a visiting scholar in 2012. Currently, he is an associate professor at China University of Geosciences, Wuhan, China. His research interests include HPC, privacy-preserving data mining, geoscience data processing and big data.

    Victor E. Lee received the B.S. degree in computer science and engineering from the University of California, Berkeley, M.S. in Electrical Engineering from the Stanford University, and Ph.D. degree in Computer Science from Kent State University in 2012. Currently, he is a senior researcher at GraphSQL Inc., Mountain View, CA, USA. His general research area is in social and complex networks, big data analytics, high performance computing, etc.

    Ruoming Jin received the Ph.D. degree in computer science and engineering from the Ohio State University in 2005. Currently, he is an associate professor at Kent State University. His general research area is in data mining, database, and big data. He has published more than 90 technical papers in these areas, most appearing in the top conferences and journals. He is a recipient of the prestigious US NSF CAREER award.

    Saurabh Garg is a lecturer at the University of Tasmania, Australia. He is one of the few Ph.D. students who completed in less than three years from the University of Melbourne. He has published more than 40 papers in highly cited journals and conferences. During his Ph.D., he has been awarded various special scholarships for his Ph.D. candidature. His research interests include resource management, scheduling, utility and grid computing, Cloud computing, green computing, wireless networks, and ad hoc networks.

    Kim-Kwang Raymond Choo received the Ph.D. in Information Security in 2006 from Queensland University of Technology, Australia. He currently holds the Cloud Technology Endowed Professorship at The University of Texas at San Antonio. He serves on the editorial board of Computers & Electrical Engineering, Cluster Computing, Digital Investigation, IEEE Access, IEEE Cloud Computing, IEEE Communications Magazine, Future Generation Computer Systems, Journal of Network and Computer Applications, PLoS ONE, Soft Computing, etc. He is also a Fellow of the Australian Computer Society, and an IEEE Senior Member.

    Michele Maasberg holds the Herbert H. McElveen Endowed Professorship in the Department of Computer Information Systems at Louisiana Tech University. She received a Ph.D. in Business Administration with a concentration in Information Technology and M.S. in Information Technology—Information Assurance from The University of Texas at San Antonio, USA and B.S. from the U.S. Naval Academy, Annapolis, MD, USA. Her research covers cyber Security issues in insider threats (behavioral and technical), malware (threat metrics, prediction of higher order behaviors), and data breaches. She has published in International Journal of Human–Computer Studies and Information Systems Frontiers, Journal of Computer Information Systems, among others. She is a Certified Information Systems Security Professional (CISSP).

    Lijun Dong received the B.Sc. degree in Mechatronic Engineering from Nanjing University of Science & Technology, China in 1999, and the Ph.D. degree in Computer Science and Technology from Huazhong University of Science and Technology, China in 2008. He is currently an associate professor for the School of Computer Science, China University of Geosciences, China. His research interests include network sciences and applications, satellite networks, and information system security.

    Chi Cheng received the B.S. and M.S. degrees in Mathematics from Hubei University, Wuhan, P.R. China, in 2003 and 2006, respectively, and the Ph.D. degree in information and communication engineering from Huazhong University of Science and Technology, Wuhan, P.R. China, in December 2013. From November 2014 to November 2016, he was postdoctoral researcher member of Japan Society for the Promotion of Science (JSPS) at Kyushu University, Japan. He is currently an associate professor in the School of Computer Science, China University of Geosciences (Wuhan), China. His research interests focus on Applied cryptography and network security.

    This is an extended version of the conference paper Zhang et al. (2014), with more than 50% new content.

    View full text