Abstract
One-class collaborative filtering (OCCF) is an emerging setup in collaborative filtering in which only positive examples or implicit feedback can be observed. Compared with the traditional collaborative filtering setting where the data have ratings, OCCF is more realistic in many scenarios when no ratings are available. In this paper, we propose to improve OCCF accuracy by exploiting the rich user information that is often naturally available in community-based interactive information systems, including a user’s search query history, and purchasing and browsing activities. We propose two major strategies to incorporate such user information into the OCCF models: One is to linearly combine scores from different sources, and the other is to embed user information into collaborative filtering. Furthermore, we employ the MapReduce framework for similarity computation over millions of users and items. Experimental results on two large-scale retail datasets from a major e-commerce company show that the proposed methods are effective and can improve the performance of the OCCF over baseline methods through leveraging rich user information.




Similar content being viewed by others
Notes
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Agichtein E, Brill E, Dumais S (2006) Improving web search ranking by incorporating user behavior information. In: SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. NY, USA, New York, pp 19–26
Balabanovíc M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Commun ACM 40(3):66–72
Chen W, Chu JC, Luan J, Bai H, Wang Y, Chang EY (2009) Collaborative filtering for orkut communities: discovery of user latent behavior. In WWW ’09: Proceeding of the 18th international world wide web conference. ACM, pp 681–690
Chen Y, Canny JF (2011) Recommending ephemeral items at web scale. In: SIGIR ’11: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. Beijing, China, pp 1013–1022
Claypool M, Gokhale A, Miranda T, Murnikov P, Netes D, Sartin M (1999) Combining content-based and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR workshop on recommender systems, August 1999
Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: OSDI. USENIX Association, pp 137–150
Fox EA, Shaw JA (1994) Combination of multiple searches. In: The second text retrieval conference (TREC-2), vol 500–215 of NIST special publication. NIST, pp 243–252
Gabriel KR, Zamir S (1979) Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21(4):489–498
Gemulla R, Nijkamp E, Haas PJ, Sismanis Y (2011) Large-scale matrix factorization with distributed stochastic gradient descent. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’11. ACM, New York, NY, USA, pp 69–77
Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. Commun ACM 35(12):61–70
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: IEEE international conference on data mining (ICDM 2008), pp 263–272
Joachims T (2002) Optimizing search engines using clickthrough data. In: KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Press, New York, NY, USA, pp 133–142
Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, pp 426–434
Koren Y (2009) Collaborative filtering with temporal dynamics. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Lebanon G, Lafferty JD (2002) Cranking: combining rankings using conditional probability models on permutations. In: ICML 2002. Morgan Kaufmann, pp 363–370
Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80
Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 667–676
Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: IEEE International conference on data mining (ICDM 2008), pp 502–511
Popescul A, Ungar L, Pennock D, Lawrence S (2001) Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In: Proceedings of the seventeenth conference on uncertainty in, artificial intelligence, pp 437–444
Savoy J, Berger PY (2004) Selection and merging strategies for multilingual information retrieval. In: CLEF ’04, vol 3491 of Lecture Notes in Computer Science. Springer, pp 27–37
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. ACM Press, New York, NY, USA, pp 253–260
Shen X, Tan B, Zhai C (2005) Implicit user modeling for personalized search. In: CIKM ’05: Proceedings of the 14th ACM international conference on information and knowledge management. ACM, New York, NY, USA, pp 824–831
Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2009) A family of non-negative matrix factorizations for one-class collaborative filtering. In: ACM RecSys
Srebro N, Jaakkola T (2003) Weighted low-rank approximations. In: ICML ’03: Proceedings of the 20th international conference on machine learning. AAAI Press, pp 720–727
Tan B, Shen X, Zhai C (2006) Mining long-term search history to improve search accuracy. In: KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, USA, pp 718–723
Tikhonov AN, Arsenin VY (1977) Solution of ill-posed problems. Wiley, London
Zhang R, Tran T (2011) An information gain-based approach for recommending useful product reviews. Knowl Inf Syst 26:419–434. doi:10.1007/s10115-010-0287-y
Acknowledgments
This paper is based upon work supported in part by the National Science Foundation under grant CNS-0834709.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, Y., Zhai, C. & Chen, Y. Exploiting rich user information for one-class collaborative filtering. Knowl Inf Syst 38, 277–301 (2014). https://doi.org/10.1007/s10115-012-0583-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0583-9