ABSTRACT
Recommendations from the long tail of the popularity distribution of items are generally considered to be particularly valuable. On the other hand, recommendation accuracy tends to decrease towards the long tail. In this paper, we quantitatively examine this trade-off between item popularity and recommendation accuracy. To this end, we assume that there is a selection bias towards popular items in the available data. This allows us to define a new accuracy measure that can be gradually tuned towards the long tail. We show that, under this assumption, this measure has the desirable property of providing nearly unbiased estimates concerning recommendation accuracy. In turn, this also motivates a refinement for training collaborative-filtering approaches. In various experiments with real-world data, including a user study, empirical evidence suggests that only a small, if any, bias of the recommendations towards less popular items is appreciated by users.
- C. Anderson. The Long Tail. Hyperion, New York, 2006.Google Scholar
- J. Bennet and S. Lanning. The Netflix Prize. In Workshop at SIGKDD-07, ACM Conference on Knowledge Discovery and Data Mining, 2007.Google Scholar
- W. G. Cochran. Sampling Techniques. Wiley, 1977.Google Scholar
- P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-N recommendation tasks. In ACM Conference on Recommender Systems, pages 39--46, 2010. Google ScholarDigital Library
- MovieLens data. homepage: http://www.grouplens.org/node/73, 2006.Google Scholar
- S. Deerwester, S. Dumais, G. Furnas, R. Harshman, T. Landauer, K. Lochbaum, Lynn Streeter, et al. Latent semantic analysis / indexing. homepage: http://lsa.colorado.edu/.Google Scholar
- S. Funk. Netflix update: Try this at home, 2006. http://sifter.org/simon/journal/20061211.html.Google Scholar
- R. Groves, D. Dillman, J.L Eltinge, and R.J.A. Little. Survey Nonresponse. Wiley, 2002.Google Scholar
- J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22:5--53, 2004. Google ScholarDigital Library
- Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In IEEE International Conference on Data Mining (ICDM), 2008. Google ScholarDigital Library
- R. Keshavan, A. Montanari, and S. Oh. Matrix completion from noisy entries. Journal of Machine Learning Research, 11:2057--78, 2010. Google ScholarDigital Library
- J. K. Kim and J. J. Kim. Nonresponse weighting adjustment using estimated response probability. The Canadian Journal of Statistics, 35:501--14, 2007.Google ScholarCross Ref
- Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In ACM Conference on Knowledge Discovery and Data Mining, pages 426--34, 2008. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web, 1999.Google Scholar
- R. Pan, Y. Zhou, B. Cao, N. Liu, R. Lukose, M. Scholz, and Q. Yang. One-class collaborative filtering. In IEEE International Conference on Data Mining (ICDM), 2008. Google ScholarDigital Library
- A. Paterek. Improving regularized singular value decomposition for collaborative filtering. In KDDCup, 2007.Google Scholar
- J. M. Robins, A. Rotnitzky, and L.P. Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association (JASA), 89:846--66, 1994.Google Scholar
- P. R. Rosenbaum and D. B. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70:41--55, 1983.Google ScholarCross Ref
- R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. In International Conference on Machine Learning (ICML), 2007. Google ScholarDigital Library
- R. Salakhutdinov and N. Srebro. Collaborative filtering in a non-uniform world: Learning with the weighted trace norm. In Advances in Neural Information Processing Systems 24 (NIPS), 2010.Google Scholar
- C. F. Sarndal and S. Lundström. Estimation in Surveys with Nonresponse. Wiley, 2006.Google Scholar
- C. F. Sarndal, B. Swensson, and J. Wretman. Model Assisted Survey Sampling. Springer, 1992.Google ScholarCross Ref
- G. Shani and A. Gunawardana. Evaluating recommendation systems. In Recommender Systems Handbook, pages 257--97. Springer, 2011.Google ScholarCross Ref
- H. Steck. Training and testing of recommender systems on data missing not at random. In ACM Conference on Knowledge Discovery and Data Mining, pages 713--22, 2010. Google ScholarDigital Library
- H. Steck and Y. Xin. A generalized probabilistic framework and its variants for training top-k recommender systems. In PRSAT Workshop at RecSys Conf., http://ceur-ws.org/#Vol-676, 2010.Google Scholar
Index Terms
- Item popularity and recommendation accuracy
Recommendations
Improving recommendation accuracy based on item-specific tag preferences
Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in contextIn recent years, different proposals have been made to exploit Social Web tagging information to build more effective recommender systems. The tagging data, for example, were used to identify similar users or were viewed as additional information about ...
Item recommendation in collaborative tagging systems via heuristic data fusion
Collaborative tagging systems have been popular on the Web. However, information overload results in the increasing need for recommender services from users, and thus item recommendation has been one of the key issues in such systems. In this paper, we ...
An Enhanced Collaborative Filtering with Flexible Item Popularity Control for Recommender Systems
SocialCom '14: Proceedings of the 2014 International Conference on Social ComputingWith the emerging and rapid development of Internet applications like social networks, E-commerce and so on, massive information has been created and stored. Recommender systems have been developed to deal with the information overload problem. Various ...
Comments