Skip to main content

Evaluating Recommender Systems

  • Chapter
  • First Online:
Recommender Systems Handbook

Abstract

Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendater system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommender systems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recommenders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and finally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the property that they evaluate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.Netflix.com.

  2. 2.

    www.amazon.com.

  3. 3.

    https://www.icaps-conference.org/competitions/.

  4. 4.

    A reference to their origins in signal detection theory.

  5. 5.

    Not to be confused with trust in the social network research, used to measure how much a user believes another user. Some literature on recommender systems uses such trust measurements to filter similar users [64].

References

  1. R. Bailey, Design of Comparative Experiments, vol. 25 (Cambridge University Press, Cambridge, 2008)

    Book  MATH  Google Scholar 

  2. D. Bamber, The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol. 12, 387–415 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  3. J. Beel, S. Langer, A comparison of offline evaluations, online evaluations, and user studies in the context of research-paper recommender systems, in International Conference on Theory and Practice of Digital Libraries (Springer, New York, 2015), pp. 153–168

    Google Scholar 

  4. Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodological) 57, 289–300 (1995)

    MathSciNet  MATH  Google Scholar 

  5. P.J. Bickel, K.A. Ducksum, Mathematical Statistics: Ideas and Concepts (Holden-Day, San Francisco, 1977)

    Google Scholar 

  6. M. Boland, Native ads will drive 74% of all ad revenue by 2021. Business Insider 14, 2016

    Google Scholar 

  7. P. Bonhard, C. Harries, J. McCarthy, M.A. Sasse, Accounting for taste: using profile similarity to improve recommender systems, in CHI ’06: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, 2006 (ACM, New York, 2006), pp. 1057–1066

    Google Scholar 

  8. C. Boutilier, R.S. Zemel, Online queries for collaborative filtering, in In Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, 2002

    Google Scholar 

  9. G.E.P. Box, W.G. Hunter, J.S. Hunter, Statistics for Experimenters (Wiley, New York, 1978)

    MATH  Google Scholar 

  10. K. Bradley, B. Smyth, Improving recommendation diversity, in Twelfth Irish Conference on Artificial Intelligence and Cognitive Science (2001), pp. 85–94

    Google Scholar 

  11. D. Braziunas, C. Boutilier, Local utility elicitation in GAI models. in Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence, Edinburgh, 2005, pp. 42–49

    Google Scholar 

  12. J.S. Breese, D. Heckerman, C.M. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in UAI, 1998

    Google Scholar 

  13. R. Burke, Evaluating the dynamic properties of recommendation algorithms. in Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys ’10, New York (ACM, New York, 2010), pp. 225–228

    Google Scholar 

  14. Ò. Celma, P. Herrera, A new approach to evaluating novel recommendations, in RecSys ’08: Proceedings of the 2008 ACM Conference on Recommender systems, New York, NY (ACM, New York, 2008), pp. 179–186

    Book  Google Scholar 

  15. P.-A. Chirita, W. Nejdl, C. Zamfir, Preventing shilling attacks in online recommender systems, in WIDM ’05: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management, New York, NY (ACM, New York, 2005), pp. 67–74

    Google Scholar 

  16. H. Cramer, V. Evers, S. Ramlal, M. Someren, L. Rutledge, N. Stash, L. Aroyo, B. Wielinga, The effects of transparency on trust in and acceptance of a content-based art recommender. User Model. User-Adapted Interact. 18(5), 455–496 (2008)

    Article  Google Scholar 

  17. P. Cremonesi, Y. Koren, R. Turrin, Performance of recommender algorithms on top-n recommendation tasks, in Proceedings of the Fourth ACM Conference on Recommender Systems (2010), pp. 39–46

    Google Scholar 

  18. M.F. Dacrema, P. Cremonesi, D. Jannach, Are we really making much progress? A worrying analysis of recent neural recommendation approaches, in Proceedings of the 13th ACM Conference on Recommender Systems (2019), pp. 101–109

    Google Scholar 

  19. A.S. Das, M. Datar, A. Garg, S. Rajaram, Google news personalization: scalable online collaborative filtering, in WWW ’07: Proceedings of the 16th International Conference on World Wide Web, New York, NY (ACM, New York, 2007), pp. 271–280

    Google Scholar 

  20. O. Dekel, C.D. Manning, Y. Singer, Log-linear models for label ranking, in NIPS’03 (2003), pages 1–1

    Google Scholar 

  21. J. Demšar, Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  22. M. Deshpande, G. Karypis, Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. 22(1), 143–177 (2004)

    Article  Google Scholar 

  23. G. Fischer, User modeling in human-computer interaction. User Model. User-Adapt. Interact. 11(1–2), 65–86 (2001)

    Article  MATH  Google Scholar 

  24. D.M. Fleder, K. Hosanagar, Recommender systems and their impact on sales diversity, in EC ’07: Proceedings of the 8th ACM Conference on Electronic Commerce, New York, NY, 2007 (ACM, New York, 2007), pp. 192–199

    Google Scholar 

  25. D. Frankowski, D. Cosley, S. Sen, L. Terveen, J. Riedl, You are what you say: privacy risks of public mentions, in SIGIR ’06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, 2006 (ACM, New York, 2006), pp. 565–572

    Google Scholar 

  26. G.A. Fredricks, R.B. Nelsen, On the relationship between spearman’s rho and kendall’s tau for pairs of continuous random variables. J. Stat. Plan. Infer. 137(7), 2143–2150 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  27. S. Frumerman, G. Shani, B. Shapira, O. Sar Shalom, Are all rejected recommendations equally bad? towards analysing rejected recommendations, in Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization (2019), pp. 157–165

    Google Scholar 

  28. Z. Gantner, S. Rendle, C. Freudenthaler, L. Schmidt-Thieme, Mymedialite: a free recommender system library. In Proceedings of the Fifth ACM Conference on Recommender systems (2011), pp. 305–308

    Google Scholar 

  29. F. Garcin, B. Faltings, O. Donatsch, A. Alazzawi, C. Bruttin, A. Huber, Offline and online evaluation of news recommender systems at swissinfo, in Proceedings of the 8th ACM Conference on Recommender systems (2014), pp. 169–176

    Google Scholar 

  30. T. George, A scalable collaborative filtering framework based on co-clustering. in Fifth IEEE International Conference on Data Mining (2005), pp. 625–628

    Google Scholar 

  31. A.G. Greenwald, Within-subjects designs: To use or not to use? Psychol. Bull. 83, 216–229 (1976)

    Article  Google Scholar 

  32. G. Guo, J. Zhang, Z. Sun, N. Yorke-Smith, Librec: a java library for recommender systems, in UMAP Workshops, vol. 4. Citeseer, 2015

    Google Scholar 

  33. P. Haddawy, V. Ha, A. Restificar, B. Geisler, J. Miyamoto, Preference elicitation via theory refinement. J. Mach. Learn. Res. 4, 317–337 (2003)

    MathSciNet  MATH  Google Scholar 

  34. C. Hayes, P. Cunningham, An on-line evaluation framework for recommender systems. Technical report, Trinity College Dublin, Department of Computer Science, 2002

    Google Scholar 

  35. X. He, J. Pan, O. Jin, T. Xu, B. Liu, T. Xu, Y. Shi, A. Atallah, R. Herbrich, S. Bowers, et al., Practical lessons from predicting clicks on ads at facebook, in Proceedings of the Eighth International Workshop on Data Mining for Online Advertising (2014), pp. 1–9

    Google Scholar 

  36. J.L. Herlocker, J.A. Konstan, J.T. Riedl, Explaining collaborative filtering recommendations, in CSCW ’00: Proceedings of the 2000 ACM conference on Computer Supported Cooperative Work, New York, NY (ACM, New York, 2000), pp. 241–250

    Google Scholar 

  37. J.L. Herlocker, J.A. Konstan, J.T. Riedl, An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5(4), 287–310 (2002). ISSN:1386-4564. http://dx.doi.org/10.1023/A:1020443909834

    Article  Google Scholar 

  38. J.L. Herlocker, J.A. Konstan, L.G. Terveen, J.T. Riedl, Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004). ISSN:1046-8188. http://doi.acm.org/10.1145/963770.963772

    Article  Google Scholar 

  39. Y. Hijikata, T. Shimizu, S. Nishida, Discovery-oriented collaborative filtering for improving user satisfaction, in IUI ’09: Proceedings of the 13th International Conference on Intelligent User Interfaces, New York, NY (ACM, New York, 2009), pp. 67–76

    Google Scholar 

  40. R. Hu, P. Pu, A comparative user study on rating vs. personality quiz based preference elicitation methods, in IUI 0́9: Proceedings of the 13th International Conference on Intelligent User Interfaces, New York, NY (ACM, New York, 2009), pp. 367–372

    Google Scholar 

  41. R. Hu, P. Pu, A comparative user study on rating vs. personality quiz based preference elicitation methods, n IUI (2009), pp. 367–372

    Google Scholar 

  42. R. Hu, P. Pu, A study on user perception of personality-based recommender systems, in UMAP (2010), pp. 291–302

    Google Scholar 

  43. N. Hug, Surprise: a python library for recommender systems. J. Open Source Softw. 5(52), 2174 (2020)

    Google Scholar 

  44. A. Iovine, F. Narducci, G. Semeraro, Conversational recommender systems and natural language: a study through the converse framework. Decis. Support Syst. 131, 113250 (2020)

    Article  Google Scholar 

  45. K. Järvelin, J. Kekäläinen, Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002). ISSN:1046-8188. http://doi.acm.org/10.1145/582415.582418

    Article  Google Scholar 

  46. N. Jones, P. Pu, User technology adoption issues in recommender systems, in Networking and Electronic Conference, 2007

    Google Scholar 

  47. M. Jugovac, D. Jannach, M. Karimi, StreamingRec: a framework for benchmarking stream-based news recommenders, in Proceedings of the 12th ACM Conference on Recommender Systems (2018), pp. 269–273

    Google Scholar 

  48. S. Jung, J.L. Herlocker, J. Webster, Click data as implicit relevance feedback in web search. Inf. Process. Manage. 43(3), 791–807 (2007)

    Article  Google Scholar 

  49. G. Karypis, Evaluation of item-based top-n recommendation algorithms, in CIKM ’01: Proceedings of the Tenth International Conference on Information and Knowledge Management, New York, NY (ACM, New York, 2001), pp. 247–254

    Google Scholar 

  50. M.G. Kendall, A new measure of rank correlation. Biometrika 30(1–2), 81–93 (1938)

    Article  MATH  Google Scholar 

  51. M.G. Kendall, The treatment of ties in ranking problems. Biometrika 33(3), 239–251 (1945)

    Article  MathSciNet  MATH  Google Scholar 

  52. R. Kohavi, R. Longbotham, D. Sommerfield, R.M. Henne, Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Discov. 18(1), 140–181 (2009)

    Article  MathSciNet  Google Scholar 

  53. R. Kohavi, A. Deng, B. Frasca, T. Walker, Y. Xu, N. Pohlmann, Online controlled experiments at large scale, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13, New York, NY, 2013 (ACM, New York, 2013), pp. 1168–1176

    Google Scholar 

  54. J.A. Konstan, S.M. McNee, C.-N. Ziegler, R. Torres, N. Kapoor, J. Riedl, Lessons on applying automated recommender systems to information-seeking tasks, in AAAI, 2006

    Google Scholar 

  55. Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  56. I. Koychev, I. Schwab, Adaptation to drifting user’s interests, in In Proceedings of ECML2000 Workshop: Machine Learning in New Information Age (2000), pp. 39–46

    Google Scholar 

  57. S.K. Lam, J. Riedl, Shilling recommender systems for fun and profit, in WWW ’04: Proceedings of the 13th International Conference on World Wide Web, New York, NY (ACM, New York, 2004), pp. 393–402

    Google Scholar 

  58. S.K. Lam, D. Frankowski, J. Riedl, Do you trust your recommendations? an exploration of security and privacy issues in recommender systems, in In Proceedings of the 2006 International Conference on Emerging Trends in Information and Communication Security (ETRICS), 2006

    Google Scholar 

  59. E.L. Lehmann, J.P. Romano, Testing Statistical Hypotheses, 3rd edn. Springer Texts in Statistics (Springer, New York, 2005)

    Google Scholar 

  60. R. Lempel, Personalization is a two-way street, in Proceedings of the Eleventh ACM Conference on Recommender Systems (2017), pp. 3–3

    Google Scholar 

  61. T. Mahmood, F. Ricci, Learning and adaptivity in interactive recommender systems. in ICEC ’07: Proceedings of the Ninth International Conference on Electronic Commerce, New York, NY (ACM, New York, 2007), pp. 75–84

    Google Scholar 

  62. A. Maksai, F. Garcin, B. Faltings, Predicting online performance of news recommender systems through richer evaluation metrics, in Proceedings of the 9th ACM Conference on Recommender Systems (2015), pp. 179–186

    Google Scholar 

  63. B.M. Marlin, R.S. Zemel, Collaborative prediction and ranking with non-random missing data, in Proceedings of the 2009 ACM Conference on Recommender Systems, RecSys 2009, New York, NY, October 23–25, 2009, pp. 5–12

    Google Scholar 

  64. P. Massa, B. Bhattacharjee, Using trust in recommender systems: An experimental analysis. in Proceedings of iTrust2004 International Conference (2004), pp. 221–235

    Google Scholar 

  65. M.R. McLaughlin, J.L. Herlocker, A collaborative filtering algorithm and evaluation metric that accurately model the user experience, in SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY (ACM, New York, 2004), pp. 329–336

    Google Scholar 

  66. H.B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, et al., Ad click prediction: a view from the trenches, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013), pp. 1222–1230

    Google Scholar 

  67. S.M. McNee, J. Riedl, J.A. Konstan, Making recommendations better: an analytic model for human-recommender interaction. in CHI ’06: CHI ’06 Extended Abstracts on Human Factors in Computing Systems, New York, NY, 2006 (ACM, New York, 2006), pp. 1103–1108

    Google Scholar 

  68. F. McSherry, I. Mironov, Differentially private recommender systems: building privacy into the netflix prize contenders. in KDD ’09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY (ACM, New york, 2009), pp. 627–636

    Google Scholar 

  69. B. Mobasher, R. Burke, R. Bhaumik, C. Williams, Toward trustworthy recommender systems: an analysis of attack models and algorithm robustness. ACM Trans. Internet Technol. 7(4), 23 (2007)

    Google Scholar 

  70. T. Murakami, K. Mori, R. Orihara, Metrics for evaluating the serendipity of recommendation lists. New Front. Artif. Intell. 4914, 40–46 (2008)

    Article  Google Scholar 

  71. T.T. Nguyen, D. Kluver, T.-Y. Wang, P.-M. Hui, M.D. Ekstrand, M.C. Willemsen, J. Riedl, Rating support interfaces to improve user experience and recommender accuracy, in Proceedings of the 7th ACM Conference on Recommender Systems, RecSys ’13, New York, NY (ACM, New York, 2013), pp. 149–156

    Google Scholar 

  72. M. O’Mahony, N. Hurley, N. Kushmerick, G. Silvestre, Collaborative recommendation: a robustness analysis. ACM Trans. Internet Technol. 4(4), 344–377 (2004)

    Article  Google Scholar 

  73. S.L. Pfleeger, B.A. Kitchenham, Principles of survey research. SIGSOFT Softw. Eng. Notes 26(6), 16–18 (2001)

    Article  Google Scholar 

  74. P. Pu, L. Chen, Trust building with explanation interfaces, in IUI ’06: Proceedings of the 11th International Conference on Intelligent User Interfaces, New York, NY, 2006 (ACM, New York, 2006), pp. 93–100

    Google Scholar 

  75. P. Pu, L. Chen, R. Hu, A user-centric evaluation framework for recommender systems, in Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, New York, NY (ACM, New York, 2011), pp. 157–164

    Book  Google Scholar 

  76. P. Pu, L. Chen, R. Hu, A user-centric evaluation framework for recommender systems, in Proceedings of the Fifth ACM Conference on Recommender Systems (2011), pp. 157–164

    Google Scholar 

  77. S. Queiroz, Adaptive preference elicitation for top-k recommendation tasks using gai-networks, in AIAP’07: Proceedings of the 25th Conference on Proceedings of the 25th IASTED International Multi-Conference, Anaheim, CA, 2007 (ACTA Press, Calgary, 2007), pp. 579–584

    Google Scholar 

  78. S. Rendle, C. Freudenthaler, Z. Gantner, L. Schmidt-Thieme, BPR: Bayesian personalized ranking from implicit feedback, in UAI ’09: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, 2009

    Google Scholar 

  79. F. Ricci, Recommender systems in tourism, in Handbook of e-Tourism (Springer, Cham, 2020), pp. 1–18

    Book  Google Scholar 

  80. M. Rossetti, F. Stella, M. Zanker, Contrasting offline and online results when evaluating recommendation algorithms, in Proceedings of the 10th ACM Conference on Recommender Systems (2016), pp. 31–34

    Google Scholar 

  81. Margaret L Russell, Donna G Moralejo, and Ellen D Burgess. Paying research subjects: participants’ perspectives. J. Med. Ethics 26(2), 126–130 (2000)

    Google Scholar 

  82. A. Said, A short history of the recsys challenge. AI Mag. 37(4), 102–104 (2017)

    MathSciNet  Google Scholar 

  83. A. Said, A. Bellogín, Comparative recommender system evaluation: benchmarking recommendation frameworks, in Proceedings of the 8th ACM Conference on Recommender Systems (2014), pp. 129–136

    Google Scholar 

  84. A. Said, A. Bellogín, Rival: a toolkit to foster reproducibility in recommender system evaluation, in Proceedings of the 8th ACM Conference on Recommender systems (2014), pp. 371–372

    Google Scholar 

  85. S.L. Salzberg, On comparing classifiers: Pitfalls toavoid and a recommended approach. Data Min. Knowl. Discov. 1(3), 317–328 (1997)

    Article  Google Scholar 

  86. M.R. Santana, L.C. Melo, F.H.F. Camargo, B. Brandão, A. Soares, R.M. Oliveira, S. Caetano, Mars-gym: a gym framework to model, train, and evaluate recommender systems for marketplaces (2020). Preprint. arXiv:2010.07035

    Google Scholar 

  87. B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Analysis of recommendation algorithms for e-commerce, in EC ’00: Proceedings of the 2nd ACM Conference on Electronic Commerce, New York, NY (ACM, New York, 2000), pp. 158–167

    Google Scholar 

  88. B. Sarwar, G. Karypis, J. Konstan, J. Reidl, Item-based collaborative filtering recommendation algorithms. in WWW ’01: Proceedings of the 10th International Conference on World Wide Web, New York, NY (ACM, New York, 2001), pp. 285–295

    Google Scholar 

  89. B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms. in Proceedings of the 10th International Conference on World Wide Web (2001), pp. 285–295

    Google Scholar 

  90. A.I. Schein, A. Popescul, L.H. Ungar, D.M. Pennock, Methods and metrics for cold-start recommendations. in SIGIR ’02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, 2002 (ACM, New York, 2002), pp. 253–260

    Google Scholar 

  91. S. Sedhain, A.K. Menon, S. Sanner, L. Xie, Autorec: autoencoders meet collaborative filtering, in Proceedings of the 24th International Conference on World Wide Web (2015), pp. 111–112

    Google Scholar 

  92. G. Shani, D. Heckerman, R.I. Brafman, An MDP-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005)

    MathSciNet  MATH  Google Scholar 

  93. G. Shani, D.M. Chickering, C. Meek, Mining recommendations from the web, in RecSys ’08: Proceedings of the 2008 ACM Conference on Recommender Systems (2008), pp. 35–42

    Google Scholar 

  94. G. Shani, L. Rokach, B. Shapira, S. Hadash, M. Tangi, Investigating confidence displays for top-n recommendations. JASIST 64(12), 2548–2563 (2013)

    Article  Google Scholar 

  95. N. Silberstein, O. Somekh, Y. Koren, M. Aharon, D. Porat, A. Shahar, T. Wu, Ad close mitigation for improved user experience in native advertisements, in Proceedings of the 13th International Conference on Web Search and Data Mining (2020), pp. 546–554

    Google Scholar 

  96. B. Smyth, P. McClave, Similarity vs. diversity, in ICCBR (2001), pp. 347–361

    Google Scholar 

  97. W.J. Spillman, E. Lang, The Law of Diminishing Returns (World Book Company, New York, 1924)

    Google Scholar 

  98. H. Steck, Item popularity and recommendation accuracy, in Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, New York, NY, 2011 (ACM, New york, 2011), pp. 125–132

    Google Scholar 

  99. H. Steck, Evaluation of recommendations: rating-prediction and ranking, in Seventh ACM Conference on Recommender Systems, RecSys ’13, Hong Kong, China, October 12–16, (2013), pp. 213–220

    Google Scholar 

  100. K. Swearingen, R. Sinha, Beyond algorithms: An HCI perspective on recommender systems, in ACM SIGIR 2001 Workshop on Recommender Systems, 2001

    Google Scholar 

  101. C.J. Van Rijsbergen, Information Retrieval (Butterworth-Heinemann, Newton, MA, 1979)

    MATH  Google Scholar 

  102. E.M. Voorhees, The philosophy of information retrieval evaluation, in CLEF ’01: Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems (Springer, London, 2002), pp. 355–370

    Google Scholar 

  103. E.M. Voorhees, Overview of trec 2002, in Proceedings of the 11th Text Retrieval Conference (TREC 2002), NIST Special Publication 500-251 (2002), pp. 1–15

    Google Scholar 

  104. Y.Y. Yao, Measuring retrieval effectiveness based on user preference of documents. J. Am. Soc. Inf. Syst. 46(2), 133–145 (1995)

    Article  Google Scholar 

  105. E. Yilmaz, J.A. Aslam, S. Robertson, A new rank correlation coefficient for information retrieval. in Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, New York, NY, 2008 (ACM, New York, 2008), pp. 587–594

    Google Scholar 

  106. Y. Zeldes, S. Theodorakis, E. Solodnik, A. Rotman, G. Chamiel, D. Friedman, Deep density networks and uncertainty in recommender systems (2017). Preprint. ArXiv:1711.02487

    Google Scholar 

  107. M. Zhang, N. Hurley, Avoiding monotony: improving the diversity of recommendation lists, in RecSys ’08: Proceedings of the 2008 ACM Conference on Recommender Systems (ACM, New York, NY, 2008), pp. 123–130

    Book  Google Scholar 

  108. S. Zhang, L. Yao, A. Sun, Y. Tay, Deep learning based recommender system: a survey and new perspectives. ACM Comput. Surv. (CSUR) 52(1), 1–38 (2019)

    Google Scholar 

  109. Y. Zhang, J. Callan, T. Minka, Novelty and redundancy detection in adaptive filtering, in SIGIR ’02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM, New York, NY, 2002), pp. 81–88

    Google Scholar 

  110. C.-N. Ziegler, S.M. McNee, J.A. Konstan, G. Lausen, Improving recommendation lists through topic diversification, in WWW 0́5: Proceedings of the 14th International Conference on World Wide Web (ACM, New York, 2005), pp. 22–32

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guy Shani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Science+Business Media, LLC, part of Springer Nature

About this chapter

Cite this chapter

Gunawardana, A., Shani, G., Yogev, S. (2022). Evaluating Recommender Systems. In: Ricci, F., Rokach, L., Shapira, B. (eds) Recommender Systems Handbook. Springer, New York, NY. https://doi.org/10.1007/978-1-0716-2197-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2197-4_15

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-0716-2196-7

  • Online ISBN: 978-1-0716-2197-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics