Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems

Heinrich, Bernd; Hopf, Marcus; Lohninger, Daniel; Schiller, Alexander; Szubartowicz, Michael

doi:10.1007/s12525-019-00366-7

Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems

Research Paper
Published: 29 August 2019

Volume 31, pages 389–409, (2021)
Cite this article

Electronic Markets Aims and scope Submit manuscript

Bernd Heinrich ORCID: orcid.org/0000-0003-4193-0100¹,
Marcus Hopf¹,
Daniel Lohninger¹,
Alexander Schiller¹ &
…
Michael Szubartowicz¹

1830 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Recommender systems strive to guide users, especially in the field of e-commerce, to their individually best choice when a large number of alternatives is available. In general, literature suggests that the quality of data which a recommender system is based on may have important impact on recommendation quality. In this paper, we focus on the data quality dimension completeness of item content data (i.e., features of items and their feature values) and investigate its impact on the prediction accuracy of recommender systems. In particular, we examine the increase in completeness per item, per user and per feature as moderators for this impact. To this end, we present a theoretical model based on the literature and derive ten hypotheses. We test these hypotheses on two real-world data sets, one from two leading web portals for restaurant reviews and another one from a movie review portal. The results strongly support that, in general, the prediction accuracy is positively influenced by increased completeness. However, the results also reveal, contrary to existing literature, that among others increasing completeness by adding features which differ significantly from already existing features (i.e., a high diversity) does not positively influence the prediction accuracy of recommender systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Something’s Missing? A Procedure for Extending Item Content Data Sets in the Context of Recommender Systems

Article Open access 22 October 2020

Which Data Quality Model for Recommender Systems?

Bringing Diversity to Recommendation Lists – An Analysis of the Placement of Diverse Items

References

Abel, F., Herder, E., Houben, G.-J., Henze, N., & Krause, D. (2013). Cross-system user modeling and personalization on the social web. User Modeling and User-Adapted Interaction, 23(2–3), 169–209. https://doi.org/10.1007/s11257-012-9131-2 .
Article Google Scholar
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems. A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749. https://doi.org/10.1109/TKDE.2005.99 .
Article Google Scholar
Adomavicius, G., & Zhang, J. (2012). Impact of data characteristics on recommender systems performance. ACM Transactions on Management Information Systems, 3(1), 1–17. https://doi.org/10.1145/2151163.2151166 .
Article Google Scholar
Adomavicius, G., & Zhang, J. (2016). Classification, ranking, and top-K stability of recommendation algorithms. INFORMS Journal on Computing, 28(1), 129–147. https://doi.org/10.1287/ijoc.2015.0662 .
Article Google Scholar
Aggarwal, C. C. (2014). Data Classification. London: Chapman and Hall/CRC.
Book Google Scholar
Aguinis, H., Beaty, J. C., Boik, R. J., & Pierce, C. A. (2005). Effect size and power in assessing moderating effects of categorical variables using multiple regression: a 30-year review. The Journal of Applied Psychology, 90(1), 94–107. https://doi.org/10.1037/0021-9010.90.1.94 .
Article Google Scholar
Amatriain, Xavier, Pujol, Josep M., Tintarev, Nava, Oliver, Nuria (2009): Rate it again. In Lawrence Bergman, Alex Tuzhilin, Robin Burke, Alexander Felfernig, Lars Schmidt-Thieme (Eds.): Proceedings of the third ACM conference on Recommender systems. New York, New York, USA. ACM Special Interest Group on Computer-Human Interaction. New York, NY: ACM, pp. 173–180.
Ballou, D. P., & Pazer, H. L. (1985). Modeling data and process quality in multi-input, multi-output information systems. Management Science, 31(2), 150–162.
Article Google Scholar
Basaran, D., Ntoutsi, E., & Zimek, A. (2017). Redundancies in data and their effect on the evaluation of recommendation systems: a case study on the Amazon reviews datasets. In N. Chawla & W. Wang (Eds.), Proceedings of the 2017 SIAM international conference on data mining (pp. 390–398). Philadelphia: Society for Industrial and Applied Mathematics.
Google Scholar
Batini, C., & Scannapieco, M. (2016). Data and information quality. Cham: Springer International Publishing.
Book Google Scholar
Batini, C., Cappiello, C., Francalanci, C., & Maurino, A. (2009). Methodologies for data quality assessment and improvement. ACM Comput Surv, 41(3), 1–52. https://doi.org/10.1145/1541880.1541883 .
Article Google Scholar
Bell, R.M., Koren, Y., Volinsky, C. (2007). The BellKor solution to the Netflix prize.
Berkovsky, S., Kuflik, T., & Ricci, F. (2012). The impact of data obfuscation on the accuracy of collaborative filtering. Expert Systems with Applications, 39(5), 5033–5042. https://doi.org/10.1016/j.eswa.2011.11.037 .
Article Google Scholar
Bharati, P., & Chaudhury, A. (2004). An empirical investigation of decision-making satisfaction in web-based decision support systems. Decision Support Systems, 37(2), 187–197. https://doi.org/10.1016/S0167-9236(03)00006-X .
Article Google Scholar
Blake, R., & Mangiameli, P. (2011). The effects and interactions of data quality and problem complexity on classification. Journal Data and Information Quality, 2(2), 1–28. https://doi.org/10.1145/1891879.1891881 .
Article Google Scholar
Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-Based Systems, 46, 109–132.
Article Google Scholar
Boneau, C. A. (1960). The effects of violations of assumptions underlying the t test. Psychological Bulletin, 57(1), 49–64. https://doi.org/10.1037/h0041412 .
Article Google Scholar
Bostandjiev, S., O’Donovan, J., Höllerer, T. (2012). TasteWeights: a visual interactive hybrid recommender system. In Pádraig Cunningham, Neil Hurley, Ido Guy, Sarabjot Singh Anand (Eds.): Proceedings of the sixth ACM conference on Recommender systems. Dublin, Ireland. ACM Special Interest Group on Electronic Commerce; ACM Special Interest Group on Knowledge Discovery in Data; ACM Special Interest Group on Artificial Intelligence; ACM Special Interest Group on Computer-Human Interaction; ACM Special Interest Group on Hypertext, Hypermedia, and Web; ACM Special Interest Group on Information Retrieval. New York, NY: ACM, pp. 35–42.
Burke, R., & Ramezani, M. (2011). Matching recommendation technologies and domains. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender Systems Handbook (pp. 367–386). Boston, MA: Springer US.
Chapter Google Scholar
Christen, P. (2012). Data matching. Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Berlin: Springer Berlin Heidelberg.
Google Scholar
Cohen, J. (1988). Statistical Power analysis for the behavioral sciences (2nd ed.). Hillsdale: Erlbaum. Available online at http://gbv.eblib.com/patron/FullRecord.aspx?p=1192162. Accessed 03/07/2019.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). New York: Routledge Taylor & Francis Group. Available online at http://www.loc.gov/catdir/enhancements/fy0634/2002072068-d.html. Accessed 03/07/2019.
Cunha, T., Soares, C., de Carvalho, A.C.P.L.F. (2016). Selecting Collaborative Filtering Algorithms Using Metalearning. In Paolo Frasconi, Niels Landwehr, Giuseppe Manco, Jilles Vreeken (Eds.): Machine Learning and Knowledge Discovery in Databases. European Conference, Ecml Pkdd 2016, Riva Del Garda, Italy, September 19–23, 2016, Proceedings, vol. 9852. Cham: Springer-Verlag New York Inc (LNCS Sublibrary: SL7 - Artificial Intelligence, 9851–9853), pp. 393–409.
Dawson, J. F. (2014). Moderation in management research: What, why, when, and how. Journal of Business and Psychology, 29(1), 1–19. https://doi.org/10.1007/s10869-013-9308-7 .
Article Google Scholar
De Pessemier, T., Dooms, S., Deryckere, T., Martens, L. (2010). Time dependency of data quality for collaborative filtering algorithms. In Xavier Amatriain, Marc Torrens, Paul Resnick, Markus Zanker (Eds.): Proceedings of the fourth ACM conference on Recommender systems. Barcelona, Spain. ACM Special Interest Group on Knowledge Discovery in Data; ACM Special Interest Group on Electronic Commerce; ACM Special Interest Group on Artificial Intelligence; ACM Special Interest Group on Computer-Human Interaction; ACM Special Interest Group on Information Retrieval; ACM Special Interest Group on Hypertext, Hypermedia, and Web. New York, NY: ACM, pp. 281–284.
Doerfel, S., Jäschke, R., & Stumme, G. (2016). The role of cores in recommender benchmarking for social bookmarking systems. ACM Trans. Intell. Syst. Technol., 7(3), 1–33. https://doi.org/10.1145/2700485 .
Article Google Scholar
Ekstrand, M., Riedl, J. (2012). When recommenders fail. In Pádraig Cunningham (Ed.): Proceedings of the sixth ACM conference on Recommender systems. the sixth ACM conference. Dublin, Ireland, 9/9/2012–9/13/2012. New York, NY: ACM (ACM Digital Library), p. 233.
Enders, C.K. (2010). Applied missing data analysis. New York: Guilford Press (Methodology in the social sciences). Available online at http://site.ebrary.com/lib/alltitles/docDetail.action?docID=10389908. Accessed 03/07/2019.
Feldman, M., Even, A., & Parmet, Y. (2018). A methodology for quantifying the effect of missing data on decision quality in classification problems. Communications in Statistics–Theory and Methods, 47(11), 2643–2663.
Article Google Scholar
Felfernig, A., Friedrich, G., & Schmidt-Thieme, L. (2007). Recommender systems. In IEEE Intelligent Systems, 22(3), 18–21.
Article Google Scholar
Forbes, P., Zhu, M. (2011). Content-boosted matrix factorization for recommender systems. In Bamshad Mobasher, Robin Burke, Dietmar Jannach, Gediminas Adomavicius (Eds.): Proceedings of the fifth ACM conference on Recommender systems. Proceedings of the fifth ACM conference on Recommender systems. Chicago, Illinois, USA. New York, NY: ACM, pp. 261–264.
Fortes, R.S., de Freitas, A.R.R., Gonçalves, M.A. (2017). A Multicriteria Evaluation of Hybrid Recommender Systems: On the Usefulness of Input Data Characteristics.
Ge, M. (2009). Information quality assessment and effects on inventory decision-making. Doctoral dissertation. Dublin City University, Dublin City University.
Geuens, S., Coussement, K., & de Bock, K. W. (2018). A framework for configuring collaborative filtering-based recommendations derived from purchase data. European Journal of Operational Research, 265(1), 208–218. https://doi.org/10.1016/j.ejor.2017.07.005 .
Article Google Scholar
Ghani, R., Probst, K., Liu, Y., Krema, M., & Fano, A. (2006). Text mining for product attribute extraction. ACM SIGKDD Explorations Newsletter, 8(1), 41–48. https://doi.org/10.1145/1147234.1147241 .
Article Google Scholar
Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78. https://doi.org/10.1016/j.paid.2016.06.069 .
Article Google Scholar
Gomez-Uribe, C.A., Hunt, N. (2016). The Netflix recommender system: Algorithms, business value, and innovation. In ACM Transactions on Management Information Systems (TMIS), 6(4, Article 13).
Grčar, M., Mladenič, D., Fortuna, B., Grobelnik, M. (2006). Data Sparsity Issues in the Collaborative Filtering Framework. In Olfa Nasraoui (Ed.): Advances in web mining and web usage analysis. 7th International Workshop on Knowledge Discovery on the Web, WebKDD 2005 : Chicago, IL, USA, August 21, 2005 : revised papers, vol. 4198. Berlin: Springer (Lecture Notes in Computer Science, 4198), pp. 58–76.
Griffith, J., O'Riordan, C., Sorensen, H. (2012). Investigations into user rating information and predictive accuracy in a collaborative filtering domain. In Sascha Ossowski, Paola Lecca (Eds.): Proceedings of the 27th annual ACM symposium on applied computing 2012. Symposium on Applied Computing : Riva del Garda, Trento, Italy, March 26–30, 2012. the 27th Annual ACM Symposium. Trento, Italy, 3/26/2012-3/30/2012. New York, N.Y.: ACM Press; Association for Computing Machinery, p. 937.
Gunawardana, A., & Shani, G. (2015). Evaluating Recommender Systems. In F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (Vol. 12, pp. 265–308). Boston: Springer US.
Chapter Google Scholar
Harper, F. M., & Konstan, J. A. (2015). The MovieLens datasets. ACM Transactions on Interactive Intelligent Systems, 5(4), 1–19. https://doi.org/10.1145/2827872 .
Article Google Scholar
Hayes, A.F. (2013). Introduction to mediation, moderation, and conditional process analysis. A regression-based approach. New York, NY: Guilford Press (Methodology in the social sciences). Available online at http://lib.myilibrary.com/detail.asp?id=480011. Accessed 03/07/2019.
Heinrich, B., & Hristova, D. (2016). A quantitative approach for modelling the influence of currency of information on decision-making under uncertainty. Journal of Decision Systems, 25(1), 16–41. https://doi.org/10.1080/12460125.2015.1080494 .
Article Google Scholar
Heinrich, B., Hristova, D., Klier, M., Schiller, A., & Szubartowicz, M. (2018a). Requirements for data quality metrics. Journal Data and Information Quality, 9(2), 1–32. https://doi.org/10.1145/3148238 .
Article Google Scholar
Heinrich, B., Klier, M., Schiller, A., & Wagner, G. (2018b). Assessing data quality – A probability-based metric for semantic consistency. Decision Support Systems, 110, 95–106. https://doi.org/10.1016/j.dss.2018.03.011 .
Article Google Scholar
Helm, R., & Mark, A. (2012). Analysis and evaluation of moderator effects in regression models: State of art, alternatives and empirical example. Review of Managerial Science, 6(4), 307–332. https://doi.org/10.1007/s11846-010-0057-y .
Article Google Scholar
Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), 22(1), 5–53.
Article Google Scholar
Huang, Z., & Zeng, D. D. (2005). Why does collaborative filtering work? Recommendation model validation and selection by analyzing bipartite random graphs. SSRN Journal. https://doi.org/10.2139/ssrn.894029 .
Jannach, D., Resnick, P., Tuzhilin, A., & Zanker, M. (2016). Recommender systems - beyond matrix completion. Communications of the ACM, 59(11), 94–102. https://doi.org/10.1145/2891406 .
Article Google Scholar
Karatzoglou, A., Hidasi, B. (2017). Deep Learning for Recommender Systems. In Paolo Cremonesi, Francesco Ricci, Shlomo Berkovsky, Alexander Tuzhilin (Eds.): Proceedings of the Eleventh ACM Conference on Recommender Systems - RecSys '17. the Eleventh ACM Conference. Como, Italy, 27.08.2017-31.08.2017. New York, New York, USA: ACM Press, pp. 396–397.
Kayaalp, M., Özyer, T., Özyer, S. T. (2009). A collaborative and content based event recommendation system integrated with data collection scrapers and services at a social networking site. In Nasrullah Memon (Ed.): International conference on advances in social networks analysis and mining, 2009. Piscataway: IEEE, pp. 113–118.
Kim, D., Park, C., Oh, J., Lee, S., Yu, H. (2016). Convolutional Matrix Factorization for Document Context-Aware Recommendation. In Shilad Sen, Werner Geyer, Jill Freyne, Pablo Castells (Eds.): Proceedings of the 10th ACM Conference on Recommender Systems - RecSys '16. the 10th ACM Conference. Boston, Massachusetts, USA, 15.09.2016–19.09.2016. New York, New York, USA: ACM Press, pp. 233–240.
Konstan, J. A., & Riedl, J. (2012). Recommender systems. From algorithms to user experience. User Model User-Adap Inter, 22(1–2), 101–123. https://doi.org/10.1007/s11257-011-9112-x .
Article Google Scholar
Koren, Y. (2009). The bellkor solution to the netflix grand prize. Netflix Prize Documentation, 81, 1–10.
Google Scholar
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37. https://doi.org/10.1109/MC.2009.263 .
Article Google Scholar
Lathia, N., Amatriain, X., Pujol, J.M. (2009). Collaborative filtering with adaptive information sources. In Sarabjot Singh Anand, Bamshad Mobasher, Alfred Kobsa, Dietmar Jannach (Eds.): Proceedings of the 7th Workshop on Intelligent Techniques for Web Personalization & Recommender Systems (ITWP'09). Intelligent Techniques for Web Personalization & Recommender Systems -- ITWP'09. Pasadena, California, USA, July 11–17. CEUR-WS. org (CEUR Workshop Proceedings (CEUR-WS.org), 528), pp. 81–86.
Lee, Y. W., Strong, D. M., Kahn, B. K., & Wang, R. Y. (2002). AIMQ: a methodology for information quality assessment. Information & Management, 40(2), 133–146. https://doi.org/10.1016/S0378-7206(02)00043-5 .
Article Google Scholar
Levi, A., Mokryn, O., Diot, C., Taft, N. (2012). Finding a needle in a haystack of reviews. cold start context-based hotel recommender system. In Pádraig Cunningham, Neil Hurley, Ido Guy, Sarabjot Singh Anand (Eds.): Proceedings of the sixth ACM conference on Recommender systems. Dublin, Ireland. ACM Special Interest Group on Electronic Commerce; ACM Special Interest Group on Knowledge Discovery in Data; ACM Special Interest Group on Artificial Intelligence; ACM Special Interest Group on Computer-Human Interaction; ACM Special Interest Group on Hypertext, Hypermedia, and Web; ACM Special Interest Group on Information Retrieval. New York, NY: ACM, pp. 115–122.
Levy, Y., & Ellis, T. J. (2006). A systems approach to conduct an effective literature review in support of information systems research. Informing Science, 9, 181–212.
Article Google Scholar
Li, S. S., & Karahanna, E. (2015). Online recommendation systems in a B2C E-commerce context: A review and future directions. Journal of the Association for Information Systems, 16(2), 72–107.
Article Google Scholar
Lops, P., de Gemmis, M., Semeraro, G. (2011). Content-based recommender systems. State of the art and trends. In : Recommender systems handbook: Springer, pp. 73–105.
Lu, J., Wu, D., Mao, M., Wang, W., & Zhang, G. (2015). Recommender system application developments: A survey. Decision Support Systems, 74, 12–32. https://doi.org/10.1016/j.dss.2015.03.008 .
Article Google Scholar
MacCallum, R. C., & Mar, C. M. (1995). Distinguishing between moderator and quadratic effects in multiple regression. Psychological Bulletin, 118(3), 405–421. https://doi.org/10.1037/0033-2909.118.3.405 .
Article Google Scholar
Matuszyk, P., Spiliopoulou, M. (2014). Predicting the performance of collaborative filtering algorithms. In Rajendra Akerkar, Nick Bassiliades, John Davies, Vadim Ermolayev (Eds.): WIMS '14 : 4th International Conference on Web Intelligence, Mining and Semantics. the 4th International Conference. Thessaloniki, Greece, 6/2/2014–6/4/2014. New York, New York, USA: ACM Press, pp. 1–6.
Mitra, P., Murthy, C. A., & Pal, S. K. (2002). Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 301–312. https://doi.org/10.1109/34.990133 .
Article Google Scholar
Nguyen, J., & Zhu, M. (2013). Content-boosted matrix factorization techniques for recommender systems. Statistical Analy Data Mining, 6(4), 286–301. https://doi.org/10.1002/sam.11184 .
Article Google Scholar
Ning, X., Desrosiers, C., Karypis, G. (2015). A comprehensive survey of neighborhood-based recommendation methods. In : Recommender systems handbook: Springer, pp. 37–76.
Ning, Y., Shi, Y., Hong, L., Rangwala, H., Ramakrishnan, N. (2017). A Gradient-based Adaptive Learning Framework for Efficient Personal Recommendation. In Paolo Cremonesi, Francesco Ricci, Shlomo Berkovsky, Alexander Tuzhilin (Eds.): Proceedings of the Eleventh ACM Conference on Recommender Systems - RecSys '17. the Eleventh ACM Conference. Como, Italy, 27.08.2017–31.08.2017. New York, New York, USA: ACM Press, pp. 23–31.
Olteanu, A., Kermarrec, A.-M., Aberer, K. (2014). Comparing the Predictive Capability of Social and Interest Affinity for Recommendations. In Boualem Benatallah, Azer Bestavros, Yannis Manolopoulos, Athena Vakali, Yanchun Zhang (Eds.): Web information systems engineering - WISE 2014. 15th International Conference, Thessaloniki, Greece, October 12–14, 2014 : proceedings, vol. 8786. Cham: Springer (LNCS sublibrary. SL 3, Information systems and application, incl. Internet/Web and HCI, 8786–8787), pp. 276–292.
Ozsoy, M. G., Polat, F., Alhajj, R. (2015): Modeling individuals and making recommendations using multiple social networks. In Jian Pei, Fabrizio Silvestri, Jie tang (Eds.): Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Piscataway, NJ, New York, NY: IEEE; ACM, pp. 1184–1191.
Pazzani, M.J., Billsus, D. (2007). Content-based recommendation systems. In : The adaptive web: Springer, pp. 325–341.
Picault, J., Ribiere, M., Bonnefoy, D., & Mercer, K. (2011). How to get the recommender out of the lab? In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender Systems Handbook (pp. 333–365). Boston: Springer US.
Chapter Google Scholar
Pipino, L. L., Lee, Y. W., & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211–218. https://doi.org/10.1145/505248.506010 .
Article Google Scholar
Porcel, C., & Herrera-Viedma, E. (2010). Dealing with incomplete information in a fuzzy linguistic recommender system to disseminate information in university digital libraries. Knowledge-Based Systems, 23(1), 32–39.
Article Google Scholar
Power, D. J., Sharda, R., & Burstein, F. (2015). Decision support systems. Hoboken: Wiley.
Google Scholar
Redman, T. C. (1996). Data quality for the information age. Boston: Artech House (The Artech House computer science library).
Google Scholar
Ricci, F., Rokach, L., Shapira, B., & Kantor, P. B. (Eds.). (2011). Recommender Systems Handbook. Boston: Springer US.
Google Scholar
Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: Introduction and challenges. In F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp. 1–34). Boston: Springer US.
Chapter Google Scholar
Sar Shalom, O., Berkovsky, S., Ronen, R., Ziklik, E., Amihood, A. (2015). Data Quality Matters in Recommender Systems. In Hannes Werthner, Markus Zanker, Jennifer Golbeck, Giovanni Semeraro (Eds.): Proceedings of the 9th ACM Conference on Recommender Systems. Vienna, Austria. RecSys; Association for Computing Machinery; ACM Conference on Recommender Systems; ACM Recommender Systems Conference. New York, NY: ACM, pp. 257–260.
Sarwar, B.M., Karypis, G., Konstan, J., Riedl, J. (2002). Recommender systems for large-scale e-commerce. Scalable neighborhood formation using clustering. In : Proceedings of the fifth international conference on computer and information technology, vol. 1, pp. 291–324.
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist., 6(2), 461–464. https://doi.org/10.1214/aos/1176344136 .
Article Google Scholar
Sergis, S., & Sampson, D. G. (2016). Learning object recommendations for teachers based on elicited ICT competence profiles. IEEE Transactions on Learning Technologies, 9(1), 67–80. https://doi.org/10.1109/TLT.2015.2434824 .
Article Google Scholar
Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender Systems Handbook (pp. 257–297). Boston: Springer US.
Chapter Google Scholar
Shmueli, & Koppius. (2011). Predictive analytics in information systems research. MIS Quarterly, 35(3), 553–572. https://doi.org/10.2307/23042796 .
Article Google Scholar
Song, Y., Dixon, S., Pearce, M. (2013). A survey of music recommendation systems and future perspectives. In Mitsuko Aramaki, Mathieu Barthet, Richard Kronland-Martinet, Sølvi Ystad (Eds.): From sounds to music and emotions. CMMR; International Symposium on Computer Music Modeling and Retrieval; CMMR "Music & Emotions". Berlin: Springer (Lecture Notes in Computer Science, 7900).
Symeonidis, P. (2016). Matrix and Tensor Decomposition in Recommender Systems. In Shilad Sen, Werner Geyer, Jill Freyne, Pablo Castells (Eds.): Proceedings of the 10th ACM Conference on Recommender Systems - RecSys '16. the 10th ACM Conference. Boston, Massachusetts, USA, 15.09.2016–19.09.2016. New York, New York, USA: ACM Press, pp. 429–430.
Tabakhi, S., & Moradi, P. (2015). Relevance–redundancy feature selection based on ant colony optimization. Pattern Recognition, 48(9), 2798–2811. https://doi.org/10.1016/j.patcog.2015.03.020 .
Article Google Scholar
Vargas-Govea, B., González-Serna, G., Ponce-Medellın, R. (2011). Effects of relevant contextual features in the performance of a restaurant recommender system. In Bamshad Mobasher, Robin Burke, Dietmar Jannach, Gediminas Adomavicius (Eds.): Proceedings of the fifth ACM conference on Recommender systems. Proceedings of the fifth ACM conference on Recommender systems. Chicago, Illinois, USA. New York, NY: ACM, pp. 592–596.
Wand, Y., & Wang, R. Y. (1996). Anchoring data quality dimensions in ontological foundations. Communications of the ACM, 39(11), 86–95. https://doi.org/10.1145/240455.240479 .
Article Google Scholar
Wang, R. Y., Storey, V. C., & Firth, C. P. (1995). A framework for analysis of data quality research. IEEE Transactions on Knowledge and Data Engineering, 7(4), 623–640.
Article Google Scholar
Woodall, P., Borek, A., Gao, J., Oberhofer, M., Koronios, A. (2015). An investigation of how data quality is affected by dataset size in the context of big data analytics. In Richard Wang (Ed.): Big data. Management and data quality ; 19th International Conference on Information Quality (ICIQ 2014), Xi'an, China, 1–3 August 2014. International Conference on Information Quality; ICIQ. Red Hook, NY: Curran, pp. 24–33.
Zapata, A., Menéndez, V. H., Prieto, M. E., & Romero, C. (2015). Evaluation and selection of group recommendation strategies for collaborative searching of learning objects. International Journal of Human-Computer Studies, 76, 22–39. https://doi.org/10.1016/j.ijhcs.2014.12.002 .
Article Google Scholar
Zhang, Z.-K., Zhou, T., & Zhang, Y.-C. (2010). Personalized recommendation via integrated diffusion on user–item–tag tripartite graphs. Physica A: Statistical Mechanics and its Applications, 389(1), 179–186. https://doi.org/10.1016/j.physa.2009.08.036 .
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management Information Systems, University of Regensburg, Universitätsstraße 31, 93053, Regensburg, Germany
Bernd Heinrich, Marcus Hopf, Daniel Lohninger, Alexander Schiller & Michael Szubartowicz

Authors

Bernd Heinrich
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Hopf
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Lohninger
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Schiller
View author publications
You can also search for this author in PubMed Google Scholar
Michael Szubartowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernd Heinrich.

Additional information

Responsible Editor: Steven Bellman

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heinrich, B., Hopf, M., Lohninger, D. et al. Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems. Electron Markets 31, 389–409 (2021). https://doi.org/10.1007/s12525-019-00366-7

Download citation

Received: 24 October 2018
Accepted: 09 August 2019
Published: 29 August 2019
Issue Date: June 2021
DOI: https://doi.org/10.1007/s12525-019-00366-7

Keywords

JEL classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems

Abstract

Access this article

Similar content being viewed by others

Something’s Missing? A Procedure for Extending Item Content Data Sets in the Context of Recommender Systems

Which Data Quality Model for Recommender Systems?

Bringing Diversity to Recommendation Lists – An Analysis of the Placement of Diverse Items

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

JEL classification

Navigation

Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems

Abstract

Access this article

Similar content being viewed by others

Something’s Missing? A Procedure for Extending Item Content Data Sets in the Context of Recommender Systems

Which Data Quality Model for Recommender Systems?

Bringing Diversity to Recommendation Lists – An Analysis of the Placement of Diverse Items

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL classification

Search

Navigation