Skip to main content
Log in

A decision-analytic framework for interpretable recommendation systems with multiple input data sources: a case study for a European e-tailer

  • S.I.: Business Analytics and Operations Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Recommendation systems help companies construct online personalization strategies for customers who are often overwhelmed by the abundance of product choices available. To extend existing operations research literature on recommendation systems, this article proposes a decision analytic framework for interpretable recommendation systems with multiple input data sources for e-commerce settings. The impact of multiple data sources on recommendation performance is investigated and two hybridization data fusion strategies, i.e., a posteriori weighting and input data source combination using factorization machines are benchmarked. Furthermore, a new importance score mechanism is introduced to provide insight into the input data sources’ and underlying variables’ impact on recommendation performance. The framework is empirically validated on 164,338 customers and 51,367 products across eight real-life data sets with four input data sources (product, customer, raw behavioral, and aggregated behavioral data) obtained from a large European e-commerce company. With this new decision analytic framework, e-commerce companies are able to open their recommendation system’s black box, to identify the most predictive input data sources and the best hybridization strategy for their business context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.

    Article  Google Scholar 

  • Al-Shamri, M. Y. H. (2016). User profiling approaches for demographic recommender systems. Knowledge-Based Systems, 100, 175–187.

    Article  Google Scholar 

  • Albadvi, A., & Shahbazi, M. (2010). Integrating rating-based collaborative filtering with customer lifetime value: New product recommendation technique. Intelligent Data Analysis, 14(1), 143–155.

    Article  Google Scholar 

  • Ando, T. (2018). Merchant selection and pricing strategy for a platform firm in the online group buying market. Annals of Operations Research, 263(1–2), 209–230.

    Article  Google Scholar 

  • Baesens, B., Setiono, R., Mues, C., & Vanthienen, J. (2003). Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 49(3), 312–329.

    Article  Google Scholar 

  • Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-Based Systems, 46, 109–132.

    Article  Google Scholar 

  • Bougiatiotis, K., & Giannakopoulos, T. (2018). Enhanced movie content similarity based on textual, auditory and visual information. Expert Systems with Applications, 96, 86–102.

    Article  Google Scholar 

  • Breese, J. S., Heckerman, D., & Kadie, C. (2013). Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In 14th Conference on Uncertainty in Artificial Intelligence (pp. 43–52). Madison: Morgan Kaufmann Publishers Inc.

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User Modelling and User-Adapted Interaction, 12(4), 331–370.

    Article  Google Scholar 

  • Cinicioglu, E. N., & Shenoy, P. P. (2016). A new heuristic for learning Bayesian networks from limited datasets: A real-time recommendation system application with RFID systems in grocery stores. Annals of Operations Research, 244(2), 385–405.

    Article  Google Scholar 

  • De Caigny, A., Coussement, K., De Bock, K. W., & Lessmann, S. (2020). Incorporating textual information in customer churn prediction models based on a convolutional neural network. International Journal of Forecasting, 36(4), 1563–1578.

    Article  Google Scholar 

  • De, P., Hu, Y., & Rahman, M. S. (2010). Technology usage and online sales: An empirical study. Management Science, 56(11), 1930–1945.

    Article  Google Scholar 

  • Deshpande, M., & Karypis, G. (2004). Item-based top-N recommendation algorithms. ACM Transactions on Information Systems, 22(1), 143–177.

    Article  Google Scholar 

  • Dooms, S. (2013). Dynamic generation of personalized hybrid recommender systems. In ‘RecSys 2013—Proceedings of the 7th ACM Conference on Recommender Systems’, ACM, Hong Kong, pp. 443–446.

  • Galetsi, P., & Katsaliaki, K. (2019). A review of the literature on big data analytics in healthcare. Journal of the Operational Research Society (July), 71, 1–19.

    Google Scholar 

  • García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences, 180(10), 2044–2064.

    Article  Google Scholar 

  • Geuens, S., Coussement, K. & De Bock, K. W. (2014), Evaluating collaborative filtering: Methods within a binary purchase setting. In ‘Proceedings of the 7th European Conference on Machine Learning (ECML)’, Nancy, pp. 81–90.

  • Geuens, S., Coussement, K., & De Bock, K. W. (2018). A framework for configuring collaborative filtering-based recommendations derived from purchase data. European Journal of Operational Research, 265(1), 208–218.

    Article  Google Scholar 

  • Glady, N., Baesens, B., & Croux, C. (2009). A modified Pareto/NBD approach for predicting customer lifetime value. Expert Systems with Applications, 36(2), 2062–2071.

    Article  Google Scholar 

  • Griffith, D. A., Boehmke, B., Bradley, R. V., Hazen, B. T., & Johnson, A. W. (2019). Embedded analytics: improving decision support for humanitarian logistics operations. Annals of Operations Research, 283(1–2), 247–265.

    Article  Google Scholar 

  • Gupta, M., & Kumar, P. (2020). Recommendation generation using personalized weight of meta-paths in heterogeneous information networks. European Journal of Operational Research, 284(2), 660–674.

    Article  Google Scholar 

  • Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5–53.

    Article  Google Scholar 

  • Hindle, G., Kunc, M., Mortensen, M., Oztekin, A., & Vidgen, R. (2020). Business analytics: Defining the field and identifying a research agenda. European Journal of Operational Research, 281(3), 483–490.

    Article  Google Scholar 

  • Jannach, D., Zanker, M., Felfernig, A., & Friedrich, G. (2010). Recommender systems: An introduction. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Jiang, Y., Shang, J., & Liu, Y. (2010). Maximizing customer satisfaction through an online recommendation system: A novel associative classification model. Decision Support Systems, 48(3), 470–479.

    Article  Google Scholar 

  • Kellar, M., Watters, C., Duffy, J. & Shepherd, M. (2004). Effect of task on time spent reading as an implicit measure of interest. In ‘Proceedings of the ASIST Annual Meeting’, Vol. 41, Medford: Information Today Inc, Providence, pp. 168–175.

  • Kharfan, M., Chan, V. W. K. & Firdolas Efendigil, T. (2020). ‘A data-driven forecasting approach for newly launched seasonal products by leveraging machine-learning approaches’, Annals of Operations Research (June).

  • Kim, A., Yang, Y., Lessmann, S., Ma, T., Sung, M. C., & Johnson, J. E. (2020). Can deep learning predict risky retail investors? A case study in financial risk behavior forecasting. European Journal of Operational Research, 283(1), 217–234.

    Article  Google Scholar 

  • Koren, Y. (2008). Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, Las Vegas, pp. 426–434.

  • Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136.

    Article  Google Scholar 

  • Li, C., Mills, K., Niu, D., Zhu, R., Zhang, H., & Kinawi, H. (2019). Android Malware Detection Based on Factorization Machine. IEEE Access, 7, 184008–184019.

    Article  Google Scholar 

  • Li, X., Chen, Y., Pettit, B., & De Rijke, M. (2019). Personalised reranking of paper recommendations using paper content and user behavior. ACM Transactions on Information Systems, 37(3), 1–23.

    Article  Google Scholar 

  • Lin, Y. C., Chen, T., & Wang, L. C. (2018). Integer nonlinear programming and optimized weighted-average approach for mobile hotel recommendation by considering travelers’ unknown preferences. Operational Research, 18(3), 625–643.

    Article  Google Scholar 

  • Lipton, Z. C., Elkan, C., & Naryanaswamy, B. (2014). Optimal thresholding of classifiers to maximize F1 measure, chapter 15. In T. Calders, F. Esposito, E. Hüllermeier, & R. Meo (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8725, pp. 225–239). Berlin Heidelberg: Springer.

    Google Scholar 

  • Liu, Q., Reiner, A. H., Frigessi, A., & Scheel, I. (2019). Diverse personalized recommendations with uncertainty from implicit preference data with the Bayesian Mallows model. Knowledge-Based Systems, 186(15), 1–12.

    Google Scholar 

  • Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in Forests of randomized trees. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (pp. 431–439). New York: Curran Associates Inc.

    Google Scholar 

  • Lundberg, S. M. & Lee, S. I. (2017). A unified approach to interpreting model predictions, In Advances in Neural Information Processing Systems, Vol. December, pp. 4766–4775.

  • Malik, M. M., Abdallah, S., & Ala’raj, M. (2018). Data mining and predictive analytics applications for the delivery of healthcare services: A systematic literature review. Annals of Operations Research, 270(1–2), 287–312.

    Article  Google Scholar 

  • Martens, D., Baesens, B., Van Gestel, T., & Vanthienen, J. (2007). Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3), 1466–1476.

    Article  Google Scholar 

  • Martens, D., Provost, F., Clark, J., & de Fortuny, E. J. (2016). Mining massive fine-grained behavior data to improve predictive analytics. MIS Quarterly, 40(4), 869–888.

    Article  Google Scholar 

  • Natarajan, S., Vairavasundaram, S., Natarajan, S., & Gandomi, A. H. (2020). Resolving data sparsity and cold start problem in collaborative filtering recommender system using Linked Open Data. Expert Systems with Applications, 149, 1–9.

    Article  Google Scholar 

  • Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., & Vanthienen, J. (2017). Social network analytics for churn prediction in telco: Model building, evaluation and network architecture. Expert Systems with Applications, 85, 204–220.

    Article  Google Scholar 

  • Pazzani, M. J. (1999). Framework for collaborative, content-based and demographic filtering. Artificial Intelligence Review, 13(5), 393–408.

    Article  Google Scholar 

  • Pujahari, A., & Sisodia, D. S. (2020). Pair-wise Preference Relation based Probabilistic Matrix Factorization for Collaborative Filtering in Recommender System. Knowledge-Based Systems, 196, 1–13.

    Article  Google Scholar 

  • Rendle, S. (2010). Factorization machines. InProceedings—IEEE International Conference on Data Mining (pp. 995–1000). Sydney: ICDM’.

  • Ribeiro, M. T., Singh, S. & Guestrin, C. (2016). “Why should i trust you?” Explaining the predictions of any classifier, In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Vol. 13–17, pp. 1135–1144.

  • Said, A., Dooms, S., Loni, B. & Tikk, D. (2014). Recommender systems challenge 2014. In RecSys 2014 - Proceedings of the 8th ACM Conference on Recommender Systems, ACM, Foster City, pp. 387–388.

  • Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2000). Analysis of recommendation algorithms for e-commerce. 2nd ACM Conference on Electronic Commerce (pp. 158–167). Minneapolis: ACM.

  • Scholz, M., Dorner, V., Schryen, G., & Benlian, A. (2017). A configuration-based recommender system for supporting e-commerce decisions. European Journal of Operational Research, 259(1), 205–215.

    Article  Google Scholar 

  • Song, I.-Y. (2000). Database design for real-world e-commerce systems. IEEE Data Engineering Bulletin, 23(1), 23–28.

    Google Scholar 

  • Tang, Y. (2013). Deep Learning using Linear Support Vector Machines, In Proceedings of the International Conference on Machine Learning 2013: Challenges in Representation Learning Workshop, Atlanta.

  • Taylor, A. E. (2009). Statistical enhancement of support vector machines, Ph.D thesis, Oregon State University. https://search.proquest.com/docview/304982216?accountid=188395

  • Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2012). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1), 211–229.

    Article  Google Scholar 

  • Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505–513.

    Article  Google Scholar 

  • Vidgen, R., Shaw, S., & Grant, D. B. (2017). Management challenges in creating value from business analytics. European Journal of Operational Research, 261(2), 626–639.

    Article  Google Scholar 

  • Vozalis, M. G., & Margaritis, K. G. (2007). Using SVD and demographic data for the enhancement of generalized Collaborative Filtering. Information Sciences, 177(15), 3017–3037.

    Article  Google Scholar 

  • Wang, Z., Crammer, K., & Vucetic, S. (2012). Breaking the curse of kernelization: Budgeted stochastic gradient descent for large-scale SVM training. Journal of Machine Learning Research, 13(1), 3103–3131.

    Google Scholar 

  • Yin, R., Li, K., Zhang, G., & Lu, J. (2019). A deeper graph neural network for recommender systems. Knowledge-Based Systems, 185(1), 1–74.

    Google Scholar 

  • Yu, S., Yang, M., Qu, Q., & Shen, Y. (2019). Contextual-boosted deep neural collaborative filtering model for interpretable recommendation. Expert Systems with Applications, 136, 365–375.

    Article  Google Scholar 

  • Zhang, W., Du, Y., Yoshida, T., & Yang, Y. (2019). DeepRec: A deep neural network approach to recommendation with item embedding and weighted loss function. Information Sciences, 470, 121–140.

    Article  Google Scholar 

  • Zhijun, W., Qing, X., Jingjie, W., Meng, Y., & Liang, L. (2020). Low-rate DDoS attack detection based on factorization machine in software defined network. IEEE Access, 8, 17404–17418.

    Article  Google Scholar 

  • Zhou, F., Zhou, H. M., Yang, Z., & Yang, L. (2019). EMD2FNN: A strategy combining empirical mode decomposition and factorization machine based neural network for stock market trend prediction. Expert Systems with Applications, 115, 136–151.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Coussement.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Fig. 5
figure 5

Post-hoc test results for precision@5 on RQ1

Fig. 6
figure 6

Post-hoc test results for recall@5 on RQ1

Fig. 7
figure 7

Post-hoc test results for F1@5 on RQ2

Fig. 8
figure 8

Post-hoc test results for precision@5 on RQ2

Fig. 9
figure 9

Post-hoc test results for recall@5 on RQ2

Fig. 10
figure 10

Post-hoc test results for F1@5 on RQ3 with 2 input data sources

Fig. 11
figure 11

Post-hoc test results for precision@5 on RQ3 with 2 input data sources

Fig. 12
figure 12

Post-hoc test results for recall@5 on RQ3 with 2 input data sources

Fig. 13
figure 13

Friedman test results for F1@5 on RQ3 with 3 input data sources

Fig. 14
figure 14

Friedman test results for precision@5 on RQ3 with 3 input data sources

Fig. 15
figure 15

Friedman test results for recall@5 on RQ3 with 3 input data sources

Fig. 16
figure 16

Friedman test results for F1@5 on RQ4

Fig. 17
figure 17

Friedman test results for precision@5 on RQ4

Fig. 18
figure 18

Friedman test results for recall@5 on RQ4

Fig. 19
figure 19

Aggregated importance scores per input data source

Fig. 20
figure 20

Importance scores per variable.

Fig. 21
figure 21

Importance scores per variable.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Coussement, K., De Bock, K.W. & Geuens, S. A decision-analytic framework for interpretable recommendation systems with multiple input data sources: a case study for a European e-tailer. Ann Oper Res 315, 671–694 (2022). https://doi.org/10.1007/s10479-021-03979-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-021-03979-4

Keywords

Navigation