Skip to main content

Mend the Learning Approach, Not the Data: Insights for Ranking E-Commerce Products

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track (ECML PKDD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12461))

  • 2175 Accesses

Abstract

Improved search quality enhances users’ satisfaction, which directly impacts sales growth of an E-Commerce (E-Com) platform. Traditional Learning to Rank (LTR) algorithms require relevance judgments on products. In E-Com, getting such judgments poses an immense challenge. In the literature, it is proposed to employ user feedback (such as clicks, add-to-basket (AtB) clicks and orders) to generate relevance judgments. It is done in two steps: first, query-product pair data are aggregated from the logs and then order rate etc. are calculated for each pair in the logs. In this paper, we advocate counterfactual risk minimization (CRM) approach which circumvents the need of relevance judgements, data aggregation and is better suited for learning from logged data, i.e. contextual bandit feedback. Due to unavailability of public E-Com LTR dataset, we provide Mercateo dataset from our platform. It contains more than 10 million AtB click logs and 1 million order logs from a catalogue of about 3.5 million products associated with 3060 queries. To the best of our knowledge, this is the first work which examines effectiveness of CRM approach in learning ranking model from real-world logged data. Our empirical evaluation shows that our CRM approach learns effectively from logged data and beats a strong baseline ranker (\(\lambda \)-MART) by a huge margin. Our method outperforms full-information loss (e.g. cross-entropy) on various deep neural network models. These findings demonstrate that by adopting CRM approach, E-Com platforms can get better product search quality compared to full-information approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.statista.com/statistics/379046/.

  2. 2.

    Available at: https://github.com/ecom-research/CRM-LTR.

  3. 3.

    Available at: https://github.com/usnistgov/trec_eval.

  4. 4.

    Available at: https://sourceforge.net/p/lemur/wiki/RankLib/.

References

  1. Agrawal, R., Halverson, A., Kenthapadi, K., Mishra, N., Tsaparas, P.: Generating labels from clicks. In: WSDM 2009, pp. 172–181. ACM (2009). https://doi.org/10.1145/1498759.1498824

  2. Bendersky, M., Wang, X., Najork, M., Metzler, D.: Learning with sparse and biased feedback for personal search. In: JCAI 2018, pp. 5219–5223. AAAI Press (2018)

    Google Scholar 

  3. Bi, K., Teo, C.H., Dattatreya, Y., Mohan, V., Croft, W.B.: Leverage implicit feedback for context-aware product search. In: eCOM@SIGIR (2019)

    Google Scholar 

  4. Borisov, A., Kiseleva, J., Markov, I., de Rijke, M.: Calibration: a simple way to improve click models. In: CIKM 2018 (2018)

    Google Scholar 

  5. Brenner, E.P., Zhao, J., Kutiyanawala, A., Yan, Z.: End-to-end neural ranking for ecommerce product search. In: SIGIR eCom, vol. 18 (2018)

    Google Scholar 

  6. Chapelle, O., Chang, Y.: Yahoo! Learning to rank challenge overview. In: Proceedings of the Learning to Rank Challenge, pp. 1–24 (2011)

    Google Scholar 

  7. Chen, D.: Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. J. Database Market. Customer Strategy Manag. 19(3), 197–208 (2012). https://doi.org/10.1057/dbm.2012.17

    Article  Google Scholar 

  8. Dai, Z., Xiong, C., Callan, J., Liu, Z.: Convolutional neural networks for soft-matching N-grams in ad-hoc search. In: WSDM 2018, pp. 126–134. ACM, New York (2018). https://doi.org/10.1145/3159652.3159659, http://doi.acm.org/10.1145/3159652.3159659

  9. Dheeru, D., Taniskidou, E.: UCI machine learning repository (2017)

    Google Scholar 

  10. Alonso, O., et al.: Relevance criteria for e-commerce: a crowdsourcing-based experimental analysis. In: SIGIR 2009, pp. 760–761. ACM (2009)

    Google Scholar 

  11. Guo, J., Fan, Y., Ji, X., Cheng, X.: MatchZoo: a learning, practicing, and developing system for neural text matching. In: SIGIR 2019 (2019). https://doi.org/10.1145/3331184.3331403, http://doi.acm.org/10.1145/3331184.3331403

  12. Hu, Y., Da, Q., Zeng, A., Yu, Y., Xu, Y.: Reinforcement learning to rank in e-commerce search engine: formalization, analysis, and application. In: KDD 2018, NY, USA (2018). https://doi.org/10.1145/3219819.3219846, http://doi.acm.org/10.1145/3219819.3219846

  13. Jiang, S., et al.: Learning query and document relevance from a web-scale click graph. In: SIGIR 2016 (2016)

    Google Scholar 

  14. Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002. ACM (2002). https://doi.org/10.1145/775047.775067

  15. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., Gay, G.: Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. Inf. Syst. 25(2), 7-es (2007). https://doi.org/10.1145/1229179.1229181

  16. Joachims, T., Swaminathan, A., Rijke, M.d.: Deep learning with logged Bandit feedback. In: ICLR 2018, May 2018

    Google Scholar 

  17. Joachims, T., Swaminathan, A., Schnabel, T.: Unbiased learning-to-rank with biased feedback. In: WSDM 2017. ACM (2017). https://doi.org/10.1145/3018661.3018699

  18. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N.: Speeding up document ranking with rank-based features. In: SIGIR 2015, NY, USA (2015). https://doi.org/10.1145/2766462.2767776

  19. Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. CoRR (2016)

    Google Scholar 

  20. Pang, L., Lan, Y., Guo, J., Xu, J., Xu, J., Cheng, X.: DeepRank: a new deep architecture for relevance ranking in information retrieval. CoRR abs/1710.05649 (2017)

    Google Scholar 

  21. Qi, Y., Wu, Q., Wang, H., Tang, J., Sun, M.: Bandit learning with implicit feedback. In: NIPS 2018, pp. 7287–7297. Curran Associates Inc., Red Hook (2018)

    Google Scholar 

  22. Qin, T., Liu, T.Y., Xu, J., Li, H.: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf. Retrieval 13(4), 346–374 (2010). https://doi.org/10.1007/s10791-009-9123-y

    Article  Google Scholar 

  23. Santu, S.K.K., Sondhi, P., Zhai, C.: On application of learning to rank for e-commerce search. In: SIGIR 2017 (2017)

    Google Scholar 

  24. Schuth, A., Hofmann, K., Whiteson, S., de Rijke, M.: Lerot: an online learning to rank framework. In: Proceedings of the 2013 Workshop on Living Labs for Information Retrieval Evaluation, pp. 23–26. ACM (2013)

    Google Scholar 

  25. Severyn, A., Moschitti, A.: Learning to rank short text pairs with convolutional deep neural networks. In: SIGIR 2015, pp. 373–382. ACM, New York (2015)

    Google Scholar 

  26. Sidana, S., Laclau, C., Amini, M.R., Vandelle, G., Bois-Crettez, A.: KASANDR: a large-scale dataset with implicit feedback for recommendation. In: SIGIR 2017, pp. 1245–1248 (2017)

    Google Scholar 

  27. Swaminathan, A., Joachims, T.: Batch learning from logged bandit feedback through counterfactual risk minimization. JMLR 16, 1731–1755 (2015)

    MathSciNet  MATH  Google Scholar 

  28. Swaminathan, A., Joachims, T.: The self-normalized estimator for counterfactual learning. In: NIPS 2015, pp. 3231–3239. MIT Press, Cambridge (2015)

    Google Scholar 

  29. Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A deep architecture for semantic matching with multiple positional sentence representations. CoRR abs/1511.08277 (2015). http://arxiv.org/abs/1511.08277

  30. Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retrieval 13(3), 254–270 (2010)

    Article  Google Scholar 

  31. Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: SIGIR 2007, pp. 391–398. ACM, New York (2007)

    Google Scholar 

  32. Yang, Z., et al.: A deep top-K relevance matching model for ad-hoc retrieval. In: Zhang, S., Liu, T.-Y., Li, X., Guo, J., Li, C. (eds.) CCIR 2018. LNCS, vol. 11168, pp. 16–27. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01012-6_2

    Chapter  Google Scholar 

Download references

Acknowledgments

We would like to thank Alan Schelten, Till Brychcy and Rudolf Sailer for insightful discussions which helped in improving the quality of this work. This work has been supported by the Bavarian Ministry of Economic Affairs, Regional Development and Energy through the WoWNet project IUK-1902-003// IUK625/002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Umer Anwaar .

Editor information

Editors and Affiliations

Appendices

A Comparison of Counterfactual Risk Estimators

We compare the performance of SNIPS estimator with two baseline estimators for counterfactual risk. We conduct the experiments on AtB click training data of Mercateo dataset. The inverse porpensity scoring (IPS) estimator is calculated by:

$$\begin{aligned} \hat{R}_{IPS}(\pi _w) = \frac{1}{n}\sum _{i=1}^n \delta _i \frac{\pi _w(a_i|c_i)}{\pi _0(a_i|c_i)}. \end{aligned}$$
(4)

Second estimator is an empirical average (EA) estimator defined as follows:

$$\begin{aligned} \hat{R}_{EA}(\pi _w) = \sum _{(c,a) \in (\mathcal {C,A})} \overline{\delta } (c,a) \pi _w (a|c) , \end{aligned}$$
(5)

where \(\overline{\delta } (c,a)\) is the empirical average of the losses for a given context and action pair. The results for these estimators are provided in Table 5. Compared to SNIPS both IPS and EA perform significantly worse on all evaluated metrics. The results confirm the importance of equivariance of the counterfactual estimator and show the advantages of SNIPS estimator.

Table 5. Results on Mercateo dataset with AtB click relevance for IPS and empirical average estimators
Fig. 3.
figure 3

SNIPS denominator vs \(\lambda \) on order logs (training set)

Fig. 4.
figure 4

Performance on orders test set of rankers trained with different \(\lambda \)

B Choosing Hyperparameter \(\lambda \)

One major drawback of SNIPS estimator is that, being a ratio estimator, it is not possible to perform its direct stochastic optimization [16]. In particular, given the success of stochastic gradient descent (SGD) training of deep neural networks in related applications, this is quite disadvantageous as one can not employ SGD for training.

To overcome this limitation, Joachims et al. [16] fixed the value of denominator in Eq. 3. They denote the denominator by S and solve multiple constrained optimization problems for different values of S. Each of these problems can be reformulated using lagrangian of the constrained optimization problem as:

$$\begin{aligned} \hat{w}_j = \mathop {\text {argmin}}\limits _{w} \frac{1}{n} \sum _{i=1}^n (\delta _i - \lambda _j) \frac{\pi _w(a_i|c_i)}{\pi _0(a_i|c_i)} \end{aligned}$$
(6)

where \(\lambda _j\) corresponds to a fixed denominator \(S_j\).

The main difficulty in applying the CRM method to learn from logged data is the need to choose hyperparameter \(\lambda \). We discuss below our heuristics of selecting it. We also evaluate the dependence of \(\lambda \) on SNIPS denominator S, which can be used to guide the search for \(\lambda \). To achieve good performance with CRM loss, one has to tune hyperparameter \(\lambda \in [0, 1]\). Instead of doing a grid search, we follow a smarter way to find a suitable \(\lambda \). Building on the observations proposed in [16], we can guide the search of \(\lambda \) based on value of SNIPS denominator S. It was shown in [16] that the value of S increases monotonically, if \(\lambda \) is increased. Secondly, it is straightforward to note that expectation of S is 1. This implies that, with increasing number of bandit feedback, the optimal value for \(\lambda \) should be selected such that its corresponding S value concentrates around 1. In our experiments, we first select some random \(\lambda \in [0, 1]\) and train the model for two epochs with this \(\lambda \). We then calculate S for the trained model; if S is greater than 1, we decrease \(\lambda \) by 10%, otherwise we increase it by 10%. The final value of \(\lambda \) is decided based on best performance on validation set.

In Fig. 3, we plot the values of denominator S on order logs (training set) of Mercateo dataset for different values of hyperparameter \(\lambda \). On the figure below, Fig. 4, we also plot performance on orders test set, in terms of MAP and NDCG@5 scores, of different rankers for these values of hyperparameter \(\lambda \). It is to be noted that the values of SNIPS denominator S monotonically increase with increasing \(\lambda \). The MAP and NDCG@5 reach its highest value for \(\lambda = 0.4\), but decrease only slightly with increasing values of \(\lambda \). Furthermore, it can also be seen from these two figures that the \(\lambda \) values with good performance on test set have corresponding SNIPS denominator values close to 1.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Anwaar, M.U., Rybalko, D., Kleinsteuber, M. (2021). Mend the Learning Approach, Not the Data: Insights for Ranking E-Commerce Products. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67670-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67669-8

  • Online ISBN: 978-3-030-67670-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics