Abstract
E-Commerce (E-Com) search is an emerging problem with multiple new challenges. One of the primary challenges constitutes optimizing it for relevance and revenue and simultaneously maintaining a discovery strategy. The problem requires designing novel strategies to systematically “discover” promising items from the inventory, that have not received sufficient exposure in search results while minimizing the loss of relevance and revenue because of that. To this end, we develop a formal framework for optimizing E-Com search and propose a novel epsilon-explore Learning to Rank (eLTR) paradigm that can be integrated with the traditional learning to rank (LTR) framework to explore new or less exposed items. The key idea is to decompose the ranking function into (1) a function of content-based features, (2) a function of behavioral features, and introduce a parameter epsilon to regulate their relative contributions. We further propose novel algorithms based on eLTR to improve the traditional LTR used in the current E-Com search engines by “forcing” exploration of a fixed number of items while limiting the relevance drop. We also show that eLTR can be considered to be monotonic sub-modular and thus we can design a greedy approximation algorithm with a theoretical guarantee. We conduct experiments with synthetic data and compare eLTR with a baseline random selection and an upper confidence bound (UCB) based exploration strategies. We show that eLTR is an efficient algorithm for such exploration. We expect that the formalization presented in this paper will lead to new research in the area of ranking problems for E-com marketplaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Auer, P., Ortner, R.: UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem. Period. Math. Hung. 61(1–2), 55–65 (2010)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)
Burges, C.J.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010)
Craswell, N.: Mean reciprocal rank. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Database Systems, pp. 1703–1703. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-39940-9
Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 87–94. WSDM 2008. ACM (2008)
Gabillon, V., Kveton, B., Wen, Z., Eriksson, B., Muthukrishnan, S.: Adaptive submodular maximization in bandit setting. In: Advances in Neural Information Processing Systems, pp. 2697–2705 (2013)
Gittins, J., Glazebrook, K., Weber, R.: Multi-armed Bandit Allocation Indices. Wiley, Hoboken (2011)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Li, L., Chen, S., Kleban, J., Gupta, A.: Counterfactual estimation and optimization of click metrics in search engines: a case study. In: Proceedings of the 24th International Conference on World Wide Web, pp. 929–934. ACM (2015)
Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 661–670. ACM (2010)
Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)
Lovász, L.: Submodular functions and convexity. In: Bachem, A., Korte, B., Grötschel, M. (eds.) Mathematical Programming The State of the Art, pp. 235–257. Springer, Heidelberg (1983). https://doi.org/10.1007/978-3-642-68874-4_10
Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functionsi. Math. Program. 14(1), 265–294 (1978)
Park, S.T., Chu, W.: Pairwise preference regression for cold-start recommendation. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 21–28. ACM (2009)
Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 253–260. ACM (2002)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT press, Cambridge (1998)
Svore, K.M., Volkovs, M.N., Burges, C.J.: Learning to rank with multiple objective functions. In: Proceedings of the 20th iNternational Conference on World Wide Web, pp. 367–376. ACM (2011)
Vermorel, J., Mohri, M.: Multi-armed Bandit algorithms and empirical evaluation. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 437–448. Springer, Heidelberg (2005). https://doi.org/10.1007/11564096_42
Wilcox, R.R.: Introduction to Robust Estimation and Hypothesis Testing. Academic Press, Cambridge (2011)
Yue, Y., Guestrin, C.: Linear submodular bandits and their application to diversified retrieval. In: Advances in Neural Information Processing Systems, pp. 2483–2491 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Goswami, A., Zhai, C., Mohapatra, P. (2018). Learning to Rank and Discover for E-Commerce Search. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10935. Springer, Cham. https://doi.org/10.1007/978-3-319-96133-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-96133-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96132-3
Online ISBN: 978-3-319-96133-0
eBook Packages: Computer ScienceComputer Science (R0)