ABSTRACT
With the explosive development of e-commerce for service, tens of millions of orders are generated every day on the Meituan platform. By allocating bonuses to new customers when they pay, the Meituan platform encourages them to use its own payment service for a better experience in the future. It can be formulated as a multi-choice knapsack problem (MCKP), and the mainstream solution is usually a two-stage method. The first stage is user intent detection, predicting the effect for each bonus treatment. Then, it serves as the objective of the MCKP, and the problem is solved in the second stage to obtain the optimal allocation strategy. However, this solution usually faces the following challenges: (1) In the user intent detection stage, due to the sparsity of interaction and noise, the traditional multi-treatment effect estimation methods lack interpretability, which may violate the domain knowledge that the marginal gain is non-negative with the increase of the bonus amount in economic theory. (2) There is an optimality gap between the two stages, which limits the upper bound of the optimal value obtained in the second stage. (3) Due to changes in the distribution of orders online, the actual cost consumption often violates the given budget limit. To solve the above challenges, we propose a framework that consists of three modules, i.e., User Intent Detection Module, Online Allocation Module, and Feedback Control Module. In the User Intent Detection Module, we implicitly model the treatment increment based on deep representation learning and constrain it to be non-negative to achieve monotonicity constraints. Then, in order to reduce the optimality gap, we further propose a convex constrained model to increase the upper bound of the optimal value. For the third challenge, to cope with the fluctuation of online bonus consumption, we leverage a feedback control strategy in the framework to make the actual cost more accurately approach the given budget limit. Finally, we conduct extensive offline and online experiments, demonstrating the superiority of our proposed framework, which reduced customer acquisition costs by 5.07% and is still running online.
Supplemental Material
- Meng Ai, Biao Li, Heyang Gong, Qingwei Yu, Shengjie Xue, Yuan Zhang, Yunzhou Zhang, and Peng Jiang. 2022. LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm. In Proceedings of the ACM Web Conference 2022. 2310--2319.Google ScholarDigital Library
- Javier Albert and Dmitri Goldenberg. 2022. E-Commerce Promotions Personalization via Online Multiple-Choice Knapsack with Uplift Modeling. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2863--2872.Google ScholarDigital Library
- Karl Johan Åström and Panganamala Ramana Kumar. 2014. Control: A perspective. Autom., Vol. 50, 1 (2014), 3--43.Google ScholarDigital Library
- Susan Athey, Julie Tibshirani, and Stefan Wager. 2019. Generalized random forests. (2019).Google Scholar
- Stuart Bennett. 1993 a. Development of the PID controller. IEEE Control Systems Magazine, Vol. 13, 6 (1993), 58--62.Google ScholarCross Ref
- Stuart Bennett. 1993 b. A history of control engineering, 1930--1955. Number 47. IET.Google Scholar
- Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7--10.Google ScholarDigital Library
- Lu Cheng, Ruocheng Guo, Raha Moraffah, Paras Sheth, Kasim Selcuk Candan, and Huan Liu. 2022. Evaluation methods and measures for causal learning algorithms. IEEE Transactions on Artificial Intelligence (2022).Google Scholar
- Dmitri Goldenberg, Javier Albert, Lucas Bernardi, and Pablo Estevez. 2020. Free lunch! retrospective uplift modeling for dynamic promotions recommendation within roi constraints. In Proceedings of the 14th ACM Conference on Recommender Systems. 486--491.Google ScholarDigital Library
- Fredrik Johansson, Uri Shalit, and David Sontag. 2016. Learning representations for counterfactual inference. In International conference on machine learning. PMLR, 3020--3029.Google Scholar
- Daniel Kahneman and Amos Tversky. 2013. Prospect theory: An analysis of decision under risk. In Handbook of the fundamentals of financial decision making: Part I. World Scientific, 99--127.Google Scholar
- Niklas Karlsson and Jianlong Zhang. 2013. Applications of feedback control in online advertising. In 2013 American control conference. IEEE, 6008--6013.Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Sören R Künzel, Jasjeet S Sekhon, Peter J Bickel, and Bin Yu. 2019. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, Vol. 116, 10 (2019), 4156--4165.Google ScholarCross Ref
- Liangwei Li, Liucheng Sun, Chenwei Weng, Chengfu Huo, and Weijun Ren. 2020. Spending money wisely: Online electronic coupon allocation based on real-time user intent detection. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2597--2604.Google ScholarDigital Library
- Belbahri Mouloud, Gandouet Olivier, and Kazma Ghaith. 2020. Adapting neural networks for uplift models. arXiv preprint arXiv:2011.00041 (2020).Google Scholar
- Gabriel Okasa. 2022. Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance. arXiv preprint arXiv:2201.12692 (2022).Google Scholar
- Robert L Phillips. 2021. Pricing and revenue optimization. In Pricing and Revenue Optimization. Stanford university press.Google Scholar
- Huashuai Qu, Ilya O Ryzhov, and Michael C Fu. 2013. Learning logistic demand curves in business-to-business pricing. In 2013 Winter Simulations Conference (WSC). IEEE, 29--40.Google ScholarCross Ref
- Patrick Schwab, Lorenz Linhardt, and Walter Karlen. 2018. Perfect match: A simple method for learning representations for counterfactual inference with neural networks. arXiv preprint arXiv:1810.00656 (2018).Google Scholar
- Uri Shalit, Fredrik D Johansson, and David Sontag. 2017. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning. PMLR, 3076--3085.Google Scholar
- Yitao Shen, Yue Wang, Xingyu Lu, Feng Qi, Jia Yan, Yixiang Mu, Yao Yang, YiFan Peng, and Jinjie Gu. 2021. A framework for massive scale personalized promotion. arXiv preprint arXiv:2108.12100 (2021).Google Scholar
- Kalyan T Talluri, Garrett Van Ryzin, and Garrett Van Ryzin. 2004. The theory and practice of revenue management. Vol. 1. Springer.Google Scholar
- Ruben van de Geer, Arnoud V den Boer, Christopher Bayliss, Christine SM Currie, Andria Ellina, Malte Esders, Alwin Haensel, Xiao Lei, Kyle DS Maclean, Antonio Martinez-Sykora, et al. 2019. Dynamic pricing and learning with competition: insights from the dynamic pricing challenge at the 2017 INFORMS RM & pricing conference. Journal of Revenue and Pricing Management, Vol. 18, 3 (2019), 185--203.Google ScholarCross Ref
- Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc., Vol. 113, 523 (2018), 1228--1242.Google ScholarCross Ref
- Stephen J Wright. 2015. Coordinate descent algorithms. Mathematical Programming, Vol. 151, 1 (2015), 3--34.Google ScholarDigital Library
- Zhuolin Wu, Li Wang, Fangsheng Huang, Linjun Zhou, Yu Song, Chengpeng Ye, Pengyu Nie, Hao Ren, Jinghua Hao, Renqing He, and Zhizhao Sun. 2022. A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platform. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD '22). Association for Computing Machinery, New York, NY, USA, 4195--4203. https://doi.org/10.1145/3534678.3539202Google ScholarDigital Library
- Xun Yang, Yasong Li, Hao Wang, Di Wu, Qing Tan, Jian Xu, and Kun Gai. 2019. Bid optimization by multivariable control in display advertising. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1966--1974.Google ScholarDigital Library
- Liuyi Yao, Zhixuan Chu, Sheng Li, Yaliang Li, Jing Gao, and Aidong Zhang. 2021. A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 15, 5 (2021), 1--46.Google Scholar
- Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2018. Representation learning for treatment effect estimation from observational data. Advances in Neural Information Processing Systems, Vol. 31 (2018).Google Scholar
- Weinan Zhang, Yifei Rong, Jun Wang, Tianchi Zhu, and Xiaofan Wang. 2016. Feedback control of real-time display advertising. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. 407--416.Google ScholarDigital Library
- Xingwen Zhang, Feng Qi, Zhigang Hua, and Shuang Yang. 2020. Solving Billion-Scale Knapsack Problems. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW '20). Association for Computing Machinery, New York, NY, USA, 3105--3111. https://doi.org/10.1145/3366423.3380084Google ScholarDigital Library
- Kui Zhao, Junhao Hua, Ling Yan, Qi Zhang, Huan Xu, and Cheng Yang. 2019. A unified framework for marketing budget allocation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1820--1830.Google ScholarDigital Library
- Yan Zhao, Xiao Fang, and David Simchi-Levi. 2017. Uplift modeling with multiple treatments and general response types. In Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 588--596.Google ScholarCross Ref
Index Terms
- A Multi-stage Framework for Online Bonus Allocation Based on Constrained User Intent Detection
Recommendations
A Framework for Multi-stage Bonus Allocation in Meal Delivery Platform
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningOnline meal delivery is undergoing explosive growth, as this service is becoming increasingly popular. A meal delivery platform aims to provide excellent and stable services for customers and restaurants. However, in reality, several hundred thousand ...
A Two-Stage Model of the Promotional Performance of Pure Online Firms
Internet firms frequently employ a two-stage approach to promotional activities. In Stage 1, they attract customers to their websites through advertising. In Stage 2, firms generate sales transactions or sales leads through their website.
Comprehensive ...
Unsupervised multi-stage attack detection framework without details on single-stage attacks
AbstractMajority of network attacks currently consist of sophisticated multi-stage attacks, which break down network attacks into several single-stage attacks. The early multi-stage attack detection methods focused on describing the detection ...
Highlights- Single-stage attacks classification without pre-defined knowledge.
- Correlation ...
Comments