skip to main content
10.1145/3437963.3441750acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Relation-aware Meta-learning for E-commerce Market Segment Demand Prediction with Limited Records

Published: 08 March 2021 Publication History

Abstract

E-commerce business is revolutionizing our shopping experiences by providing convenient and straightforward services. One of the most fundamental problems is how to balance the demand and supply in market segments to build an efficient platform. While conventional machine learning models have achieved great success on data-sufficient segments, it may fail in a large-portion of segments in E-commerce platforms, where there are not sufficient records to learn well-trained models. In this paper, we tackle this problem in the context of market segment demand prediction. The goal is to facilitate the learning process in the target segments by leveraging the learned knowledge from data-sufficient source segments. Specifically, we propose a novel algorithm, RMLDP, to incorporate a multi-pattern fusion network (MPFN) with a meta-learning paradigm. The multi-pattern fusion network considers both local and seasonal temporal patterns for segment demand prediction. In the meta-learning paradigm, transferable knowledge is regarded as the model parameter initialization of MPFN, which are learned from diverse source segments. Furthermore, we capture the segment relations by combining data-driven segment representation and segment knowledge graph representation and tailor the segment-specific relations to customize transferable model parameter initialization. Thus, even with limited data, the target segment can quickly find the most relevant transferred knowledge and adapt to the optimal parameters. We conduct extensive experiments on two large-scale industrial datasets. The results justify that our RMLDP outperforms a set of state-of-the-art baselines. Besides, RMLDP has been deployed in Taobao, a real-world E-commerce platform. The online A/B testing results further demonstrate the practicality of RMLDP.

References

[1]
Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, and Nando De Freitas. 2016. Learning to learn by gradient descent by gradient descent. In NeurIPS . 3981--3989.
[2]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 785--794.
[3]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[4]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In ICML . 1126--1135.
[5]
Chelsea Finn and Sergey Levine. 2018. Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm. In ICLR .
[6]
Chelsea Finn, Kelvin Xu, and Sergey Levine. 2018. Probabilistic Model-Agnostic Meta-Learning. In NeurIPS .
[7]
Boqing Gong, Yuan Shi, Fei Sha, and Kristen Grauman. 2012. Geodesic flow kernel for unsupervised domain adaptation. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2066--2073.
[8]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[9]
Chao Huang, Xian Wu, Xuchao Zhang, Chuxu Zhang, Jiashu Zhao, Dawei Yin, and Nitesh V Chawla. 2019. Online Purchase Prediction via Multi-Scale Modeling of Behavior Dynamics. In KDD. ACM.
[10]
Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. 2018. Modeling long-and short-term temporal patterns with deep neural networks. In SIGIR . 95--104.
[11]
Nikolay Laptev, Jason Yosinski, Li Erran Li, and Slawek Smyl. 2017. Time-series extreme event forecasting with neural networks at uber. In ICML Time Series Workshop .
[12]
Yoonho Lee and Seungjin Choi. 2018. Gradient-based meta-learning with learned layerwise metric and subspace. In ICML . 2933--2942.
[13]
Zheng Li, Xin Li, Ying Wei, Lidong Bing, Yu Zhang, and Qiang Yang. 2019. Transferable end-to-end aspect-based sentiment analysis with selective adversarial learning. In EMNLP .
[14]
Zhenguo Li, Fengwei Zhou, Fei Chen, and Hang Li. 2017. Meta-sgd: Learning to learn quickly for few shot learning. arXiv preprint arXiv:1707.09835 (2017).
[15]
Marco Lippi, Matteo Bertini, and Paolo Frasconi. 2013. Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning. IEEE TITS, Vol. 14, 2 (2013), 871--882.
[16]
Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I Jordan. 2015. Learning transferable features with deep adaptation networks. arXiv preprint arXiv:1502.02791 (2015).
[17]
Mingsheng Long, Jianmin Wang, Guiguang Ding, Dou Shen, and Qiang Yang. 2013. Transfer learning with graph co-regularization. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 7 (2013), 1805--1818.
[18]
Zhongjian Lv, Jiajie Xu, Kai Zheng, Hongzhi Yin, Pengpeng Zhao, and Xiaofang Zhou. 2018. LC-RNN: A Deep Learning Model for Traffic Speed Prediction. In IJCAI . 3470--3476.
[19]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In KDD. ACM, 1903--1911.
[20]
Boris Oreshkin, Pau Rodr'iguez López, and Alexandre Lacoste. 2018. Tadam: Task dependent adaptive metric for improved few-shot learning. In NeurIPS . 721--731.
[21]
Boris N Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. ICLR (2020).
[22]
Bei Pan, Ugur Demiryurek, and Cyrus Shahabi. 2012. Utilizing real-world transportation data for accurate traffic prediction. In ICDM. IEEE, 595--604.
[23]
Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE TKDE, Vol. 22, 10 (2009), 1345--1359.
[24]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD . 701--710.
[25]
Yao Qin, Dongjin Song, Haifeng Cheng, Wei Cheng, Guofei Jiang, and Garrison Cottrell. 2017. A dual-stage attention-based recurrent neural network for time series prediction. IJCAI (2017).
[26]
Syama Sundar Rangapuram, Matthias W Seeger, Jan Gasthaus, Lorenzo Stella, Yuyang Wang, and Tim Januschowski. 2018. Deep state space models for time series forecasting. In NeurIPS . 7785--7794.
[27]
Sachin Ravi and Hugo Larochelle. 2016. Optimization as a Model for Few-Shot Learning. ICLR (2016).
[28]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2019. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting (2019).
[29]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. In NeurIPS. 4077--4087.
[30]
Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7167--7176.
[31]
Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et almbox. 2016. Matching networks for one shot learning. In NeurIPS. 3630--3638.
[32]
Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J Lim. 2019. Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation. In NeurIPS . 1--12.
[33]
Yuyang Wang, Alex Smola, Danielle Maddix, Jan Gasthaus, Dean Foster, and Tim Januschowski. 2019. Deep Factors for Forecasting. In ICML. 6607--6617.
[34]
Xian Wu, Baoxu Shi, Yuxiao Dong, Chao Huang, et almbox. 2018. RESTFul: Resolution-Aware Forecasting of Behavioral Time Series Data. In CIKM. ACM, 1073--1082.
[35]
Huaxiu Yao, Yiding Liu, Ying Wei, Xianfeng Tang, and Zhenhui Li. 2019 a. Learning from Multiple Cities: A Meta-Learning Approach for Spatial-Temporal Prediction. In WWW. ACM.
[36]
Huaxiu Yao, Xianfeng Tang, Hua Wei, Guanjie Zheng, Yanwei Yu, and Zhenhui Li. 2019 b. Revisiting Spatial-Temporal Similarity: A Deep Learning Framework for Traffic Prediction. AAAI Conference on Artificial Intelligence (2019).
[37]
Huaxiu Yao, Ying Wei, Junzhou Huang, and Zhenhui Li. 2019 c. Hierarchically Structured Meta-learning. In ICML. 7045--7054.

Cited By

View all
  • (2024)MulSTE: A Multi-view Spatio-temporal Learning Framework with Heterogeneous Event Fusion for Demand-supply PredictionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672030(1781-1792)Online publication date: 25-Aug-2024
  • (2024)Variational Autoencoder Based Distributed Unsupervised Meta-learning Framework2024 International Conference on Interactive Intelligent Systems and Techniques (IIST)10.1109/IIST62526.2024.00106(708-713)Online publication date: 4-Mar-2024
  • (2024)Distributed unsupervised meta-learning algorithm over multi-agent systemsDigital Communications and Networks10.1016/j.dcan.2024.08.006Online publication date: Aug-2024
  • Show More Cited By

Index Terms

  1. Relation-aware Meta-learning for E-commerce Market Segment Demand Prediction with Limited Records

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
      March 2021
      1192 pages
      ISBN:9781450382977
      DOI:10.1145/3437963
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 March 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. market segment demand prediction
      2. periodicity
      3. segment relation extraction

      Qualifiers

      • Research-article

      Conference

      WSDM '21

      Acceptance Rates

      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MulSTE: A Multi-view Spatio-temporal Learning Framework with Heterogeneous Event Fusion for Demand-supply PredictionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672030(1781-1792)Online publication date: 25-Aug-2024
      • (2024)Variational Autoencoder Based Distributed Unsupervised Meta-learning Framework2024 International Conference on Interactive Intelligent Systems and Techniques (IIST)10.1109/IIST62526.2024.00106(708-713)Online publication date: 4-Mar-2024
      • (2024)Distributed unsupervised meta-learning algorithm over multi-agent systemsDigital Communications and Networks10.1016/j.dcan.2024.08.006Online publication date: Aug-2024
      • (2023)Multi-Agent Chronological Planning with Model-Agnostic Meta Reinforcement LearningApplied Sciences10.3390/app1316917413:16(9174)Online publication date: 11-Aug-2023
      • (2023)Multifaceted Relation-aware Meta-learning with Dual Customization for User Cold-start RecommendationACM Transactions on Knowledge Discovery from Data10.1145/359745817:9(1-27)Online publication date: 18-Jul-2023
      • (2023)Adaptive Cross-Scenario Few-Shot Learning Framework for Structural Damage Detection in Civil InfrastructureJournal of Construction Engineering and Management10.1061/JCEMD4.COENG-13196149:5Online publication date: May-2023
      • (2023)A multiple long short-term model for product sales forecasting based on stage future vision with prior knowledgeInformation Sciences10.1016/j.ins.2022.12.099625(97-124)Online publication date: May-2023
      • (2022)Applications of Fusion Techniques in E-Commerce Environments: A Literature ReviewSensors10.3390/s2211399822:11(3998)Online publication date: 25-May-2022
      • (2022)What Can Knowledge Bring to Machine Learning?—A Survey of Low-shot Learning for Structured DataACM Transactions on Intelligent Systems and Technology10.1145/351003013:3(1-45)Online publication date: 3-Mar-2022
      • (2022)ROSE: Robust Caches for Amazon Product SearchCompanion Proceedings of the Web Conference 202210.1145/3487553.3524213(89-93)Online publication date: 25-Apr-2022
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media