skip to main content
research-article

Package Arrival Time Prediction via Knowledge Distillation Graph Neural Network

Published: 28 February 2024 Publication History

Abstract

Accurately estimating packages’ arrival time in e-commerce can enhance users’ shopping experience and improve the placement rate of products. This problem is often formalized as an Origin-Destination (OD)-based ETA (i.e., estimated time of arrival) prediction task, where the delivery time is estimated mainly based on sender and receiver addresses and other context information. One inherent challenge of the OD-based ETA problem is that the delivery time highly depends on the actual delivery trajectory which is unknown at the time of prediction. In this article, we tackle this challenge by effectively exploiting historical delivery trajectories. We propose a novel Knowledge Distillation Graph neural network-based package ETA prediction (KDG-ETA) model, which uses knowledge distillation in the training phase to distill the knowledge of historical trajectories into OD pair embeddings. In KDG-ETA, a multi-level trajectory graph representation model is proposed to fully exploit trajectory information at the node-level, edge-level, and path-level. Then, the OD representations embedded with trajectory knowledge are combined with context embeddings from feature extraction module for delivery time prediction using an adaptive attention module. KDG-ETA consistently outperforms existing state-of-the-art OD-based ETA prediction methods on three real-world Alibaba datasets, reducing the Mean Absolute Error (MAE) by 3.0%–39.1% as demonstrated in our extensive empirical evaluation.

References

[1]
Abdolmaged Alkhulaifi, Fahad Alsahli, and Irfan Ahmad. 2020. Knowledge distillation in deep learning and its applications. CoRR abs/2007.09029 (2020).
[2]
Pouria Amirian, Anahid Basiri, and Jeremy Morley. 2016. Predictive analytics for enhancing travel time estimation in navigation apps of Apple, Google, and Microsoft. In Proceedings of the 9th ACM SIGSPATIAL International Workshop on Computational Transportation Science. ACM, 31–36.
[3]
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016. ACM, 785–794.
[4]
Ruomeng Cui, Tianshu Sun, Zhikun Lu, and Joseph Golden. 2020. Sooner or later? Promising delivery speed in online retail. In Promising Delivery Speed in Online Retail.
[5]
Yimian Dai, Fabian Gieseke, Stefan Oehmcke, Yiquan Wu, and Kobus Barnard. 2021. Attentional feature fusion. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV 2021). IEEE, 3559–3568.
[6]
Arthur Cruz De Araujo and Ali Etemad. 2019. Deep neural networks for predicting vehicle travel times. In Proceedings of the 2019 IEEE SENSORS. 1–4. DOI:
[7]
Arthur Cruz de Araujo and Ali Etemad. 2021. End-to-end prediction of parcel delivery time with deep learning for smart-city applications. IEEE Internet Things J. 8, 23 (2021), 17043–17056.
[8]
Xiang Deng and Zhongfei Zhang. 2021. Graph-free knowledge distillation for graph neural networks. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI). 2321–2327.
[9]
Austin Derrow-Pinion, Jennifer She, David Wong, Oliver Lange, Todd Hester, Luis Perez, Marc Nunkesser, Seongjae Lee, Xueying Guo, Brett Wiltshire, Peter W. Battaglia, Vishal Gupta, Ang Li, Zhongwen Xu, Alvaro Sanchez-Gonzalez, and Yujia Li, Petar Velickovic. 2021. ETA prediction with graph neural networks in google maps. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM’21). ACM, 3767–3776.
[10]
Zhenni Feng and Yanmin Zhu. 2016. A survey on trajectory data mining: Techniques and applications. IEEE Access 4 (2016), 2056–2067.
[11]
Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch geometric. CoRR abs/1903.02428 (2019).
[12]
Kun Fu, Fanlin Meng, Jieping Ye, and Zheng Wang. 2020. CompactETA: A fast inference system for travel time prediction. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, August 23-27, 2020). ACM, 3337–3345.
[13]
Kui Fu, Peipei Shi, Yafei Song, Shiming Ge, Xiangju Lu, and Jia Li. 2020. Ultrafast video attention prediction with coupled knowledge distillation. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI). 10802–10809.
[14]
Tao-Yang Fu and Wang-Chien Lee. 2019. DeepIST: Deep image-based spatio-temporal network for travel time estimation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM 2019). ACM, 69–78.
[15]
Ziyan Gao and Zhanbo Sun. 2021. Modeling spatio-temporal interactions for vehicle trajectory prediction based on graph representation learning. In Proceedings of the 24th IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, 1334–1339.
[16]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, CA, August 13–17, 2016). ACM, 855–864.
[17]
Zhichun Guo, Chunhui Zhang, Yujie Fan, Yijun Tian, Chuxu Zhang, and Nitesh V. Chawla. 2023. Boosting graph neural networks via adaptive knowledge distillation. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023), 35th Conference on Innovative Applications of Artificial Intelligence (IAAI 2023), 13th Symposium on Educational Advances in Artificial Intelligence (EAAI 2023) (Washington, DC, February 7–14, 2023). AAAI Press, 7793–7801.
[18]
Seungwoong Ha and Hawoong Jeong. 2023. Learning heterogeneous interaction strengths by trajectory prediction with graph neural network. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023) (Kigali, Rwanda, May 1–5, 2023). OpenReview.net.
[19]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Proceedings of the Annual Conference on Neural Information Processing Systems 2017(December 4–9, 2017, Long Beach, CA, USA). 1024–1034.
[20]
Peng Han, Jin Wang, Di Yao, Shuo Shang, and Xiangliang Zhang. 2021. A graph-based approach for trajectory similarity computation in spatial networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’21). ACM, 556–564.
[21]
Florentin D. Hildebrandt and Marlin W. Ulmer. 2021. Supervised learning for arrival time estimations in restaurant meal delivery. Transportation Science (2021).
[22]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015). http://arxiv.org/abs/1503.02531
[23]
Ishan Jindal, Tony Qin, Xuewen Chen, Matthew S. Nokleby, and Jieping Ye. 2017. A unified neural network approach for estimating travel time and distance for a taxi trip. arXiv: 1710.04350 (2017). http://arxiv.org/abs/1710.04350
[24]
Chaitanya K. Joshi, Fayao Liu, Xu Xun, Jie Lin, and Chuan-Sheng Foo. 2021. On representation knowledge distillation for graph neural networks. arXiv preprint arXiv:2111.04964 (2021).
[25]
Antonios Karatzoglou, Nikolai Schnell, and Michael Beigl. 2018. A convolutional neural network approach for modeling semantic trajectories and predicting future locations. In Proceedings of the Artificial Neural Networks and Machine Learning (ICANN 2018), Vol. 11139. Springer, 61–72.
[26]
Xingjian Li, Haoyi Xiong, Zeyu Chen, Jun Huan, Ji Liu, Cheng-Zhong Xu, and Dejing Dou. 2022. Knowledge distillation with attention for deep transfer learning of convolutional networks. ACM Trans. Knowl. Discov. Data 16, 3 (2022), 42:1–42:20.
[27]
Yaguang Li, Kun Fu, Zheng Wang, Cyrus Shahabi, Jieping Ye, and Yan Liu. 2018. Multi-task representation learning for travel time estimation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018). ACM, 1695–1704.
[28]
Yang Li, Xingyu Wu, Jinglong Wang, Yong Liu, Xiaoqing Wang, Yuming Deng, and Chunyan Miao. 2021. Unsupervised categorical representation learning for package arrival time prediction. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM ’21). ACM, 3935–3944.
[29]
Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018) (London, UK, August 19–23, 2018). ACM, 1754–1763.
[30]
Lei Lin, Weizi Li, Huikun Bi, and Lingqiao Qin. 2022. Vehicle trajectory prediction using LSTMs with spatial-temporal attention mechanisms. IEEE Intell. Transp. Syst. Mag. 14, 2 (2022), 197–208.
[31]
Hongbin Liu, Hao Wu, Weiwei Sun, and Ickjai Lee. 2019. Spatio-temporal GRU for trajectory classification. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1228–1233.
[32]
Wei Liu, Jiayu He, Haiming Wang, Huaijie Zhu, and Jian Yin. 2021. A novel road segment representation method for travel time estimation. In Proceedings of the Database Systems for Advanced Applications (DASFAA 2021). Vol. 12680. Springer, 398–413.
[33]
Jianming Lv, Qinghui Sun, Qing Li, and Luís Moreira-Matias. 2020. Multi-scale and multi-scope convolutional neural networks for destination prediction of trajectories. IEEE Trans. Intell. Transp. Syst. 21, 8 (2020), 3184–3195.
[34]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14) (New York, NY, August 24–27, 2014). ACM, 701–710.
[35]
Yuting Qiang, Haomin Wen, Lixia Wu, Xiaowei Mao, Fan Wu, Huaiyu Wan, and Haoyuan Hu. 2023. Modeling intra- and inter-community information for route and time prediction in last-mile delivery. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023) (Anaheim, CA, April 3–7, 2023). IEEE, 3106–3112. DOI:
[36]
Raffi Sevlian and Ram Rajagopal. 2010. Travel time estimation using floating car data. arXiv Preprint arXiv:1012.4249 (2010). http://arxiv.org/abs/1012.4249
[37]
Yibin Shen, Cheqing Jin, Jiaxun Hua, and Dingjiang Huang. 2022. TTPNet: A neural network for travel time prediction based on tensor decomposition and graph embedding. IEEE Trans. Knowl. Data Eng. 34, 9 (2022), 4514–4526. DOI:
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017(December 4–9, 2017, Long Beach, CA),. 5998–6008.
[39]
Hongjian Wang, Xianfeng Tang, Yu-Hsuan Kuo, Daniel Kifer, and Zhenhui Li. 2019. A simple baseline for travel time estimation using large-scale trip data. ACM Trans. Intell. Syst. Technol. 10, 2 (2019), 19:1–19:22.
[40]
Senzhang Wang, Jiannong Cao, and Philip Yu. 2020. Deep learning for spatio-temporal data mining: A survey. IEEE Trans. Knowl. Data Eng. (2020).
[41]
Yilun Wang, Yu Zheng, and Yexiang Xue. 2014. Travel time estimation of a path using sparse trajectories. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14). ACM, 25–34.
[42]
Haomin Wen, Youfang Lin, Xiaowei Mao, Fan Wu, Yiji Zhao, Haochen Wang, Jianbin Zheng, Lixia Wu, Haoyuan Hu, and Huaiyu Wan. 2022. Graph2Route: A Dynamic spatial-temporal graph neural network for pick-up and delivery route prediction. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington, DC, August 14–18, 2022). ACM, 4143–4152.
[43]
Fan Wu and Lixia Wu. 2019. DeepETA: A spatial-temporal sequential neural network model for estimating time of arrival in package delivery system. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI. 774–781.
[44]
Cheng Yang, Jiawei Liu, and Chuan Shi. 2021. Extract the knowledge of graph neural networks and go beyond it: An effective knowledge distillation framework. In WWW ’21: The Web Conference 2021. ACM, 1227–1237.
[45]
Yiding Yang, Jiayan Qiu, Mingli Song, Dacheng Tao, and Xinchao Wang. 2020. Distilling knowledge from graph convolutional networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 7072–7081.
[46]
Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, and Shuai Yi. 2020. Spatio-temporal graph transformer networks for pedestrian trajectory prediction. In Proceedings of the 16th European Conference on Computer Vision (ECCV 2020), Part XII(Lecture Notes in Computer Science, Vol. 12357). Springer, 507–523.
[47]
Liang Zhang and Cheng Long. 2023. Road network representation learning: A dual graph based approach. ACM Trans. Knowl. Discov. Data (Apr. 2023).
[48]
Lei Zhang, Mingliang Wang, Xin Zhou, Xingyu Wu, Yiming Cao, Yonghui Xu, Lizhen Cui, and Zhiqi Shen. 2023. Dual graph multitask framework for imbalanced delivery time estimation. In Proceedings of the 28th International Conference on Database Systems for Advanced Applications, (DASFAA 2023) (Tianjin, China, April 17–20, 2023), Part IV(Lecture Notes in Computer Science, Vol. 13946). Springer, 606–618.
[49]
Lei Zhang, Xingyu Wu, Yong Liu, Xin Zhou, Yiming Cao, Yonghui Xu, Lizhen Cui, and Chunyan Miao. 2023. Estimating package arrival time via heterogeneous hypergraph neural network. Expert Systems with Applications (2023), 121740.
[50]
Lei Zhang, Xin Zhou, Zhiwei Zeng, Yiming Cao, Yonghui Xu, Mingliang Wang, Xingyu Wu, Yong Liu, Lizhen Cui, and Zhiqi Shen. 2023. Delivery time prediction using large-scale graph structure learning based on quantile regression. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023), (Anaheim, CA, April 3–7, 2023). IEEE, 3403–3416.
[51]
Xin Zhou, Jinglong Wang, Yong Liu, Xingyu Wu, Zhiqi Shen, and Cyril Leung. 2023. Inductive graph transformer for delivery time estimation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining (WSDM 2023) (Singapore, 27 February 2023-3 March 2023). ACM, 679–687.
[52]
Lin Zhu, Wei Yu, Kairong Zhou, Xing Wang, Wenxing Feng, Pengyu Wang, Ning Chen, and Pei Lee. 2020. Order fulfillment cycle time estimation for on-demand food delivery. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20). ACM, 2571–2580.

Cited By

View all
  • (2024)Efficient algorithms to mine concise representations of frequent high utility occupancy patternsApplied Intelligence10.1007/s10489-024-05296-254:5(4012-4042)Online publication date: 18-Mar-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 5
June 2024
699 pages
EISSN:1556-472X
DOI:10.1145/3613659
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 February 2024
Online AM: 24 January 2024
Accepted: 16 January 2024
Revised: 19 November 2023
Received: 27 April 2023
Published in TKDD Volume 18, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Package arrival time prediction
  2. graph neural network
  3. trajectory data mining
  4. knowledge distillation

Qualifiers

  • Research-article

Funding Sources

  • National Key R&D Program of China
  • NSFC
  • Shandong Provincial Key Research and Development Program
  • Shandong Province Outstanding Youth Science Foundation
  • Shandong Province Science and Technology-based Small and Medium Enterprises Innovation Capacity Enhancement Project
  • Fundamental Research Funds of Shandong University China-Singapore International Joint Research Project
  • Alibaba Innovative Research (AIR) Program and Alibaba-NTU Singapore Joint Research Institute
  • Nanyang Technological University, Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)435
  • Downloads (Last 6 weeks)32
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient algorithms to mine concise representations of frequent high utility occupancy patternsApplied Intelligence10.1007/s10489-024-05296-254:5(4012-4042)Online publication date: 18-Mar-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media