Skip to main content
Log in

Order dispatching for an ultra-fast delivery service via deep reinforcement learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper proposes a real-life application of deep reinforcement learning to address the order dispatching problem of a Turkish ultra-fast delivery company, Getir. Before applying off-the-shelf reinforcement learning methods, we define the specific problem at Getir and one of the solutions the company has implemented. We discuss the novel aspects of Getir’s problem compared to the state-of-the-art order dispatching studies and highlight the limitations of Getir’s solution. The overall aim of the company is to deliver to as many customers as possible within 10 minutes. The orders arrive throughout the day, and centralized warehouses in the regions decide whether an incoming order should be served or canceled depending on their couriers’ shifts and status. We use Deep Q-networks to learn the actions of warehouses, i.e., accepting or canceling an order, directly from state dimensions using reinforcement learning. We design the networks with two different rewards. We conduct empirical analyses using real-life data provided by Getir to generate training samples and to assess the models’ performance during a selected 30-day period with a total of 9880 orders. The results indicate that our proposed models are able to generate policies that outperform the rule-based heuristic employed in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Due to privacy agreements with the company, we do not disclose the exact numbers for the queue size and delivery time limits in this paper.

  2. https://kovan.itu.edu.tr/index.php/s/bG1VPCovocpnKyU

References

  1. Agarap AF (2018) Deep learning using rectified linear units (relu). CoRR arXiv:1803.08375

  2. Chen B, Qu R, Bai R, Laesanklang W (2019a) A variable neighborhood search algorithm with reinforcement learning for a real-life periodic vehicle routing problem with time windows and open routes. RAIRO Operations Research

  3. Chen Y, Qian Y, Yao Y, Wu Z, Li R, Zhou Y, Hu H, Xu Y (2019b) Can sophisticated dispatching strategy acquired by reinforcement learning? In: 18th International Conference on Autonomous Agents and MultiAgent Systems, pp 1395–1403

  4. Han S, Zhao L, Chen K, Zw Luo, Mishra D (2017) Appointment scheduling and routing optimization of attended home delivery system with random customer behavior. Eur J Oper Res 262(3):966–980

    Article  MathSciNet  Google Scholar 

  5. Holler J, Vuorio R, Qin Z, Tang X, Jiao Y, Jin T, Singh S, Wang C, Ye J (2019) Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE International Conference on Data Mining (ICDM), pp 1090–1095. https://doi.org/10.1109/ICDM.2019.00129

  6. Huang Y, Zhao L, Powell W B, Tong Y, Ryzhov I O (2019) Optimal learning for urban delivery fleet allocation. Transp Sci 53(3):623–641. https://doi.org/10.1287/trsc.2018.0861

    Article  Google Scholar 

  7. Jung J, Jayakrishnan R (2013) Design and modeling of real-time shared-taxi dispatch algorithms. In: Transportation Research Board 92nd Annual Meeting

  8. Kingma D P, Ba J (2015) Adam: a method for stochastic optimization. CoRR arXiv:1412.6980

  9. Li Y, Zheng Y, Yang Q (2019) Efficient and effective express via contextual cooperative reinforcement learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 510–519

  10. Lin C, Choy K L, Ho G T, Lam H, Pang G K, Chin K S (2014) A decision support system for optimizing dynamic courier routing operations. Expert Syst Appl 41(15):6917–6933

    Article  Google Scholar 

  11. Lin K, Zhao R, Xu Z, Zhou J (2018) Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD’18. Association for Computing Machinery, New York, pp 1774–1783. https://doi.org/10.1145/3219819.3219993

  12. Lu Z, Pu H, Wang F, Hu Z, Wang L (2017) The expressive power of neural networks: a view from the width. In: Guyon I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30, Curran Associates, Inc., pp 6231–6239

  13. Mahmud M, Kaiser MS, Hussain A, Vassanelli S (2018) Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 29(6):2063–2079. https://doi.org/10.1109/TNNLS.2018.2790388

    Article  MathSciNet  Google Scholar 

  14. Masoud N, Jayakrishnan R (2017) A real-time algorithm to solve the peer-to-peer ride-matching problem in a flexible ridesharing system. Transportation Research Part B Methodological. https://doi.org/10.1016/j.trb.2017.10.006

  15. Massey Jr FJ (1951) The kolmogorov-smirnov test for goodness of fit. J Am Stat Assoc 46 (253):68–78. https://doi.org/10.1080/01621459.1951.10500769

    Article  Google Scholar 

  16. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop

  17. Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533

    Article  Google Scholar 

  18. Ota M, Vo H, Silva C, Freire J (2015) A scalable approach for data-driven taxi ride-sharing simulation. In: 2015 IEEE International Conference on Big Data (Big Data), pp 888–897

  19. Pitel L (2020) Michael moritz backs turkish grocery start-up. Available from https://www.ft.com/content/d0a427f6-36e0-11ea-a6d3-9a26f8c3cba4

  20. Qin Z, Tang X, Jiao Y, Zhang F, Xu Z, Zhu H, Ye J (2020) Ride-hailing order dispatching at didi via reinforcement learning. INFORMS J Appl Anal 50(5):272–286

    Article  Google Scholar 

  21. Restrepo M I, Semet F, Pocreau T (2019) Integrated shift scheduling and load assignment optimization for attended home delivery. Transp Sci 53(4):1150–1174

    Article  Google Scholar 

  22. Reyes D, Erera A L, Savelsbergh M W P, Sahasrabudhe S, O’Neil RJ (2018) The meal delivery routing problem. Technical Report

  23. Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. In: International Conference on Learning Representations

  24. Silver D, Huang A, Maddison C J, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489. https://doi.org/10.1038/nature16961

    Article  Google Scholar 

  25. Sungur I, Ren Y, Ordonez F, Dessouky M, Zhong H (2010) A model and algorithm for the courier delivery problem with uncertainty. Transp Sci 44(2):193–205. https://doi.org/10.1287/trsc.1090.0303

    Article  Google Scholar 

  26. Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction, 2nd edn. The MIT Press

  27. Tan T, Bao F, Deng Y, Jin A, Dai Q, Wang J (2020) Cooperative deep reinforcement learning for large-scale traffic grid signal control. IEEE Trans Cybern 50(6):2687–2700. https://doi.org/10.1109/TCYB.2019.2904742

    Article  Google Scholar 

  28. Tang X, Qin ZT, Zhang F, Wang Z, Xu Z, Ma Y, Zhu H, Ye J (2019) A deep value-network based approach for multi-driver order dispatching. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, Association for Computing Machinery, New York, p 1780–1790. https://doi.org/10.1145/3292500.3330724

  29. Ulmer MW, Thomas BW, Mattfeld DC (2019) Preemptive depot returns for dynamic same-day delivery. EURO J Transp Logist 8(4):327–361. https://doi.org/10.1007/s13676-018-0124-0, https://www.sciencedirect.com/science/article/pii/S2192437620300479

  30. Uwano F, TATEBE N, TAJIMA Y, NAKATA M, KOVACS T, TAKADAMA K (2018) Multi-agent cooperation based on reinforcement learning with internal reward in maze problem. SICE J Control Measur Syst Integr 11(4):321–330. https://doi.org/10.9746/jcmsi.11.321

    Article  Google Scholar 

  31. Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30

  32. Vera JM, Abad AG (2019) Deep reinforcement learning for routing a heterogeneous fleet of vehicles. In: 2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI), pp 1–6

  33. Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd International Conference on Machine Learning, PMLR, vol 48. Proceedings of Machine Learning Research, New York, pp 1995–2003. http://proceedings.mlr.press/v48/wangf16.html

  34. Wang Z, Qin Z, Tang X, Ye J, Zhu H (2018) Deep reinforcement learning with knowledge transfer for online rides order dispatching. In: 2018 IEEE International Conference on Data Mining (ICDM), pp 617–626, https://doi.org/10.1109/ICDM.2018.00077

  35. Zhao J, Mao M, Zhao X, Zou J (2020) A hybrid of deep reinforcement learning and local search for the vehicle routing problems. IEEE Trans Intell Transp Syst:1–11. https://doi.org/10.1109/TITS.2020.3003163

  36. Zhou M, Jin J, Zhang W, Qin Z, Jiao Y, Wang C, Wu G, Yu Y, Ye J (2019) Multi-agent reinforcement learning for order-dispatching via order-vehicle distribution matching. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM ’19. Association for Computing Machinery, New York, pp 2645–2653. https://doi.org/10.1145/3357384.3357799

Download references

Acknowledgments

This research is partly funded by Getir Perakende Lojistik A.S., Istanbul, Turkey.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eray Mert Kavuk.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kavuk, E.M., Tosun, A., Cevik, M. et al. Order dispatching for an ultra-fast delivery service via deep reinforcement learning. Appl Intell 52, 4274–4299 (2022). https://doi.org/10.1007/s10489-021-02610-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02610-0

Keywords

Navigation