Skip to main content
Log in

DHL: Deep reinforcement learning-based approach for emergency supply distribution in humanitarian logistics

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Alleviating human suffering in disasters is one of the main objectives of humanitarian logistics. The lack of emergency rescue materials is the root cause of this suffering and must be considered when making emergency supply distribution decision. As large-scale disasters often cause varying degrees of damage to different influenced areas, which will cause differences in both human suffering and the demand for emergency supply in influenced areas. This paper considers a novel emergency supply distribution scenario in humanitarian logistics, which takes into account these differences. In the scenario, besides the economic goals such as minimizing costs, the humanitarian goal of alleviating the suffering of survivors is treated as one of the main bases of emergency supply distribution decision making. We first apply Markov Decision Process to establish the formulation of the emergency supply distribution problem. Then, to acquire the optimal resource allocation policy that can reduce the economic cost while decreasing the suffering of survivors, a Deep Q-Network-based approach for emergency supply distribution in Humanitarian Logistics (DHL) is developed. Numerical results demonstrate DHL has better performance and lower time complexity to solve the problem by comparing with other baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and material

Not applicable.

Code availability

Not applicable.

References

  1. https://www.emdat.be. Accessed 11 Feb 2022

  2. UNDRR (2020) Human cost of disasters: An overview of the last 20 years (2000–2019). Geneva. https://www.undrr.org/publication/human-cost-disasters-overview-last-20-years-2000-2019. Accessed 11 Feb 2022

  3. Nappi MML, Souza JC (2015) Disaster management: hierarchical structuring criteria for selection and location of temporary shelters. Nat Hazards 75:2421–2436

    Article  Google Scholar 

  4. Das R, Hanaoka S (2014) An agent-based model for resource allocation during relief distribution. J Humanit Logist Supply Chain Manag 4(2):265–285

    Article  Google Scholar 

  5. Fiedrich F, Gehbauer F, Rickers U (2000) Optimized resource allocation for emergency response after earthquake disasters. Safety Sci 35(1–3):41–57

    Article  Google Scholar 

  6. Wex F, Schryen G, Feuerriegel S, Neumann D (2014) Emergency response in natural disaster management: Allocation and scheduling of rescue units. Eur J Oper Res 235(3):697–708

    Article  MathSciNet  Google Scholar 

  7. Alem D, Clark A, Moreno A (2016) Stochastic network models for logistics planning in disaster relief. Eur J Oper Res 255(1):187–206

    Article  MathSciNet  Google Scholar 

  8. Chen YX, Tadikamalla PR, Shang J, Song Y (2020) Supply allocation: bi-level programming and differential evolution algorithm for Natural Disaster Relief. Clust Comput 23(1):203–217

    Article  Google Scholar 

  9. Wang Y, Sun B (2021) Multiperiod optimal emergency material allocation considering road network damage and risk under uncertain conditions. Oper Res 1–36

  10. Yu L, Zhang C, Jiang J, Yang H, Shang H (2021) Reinforcement learning approach for resource allocation in humanitarian logistics. Expert Syst Appl 173

  11. Yu L, Yang H, Miao L, Zhang C (2018) Rollout algorithms for resource allocation in humanitarian logistics. IISE Trans 51(8):887–909

    Article  Google Scholar 

  12. Yu L, Zhang C, Yang H, Miao L (2018) Novel methods for resource allocation in humanitarian logistics considering human suffering. Comput Ind Eng 119:1–20

    Article  Google Scholar 

  13. Silva MA, Leiras A (2021) The Deprivation Cost in Humanitarian Logistics: A Systematic Review. In: International Joint conference on Industrial Engineering and Operations Management. pp 279–301

  14. Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. MIT press, Cambridge

    MATH  Google Scholar 

  15. Kohl N, Stone P (2004) Policy gradient reinforcement learning for fast quadrupedal locomotion. In: IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA'04. 2004, vol 3. IEEE, pp 2619–2624

  16. Tesauro G (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68

    Article  Google Scholar 

  17. Strehl AL, Li L, Wiewiora E, Langford J, Littman ML (2006) PAC model-free reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning. pp 881–888

  18. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: A brief survey. IEEE Signal Process Mag 34(6):26–38

    Article  Google Scholar 

  19. He Y, Zhao N, Yin H (2017) Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Trans Veh Technol 67(1):44–55

    Article  Google Scholar 

  20. Xiong X, Zheng K, Lei L, Hou L (2020) Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Areas Commun 38(6):1133–1146

    Article  Google Scholar 

  21. Yu P, Zhou F, Zhang X, Qiu X, Kadoch M, Cheriet M (2020) Deep learning-based resource allocation for 5G broadband TV service. IEEE Trans Broadcast 66(4):800–813

    Article  Google Scholar 

  22. Hu X, Liu S, Chen R, Wang W, Wang C (2018) A deep reinforcement learning-based framework for dynamic resource allocation in multibeam satellite systems. IEEE Commun Lett 22(8):1612–1615

    Article  Google Scholar 

  23. Xiong Z, Zhang Y, Niyato D, Deng R, Wang P, Wang LC (2019) Deep reinforcement learning for mobile 5G and beyond: Fundamentals, applications, and challenges. IEEE Veh Technol Mag 14(2):44–52

    Article  Google Scholar 

  24. Zhang Y, Yao J, Guan H (2017) Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput 4(6):60–69

    Article  Google Scholar 

  25. Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, ... Wang Y (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 372–382

  26. Du Y, Zhang F, Xue L (2018) A kind of joint routing and resource allocation scheme based on prioritized memories-deep Q network for cognitive radio ad hoc networks. Sensors 18(7):2119

    Article  Google Scholar 

  27. Xiong Z, Zhang Y, Lim WYB, Kang J, Niyato D, Leung C, Miao C (2020) UAV-assisted wireless energy and data transfer with deep reinforcement learning. IEEE Trans Cogn Commun Netw 7(1):85–99

    Article  Google Scholar 

  28. Zhang W, Yang D, Wu W, Peng H, Zhang N, Zhang H, Shen X (2021) Optimizing federated learning in distributed industrial iot: A multi-agent approach. IEEE J Sel Areas Commun 39(12):3688–3703

    Article  Google Scholar 

  29. Sheu JB (2007) An emergency logistics distribution approach for quick response to urgent relief demand in disasters. Transp Res Part E: Logist Transp Rev 43(6):687–709

    Article  Google Scholar 

  30. Huang K, Jiang Y, Yuan Y, Zhao L (2015) Modeling multiple humanitarian objectives in emergency response to large-scale disasters. Transp Res Part E: Logist Transp Rev 75:1–17

    Article  Google Scholar 

  31. Holguín-Veras J, Pérez N, Jaller M, Van Wassenhove LN, Aros-Vera F (2013) On the appropriate objective function for post-disaster humanitarian logistics models. J Oper Manag 31(5):262–280

    Article  Google Scholar 

  32. Wiering MA, Van Otterlo M (2012) Reinforcement learning: State-of-the-Art. Springer, Berlin, Germany

    Book  Google Scholar 

  33. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey. J Artif Intell Res 4:237–285

    Article  Google Scholar 

  34. Watkins CJCH (1989) Learning from delayed rewards. PhD Thesis, University of Cambridge, England

  35. Qiu C, Wang X, Yao H, Du J, Yu FR, Guo S (2020) Networking Integrated Cloud–Edge–End in IoT: A Blockchain-Assisted Collective Q-Learning Approach. IEEE Internet Things J 8(16):12694–12704

    Article  Google Scholar 

  36. Mohammed A, Nahom H, Tewodros A, Habtamu Y, Hayelom G (2020) Deep reinforcement learning for computation offloading and resource allocation in blockchain-based multi-UAV-enabled mobile edge computing. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, pp 295–299

  37. Ke HC, Wang H, Zhao HW, Sun WJ (2021) Deep reinforcement learning-based computation offloading and resource allocation in security-aware mobile edge computing. Wirel Net 27(5):3357–3373

    Article  Google Scholar 

  38. Zhang R, Xiong K, Lu Y, Gao B, Fan P, Letaief KB (2022) Joint Coordinated Beamforming and Power Splitting Ratio Optimization in MU-MISO SWIPT-Enabled HetNets: A Multi-Agent DDQN-Based Approach. IEEE J Sel Areas Commun 40(2):677–693

    Article  Google Scholar 

  39. Šemrov D, Marsetič R, Žura M, Todorovski L, Srdic A (2016) Reinforcement learning approach for train rescheduling on a single-track railway. Transport Res Part B: Meth 86:250–267

    Article  Google Scholar 

  40. Konar A, Chakraborty IG, Singh SJ, Jain LC, Nagar AK (2013) A deterministic improved Q-learning for path planning of a mobile robot. IEEE Transact Syst Man Cyber: Syst 43(5):1141–1153

    Article  Google Scholar 

Download references

Funding

We would like to thank the editors and anonymous reviewers for their helpful comments. The work of J. Fan, X. Chang and H.Kang was supported by the Fundamental Research Funds for the Central Universities 2020YJS042 and by the Beijing Municipal Natural Science Foundation (No. M22037). The work of J. Mišić and V. B. Mišić was supported by Natural Science and Engineering Research Council (NSERC) of Canada through their respective Discovery Grants.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaolin Chang.

Ethics declarations

Conflicts of interest

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, J., Chang, X., Mišić, J. et al. DHL: Deep reinforcement learning-based approach for emergency supply distribution in humanitarian logistics. Peer-to-Peer Netw. Appl. 15, 2376–2389 (2022). https://doi.org/10.1007/s12083-022-01353-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-022-01353-0

Keywords

Navigation