Skip to main content

Advertisement

Log in

Gradual task complexity scaling (GTCS-DRL): a deep reinforcement learning approach for training automated guided vehicle system

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

In the realm of Automated Guided Vehicle (AGV) systems, training Deep Reinforcement Learning (DRL) models presents significant challenges due to the complexity of tasks and environments, as well as the large state and action spaces involved. This paper introduces the Gradual Task Complexity Scaling (GTCS) approach, a novel learning procedure for DRL that effectively addresses these challenges. Unlike existing methods, which focus on directly achieving the final objective, GTCS incrementally upgrades the agent’s objectives while maintaining the same environmental context, enabling a more efficient balance between exploration and exploitation during training. The GTCS procedure features four key components: gradually expanding the reduced effective space size within the warehouse, increasing the number of products the agent must deliver, enhancing the capabilities of AGVs represented as DRL agents, and reducing the maximum number of steps allowed for task completion. GTCS outperforms previous approaches by improving the stability of the learning process, optimizing the delivery workflow, and achieving more efficient learning outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

Not provided

References

  • Albekairi M, Kaaniche K, Abbas G, Mercorelli P, Alanazi MD, Almadhor A (2024) Advanced neural classifier-based effective human assistance robots using comparable interactive input assessment technique. Mathematics 12(16):2500

    Article  Google Scholar 

  • Alshahir A, Kaaniche K, Abbas G, Mercorelli P, Albekairi M, Alanazi MD (2024) A study on the performance of adaptive neural networks for haze reduction with a focus on precision. Mathematics 12(16):2526

    Article  Google Scholar 

  • Amhraoui E, Masrour T (2021) A new approach for multi-agent reinforcement learning. Artificial intelligence and industrial applications: smart operation management. Springer, Berlin, pp 263–275

    Chapter  MATH  Google Scholar 

  • Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38

    Article  Google Scholar 

  • Bulut V (2022) Optimal path planning method based on epsilon-greedy q-learning algorithm. J Braz Soc Mech Sci Eng 44(3):106

    Article  MATH  Google Scholar 

  • Buşoniu L, Babuška R, De Schutter B (2010) Multi-agent reinforcement learning: an overview. Innov Multi-agent Syst Appl 1:183–221

    MathSciNet  MATH  Google Scholar 

  • Canese L, Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Re M, Spanò S (2021) Multi-agent reinforcement learning: a review of challenges and applications. Appl Sci 11(11):4948

    Article  Google Scholar 

  • Contreras-Cruz MA, Ayala-Ramirez V, Hernandez-Belmonte UH (2015) Mobile robot path planning using artificial bee colony and evolutionary programming. Appl Soft Comput 30:319–328

    Article  MATH  Google Scholar 

  • El Mazgualdi C, Masrour T, El Hassani I, Khdoudi A (2021) Machine learning for kpis prediction: a case study of the overall equipment effectiveness within the automotive industry. Soft Comput 25:2891–2909

    Article  Google Scholar 

  • El Mazgualdi C, Masrour T, Barka N, El Hassani I (2022) A learning-based decision tool towards smart energy optimization in the manufacturing process. Systems 10(5):180

    Article  MATH  Google Scholar 

  • Fang Q, Xie C (2004) A study on intelligent path following and control for vision-based automated guided vehicle. In: Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No. 04EX788), vol. 6, pp. 4811–4815 . IEEE

  • Hart PE, Nilsson NJ, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern 4(2):100–107

    Article  MATH  Google Scholar 

  • Hu H, Yang X, Xiao S, Wang F (2023) Anti-conflict agv path planning in automated container terminals based on multi-agent reinforcement learning. Int J Prod Res 2:1–16

    Google Scholar 

  • Jeon SM, Kim KH, Kopfer H (2011) Routing automated guided vehicles in container terminals through the q-learning technique. Logist Res 3:19–27

    Article  MATH  Google Scholar 

  • Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

    Article  MATH  Google Scholar 

  • Karur K, Sharma N, Dharmatti C, Siegel JE (2021) A survey of path planning algorithms for mobile robots. Vehicles 3(3):448–468

    Article  MATH  Google Scholar 

  • Khdoudi A, Barka N, Masrour T, El-Hassani I, Mazgualdi CE (2023) Online prediction of automotive tempered glass quality using machine learning. Int J Adv Manuf Technol 2:1–26

    Google Scholar 

  • Li MP, Sankaran P, Kuhl ME, Ganguly A, Kwasinski A, Ptucha R (2018) Simulation analysis of a deep reinforcement learning approach for task selection by autonomous material handling vehicles. In: 2018 Winter Simulation Conference (WSC), pp. 1073–1083 . IEEE

  • Masrour T, Rhazzaf M (2018) A new approach for dynamic parametrization of ant system algorithms. Int J Intell Syst Appl 10(6):1

    MATH  Google Scholar 

  • Mazgualdi CE, Masrour T, Hassani IE, Khdoudi A (2021) A deep reinforcement learning (drl) decision model for heating process parameters identification in automotive glass manufacturing. Artificial intelligence and industrial applications: smart operation management. Springer, Berlin, pp 77–87

    Chapter  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Puterman ML (1990) Markov decision processes. Handb Oper Res Manag Sci 2:331–434

    MathSciNet  MATH  Google Scholar 

  • Ramer C, Sessner J, Scholz M, Zhang X, Franke J (2015) Fusing low-cost sensor data for localization and mapping of automated guided vehicle fleets in indoor applications. In: 2015 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 65–70. IEEE

  • Rhazzaf M, Masrour T (2020) A dynamic configuration with a shared knowledge centre for multi-objective ant colony optimisation algorithms. Int J Intell Syst Technol Appl 19(6):541–554

    MATH  Google Scholar 

  • Rhazzaf M, Masrour T (2021) Deep learning approach for automated guided vehicle system. Artificial intelligence and industrial applications: smart operation management. Springer, Berlin, pp 227–237

    Chapter  MATH  Google Scholar 

  • Rhazzaf M, Masrour T (2021) Smart autonomous vehicles in high dimensional warehouses using deep reinforcement learning approach. Eng Lett 29:1

    MATH  Google Scholar 

  • Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv preprint arXiv:1511.05952

  • Stentz A (1994) Optimal and efficient path planning for partially-known environments. In: Proceedings of the 1994 IEEE International Conference on Robotics and Automation, pp. 3310–3317. IEEE

  • Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30

  • Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1995–2003 . PMLR

  • Xue T, Zeng P, Yu H (2018) A reinforcement learning method for multi-agv scheduling in manufacturing. In: 2018 IEEE International Conference on Industrial Technology (ICIT), pp. 1557–1561 . IEEE

  • Zhang Y, Qian Y, Yao Y, Hu H, Xu Y (2020) Learning to cooperate: Application of deep reinforcement learning for online agv path finding. In: Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, pp. 2077–2079

Download references

Acknowledgements

The authors wish to acknowledge that there are no individuals or organizations to be acknowledged for their contributions to this paper.

Funding

This research study was conducted without any external funding or financial support.

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed equally to this work.

Corresponding author

Correspondence to Tawfik Masrour.

Ethics declarations

Conflict of interest/Conflict of interest

The authors declare no Conflict of interest or Conflict of interest.

Ethics Approval

Not Applicable

Consent to Participate

Not Applicable

Consent for Publication

Not Applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rhazzaf, M., Masrour, T. Gradual task complexity scaling (GTCS-DRL): a deep reinforcement learning approach for training automated guided vehicle system. Evolving Systems 16, 38 (2025). https://doi.org/10.1007/s12530-025-09660-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12530-025-09660-6

Keywords