Skip to main content

Q-Learning Based Framework for Solving the Stochastic E-waste Collection Problem

  • Conference paper
Evolutionary Computation in Combinatorial Optimization (EvoCOP 2024)

Abstract

Electrical and Electronic Equipment (EEE) has evolved into a gateway for accessing technological innovations. However, EEE imposes substantial pressure on the environment due to the shortened life cycles. E-waste encompasses discarded EEE and its components which are no longer in use. This study focuses on the e-waste collection problem and models it as a Vehicle Routing Problem with a heterogeneous fleet and a multi-period planning problem with time windows as well as stochastic travel times. Two different Q-learning-based methods are designed to enhance the search procedure for finding solutions. The first method involves utilizing the state-action value to determine the order of multiple improvement operators within the GRASP framework. The second one involves a hyperheuristic that extracts a stochastic policy to select heuristic operators during the search. Computational experiments demonstrate that both methods perform competitively with state-of-the-art methods in newly-generated small-sized instances, while the performance gap widens as the size of the problem instances increases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pérez-Belis, V., Bovea, M.D., Ibáñez-Forés, V.: An in-depth literature review of the waste electrical and electronic equipment context: trends and evolution. Waste Manage. Res. 33(1), 3–29 (2015)

    Article  Google Scholar 

  2. Wu, H., Tao, F., Yang, B.: Optimization of vehicle routing for waste collection and transportation. Int. J. Environ. Res. Public Health 17(14), 4963 (2020)

    Article  Google Scholar 

  3. Szwarc, K., Nowakowski, P., Boryczka, U.: An evolutionary approach to the vehicle route planning in e-waste mobile collection on demand. Soft. Comput. 25(8), 6665–6680 (2021)

    Article  Google Scholar 

  4. Pourhejazy, P., Zhang, D., Zhu, Q., Wei, F., Song, S.: Integrated e-waste transportation using capacitated general routing problem with time-window. Transp. Res. Part E: Logist. Transp. Rev. 145, 102169 (2021)

    Article  Google Scholar 

  5. Gunawan, A., Nguyen, M.P.K., Vincent, F.Y., Nguyen, D.V.A.: The heterogeneous vehicle routing problem with multiple time windows for the e-waste collection problem. In: 19th International Conference on Automation Science and Engineering (CASE) (2023)

    Google Scholar 

  6. Gunawan, A., Nguyen, D.V.A., Nguyen, P.K.M., Vansteenwegen, P.: Grasp solution approach for the e-waste collection problem. In: Daduna, J.R., Liedtke, G., Shi, X., Voß, S. (eds.) ICCL 2023. LNCS, vol. 14239, pp. 260–275. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43612-3_16

    Chapter  Google Scholar 

  7. Król, A., Nowakowski, P., Mrówczyńska, B.: How to improve WEEE management? Novel approach in mobile collection with application of artificial intelligence. Waste Manage. 50, 222–233 (2016)

    Article  Google Scholar 

  8. Karimi-Mamaghan, M., Mohammadi, M., Meyer, P., Karimi-Mamaghan, A.M., Talbi, E.G.: Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: a state-of-the-art. Eur. J. Oper. Res. 296(2), 393–422 (2022)

    Article  MathSciNet  Google Scholar 

  9. Talbi, E.G.: Machine learning into metaheuristics: a survey and taxonomy. ACM Comput. Surv. (CSUR) 54(6), 1–32 (2021)

    Article  Google Scholar 

  10. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  11. Lu, H., Zhang, X., Yang, S.: A learning-based iterative method for solving vehicle routing problems. In: International Conference on Learning Representations (2019)

    Google Scholar 

  12. Peng, B., Zhang, Y., Gajpal, Y., Chen, X.: A memetic algorithm for the green vehicle routing problem. Sustainability 11(21), 6055 (2019)

    Article  Google Scholar 

  13. Reijnen, R., Zhang, Y., Lau, H.C., Bukhsh, Z.: Operator selection in adaptive large neighborhood search using deep reinforcement learning. arXiv preprint arXiv:2211.00759 (2022)

  14. Ödling, D.: A metaheuristic for vehicle routing problems based on reinforcement learning (2018)

    Google Scholar 

  15. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)

    Article  Google Scholar 

  16. Festa, P., Pastore, T., Ferone, D., Juan, A.A., Bayliss, C.: Integrating biased-randomized GRASP with monte carlo simulation for solving the vehicle routing problem with stochastic demands. In: 2018 Winter Simulation Conference (WSC), pp. 2989–3000. IEEE (2018)

    Google Scholar 

  17. Golden, B., Assad, A., Levy, L., Gheysens, F.: The fleet size and mix vehicle routing problem. Comput. Oper. Res. 11(1), 49–66 (1984)

    Article  Google Scholar 

  18. Kallestad, J., Hasibi, R., Hemmati, A., Sörensen, K.: A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems. Eur. J. Oper. Res. 309(1), 446–468 (2023)

    Article  MathSciNet  Google Scholar 

  19. Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, vol. 12 (1999)

    Google Scholar 

  20. Konda, V., Tsitsiklis, J.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems, vol. 12 (1999)

    Google Scholar 

  21. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897. PMLR (2015)

    Google Scholar 

  22. Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press, Boca Raton (2020)

    Book  Google Scholar 

Download references

Acknowledgement

This research was supported by the Singapore Ministry of Education (MOE) Academic Research Fund (AcRF) Tier 1 grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aldy Gunawan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper

Nguyen, D.V.A., Gunawan, A., Misir, M., Vansteenwegen, P. (2024). Q-Learning Based Framework for Solving the Stochastic E-waste Collection Problem. In: Stützle, T., Wagner, M. (eds) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2024. Lecture Notes in Computer Science, vol 14632. Springer, Cham. https://doi.org/10.1007/978-3-031-57712-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57712-3_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57711-6

  • Online ISBN: 978-3-031-57712-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics