An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem

Zou, Yang; Wu, Hecheng; Yin, Yunqiang; Dhamotharan, Lalitha; Chen, Daqiang; Tiwari, Aviral Kumar

doi:10.1007/s10479-022-04788-z

An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem

Original Research
Published: 20 June 2022

Volume 339, pages 517–536, (2024)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Yang Zou¹,
Hecheng Wu¹,
Yunqiang Yin ORCID: orcid.org/0000-0001-5761-6680²,
Lalitha Dhamotharan³,
Daqiang Chen⁴ &
…
Aviral Kumar Tiwari⁵

1455 Accesses
14 Citations
Explore all metrics

Abstract

Low-carbon logistics is an emerging and sustainable development industry in the era of a low-carbon economy. The end-to-end deep reinforcement learning (DRL) method with an encoder-decoder framework has been proven effective for solving logistics problems. However, in most cases, the recurrent neural networks (RNN) and attention mechanisms are used in encoders and decoders, which may result in the long-distance dependence problem and the neglect of the correlation between query vectors. To surround this problem, we propose an improved transformer model (TAOA) with both multi-head attention mechanism (MHA) and attention to attention mechanism (AOA), and apply it to solve the low-carbon multi-depot vehicle routing problem (MDVRP). In this model, the MHA and AOA are implemented to solve the probability of route nodes in the encoder and decoder. The MHA is used to process different parts of the input sequence, which can be calculated in parallel, and the AOA is used to deal with the deficiency problem of correlation between query results and query vectors in the MHA. The actor-critic framework based on strategy gradient is constructed to train model parameters. The 2opt operator is further used to optimize the resulting routes. Finally, extensive numerical studies are carried out to verify the effectiveness and operation efficiency of the proposed TAOA, and the results show that the proposed TAOA performs better in solving the MDVRP than the traditional transformer model (Kools), genetic algorithm (GA), and Google OR-Tools (Ortools).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Reinforcement Learning with Two-Stage Training Strategy for Practical Electric Vehicle Routing Problem with Time Windows

Gase: graph attention sampling with edges fusion for solving vehicle routing problems

Article 06 August 2024

A Deep Reinforcement Learning Algorithm Using Dynamic Attention Model for Vehicle Routing Problems

References

Bello, I., Pham, H., Le, Q.V. et al. (2019). Neural combinatorial optimization with reinforcement learning. In 5th International conference on learning representations, ICLR 2017—Workshop track proceedings.
Bock, S., & Wei, M. G. (2019, July). A proof of local convergence for the Adam optimizer. In 2019 International joint conference on neural networks (IJCNN) (pp. 1–8).
Brandão de Oliveira, H. C., & Vasconcelos, G. C. (2010). A hybrid search method for the vehicle routing problem with time windows. Annals of Operations Research, 180, 125–144.
Article Google Scholar
Bresson, X., & Laurent, T. (2021). The transformer network for the traveling salesman problem. arXiv preprint arXiv:2103.03012.
Camacho-Vallejo, J. F., López-Vera, L., et al. (2021). A tabu search algorithm to solve a green logistics bi-objective bi-level problem. Annals of Operations Research, 12(4), 1–27.
Google Scholar
Deudon, M., Cournut, P., Lacoste, A. et al. (2018). Learning heuristics for the tsp by policy gradient. In International conference on the integration of constraint programming, artificial intelligence, and operations research (pp. 170–181).
Eggleston, H. S., Buendia, L., Miwa, K. et al. (2006). 2006 IPCC guidelines for national greenhouse gas inventories.
Facts, E. (2005). Average carbon dioxide emissions resulting from gasoline and diesel fuel, United States Environmental Protection Agency, Seattle, Wash., USA
Galindres-Guancha, L. F., Toro-Ocampo, E. M., & Rendón, R. A. (2018). Multi-objective MDVRP solution considering route balance and cost using the ILS metaheuristic. International Journal of Industrial Engineering Computations, 9(1), 33–46.
Article Google Scholar
Gillett, B. E., & Johnson, J. G. (1976). Multi-terminal vehicle-dispatch algorithm. Omega, 4(6), 711–718.
Article Google Scholar
Haarnoja, T., Zhou, A. et al. (2018). Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905.
Huang, L. et al. (2019). Attention on attention for image captioning. In Proceedings of the IEEE/CVF international conference on computer vision.
Huang, Z., Liang, D., Xu, P. et al. (2020). Improve transformer models with better relative position embeddings. arXiv preprint arXiv:2009.13658.
Kalaivaani, P., Sathishkumar, V. E., Hatamleh, W. A., et al. (2021). Advanced lightweight feature interaction in deep neural networks for improving the prediction in click through rate. Annals of Operations Research, 11, 1–15.
Google Scholar
Kool, W., Hoof, H. V., & Welling, M. (2019). Attention, learn to solve routing problems! In: 7th International conference on learning representations.
Kruk, S. (2018). Practical python AI projects: Mathematical models of optimization problems with Google OR-tools, Apress.
Kuo, Y., & Wang, C. (2011). A variable neighborhood search for the multi-depot vehicle routing problem with loading cost. Expert Systems with Applications, 39(8), 6949–6954.
Article Google Scholar
Kurbiel, T., & Khaleghian, S. (2017). Training of deep neural networks based on distance measures using RMSProp. arXiv preprint arXiv:1708.01911.
Li, J., Wang, R., Li, T., et al. (2018). Benefit analysis of shared depot resources for multi-depot vehicle routing problem with fuel consumption. Transportation Research Part d: Transport and Environment, 59, 417–432.
Article Google Scholar
Ma, Q., Ge, S., He, D. et al. (2019). Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936.
Marrekchi, E., Besbes, W., Dhouib, D., et al. (2021). A review of recent advances in the operations research literature on the green routing problem and its variants. Annals of Operations Research, 304, 529–574.
Article Google Scholar
Mizutani, E., & Dreyfus, S. (2017). Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains. Annals of Operations Research, 258, 107–131.
Article Google Scholar
Nazari, M., Oroojlooy, A., Takáč, M. et al. (2018). Reinforcement learning for solving the vehicle routing problem. In: Advances in neural information processing systems.
Penna, P. H. V., Subramanian, A., Ochi, L. S., et al. (2019). A hybrid heuristic for a broad class of vehicle routing problems with heterogeneous fleet. Annals of Operations Research, 273, 5–74.
Article Google Scholar
Potvin, J. Y. (1996). Genetic algorithms for the traveling salesman problem. Annals of Operations Research, 63, 337–370.
Article Google Scholar
Powell, W. B. (2016). Perspectives of approximate dynamic programming. Annals of Operations Research, 241, 319–356.
Article Google Scholar
Roy, J., Pamučar, D., & Kar, S. (2020). Evaluation and selection of third party logistics provider under sustainability perspectives: An interval valued fuzzy-rough approach. Annals of Operations Research, 293, 669–714.
Article Google Scholar
Sahin, B., Yilmaz, H., Ust, Y., et al. (2009). An approach for analysing transportation costs and a case study. European Journal of Operational Research, 193(1), 1–11.
Article Google Scholar
Salhi, S., Imran, A., & Wassan, N. A. (2014). The multi-depot vehicle routing problem with heterogeneous vehicle fleet: Formulation and a variable neighborhood search implementation. Computers & Operations Research, 52, 315–325.
Article Google Scholar
Sayli, M., & Yılmaz, E. (2017). Anti-periodic solutions for state-dependent impulsive recurrent neural networks with time-varying and continuously distributed delays. Annals of Operations Research, 258, 159–185.
Article Google Scholar
Sbihi, A., & Eglese, R. W. (2010). Combinatorial optimization and green logistics. Annals of Operations Research, 175(1), 159–175.
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. In: Advances in neural information processing systems (pp. 5998–6008).
Vinyals, O., Fortunato, M., & Jaitly, N. (2015). Pointer networks. arXiv preprint arXiv:1506.03134.
Ward, R., Wu, X., & Bottou, L. (2018). Adagrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization. arXiv preprint arXiv:1806.01811
Xiao, Y., Zhao, Q., et al. (2012). Development of a fuel consumption optimization model for the capacitated vehicle routing problem. Computers & Operations Research, 39(7), 1419–1431.
Article Google Scholar
Yang, H. (2021). Extended attention mechanism for TSP problem. In 2021 International joint conference on neural networks (IJCNN).

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under grant number 71971041; in part by the Outstanding Young Scientific and Technological Talents Foundation of Sichuan Province under grant number 2020JDJQ0035; and in part by the Major Program of National Social Science Foundation of China under Grant 20&ZD084.

Author information

Authors and Affiliations

College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
Yang Zou & Hecheng Wu
School of Economics and Management, University of Electronic Science and Technology of China, Chengdu, 611731, China
Yunqiang Yin
University of Exeter Business School, University of Exeter, Exeter, EX4 4PU, UK
Lalitha Dhamotharan
School of Management and E-Business, Zhejiang Gongshang University, Hangzhou, 310018, China
Daqiang Chen
Rajagiri Business School (RBS), Kochi, 682039, India
Aviral Kumar Tiwari

Authors

Yang Zou
View author publications
You can also search for this author in PubMed Google Scholar
Hecheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yunqiang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Lalitha Dhamotharan
View author publications
You can also search for this author in PubMed Google Scholar
Daqiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Aviral Kumar Tiwari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunqiang Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, Y., Wu, H., Yin, Y. et al. An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem. Ann Oper Res 339, 517–536 (2024). https://doi.org/10.1007/s10479-022-04788-z

Download citation

Accepted: 17 May 2022
Published: 20 June 2022
Issue Date: August 2024
DOI: https://doi.org/10.1007/s10479-022-04788-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Reinforcement Learning with Two-Stage Training Strategy for Practical Electric Vehicle Routing Problem with Time Windows

Gase: graph attention sampling with edges fusion for solving vehicle routing problems

A Deep Reinforcement Learning Algorithm Using Dynamic Attention Model for Vehicle Routing Problems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now