Skip to main content

Advertisement

Log in

An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem

  • Original Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Low-carbon logistics is an emerging and sustainable development industry in the era of a low-carbon economy. The end-to-end deep reinforcement learning (DRL) method with an encoder-decoder framework has been proven effective for solving logistics problems. However, in most cases, the recurrent neural networks (RNN) and attention mechanisms are used in encoders and decoders, which may result in the long-distance dependence problem and the neglect of the correlation between query vectors. To surround this problem, we propose an improved transformer model (TAOA) with both multi-head attention mechanism (MHA) and attention to attention mechanism (AOA), and apply it to solve the low-carbon multi-depot vehicle routing problem (MDVRP). In this model, the MHA and AOA are implemented to solve the probability of route nodes in the encoder and decoder. The MHA is used to process different parts of the input sequence, which can be calculated in parallel, and the AOA is used to deal with the deficiency problem of correlation between query results and query vectors in the MHA. The actor-critic framework based on strategy gradient is constructed to train model parameters. The 2opt operator is further used to optimize the resulting routes. Finally, extensive numerical studies are carried out to verify the effectiveness and operation efficiency of the proposed TAOA, and the results show that the proposed TAOA performs better in solving the MDVRP than the traditional transformer model (Kools), genetic algorithm (GA), and Google OR-Tools (Ortools).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig.1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Bello, I., Pham, H., Le, Q.V. et al. (2019). Neural combinatorial optimization with reinforcement learning. In 5th International conference on learning representations, ICLR 2017—Workshop track proceedings.

  • Bock, S., & Wei, M. G. (2019, July). A proof of local convergence for the Adam optimizer. In 2019 International joint conference on neural networks (IJCNN) (pp. 1–8).

  • Brandão de Oliveira, H. C., & Vasconcelos, G. C. (2010). A hybrid search method for the vehicle routing problem with time windows. Annals of Operations Research, 180, 125–144.

    Article  Google Scholar 

  • Bresson, X., & Laurent, T. (2021). The transformer network for the traveling salesman problem. arXiv preprint arXiv:2103.03012.

  • Camacho-Vallejo, J. F., López-Vera, L., et al. (2021). A tabu search algorithm to solve a green logistics bi-objective bi-level problem. Annals of Operations Research, 12(4), 1–27.

    Google Scholar 

  • Deudon, M., Cournut, P., Lacoste, A. et al. (2018). Learning heuristics for the tsp by policy gradient. In International conference on the integration of constraint programming, artificial intelligence, and operations research (pp. 170–181).

  • Eggleston, H. S., Buendia, L., Miwa, K. et al. (2006). 2006 IPCC guidelines for national greenhouse gas inventories.

  • Facts, E. (2005). Average carbon dioxide emissions resulting from gasoline and diesel fuel, United States Environmental Protection Agency, Seattle, Wash., USA

  • Galindres-Guancha, L. F., Toro-Ocampo, E. M., & Rendón, R. A. (2018). Multi-objective MDVRP solution considering route balance and cost using the ILS metaheuristic. International Journal of Industrial Engineering Computations, 9(1), 33–46.

    Article  Google Scholar 

  • Gillett, B. E., & Johnson, J. G. (1976). Multi-terminal vehicle-dispatch algorithm. Omega, 4(6), 711–718.

    Article  Google Scholar 

  • Haarnoja, T., Zhou, A. et al. (2018). Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905.

  • Huang, L. et al. (2019). Attention on attention for image captioning. In Proceedings of the IEEE/CVF international conference on computer vision.

  • Huang, Z., Liang, D., Xu, P. et al. (2020). Improve transformer models with better relative position embeddings. arXiv preprint arXiv:2009.13658.

  • Kalaivaani, P., Sathishkumar, V. E., Hatamleh, W. A., et al. (2021). Advanced lightweight feature interaction in deep neural networks for improving the prediction in click through rate. Annals of Operations Research, 11, 1–15.

    Google Scholar 

  • Kool, W., Hoof, H. V., & Welling, M. (2019). Attention, learn to solve routing problems! In: 7th International conference on learning representations.

  • Kruk, S. (2018). Practical python AI projects: Mathematical models of optimization problems with Google OR-tools, Apress.

  • Kuo, Y., & Wang, C. (2011). A variable neighborhood search for the multi-depot vehicle routing problem with loading cost. Expert Systems with Applications, 39(8), 6949–6954.

    Article  Google Scholar 

  • Kurbiel, T., & Khaleghian, S. (2017). Training of deep neural networks based on distance measures using RMSProp. arXiv preprint arXiv:1708.01911.

  • Li, J., Wang, R., Li, T., et al. (2018). Benefit analysis of shared depot resources for multi-depot vehicle routing problem with fuel consumption. Transportation Research Part d: Transport and Environment, 59, 417–432.

    Article  Google Scholar 

  • Ma, Q., Ge, S., He, D. et al. (2019). Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936.

  • Marrekchi, E., Besbes, W., Dhouib, D., et al. (2021). A review of recent advances in the operations research literature on the green routing problem and its variants. Annals of Operations Research, 304, 529–574.

    Article  Google Scholar 

  • Mizutani, E., & Dreyfus, S. (2017). Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains. Annals of Operations Research, 258, 107–131.

    Article  Google Scholar 

  • Nazari, M., Oroojlooy, A., Takáč, M. et al. (2018). Reinforcement learning for solving the vehicle routing problem. In: Advances in neural information processing systems.

  • Penna, P. H. V., Subramanian, A., Ochi, L. S., et al. (2019). A hybrid heuristic for a broad class of vehicle routing problems with heterogeneous fleet. Annals of Operations Research, 273, 5–74.

    Article  Google Scholar 

  • Potvin, J. Y. (1996). Genetic algorithms for the traveling salesman problem. Annals of Operations Research, 63, 337–370.

    Article  Google Scholar 

  • Powell, W. B. (2016). Perspectives of approximate dynamic programming. Annals of Operations Research, 241, 319–356.

    Article  Google Scholar 

  • Roy, J., Pamučar, D., & Kar, S. (2020). Evaluation and selection of third party logistics provider under sustainability perspectives: An interval valued fuzzy-rough approach. Annals of Operations Research, 293, 669–714.

    Article  Google Scholar 

  • Sahin, B., Yilmaz, H., Ust, Y., et al. (2009). An approach for analysing transportation costs and a case study. European Journal of Operational Research, 193(1), 1–11.

    Article  Google Scholar 

  • Salhi, S., Imran, A., & Wassan, N. A. (2014). The multi-depot vehicle routing problem with heterogeneous vehicle fleet: Formulation and a variable neighborhood search implementation. Computers & Operations Research, 52, 315–325.

    Article  Google Scholar 

  • Sayli, M., & Yılmaz, E. (2017). Anti-periodic solutions for state-dependent impulsive recurrent neural networks with time-varying and continuously distributed delays. Annals of Operations Research, 258, 159–185.

    Article  Google Scholar 

  • Sbihi, A., & Eglese, R. W. (2010). Combinatorial optimization and green logistics. Annals of Operations Research, 175(1), 159–175.

    Article  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. In: Advances in neural information processing systems (pp. 5998–6008).

  • Vinyals, O., Fortunato, M., & Jaitly, N. (2015). Pointer networks. arXiv preprint arXiv:1506.03134.

  • Ward, R., Wu, X., & Bottou, L. (2018). Adagrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization. arXiv preprint arXiv:1806.01811

  • Xiao, Y., Zhao, Q., et al. (2012). Development of a fuel consumption optimization model for the capacitated vehicle routing problem. Computers & Operations Research, 39(7), 1419–1431.

    Article  Google Scholar 

  • Yang, H. (2021). Extended attention mechanism for TSP problem. In 2021 International joint conference on neural networks (IJCNN).

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under grant number 71971041; in part by the Outstanding Young Scientific and Technological Talents Foundation of Sichuan Province under grant number 2020JDJQ0035; and in part by the Major Program of National Social Science Foundation of China under Grant 20&ZD084.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunqiang Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, Y., Wu, H., Yin, Y. et al. An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem. Ann Oper Res 339, 517–536 (2024). https://doi.org/10.1007/s10479-022-04788-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-022-04788-z

Keywords