research-article

EFECTIW-ROTER: Deep Reinforcement Learning Approach for Solving Heterogeneous Fleet and Demand Vehicle Routing Problem With Time-Window Constraints

Authors:

Arash Mozhdehi,

Mahdi Mohammadizadeh,

Yunli Wang,

Sun Sun,

Xin WangAuthors Info & Claims

SIGSPATIAL '24: Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems

Pages 17 - 28

https://doi.org/10.1145/3678717.3691208

Published: 22 November 2024 Publication History

Get Access

Abstract

The heterogeneous fleet and demand vehicle routing problem with time-window constraints (HFDVRPTW) is a crucial optimization problem of significant importance in real-world logistics operations. In this paper, we propose a deep reinforcement learning (DRL)-based method, termed spatial Edge-Feature EnhanCed mulTIgraph fusion encoder With spectral-based embedding and hieRarchical decOder with learnable TEmpoRal positional embedding (EFECTIW-ROTER, pronounced "Effective Router"), to tackle this complex and practical optimization problem. EFECTIW-ROTER utilizes two sparse graphs to represent node connectivity, where nodes correspond to customers and the depot. This sparsity results from the time-window constraints and customers' demand relative to the list of acceptable vehicle attributes specified for service within a heterogeneous fleet, determined by the reachability of the nodes based on these two factors. Leveraging two graph Transformer models, EFECTIW-ROTER's encoding module captures the interactions between the nodes based on these factors. One model encodes customers' heterogeneous demand with spatial edge features based on travel time between the nodes, while the second employs temporal positional embeddings to capture temporal relationships based on time-window ordering. A fusion model is introduced to integrate node interactions based on these graphs. Additionally, a spectral-attention-based pooling ensures effective state representation for the DRL-based method. EFECTIW-ROTER features a hierarchical attention decoder operating in two stages: heterogeneous vehicle selection and node selection. Enhanced with positional embeddings, the decoder is empowered to make effective routing decisions based on time-window constraints' ordering. Experimental results using real-world traffic data from two major Canadian cities confirm EFECTIW-ROTER's better performance over current state-of-the-art DRL-based and heuristic methods. EFECTIW-ROTER reduces travel times while also achieving faster computational times when compared to conventional heuristics. Additional experiments demonstrate its generalizability across larger instances.

References

[1]

Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. 2017. Neural Combinatorial Optimization with Reinforcement Learning. https://openreview.net/forum?id=rJY3vK9eg

Abstract

References

Index Terms

Recommendations

Reinforcement Learning for Solving Multiple Vehicle Routing Problem with Time Window

Deep Reinforcement Learning Algorithm for Fast Solutions to Vehicle Routing Problem with Time-Windows

Vehicle routing problem with a heterogeneous fleet and time windows

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations