A multi-period TSP with stochastic regular and urgent demands

doi:10.1016/j.ejor.2006.12.040

European Journal of Operational Research

Volume 185, Issue 1, 16 February 2008, Pages 122-132

https://doi.org/10.1016/j.ejor.2006.12.040 Get rights and content

Abstract

In this paper, we study the multi-period TSP problem with stochastic urgent and regular demands. Urgent demands have to be satisfied immediately while regular demands can be satisfied either immediately or the day after. Demands appear stochastically at nodes. The objective is to minimize the average long-run delivery costs, knowing the probabilities governing the demands at nodes.

The problem is cast as a Markov Process with Costs and, at least in principle, can be solved using an approach originally proposed by Howard [R.A. Howard, Dynamic Programming and Markov Processes, MIT, Cambridge, USA, 1960] for Markov Processes with Rewards. However, the number of states of the Markov Process grows exponentially with the number of nodes in the network which pose a limit on the dimension of the instances that are computationally tractable. We suggest a second Markov approach considering a system (Aggregate Model) whose number of states grows only polynomially with the number of nodes in the network. The important relations between the optimal solutions of the original and the aggregate models will be discussed. Finally, we also propose a hybrid procedure which combines both models. The viability of the proposed methodology is shown by applying the procedure to a numerical example.

Introduction

The Travelling Salesman Problem (TSP) consists of determining a minimum distance tour visiting each client only once. This is one of the most widely studied combinatorial optimization problems because, although easy to define it is difficult to solve. It is the foundation of many logistic and distribution problems. Moreover, a lot of different permutation problems may be also reduced to it, such as those arising in job sequencing, wallpaper cutting and hole punching to mention just a few among many fields of application. Many variations of the problem have been considered. In this paper, we study the multi-period TSP with stochastic urgent and regular demands (MTSP-DSs), which is a new problem to the best of our knowledge. As we describe in greater detail in Section 2, it can be defined as follows.

Given a network, the demands for service stochastically arise from the nodes. Demand at each node in a given day may be either urgent or regular according to known probabilities. Nodes with urgent demand must be served the same day while those with regular demand must be served either the same day or the following day. The problem faced each day is (1) to decide which regular nodes will be served on the same day and (2) to solve a TSP for serving the selected nodes (including of course all urgent nodes). The objective is to minimize the average daily cost of serving all demands.

A similar version of the described problem has been recently tackled by Angelelli et al. [1], [2]. In their work, the authors do not have any type of information on the future demands. They present simple on-line algorithms – which do not use any type of information on future demands – for which they study their competitive ratio. The competitive ratio is a measure of quality of an on-line algorithm, in terms of a ratio between the solution computed by the algorithm and the optimal solution knowing the evolution of demands over time.

In our version of the problem, we suppose that the demands’ probability distributions are known. Therefore, we deal with a multi-period stochastic TSP with the additional feature on the two types of demands. The two types of demands somewhat resemble the two-period TSP, which TSP has been first proposed by Butler et al. [3]. It originated from dairy farm milk collection in the county of Dublin, where some farms need everyday pick-up and others require every other day pick-up. Therefore, in the two-period TSP the problem is to identify two tours with a combined distance that is minimized and such that:

1.
each farm requiring daily pick-up is visited exactly once by each tour;
2.
each farm requiring every other day pick-up is visited exactly once by only one of the tours.

More specifically, Butler et al. [3] dealt with the symmetric version of the problem and provide several integer programming formulations. They also gave a set of valid inequalities obtained as generalization of the valid inequalities developed for the symmetric TSP.

The stochastic TSP is a TSP which explicitly takes uncertainty into account. Common examples of uncertainties are demands, travel times, link or route failures and so on. In this paper, we focus on the case where the set of customers to be visited is not known with certainty. Each customer, j, has a probability, p_j, of being present. Stochastic TSPs differ from their deterministic counterpart in several fundamental respects. The concept of a solution is different and several fundamental properties of the deterministic problem no longer hold in the stochastic case. The way of modelling optimisation problems and, more specifically, stochastic optimisation problems, is not unique and depends on the scope, data and algorithms that are available. Several approaches and methodologies have been developed to handle stochastic TSPs. For a survey we refer the reader to Gendreau et al. [7] or to Powell et al. [14].

A possible approach to deal with stochastic TSPs is to use two-stage stochastic programs where a planned or a priori solution is determined at the first stage. Then, at the second stage, once the realisations of the random variables are disclosed, a recourse or corrective action is applied to the first stage decision. A different approach, to deal with stochastic TSPs, is the one proposed by Jaillet [10]. His work addressed a TSP in which the number of points to be visited is a random variable. On any given instance only a subset S of n given points must be visited. The aim is to find an a priori “best” tour through the n points. On any specific realization of the given instance, the subset S of present points will be visited in the same order as they appear in the a priori tour, thereby skipping those points which are not present in that problem instance. Even though our problem resembles, in some aspects, the problem studied by Jaillet, our approach cannot be cast in the framework of a priori optimisation (which might be considered as a particular case of the two-stage problem) since the visiting order is not fixed a priori and must be computed for each possible realization in each epoch while minimizing the long-run average cost.

Herein, we provide a general framework for such a class of problems. In particular, we formalize the problem as a Markov decision process (MDP). A Markov decision process is a discrete-stage sequential decision making problem in a stochastic environment. The key feature of such a process is that the transition probability from the current state to that at the next decision epoch depends only on the current state and not on earlier states and actions of the process. Note that, the definition of “state” is fundamental in determining whether or not a stochastic process is Markov. For a comprehensive review of Markov stochastic processes, see for instance the survey by White and White [16] or the textbook by Puterman [15].

Markov decision processes are not new for modelling routing problems. Dror et al. [6] showed how to model a stochastic VRP as a Markov decision process. They also advised on the large number of states the model could involve and the requirement of some form of relaxation to solve practical instances. However, the authors gave only a description of this approach with the purpose of providing new insight into the problem. Minkoff [11] used a Markov decision process to solve the vehicle dispatching decision problem and propose a heuristic approach based on the approximation of the expected future costs for solving large instances of the problem.

In this paper, we begin with casting the MTSP-DSs as a Markov decision process to exploit the good theoretical properties of MDP, from here on named Exact Model. Since in real applications the number of different states may actually be prohibitively large, we also propose an aggregate MDP that is an approximation of the exact one but contains a manageable number of states, herein referred to as Aggregate Model. As we will see, the number of states in the Exact Model grows exponentially with n (the number of nodes in the network) and only polynomially in the Aggregate Model. For this reason the Exact Model cannot be implemented in real applications.

The issue of formalizing MDPs with a smaller number of states is not new in the literature. A natural way to construct an approximate model is to let the new state and decision spaces be subsets of the original state and decision spaces then define the new reward and transition structure using the reward and transition structure of the original MDP (see [17]). The optimal policy of the Aggregate Model suggests a suboptimal policy for the Exact Model. Bounds indicating the quality of the suboptimal design are presented in [18], [19]. However, these results, as well as those presented by Porteus [13], are valid for discounted MDP and rely on the theory of monotone contraction operators (see [4]). Therefore, all these results cannot be applied to our model since it minimizes the average undiscounted cost. Analogously, we cannot adopt the bounds proposed by Odoni [12] since they require a number of computations proportional to at least to the square of the number of states in the exact MDP, which is unfortunately exponential in the number of nodes of the network. Although the mentioned results cannot be applied to our specific problem, we will show the important relations between the optimal solutions of the Exact and Aggregate models.

Finally, we also propose a hybrid procedure, where the first stage decision is optimally taken considering the information about the current state and approximating future costs using those obtained from the Aggregate Model.

The MTSP-DSs is obviously NP-hard since it requires the solution of (many) TSP problems. Moreover, even if an oracle would provide the solution of any TSP in constant time, to find an optimal policy, we would still need to associate an optimal decision to each possible state.

The paper is organized as follows. In Section 2, we describe the real application of the MTSP-DSs and we also provide an optimization model. The exact procedure to solve the problem is described in Section 3. The Aggregate Model and the hybrid procedure are respectively presented in Section 4 and Section 6. In Section 5, we show a numerical example of the proposed procedures. Final considerations and conclusions are presented in the last section.

Section snippets

Problem definition

The MTSP-DSs is motivated by a real-world application concerning blood delivery (see [5], [8]).

The Austrian Red Cross (ARC), a non-profit organization, is in charge of delivering blood to hospitals on their request. In current operations, the ARC is obliged to fulfil any order within the following day. This policy leads to high delivery costs and many extra working hours for its drivers. To reduce costs through higher flexibility the ARC is interested in changing policy, in particular is

Exact Markov model

The problem described above can be cast as a Markov Process with Costs and, at least in principle, can be solved using an approach originally proposed by Howard [9] for Markov Processes with Rewards. This approach is an iterative procedure and each iteration consists of two phases. In the first phase a linear system of N inequalities in N variables is solved, where N is the number of different states. In the second phase, for each single state, the “best” decision for that state is found using

Aggregate Markov model

The number of states N in the Exact Model grows exponentially with the number of nodes n, $N = 3^{n}$ . It is therefore impractical for any real size problem to apply the Exact Model (EM). The basic idea in all aggregation techniques is to approximate the original MDP with an MDP having a smaller state (and/or action) space cardinality. Usually, the original MDP state space is partitioned into subsets and each state in the aggregate MDP is associated with one of the subsets. Here, we propose an

Numerical example

As an example, we consider the network depicted in Fig. 1, consisting of a central depot D and 5 possible clients, numbered 1, 2, … , 5. In this example, with $n = 5$ , the number of states is $N = 3^{5} = 243$ . State ( $0, u, 0, r, u)$ means that there are no requests from nodes 1 and 3, there are urgent requests from nodes 2 and 5 and there is a regular request from node 3, while state $(r, r, r, r, r)$ corresponds to all five nodes having a regular request. The decisions available in each state depend on the particular

Hybrid approach for practical implementation

In the previous sections we presented both the exact and the aggregate models. As we have seen, the exact model is characterised by an exponential number of states which poses a limit on the dimensions of the computationally tractable instances. On the other hand, the aggregate model has a number of states which grows only quadratically with the instance dimension, but its solution is in general only suboptimal.

To better address real size instances, we propose a hybrid procedure which combines

Conclusion

In this paper, we study the multi-period TSP problem with stochastic urgent and regular demands, a specific version of the TSP problem. The problem is cast as a Markov Process with Costs to exploit the good theoretical properties of Markov Decision Processes. However, it has the drawback that the number of states grows exponentially with the size of the instance of the problem (number of nodes in the network). It is therefore impractical for any reasonable size problem. Hence, we propose an

References (19)

M. Gendreau et al.
Stochastic vehicle routing
European Journal of Operational Research
(1996)
C.C. White et al.
Markov decision processes
European Journal of Operational Research
(1989)
E. Angelelli, M.W.P. Savelsbergh, M.G. Speranza, A dynamic multi-period routing problem. Technical Report no. 251,...
E. Angelelli, M.W.P. Savelsbergh, M.G. Speranza, Competitive analysis of a dispatch policy for a dynamic multi-period...
M. Butler et al.
The two-period travelling salesman problem applied to milk collection in Ireland
Computational Optimization and Applications
(1997)
E.V. Denardo
Contraction mappings in the theory of dynamic programming
Siam Review
(1967)
K.F. Doerner, W.J. Gutjahr, R.F. Hartl, G. Lulli, A probabilistic two-day delivery vehicle routing problem, in:...
M. Dror et al.
Vehicle routing with stochastic demands: Properties and solution framework
Transportation Science
(1989)
V. Hemmelmayr et al.
Delivery Strategies for Blood Products Supplies
(2006)

There are more references available in the full text version of this article.

Cited by (10)

Dynamic vehicle routing with random requests: A literature review
2023, International Journal of Production Economics
Citation Excerpt :
Furthermore, some authors consider different priority classes of requests. In Andreatta and Lulli (2008), a blood center delivers blood to hospitals in response to two types of requests: urgent requests that must be served on the same day and regular requests that should be served within two days. Laganà et al. (2021) classify parcel delivery requests into three dynamic priority classes: urgent, prominent, and unimportant.
Stimulated by the growing demand for logistics services and the advances in information technologies, research interest in the dynamic vehicle routing problem with random requests (DVRPRR) increased over recent decades. The DVRPRR differs from classical vehicle routing in that the customer requests are not fully known in advance but arrive dynamically during the execution of planned routes. Various approaches were developed to deal with this dynamism and the potential stochasticity. This paper provides a comprehensive and in-depth review of the existing DVRPRR literature. We propose a novel taxonomy that identifies four DVRPRR variants with different request types and planning horizons. We also analyze the research on each variant from the perspectives of problem settings, decision strategies, solution approaches, etc. Finally, we summarize the state of the art of the DVRPRR and suggest promising directions for future research.
Is uncertainty always bad for the performance of transportation systems?
2021, Communications in Transportation Research
Uncertainty is usually perceived as having negative effects on transportation systems, such as increasing operation cost, decreasing resource utility, and reducing customer satisfaction. However, it is unclear whether this perception is universally true or is true only under certain conditions. This research compares the performance of transportation systems with uncertain parameters with the performance of the same systems in which the uncertain parameters are replaced by their expectations. The analyses prove that uncertainty can have negative, negligible, and positive impact on the performance of transportation systems under different conditions.
A multi-objective dynamic vehicle routing problem with fuzzy time windows: Model, solution and application
2014, Applied Soft Computing Journal
Citation Excerpt :
This section shows the implementation process of the proposed model and its application in distribution of the needed blood bags of hospitals and clinics from one or more central distribution centers. The similar case was studied before by [45,46] with same basic definitions and now it is tried to develop it by the proposed conception of fuzzy time windows and dynamic requests in a defined structure. In the mentioned distribution plan, some hospitals and clinics that have the daily demands are exist and they are considered as determined requests.
In this paper, a multi-objective dynamic vehicle routing problem with fuzzy time windows (DVRPFTW) is presented. In this problem, unlike most of the work where all the data are known in advance, a set of real time requests arrives randomly over time and the dispatcher does not have any deterministic or probabilistic information on the location and size of them until they arrive. Moreover, this model involves routing vehicles according to customer-specific time windows, which are highly relevant to the customers’ satisfaction level. This preference information of customers can be represented as a convex fuzzy number with respect to the satisfaction for a service time. This paper uses a direct interpretation of the DVRPFTW as a multi-objective problem where the total required fleet size, overall total traveling distance and waiting time imposed on vehicles are minimized and the overall customers’ preferences for service is maximized. A solving strategy based on the genetic algorithm (GA) and three basic modules are proposed, in which the state of the system including information of vehicles and customers is checked in a management module each time. The strategy module tries to organize the information reported by the management module and construct an efficient structure for solving in the subsequent module. The performance of the proposed approach is evaluated in different steps on various test problems generalized from a set of static instances in the literature. In the first step, the performance of the proposed approach is checked in static conditions and then the other assumptions and developments are added gradually and changes are examined. The computational experiments on data sets illustrate the efficiency and effectiveness of the proposed approach.
Heuristic algorithms for the 2-period balanced Travelling Salesman Problem in Euclidean graphs
2011, European Journal of Operational Research
Citation Excerpt :
In 2002 Paletta [23] presents a new heuristic algorithm for the PTSP, improved in Bertazzi et al. [8] in 2004. Other works are involved with PVRP ([14,7]), with asymmetric PTSP [24], with dynamic versions of multiperiod TSP ([2–4]) and multiperiod VRP ([1,16]). Periodic arc routing problems are also considered in literature: see for example [15].
In the 2-period Travelling Salesman Problem some nodes, called double nodes, are visited in both of two periods while the remaining ones, called single nodes, are visited in either one of the periods. In this paper we study the case in which a balance constraint is also introduced. We require that the difference between the number of visited nodes in the two periods must be below a fixed threshold. Moreover, we suppose that distances between nodes are Euclidean. The problem is NP-hard, and exact methods, now available, appear inadequate. Here, we propose three heuristics. Computational experiences and a comparison between the algorithms are also given.
Robust multiperiod vehicle routing under customer order uncertainty
2021, Operations Research
Modeling of periodic location routing problem with time window and satisfaction dependent demands
2016, IEEE International Conference on Industrial Engineering and Engineering Management

View all citing articles on Scopus

View full text

Discrete OptimizationA multi-period TSP with stochastic regular and urgent demands

Abstract

Introduction

Section snippets

Problem definition

Exact Markov model

Aggregate Markov model

Numerical example

Hybrid approach for practical implementation

Conclusion

European Journal of Operational Research

European Journal of Operational Research

The two-period travelling salesman problem applied to milk collection in Ireland

Computational Optimization and Applications

Contraction mappings in the theory of dynamic programming

Siam Review

Vehicle routing with stochastic demands: Properties and solution framework

Transportation Science

Delivery Strategies for Blood Products Supplies

Discrete Optimization
A multi-period TSP with stochastic regular and urgent demands