Discrete Optimization
A multi-period TSP with stochastic regular and urgent demands

https://doi.org/10.1016/j.ejor.2006.12.040Get rights and content

Abstract

In this paper, we study the multi-period TSP problem with stochastic urgent and regular demands. Urgent demands have to be satisfied immediately while regular demands can be satisfied either immediately or the day after. Demands appear stochastically at nodes. The objective is to minimize the average long-run delivery costs, knowing the probabilities governing the demands at nodes.

The problem is cast as a Markov Process with Costs and, at least in principle, can be solved using an approach originally proposed by Howard [R.A. Howard, Dynamic Programming and Markov Processes, MIT, Cambridge, USA, 1960] for Markov Processes with Rewards. However, the number of states of the Markov Process grows exponentially with the number of nodes in the network which pose a limit on the dimension of the instances that are computationally tractable. We suggest a second Markov approach considering a system (Aggregate Model) whose number of states grows only polynomially with the number of nodes in the network. The important relations between the optimal solutions of the original and the aggregate models will be discussed. Finally, we also propose a hybrid procedure which combines both models. The viability of the proposed methodology is shown by applying the procedure to a numerical example.

Introduction

The Travelling Salesman Problem (TSP) consists of determining a minimum distance tour visiting each client only once. This is one of the most widely studied combinatorial optimization problems because, although easy to define it is difficult to solve. It is the foundation of many logistic and distribution problems. Moreover, a lot of different permutation problems may be also reduced to it, such as those arising in job sequencing, wallpaper cutting and hole punching to mention just a few among many fields of application. Many variations of the problem have been considered. In this paper, we study the multi-period TSP with stochastic urgent and regular demands (MTSP-DSs), which is a new problem to the best of our knowledge. As we describe in greater detail in Section 2, it can be defined as follows.

Given a network, the demands for service stochastically arise from the nodes. Demand at each node in a given day may be either urgent or regular according to known probabilities. Nodes with urgent demand must be served the same day while those with regular demand must be served either the same day or the following day. The problem faced each day is (1) to decide which regular nodes will be served on the same day and (2) to solve a TSP for serving the selected nodes (including of course all urgent nodes). The objective is to minimize the average daily cost of serving all demands.

A similar version of the described problem has been recently tackled by Angelelli et al. [1], [2]. In their work, the authors do not have any type of information on the future demands. They present simple on-line algorithms – which do not use any type of information on future demands – for which they study their competitive ratio. The competitive ratio is a measure of quality of an on-line algorithm, in terms of a ratio between the solution computed by the algorithm and the optimal solution knowing the evolution of demands over time.

In our version of the problem, we suppose that the demands’ probability distributions are known. Therefore, we deal with a multi-period stochastic TSP with the additional feature on the two types of demands. The two types of demands somewhat resemble the two-period TSP, which TSP has been first proposed by Butler et al. [3]. It originated from dairy farm milk collection in the county of Dublin, where some farms need everyday pick-up and others require every other day pick-up. Therefore, in the two-period TSP the problem is to identify two tours with a combined distance that is minimized and such that:

  • 1.

    each farm requiring daily pick-up is visited exactly once by each tour;

  • 2.

    each farm requiring every other day pick-up is visited exactly once by only one of the tours.

More specifically, Butler et al. [3] dealt with the symmetric version of the problem and provide several integer programming formulations. They also gave a set of valid inequalities obtained as generalization of the valid inequalities developed for the symmetric TSP.

The stochastic TSP is a TSP which explicitly takes uncertainty into account. Common examples of uncertainties are demands, travel times, link or route failures and so on. In this paper, we focus on the case where the set of customers to be visited is not known with certainty. Each customer, j, has a probability, pj, of being present. Stochastic TSPs differ from their deterministic counterpart in several fundamental respects. The concept of a solution is different and several fundamental properties of the deterministic problem no longer hold in the stochastic case. The way of modelling optimisation problems and, more specifically, stochastic optimisation problems, is not unique and depends on the scope, data and algorithms that are available. Several approaches and methodologies have been developed to handle stochastic TSPs. For a survey we refer the reader to Gendreau et al. [7] or to Powell et al. [14].

A possible approach to deal with stochastic TSPs is to use two-stage stochastic programs where a planned or a priori solution is determined at the first stage. Then, at the second stage, once the realisations of the random variables are disclosed, a recourse or corrective action is applied to the first stage decision. A different approach, to deal with stochastic TSPs, is the one proposed by Jaillet [10]. His work addressed a TSP in which the number of points to be visited is a random variable. On any given instance only a subset S of n given points must be visited. The aim is to find an a priori “best” tour through the n points. On any specific realization of the given instance, the subset S of present points will be visited in the same order as they appear in the a priori tour, thereby skipping those points which are not present in that problem instance. Even though our problem resembles, in some aspects, the problem studied by Jaillet, our approach cannot be cast in the framework of a priori optimisation (which might be considered as a particular case of the two-stage problem) since the visiting order is not fixed a priori and must be computed for each possible realization in each epoch while minimizing the long-run average cost.

Herein, we provide a general framework for such a class of problems. In particular, we formalize the problem as a Markov decision process (MDP). A Markov decision process is a discrete-stage sequential decision making problem in a stochastic environment. The key feature of such a process is that the transition probability from the current state to that at the next decision epoch depends only on the current state and not on earlier states and actions of the process. Note that, the definition of “state” is fundamental in determining whether or not a stochastic process is Markov. For a comprehensive review of Markov stochastic processes, see for instance the survey by White and White [16] or the textbook by Puterman [15].

Markov decision processes are not new for modelling routing problems. Dror et al. [6] showed how to model a stochastic VRP as a Markov decision process. They also advised on the large number of states the model could involve and the requirement of some form of relaxation to solve practical instances. However, the authors gave only a description of this approach with the purpose of providing new insight into the problem. Minkoff [11] used a Markov decision process to solve the vehicle dispatching decision problem and propose a heuristic approach based on the approximation of the expected future costs for solving large instances of the problem.

In this paper, we begin with casting the MTSP-DSs as a Markov decision process to exploit the good theoretical properties of MDP, from here on named Exact Model. Since in real applications the number of different states may actually be prohibitively large, we also propose an aggregate MDP that is an approximation of the exact one but contains a manageable number of states, herein referred to as Aggregate Model. As we will see, the number of states in the Exact Model grows exponentially with n (the number of nodes in the network) and only polynomially in the Aggregate Model. For this reason the Exact Model cannot be implemented in real applications.

The issue of formalizing MDPs with a smaller number of states is not new in the literature. A natural way to construct an approximate model is to let the new state and decision spaces be subsets of the original state and decision spaces then define the new reward and transition structure using the reward and transition structure of the original MDP (see [17]). The optimal policy of the Aggregate Model suggests a suboptimal policy for the Exact Model. Bounds indicating the quality of the suboptimal design are presented in [18], [19]. However, these results, as well as those presented by Porteus [13], are valid for discounted MDP and rely on the theory of monotone contraction operators (see [4]). Therefore, all these results cannot be applied to our model since it minimizes the average undiscounted cost. Analogously, we cannot adopt the bounds proposed by Odoni [12] since they require a number of computations proportional to at least to the square of the number of states in the exact MDP, which is unfortunately exponential in the number of nodes of the network. Although the mentioned results cannot be applied to our specific problem, we will show the important relations between the optimal solutions of the Exact and Aggregate models.

Finally, we also propose a hybrid procedure, where the first stage decision is optimally taken considering the information about the current state and approximating future costs using those obtained from the Aggregate Model.

The MTSP-DSs is obviously NP-hard since it requires the solution of (many) TSP problems. Moreover, even if an oracle would provide the solution of any TSP in constant time, to find an optimal policy, we would still need to associate an optimal decision to each possible state.

The paper is organized as follows. In Section 2, we describe the real application of the MTSP-DSs and we also provide an optimization model. The exact procedure to solve the problem is described in Section 3. The Aggregate Model and the hybrid procedure are respectively presented in Section 4 and Section 6. In Section 5, we show a numerical example of the proposed procedures. Final considerations and conclusions are presented in the last section.

Section snippets

Problem definition

The MTSP-DSs is motivated by a real-world application concerning blood delivery (see [5], [8]).

The Austrian Red Cross (ARC), a non-profit organization, is in charge of delivering blood to hospitals on their request. In current operations, the ARC is obliged to fulfil any order within the following day. This policy leads to high delivery costs and many extra working hours for its drivers. To reduce costs through higher flexibility the ARC is interested in changing policy, in particular is

Exact Markov model

The problem described above can be cast as a Markov Process with Costs and, at least in principle, can be solved using an approach originally proposed by Howard [9] for Markov Processes with Rewards. This approach is an iterative procedure and each iteration consists of two phases. In the first phase a linear system of N inequalities in N variables is solved, where N is the number of different states. In the second phase, for each single state, the “best” decision for that state is found using

Aggregate Markov model

The number of states N in the Exact Model grows exponentially with the number of nodes n, N=3n. It is therefore impractical for any real size problem to apply the Exact Model (EM). The basic idea in all aggregation techniques is to approximate the original MDP with an MDP having a smaller state (and/or action) space cardinality. Usually, the original MDP state space is partitioned into subsets and each state in the aggregate MDP is associated with one of the subsets. Here, we propose an

Numerical example

As an example, we consider the network depicted in Fig. 1, consisting of a central depot D and 5 possible clients, numbered 1, 2,  , 5. In this example, with n=5, the number of states is N=35=243. State (0,u,0,r,u) means that there are no requests from nodes 1 and 3, there are urgent requests from nodes 2 and 5 and there is a regular request from node 3, while state (r,r,r,r,r) corresponds to all five nodes having a regular request. The decisions available in each state depend on the particular

Hybrid approach for practical implementation

In the previous sections we presented both the exact and the aggregate models. As we have seen, the exact model is characterised by an exponential number of states which poses a limit on the dimensions of the computationally tractable instances. On the other hand, the aggregate model has a number of states which grows only quadratically with the instance dimension, but its solution is in general only suboptimal.

To better address real size instances, we propose a hybrid procedure which combines

Conclusion

In this paper, we study the multi-period TSP problem with stochastic urgent and regular demands, a specific version of the TSP problem. The problem is cast as a Markov Process with Costs to exploit the good theoretical properties of Markov Decision Processes. However, it has the drawback that the number of states grows exponentially with the size of the instance of the problem (number of nodes in the network). It is therefore impractical for any reasonable size problem. Hence, we propose an

References (19)

  • M. Gendreau et al.

    Stochastic vehicle routing

    European Journal of Operational Research

    (1996)
  • C.C. White et al.

    Markov decision processes

    European Journal of Operational Research

    (1989)
  • E. Angelelli, M.W.P. Savelsbergh, M.G. Speranza, A dynamic multi-period routing problem. Technical Report no. 251,...
  • E. Angelelli, M.W.P. Savelsbergh, M.G. Speranza, Competitive analysis of a dispatch policy for a dynamic multi-period...
  • M. Butler et al.

    The two-period travelling salesman problem applied to milk collection in Ireland

    Computational Optimization and Applications

    (1997)
  • E.V. Denardo

    Contraction mappings in the theory of dynamic programming

    Siam Review

    (1967)
  • K.F. Doerner, W.J. Gutjahr, R.F. Hartl, G. Lulli, A probabilistic two-day delivery vehicle routing problem, in:...
  • M. Dror et al.

    Vehicle routing with stochastic demands: Properties and solution framework

    Transportation Science

    (1989)
  • V. Hemmelmayr et al.

    Delivery Strategies for Blood Products Supplies

    (2006)
There are more references available in the full text version of this article.

Cited by (10)

  • Dynamic vehicle routing with random requests: A literature review

    2023, International Journal of Production Economics
    Citation Excerpt :

    Furthermore, some authors consider different priority classes of requests. In Andreatta and Lulli (2008), a blood center delivers blood to hospitals in response to two types of requests: urgent requests that must be served on the same day and regular requests that should be served within two days. Laganà et al. (2021) classify parcel delivery requests into three dynamic priority classes: urgent, prominent, and unimportant.

  • A multi-objective dynamic vehicle routing problem with fuzzy time windows: Model, solution and application

    2014, Applied Soft Computing Journal
    Citation Excerpt :

    This section shows the implementation process of the proposed model and its application in distribution of the needed blood bags of hospitals and clinics from one or more central distribution centers. The similar case was studied before by [45,46] with same basic definitions and now it is tried to develop it by the proposed conception of fuzzy time windows and dynamic requests in a defined structure. In the mentioned distribution plan, some hospitals and clinics that have the daily demands are exist and they are considered as determined requests.

  • Heuristic algorithms for the 2-period balanced Travelling Salesman Problem in Euclidean graphs

    2011, European Journal of Operational Research
    Citation Excerpt :

    In 2002 Paletta [23] presents a new heuristic algorithm for the PTSP, improved in Bertazzi et al. [8] in 2004. Other works are involved with PVRP ([14,7]), with asymmetric PTSP [24], with dynamic versions of multiperiod TSP ([2–4]) and multiperiod VRP ([1,16]). Periodic arc routing problems are also considered in literature: see for example [15].

  • Modeling of periodic location routing problem with time window and satisfaction dependent demands

    2016, IEEE International Conference on Industrial Engineering and Engineering Management
View all citing articles on Scopus
View full text