Elsevier

Knowledge-Based Systems

Volume 212, 5 January 2021, 106638
Knowledge-Based Systems

A sampling method based on distributed learning automata for solving stochastic shortest path problem

https://doi.org/10.1016/j.knosys.2020.106638Get rights and content

Abstract

This paper studies an iterative stochastic algorithm for solving the stochastic shortest path problem. This algorithm, which uses a distributed learning automata, tries to find the shortest path by taking a sufficient number of samples from the edges of the graph. In this algorithm, which edges to be sampled are determined dynamically as the algorithm proceeds. At each iteration of this algorithm, a distributed learning automata used to determine which edges to be sampled. This sampling method, which uses distributed learning automata, reduces the number of samplings from those edges, which may not be along the shortest path, and resulting in a reduction in the number of the edges to be sampled. In this paper, we propose a new method for analysis of the algorithm. The method proposed in this paper, unlike the previous ones, which were based on the Martingale theory, is based on the sampling. The proof given in this paper, unlike the previous ones, which were for a specific input graph, is not restricted to the input graph and is general. We also show that as the number of samples taken from the edges increases, the probability of finding the shortest path also increases. Experimental results obtained from some benchmark stochastic graphs and some random graphs also confirm the theoretical results.

Introduction

Many problems can be modeled as finding the shortest path in weighted graphs. In many applications, such as travel time in intelligent transportation systems, length of edges are random variables, and these graphs are called stochastic graphs. A stochastic graph, G, can be defined by a triple G=(V,E,F) comprising a set V of nodes with a set E of edges, which are ordered pairs of elements of nodes, and the probability distribution F associated to edges of the graph describing the statistics of edge lengths. In this graph, the length of edge (i,j) is sampled from distribution defined by fij. Node j is successor of node i and node i is predecessor of node j. We also assume that the length of edge (i,j) is a positive-valued random variable with fij as its probability density function. Also, it assumed that these probability density functions are unknown to the algorithm, in which the algorithm only has access to samples taken from the corresponding distribution. For example, considering transportation networks, the intersections represent nodes of the graph, and each edge represents the street/road between the two corresponding nodes. In this network, measures such as the travel time and the cost of a traveling show the edge length. The travel time is a random variable with unknown distribution. The travel time in each street/road depends on the traffic situation on that edge (street/road). When traveling from each street/road, we have a sample travel time that is sampled from the corresponding distribution. This problem in intelligent transportation systems is called vehicle route guidance systems (RGS). The goal of solving the RGS problem is the design of an algorithm for finding the optimal route (path) from the origin to the destination. This optimal route between an origin and destination for most RGS is defined as the one with minimum expected travel time [1].

A sequence vi1,vi2,,vini of distinct nodes of the graph with the property that there is an edge between every two consecutive pair of nodes, (vij,vij+1)E (for 1j<ni), is called simple path, where ni is the number of nodes in the given path. Since each edge length is a random variable, the goal is to find its expected value. Let Cuv denotes a sample length of edge (u,v) taken from the probability density function fuv and let C̄uv denotes the expected length of the edge (u,v). We denote a simple path from vs to vd as π and its expected length as L̄π. The expected length of path π, which is denoted by L̄π, equals to j=1ni1C̄ijij+1. We also assume that the graph has t distinct simple paths Π={π1,π2,,πt} from vs to vd. A simple path with the minimum expected length from vs and to vd is called the shortest path and denoted by π, i.e. the shortest path π has expected length of L̄π=minπΠ{L̄π}. This means that path π has the smallest expected length among all paths from the source node vs to the destination node vd. The problem of finding the shortest path in a stochastic graph is called stochastic shortest path problem (SSPP). A simple algorithm for solving this problem is to take a sufficient number of samples from edge length Cuv of each edge (u,v) and then estimate the expected length C̄uv of that edge. Finally, by using an algorithm for solving shortest path problems, such as Dijkstra or Floyd–Warshal, we can find the shortest path from the source node vs to the destination node vd. This approach has two main problems: the first one is that we need a lot of samples from each edge for estimating the expected length with high confidence and the second one is that we need a lot of samples from the edges that do not belong to the shortest path. The goal of the approach proposed in this paper is to take more samples from edges belonging to the shortest path and take a smaller number of samples from the edges that do not belong to the shortest path.

Several algorithms are proposed in the literature for solving the SSPP. These algorithms can be classified into three groups based on the time that values of edge lengths are learned [2]. (1) The lengths of edges are learned before traversing a path [3]. (2) The length of an edge is never learned or become known after traversing the shortest path [4]. (3) The lengths of edges are learned as traversing different paths in the graph.

Let π be a path and the probability that this path is shortest be denoted by Pπ. An approach for finding Pπ is to use distributed learning automata (DLA), in which a network of learning automata cooperates for solving the stochastic shortest path problem. Several DLA-based algorithms are proposed in the literature for solving the SSPP. These algorithms fall in the third group in which the lengths of edges are learned as traversing different paths in the graph. Beigy and Meybodi proposed some DLA-based algorithms for solving SSPP [5], [6]. The convergence of these algorithms was shown by using Martingale Theorem. Meybodi and Meybodi proposed an algorithm based on an extended version of DLA for solving the SSPP and its convergence was shown by using Martingale Theorem [7]. Vahidipour et al. proposed an algorithm for solving the SSPP by using a combination of LAs and Pertri-nets [8]. This algorithm explores different paths in parallel by sending different tokens. The convergence of this algorithm was shown by using a continuous-time Markov chain. The combination of LA and Petri-nets was also generalized for solving different graph problems [9].

In this paper, an iterative algorithm using DLA for solving the SSPP, which falls in the third group, is studied. This algorithm tries to find the shortest path, π, from the set of paths Π={π1,,πt} by taking a sufficient number of samples from the lengths of their edges. This algorithm dynamically determines the next edge to be sampled. This method reduces the number of unnecessary sampling from the edges that are not along the shortest path resulting reduction of the overall number of samples. The algorithm studied in this paper is obtained by a modification on the algorithm given in [6], by changing the updating rule of learning rates but the different method was used for proof of convergence. The convergence of the algorithms proposed in [5], [6] was shown by using Martingale theory, and the outline of proof is given and then the convergence of the algorithms was proved only for a five nodes graph. When the size of the graph becomes larger, the derivation of equations using the methods given in [5], [6] becomes very complicated or even intractable. Hence, the convergence of algorithms are not applicable for general graphs and must be done for every input graph. In this paper, a new method of proof based on the sampling, which is different from [5], [6], is used to prove that if each LA of DLA uses the linear reward-inaction learning LRI algorithm, then the algorithm finds the shortest path with a high probability. The LRI algorithm only updates its state when the environment gives a reward to the selected action. The proof shows that if the parameters of the algorithm are chosen properly, it finds the path with the minimum expected length with probability as close to unity as desired. Hence, the convergence proof given for this algorithm in contrast to the previously reported convergence method for DLA-based algorithms is based on sampling and can be used for the arbitrary graph. We also evaluate the modified method using some benchmark stochastic graphs, some randomly generated graphs, and some Kronecker graphs. The experimental results have shown that the modified algorithm outperforms the algorithms reported in [5], [6]. Specifically, the algorithm given in this paper needs a smaller number of iterations and a smaller number of samples from the edges of the graph than the algorithm given in [5], [7] to find the shortest path at approximately the same convergence rate. But the modified algorithm increases the rate of convergence at the cost of increasing the number of iterations needed for finding the shortest path and the number of samples taken from the edges of the graph than the algorithm given in [6].

The rest of this paper is organized as follows: The related work is given in Section 2. LA and DLA are briefly described in Section 3. The modified algorithm for solving the stochastic shortest path problem is given in Section 4. In this section, we also study the convergence of the proposed algorithm and introduce a new method for proving the convergence of the algorithm. In Section 5, the experimental results are reported. In Section 6, we study the computational complexity of the modified algorithm and compare it with the computational cost of three other related algorithms. Finally, Section 7 concludes the paper.

Section snippets

Related work

In this section, we give a brief review of the related work on algorithms for solving the stochastic shortest path problem. These algorithms can be categorized into two main groups: the first group aims to find a priori solution that minimizes the expected length, and the second group computes an on-line solution. Some problems for general stochastic graphs have been analyzed in [3], [10]. The unique arcs concept was studied in [11], and the uniformly directed cuts were used for the analysis of

Distributed learning automata

The learning in learning automata (LA), which is an interconnection of a stochastic automaton and a random environment, is to find the best action from its action set. At the stage k, the stochastic automaton selects its action from the finite set of r actions, denoted by α̲=α1,,αr, using the action probability vector p̲(k)=p1(k),,pr(k), where pi(k) is the probability of selecting action αi at stage k. Let this action denoted by α(k)=αi, which applied to the random environment. The random

Algorithm for finding the stochastic shortest path

In this section, we propose a DLA-based algorithm for finding a path with the minimum expected length (the shortest path) in a stochastic graph. This algorithm is a modified version of the algorithm given in [6] in which the updating rule for the learning rate has been changed. In this algorithm, the stochastic graph is the random environment for the DLA. We build a DLA based on the given graph. Each node of the DLA corresponds to a node in the graph. The number of actions for every LA in the

Experiments

In this section, we evaluate the performance of the proposed algorithm using different stochastic graphs. In this section, we first give the graphs used for evaluating the algorithm, next give the performance measures being used for evaluation of the given algorithms, and finally report the experimental results.

Computational complexity of the algorithms

In this section, we study the computational complexity of the following four algorithms: the proposed algorithm and the three algorithms given in [5], [6], [7]. The computation times of these four algorithms are the order of O(rkt), where r is the number of actions of LA, k is the time needed to select a path and update the action probability vectors for that path, and t is the number of iterations that algorithm needs to find the shortest path. Hence, the computation time of the algorithms

Conclusions

In this paper, we studied a DLA-based algorithm for finding the shortest path in stochastic graphs. The method given in this paper provides a way that can be used to choose a path from the source node to the destination node with minimal expected length. A new method of proof is proposed for an arbitrary graph, which can be used for an arbitrary input graph. It is shown that when all LAs of DLA use a LRI algorithm, the modified algorithm finds the shortest path with probability as close to

CRediT authorship contribution statement

Hamid Beigy: Conceptualization, Methodology, Software, Proof, Writing. Mohammad Reza Meybodi: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank anonymous reviewers for their time, valuable comments, constructive criticism, and suggestions, which greatly improved the paper. The authors also would like to thank Mr. Mahmoud Karimian for his help.

References (46)

  • PolychronopoulosG.H. et al.

    Stochastic shortest path problems with recourse

    Networks

    (1990)
  • FrankH.

    Shortest path in probabilistic graphs

    Oper. Res.

    (1969)
  • MirchandaniP.B. et al.

    Shortest distance and reliability of probabilistic networks: A case with temporary preferences

    Comput. Oper. Res.

    (1985)
  • BeigyH. et al.

    Utilizing distributed learning automata to solve stochastic shortest path problems

    Int. J. Uncertain. Fuzziness Knowl.-Based Syst.

    (2006)
  • BeigyH. et al.

    An iterative stochastic algorithm based on distributed learning automata for finding the stochastic shortest path in stochastic graphs

    J. Supercomput.

    (2020)
  • MeybodiM.R.M. et al.

    Extended distributed learning automata: An automata-based framework for solving stochastic graph optimization problems

    Appl. Intell.

    (2014)
  • VahidipourS.M. et al.

    Finding the shortest path in stochastic graphs using learning automata and adaptive stochastic Petri nets

    Int. J. Uncertain. Fuzziness Knowl.-Based Syst.

    (2017)
  • PritskerA.A.B.

    Application of multi-channel queuing results to the analysip of conveyor systems

    J. Ind. Eng.

    (1966)
  • MartinJ.J.

    Distribution of the time through a directed acyclic network

    Oper. Res.

    (1965)
  • SizalC.U. et al.

    The use of cutsets in Monte-Carlo analysis of stochastic networks

    Math. Comput. Simulation

    (1979)
  • MirchandaniP.B.

    Shortest distance and reliability of probabilistic networks

    Comput. Oper. Res.

    (1970)
  • SigalC.C. et al.

    The stochastic shortest route problem

    Oper. Res.

    (1980)
  • AlexopoulosC.

    State space partitioning methods for stochastic shortest path problems

    Networks

    (1997)
  • View full text