A queueing network-based distributed Laplacian solver for directed graphs
Introduction
Solving a Laplacian system of equations forms a core routine for a number of fields, including network analysis, computer vision, operations research, and machine learning. Following Spielman and Teng's [1] pioneering work proposing a quasi-linear time algorithm for solving undirected Laplacians, a powerful collection of algorithmic tools have been developed for the undirected case. On the other hand, similar developments have remained elusive for the directed graphs. In a recent line of work, Cohen et al. [2], [3] extensively studied directed Laplacians and gave a new class of nearly-linear time algorithms for such asymmetric linear systems. However, all these solvers are centralized. Recently, Tang and Mei [4] studied a consensus-based distributed algorithm for the directed Laplacians, which they show converges to a solution at a geometric rate. However, they do not characterize this rate in terms of the underlying graph parameters. Thus, to the best of our knowledge currently there is no such algorithm for directed graphs. In this work, we endeavor to fill that gap by describing a simple distributed algorithm to solve a large and useful class of Laplacian systems that includes the electrical flow computation, which in turn is used as a primitive for computing max flows [5], random spanning trees [6], graph sparsification [7], and computation of current flow closeness centrality [8]. Also, our method is a completely new approach for solving Laplacian systems based on the theory of queueing networks, multidimensional Markov chains, and random walks.
In particular, we first formulate a stochastic problem that captures the properties of Laplacian systems of the form with a constraint that only one element in b is negative, where such that is the diagonal out-degree matrix and is the adjacency matrix of the directed graph. We call such systems “one-sink” Laplacian systems since the stochastic process can be viewed as a network in which some nodes are generating data, and there is a single sink that collects all the packets that it receives. We call our stochastic process the “Data Collection Process” and show that this process has an equivalence to the one-sink Laplacian system at stationarity. The main challenge while handling the directed graphs is the computation of the stationary distribution of its corresponding Markov chain. Our proposed solver works by quickly approximating the stationary distribution of a multidimensional Markov chain induced by the queueing network of the Data Collection Process. We show that when this multidimensional chain is ergodic the vector whose vth coordinate is proportional to the probability at stationarity of the queue at v being non-empty is a solution to . We estimate this solution in a natural way by statistically determining the proportion of time slots for which the queues are empty. A fast-mixing property of the multidimensional chain allows us to achieve the running time result.
Furthermore, we consider a synchronous model of communication such that in each round all nodes in parallel can send a message of size to one of its neighbors and perform local computation on messages received in previous rounds from its neighbors. Under this model, our solver takes distributed rounds to solve one-sink Laplacian systems. However, since under this model each node can get at most messages from its neighbors, where is the maximum in-degree of the graph, so simulating each round in worst-case would require time. In Table 1, we present the running time of our solver for various graphs where we add a factor of to the distributed round complexity to account for multiple reception by a node in each round. This will not affect the constant degree graphs while giving an extra for graphs whose in-degree grows with the number of vertices.
Section snippets
An equivalent stochastic process: data collection
In this section, we define a stochastic process on a graph that has the property that at stationarity its steady-state behavior provides the solution to a one-sink Laplacian system. The process is formally defined as follows. Definition 1 Data Collection Process On a strongly connected directed network with a weight function of directed edges, we identify a distinguished sink node, that passively collects data, and a set of data sources . Each node in has a queue in which it can store data
Our model
We assume a synchronous model of communication which we call as GP-CONGEST based on its relation with the standard CONGEST model [12] and gossip [13] algorithms. In this model, all nodes wake up simultaneously and the time is divided into rounds. In each round, each node in the network can send a message of size to one of its neighbors and perform local computation based on the information obtained from the messages received from its neighbors in previous rounds. However, note that by
Distributed solver algorithm
We now present the main distributed algorithm. Our main solver algorithm, DRW-LSolve, is directed by a single node, the sink , which we call as the controller node. The algorithm takes as input , ϵ and κ. The weight function w specifies the Laplacian matrix L. The error of the solution to is controlled by the parameter ϵ. The fourth parameter κ is a user-defined granularity factor which expresses the fact that our algorithm is not able to accurately approximate those coordinates of
Our results
In this section, we present our main results, which extend the results we present for the undirected case in [14] to the directed case. In general, for directed graphs, natural extensions to solving linear systems in graph Laplacians involve computation of the stationary distribution of a random walk. Unlike undirected graphs for which the stationary distribution is proportional to the vertex's degree, this step is non-trivial for directed graphs. Moreover, for undirected graphs the Laplacian
Conclusion
Although our main result is presenting a distributed algorithm for solving one-sink Laplacians systems on strongly connected graphs, our solver implicitly approximates the stationary distribution of its corresponding Markov chain. In itself, this is a key contribution because the computation of stationary distribution is one of the main discrepancies between the directed and undirected settings. It is also one of the prime steps for the PageRank algorithm and is required for the normalization
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (15)
- et al.
Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems
- et al.
Faster algorithms for computing the stationary distribution, simulating random walks, and more
- et al.
Solving directed Laplacian systems in nearly-linear time through sparse LU factorizations
- et al.
Distributed algorithms for solving a linear equation under a directed graph
- et al.
Electrical flows, Laplacian systems, and faster approximation of maximum flow in undirected graphs
- et al.
Fast generation of random spanning trees and the effective resistance metric
- et al.
Graph sparsification by effective resistances
SIAM J. Comput.
(2011)
Cited by (4)
A stochastic process on a network with connections to Laplacian systems of equations
2022, Advances in Applied ProbabilityA Queueing Network-Based Distributed Laplacian Solver
2021, AlgorithmicaLower bounds for in-network computation of arbitrary functions
2021, Distributed ComputingA Group-to-Group Version of Random Walk Betweenness Centrality
2020, ACM International Conference Proceeding Series