A queueing network-based distributed Laplacian solver for directed graphs

https://doi.org/10.1016/j.ipl.2020.106040Get rights and content

Highlights

  • First distributed Laplacian solver analyzed in terms of the underlying directed graph.

  • New approach to solve Laplacians using the theory of queueing networks, Markov chains.

  • Works for systems xTL=bT where only one coordinate of b is negative like electrical flow.

  • Uses equivalence of Laplacian system to steady-state equation of a stochastic process.

  • Running time of O˜(n) for random digraphs.

Abstract

We present a distributed algorithm for solving Laplacian systems on strongly connected directed graphs, the first that can be analyzed in terms of the underlying graph's parameters. Our distributed solver works for a large and important class of Laplacian systems that we call “one-sink” Laplacian systems, which includes the important electrical flow computation problem. Specifically, given Dout as the diagonal out-degree matrix and A as the adjacency matrix of the directed graph with L=DoutA, our solver can produce solutions for systems of the form xTL=bT, where exactly one of the coordinates of b is negative. Our solver takes O˜(thitdmaxin) time (where O˜ hides polylogn factors) to produce an approximate solution where thit is the worst-case hitting time of the random walk on the graph, which is Θ(n) for a large set of important graphs and dmaxin is the maximum in-degree of the graph.

Introduction

Solving a Laplacian system of equations forms a core routine for a number of fields, including network analysis, computer vision, operations research, and machine learning. Following Spielman and Teng's [1] pioneering work proposing a quasi-linear time algorithm for solving undirected Laplacians, a powerful collection of algorithmic tools have been developed for the undirected case. On the other hand, similar developments have remained elusive for the directed graphs. In a recent line of work, Cohen et al. [2], [3] extensively studied directed Laplacians and gave a new class of nearly-linear time algorithms for such asymmetric linear systems. However, all these solvers are centralized. Recently, Tang and Mei [4] studied a consensus-based distributed algorithm for the directed Laplacians, which they show converges to a solution at a geometric rate. However, they do not characterize this rate in terms of the underlying graph parameters. Thus, to the best of our knowledge currently there is no such algorithm for directed graphs. In this work, we endeavor to fill that gap by describing a simple distributed algorithm to solve a large and useful class of Laplacian systems that includes the electrical flow computation, which in turn is used as a primitive for computing max flows [5], random spanning trees [6], graph sparsification [7], and computation of current flow closeness centrality [8]. Also, our method is a completely new approach for solving Laplacian systems based on the theory of queueing networks, multidimensional Markov chains, and random walks.

In particular, we first formulate a stochastic problem that captures the properties of Laplacian systems of the form xTL=bT with a constraint that only one element in b is negative, where L=DoutA such that Dout is the diagonal out-degree matrix and A is the adjacency matrix of the directed graph. We call such systems “one-sink” Laplacian systems since the stochastic process can be viewed as a network in which some nodes are generating data, and there is a single sink that collects all the packets that it receives. We call our stochastic process the “Data Collection Process” and show that this process has an equivalence to the one-sink Laplacian system at stationarity. The main challenge while handling the directed graphs is the computation of the stationary distribution of its corresponding Markov chain. Our proposed solver works by quickly approximating the stationary distribution of a multidimensional Markov chain induced by the queueing network of the Data Collection Process. We show that when this multidimensional chain is ergodic the vector whose vth coordinate is proportional to the probability at stationarity of the queue at v being non-empty is a solution to xTL=bT. We estimate this solution in a natural way by statistically determining the proportion of time slots for which the queues are empty. A fast-mixing property of the multidimensional chain allows us to achieve the running time result.

Furthermore, we consider a synchronous model of communication such that in each round all nodes in parallel can send a message of size O(logn) to one of its neighbors and perform local computation on messages received in previous rounds from its neighbors. Under this model, our solver takes O˜(thit) distributed rounds to solve one-sink Laplacian systems. However, since under this model each node can get at most dmaxin messages from its neighbors, where dmaxin is the maximum in-degree of the graph, so simulating each round in worst-case would require dmaxin time. In Table 1, we present the running time of our solver for various graphs where we add a factor of dmaxin to the distributed round complexity to account for multiple reception by a node in each round. This will not affect the constant degree graphs while giving an extra O(n) for graphs whose in-degree grows with the number of vertices.

Section snippets

An equivalent stochastic process: data collection

In this section, we define a stochastic process on a graph that has the property that at stationarity its steady-state behavior provides the solution to a one-sink Laplacian system. The process is formally defined as follows.

Definition 1 Data Collection Process

On a strongly connected directed network G=(V,E,w) with a weight function w:ER+ of directed edges, we identify a distinguished sink node, us that passively collects data, and a set of data sources VsV{us}. Each node in V{us} has a queue in which it can store data

Our model

We assume a synchronous model of communication which we call as GP-CONGEST based on its relation with the standard CONGEST model [12] and gossip [13] algorithms. In this model, all nodes wake up simultaneously and the time is divided into rounds. In each round, each node in the network can send a message of size O(logn) to one of its neighbors and perform local computation based on the information obtained from the messages received from its neighbors in previous rounds. However, note that by

Distributed solver algorithm

We now present the main distributed algorithm. Our main solver algorithm, DRW-LSolve, is directed by a single node, the sink us, which we call as the controller node. The algorithm takes as input w,b, ϵ and κ. The weight function w specifies the Laplacian matrix L. The error of the solution to xTL=bT is controlled by the parameter ϵ. The fourth parameter κ is a user-defined granularity factor which expresses the fact that our algorithm is not able to accurately approximate those coordinates of

Our results

In this section, we present our main results, which extend the results we present for the undirected case in [14] to the directed case. In general, for directed graphs, natural extensions to solving linear systems in graph Laplacians involve computation of the stationary distribution of a random walk. Unlike undirected graphs for which the stationary distribution is proportional to the vertex's degree, this step is non-trivial for directed graphs. Moreover, for undirected graphs the Laplacian

Conclusion

Although our main result is presenting a distributed algorithm for solving one-sink Laplacians systems on strongly connected graphs, our solver implicitly approximates the stationary distribution of its corresponding Markov chain. In itself, this is a key contribution because the computation of stationary distribution is one of the main discrepancies between the directed and undirected settings. It is also one of the prime steps for the PageRank algorithm and is required for the normalization

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (15)

  • D.A. Spielman et al.

    Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems

  • M.B. Cohen et al.

    Faster algorithms for computing the stationary distribution, simulating random walks, and more

  • M.B. Cohen et al.

    Solving directed Laplacian systems in nearly-linear time through sparse LU factorizations

  • Y. Tang et al.

    Distributed algorithms for solving a linear equation under a directed graph

  • P. Christiano et al.

    Electrical flows, Laplacian systems, and faster approximation of maximum flow in undirected graphs

  • A. Mądry et al.

    Fast generation of random spanning trees and the effective resistance metric

  • D.A. Spielman et al.

    Graph sparsification by effective resistances

    SIAM J. Comput.

    (2011)
There are more references available in the full text version of this article.
View full text