abstract

Brief announcement: better speedups for parallel max-flow

Authors:

George Constantin Caragea,

Uzi VishkinAuthors Info & Claims

SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures

Pages 131 - 134

https://doi.org/10.1145/1989493.1989511

Published: 04 June 2011 Publication History

Abstract

We present a parallel solution to the Maximum-Flow (Max-Flow) problem, suitable for a modern many-core architecture. We show that by starting from a PRAM algorithm, following an established "programmer's workflow" and targeting XMT, a PRAM-inspired many-core architecture, we achieve significantly higher speed-ups than previous approaches. Comparison with the fastest known serial max-flow implementation on a modern CPU demonstrates for the first time potential for orders-of-magnitude performance improvement for Max-Flow. Using XMT, the PRAM Max-Flow algorithm is also much easier to program than for other parallel platforms, contributing a powerful example toward dual validation of both PRAM algorithmics and XMT.

References

[1]

R. J. Anderson and J. Setubal. On the parallel implementation of goldberg's maximum flow algorithm. In SPAA '92: Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures, pages 168--177, New York, NY, USA, 1992. ACM.

Digital Library

[2]

R. J. Anderson and J. Setubal. Goldberg's algorithm for maximum flow in perspective: a computational study. In D. S. Johnson and C. C. McGeoch, editors, Network Flows and Matching: First DIMACS Implementation Challenge, pages 1--18, 1993.

[3]

D. Bader and V. Sachdeva. A cache-aware parallel implementation of the push-relabel network flow algorithm and experimental evaluation of the gap relabeling heuristic. In PDCS '05: Proceedings of the 18th ISCA International Conference on Parallel and Distributed Computing Systems, 2005.

[4]

G. C. Caragea, F. Keceli, A. Tzannes, and U. Vishkin. General-purpose vs. gpu: Comparison of many-cores on irregular workloads. In HotPar '10: Proceedings of the 2nd Workshop on Hot Topics in Parallelism. USENIX, June 2010.

[5]

J. Cheriyan and K. Mehlhorn. An analysis of the highest-level selection rule in the preflow-push max-flow algorithm. Information Processing Letters, 69:69--239, 1998.

Digital Library

[6]

A. Goldberg. Network optimization library. http://www.avglab.com/andrew/soft.html.

[7]

A. Goldberg and R. Tarjan. A parallel algorithm for finding a blocking flow in an acyclic network. Information Processing Letters, 31:265--271, 1989.

Digital Library

[8]

A. V. Goldberg and R. E. Tarjan. A new approach to the maximum-flow problem. J. ACM, 35(4):921--940, 1988.

Digital Library

[9]

Z. He and B. Hong. Dynamically tuned push-relabel algorithm for the maximum flow problem on cpu-gpu-hybrid platforms. In The 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS'10), 2010.

[10]

M. Hussein, A. Varshney, and L. Davis. On implementing graph cuts on cuda. In First Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), 2007.

[11]

F. Keceli, A. Tzannes, G. C. Caragea, R. Barua, and U. Vishkin. Toolchain for programming, simulating and studying the xmt many-core architecture. In Proc. International Workshop on High-Level Parallel Programming Models and Supportive Environments, 2011.

Digital Library

[12]

Y. Shiloach and U. Vishkin. An o(n2 log n) parallel max-flow algorithm. J. Algorithms, 3(2):128--146, 1982.

Digital Library

[13]

V. Vineet and P. Narayanan. Cuda cuts: Fast graph cuts on the gpu. In Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on, pages 1--8, june 2008.

[14]

U. Vishkin. A parallel blocking flow algorithm for acyclic networks. J. Algorithms, 13(3):489--501, 1992.

Digital Library

[15]

U. Vishkin. Using simple abstraction to reinvent computing for parallelism. Commun. ACM, 54:75--85, January 2011.

Digital Library

[16]

X. Wen and U. Vishkin. Fpga-based prototype of a pram-on-chip processor. In CF '08: Proceedings of the 2008 conference on Computing frontiers, pages 55--66, New York, NY, USA, 2008. ACM.

Digital Library

Cited By

Kara GÖzturan C(2019)Algorithm 1002ACM Transactions on Mathematical Software10.1145/333048145:4(1-28)Online publication date: 9-Dec-2019
https://dl.acm.org/doi/10.1145/3330481
Edwards JVishkin U(2016)FFT on XMT: Case Study of a Bandwidth-Intensive Regular Algorithm on a Highly-Parallel Many Core2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2016.157(561-569)Online publication date: May-2016
https://doi.org/10.1109/IPDPSW.2016.157
Homayounnejad SBagheri A(2015)An efficient distributed max-flow algorithm for Wireless Sensor NetworksJournal of Network and Computer Applications10.1016/j.jnca.2015.04.00454:C(20-32)Online publication date: 1-Aug-2015
https://dl.acm.org/doi/10.1016/j.jnca.2015.04.004
Show More Cited By

Index Terms

Brief announcement: better speedups for parallel max-flow
1. Computer systems organization
  1. Architectures
    1. Parallel architectures

Recommendations

Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor
SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

We compare the Paraleap FPGA computer, a 64-processor hardware prototype of the PRAM-driven XMT architecture, with an Intel Core 2 Duo processor and show that Paraleap outperforms the Intel processor by up to 13.89x in terms of cycle counts. The ...
Brief Announcement: A Randomness-efficient Massively Parallel Algorithm for Connectivity
PODC'21: Proceedings of the 2021 ACM Symposium on Principles of Distributed Computing

We give a randomness-efficient Massively Parallel Computation (MPC) algorithm for deciding whether an undirected graph is connected. For Connectivity on n-vertex, m-edge graphs whose components have diameter at most D = 2o(log n/ log log n), our ...
Brief Announcement: MIC++: Accelerating Maximal Information Coefficient Calculation with GPUs and FPGAs
SPAA '16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures

To discover relationships and associations between pairs of variables in large data sets have become one of the most significant challenges for bioinformatics scientists. To tackle this problem, maximal information coefficient (MIC) is widely applied as ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures

June 2011

404 pages

ISBN:9781450307437

DOI:10.1145/1989493

Co-chairs:
Friedhelm Meyer auf der Heide
University of Paderborn, Germany
,
Rajmohan Rajaraman
Northeastern University, USA

Copyright © 2011 Authors.

Sponsors

In-Cooperation

EATCS: European Association for Theoretical Computer Science

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Abstract

Conference

SPAA '11

Sponsor:

SPAA '11: 23rd ACM Symposium on Parallelism in Algorithms and Architectures

June 4 - 6, 2011

California, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25

Sponsor:
sigact
sigact

37th ACM Symposium on Parallelism in Algorithms and Architectures

July 28 - August 1, 2025

Portland , OR , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
321
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kara GÖzturan C(2019)Algorithm 1002ACM Transactions on Mathematical Software10.1145/333048145:4(1-28)Online publication date: 9-Dec-2019
https://dl.acm.org/doi/10.1145/3330481
Edwards JVishkin U(2016)FFT on XMT: Case Study of a Bandwidth-Intensive Regular Algorithm on a Highly-Parallel Many Core2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2016.157(561-569)Online publication date: May-2016
https://doi.org/10.1109/IPDPSW.2016.157
Homayounnejad SBagheri A(2015)An efficient distributed max-flow algorithm for Wireless Sensor NetworksJournal of Network and Computer Applications10.1016/j.jnca.2015.04.00454:C(20-32)Online publication date: 1-Aug-2015
https://dl.acm.org/doi/10.1016/j.jnca.2015.04.004
Cong G(2014)A Synchronous Parallel Max-Flow Algorithm for Real-World NetworksProceedings of the 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)10.1109/HPCC.2014.213(68-75)Online publication date: 20-Aug-2014
https://dl.acm.org/doi/10.1109/HPCC.2014.213
Edwards JVishkin UBlelloch GHerlihy M(2012)Brief announcementProceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures10.1145/2312005.2312042(190-192)Online publication date: 25-Jun-2012
https://dl.acm.org/doi/10.1145/2312005.2312042
Edwards JVishkin UGuo MHuang Z(2012)Better speedups using simpler parallel programming for graph connectivity and biconnectivityProceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/2141702.2141714(103-114)Online publication date: 26-Feb-2012
https://dl.acm.org/doi/10.1145/2141702.2141714
Berka T(2012)The Generalized Feed-forward Loop Motif: Definition, Detection and Statistical SignificanceProcedia Computer Science10.1016/j.procs.2012.09.00911(75-87)Online publication date: 2012
https://doi.org/10.1016/j.procs.2012.09.009

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten