Performing tasks on synchronous restartable message-passing processors

Chlebus, Bogdan S.; De Prisco, Roberto; Shvartsman, Alex A.

doi:10.1007/PL00008926

Performing tasks on synchronous restartable message-passing processors

Original articles
Published: January 2001

Volume 14, pages 49–64, (2001)
Cite this article

Distributed Computing Aims and scope Submit manuscript

Bogdan S. Chlebus¹,
Roberto De Prisco² &
Alex A. Shvartsman²

46 Accesses
Explore all metrics

Summary.

This work considers the problem of performing t tasks in a distributed system of p fault-prone processors. This problem, called do-all herein, was introduced by Dwork, Halpern and Waarts. The solutions presented here are for the model of computation that abstracts a synchronous message-passing distributed system with processor stop-failures and restarts. We present two new algorithms based on a new aggressive coordination paradigm by which multiple coordinators may be active as the result of failures. The first algorithm is tolerant of $f < p$ stop-failures and does not allow restarts. Its available processor steps (work) complexity is $S = {\cal O}((t + p\log p/\log\log p) \cdot\log f)$ and its message complexity is $M = {\cal O}(t + p\log p/\log\log p + fp)$. Unlike prior solutions, our algorithm uses redundant broadcasts when encountering failures and, for p =t and largef, it achieves better work complexity. This algorithm is used as the basis for another algorithm that tolerates stop-failures and restarts. This new algorithm is the first solution for the do-all problem that efficiently deals with processor restarts. Its available processor steps is $S = {\cal O}((t + p\log p + f)\cdot \min\{\log p,\log f\})$, and its message complexity is $M={\cal O}(t+p\log p+ f p)$, wheref is the total number of failures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Author information

Authors and Affiliations

Instytut Informatyki, Uniwersytet Warszawski, ul. Banacha 2, 02-097 Warszawa, Poland (e-mail: chlebus@mimuw.edu.pl) , , , , , , PL
Bogdan S. Chlebus
Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, NE43-316 Cambridge, MA 02139, USA (e-mail: robdep@theory.lcs.mit.edu, alex@theory.lcs.mit.edu) , , , , , , US
Roberto De Prisco & Alex A. Shvartsman

Authors

Bogdan S. Chlebus
View author publications
You can also search for this author in PubMed Google Scholar
Roberto De Prisco
View author publications
You can also search for this author in PubMed Google Scholar
Alex A. Shvartsman
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Received: October 1998 / Accepted: September 2000

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chlebus, B., De Prisco, R. & Shvartsman, A. Performing tasks on synchronous restartable message-passing processors. Distrib Comput 14, 49–64 (2001). https://doi.org/10.1007/PL00008926

Download citation

Issue Date: January 2001
DOI: https://doi.org/10.1007/PL00008926

Key words:Fault-tolerance – Distributed systems – Load balancing – Processor restarts – Work

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performing tasks on synchronous restartable message-passing processors

Summary.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Better Sooner Rather Than Later

A Closer Look at Fault Tolerance

A Separation of n-consensus and (n + 1)-consensus Based on Process Scheduling

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Performing tasks on synchronous restartable message-passing processors

Summary.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Better Sooner Rather Than Later

A Closer Look at Fault Tolerance

A Separation of n-consensus and (n + 1)-consensus Based on Process Scheduling

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now