Abstract
We motivate and propose a new way of thinking about failure detectors which allows us to define what it means to solve a distributed task wait-free using a failure detector. In our model, the system is composed of computation processes that obtain inputs and are supposed to produce outputs and synchronization processes that are subject to failures and can query a failure detector. Under the condition that correct (never failing) synchronization processes take sufficiently many steps, they provide the computation processes with enough advice to solve the given task wait-free: every computation process outputs in a finite number of its own steps, regardless of the behavior of other computation processes. Every task can thus be characterized by the weakest failure detector that allows for solving it, and we show that every such failure detector captures a form of set agreement. We then obtain a complete classification of tasks, including ones that evaded comprehensible characterization so far, such as renaming or weak symmetry breaking.
Similar content being viewed by others
Notes
Informally, \(\mathcal {D}\) is the weakest failure detector to solve a task \(T\) if it (1) solves \(T\) and (2) can be deduced from any failure detector that solves \(T\).
Note that all tasks can be solved \(1\)-concurrently.
For some values of \(j\) and \(k\), however, the question of the maximal tolerated concurrency of \((j,j+k-1)\)-renaming is still open [11].
In other words, point contention [6] in the run with respect to \(C\)-processes does not exceed \(k\).
A trivial failure detector always outputs \(\bot \).
Recall that, informally, in a solution of a colorless task, a process is free to adopt the input or the output value of any other participating process.
The procedure is similar to the corridor-based depth-first search simulation of [26].
References
Afek, Y., Attiya, H., Dolev, D., Gafni, E., Merritt, M., Shavit, N.: Atomic snapshots of shared memory. J. ACM 40(4), 873–890 (1993)
Afek, Y., Kuznetsov, P., Nir, I.: Renaming and the weakest family of failure detectors. Distrib. Comput. 25(6), 411–425 (2012)
Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Partial synchrony based on set timeliness. Distrib. Comput. 25(3), 249–260 (2012)
Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message passing systems. J. ACM 42(2), 124–142 (1995)
Attiya, H., Bar-Noy, A., Dolev, D., Peleg, D., Reischuk, R.: Renaming in an asynchronous environment. J. ACM 37(3), 524–548 (1990)
Attiya, H., Fouren, A.: Algorithms adapting to point contention. J. ACM 50(4), 444–468 (2003)
Attiya, H., Welch, J.: Distributed Computing. Fundamentals, Simulations, and Advanced Topics. Wiley, Hoboken (2004)
Borowsky, E., Gafni, E.: Generalized FLP impossibility result for \(t\)-resilient asynchronous computations. In: STOC, pp. 91–100. ACM Press (1993)
Borowsky, E., Gafni, E.: Immediate atomic snapshots and fast renaming. In: PODC, pp. 41–51. ACM Press (1993)
Borowsky, E., Gafni, E., Lynch, N.A., Rajsbaum, S.: The BG distributed simulation algorithm. Distrib. Comput. 14(3), 127–146 (2001)
Castañeda, A., Rajsbaum, S.: New combinatorial topology bounds for renaming: the lower bound. Distrib. Comput. 22(5–6), 287–301 (2010)
Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Chaudhuri, S.: More choices allow more faults: set consensus problems in totally asynchronous systems. Inf. Comput. 105(1), 132–158 (1993)
Delporte-Gallet, C., Fauconnier, H., Gafni, E., Kuznetsov, P.: Wait-freedom with advice. In: PODC, pp. 105–114 (2012)
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R.: Tight failure detection bounds on atomic object implementations. J. ACM 57(4), 22:1–22:32 (2010)
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Hadzilacos, V., Koutnetzov, P., Toueg, S.: The weakest failure detectors to solve certain fundamental problems in distributed computing. In: PODC, pp. 338–346. ACM Press (2004)
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Kouznetsov, P.: Mutual exclusion in asynchronous systems with failure detectors. J. Parallel Distrib. Comput. 65(4), 492–505 (2005)
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Tielmann, A.: The disagreement power of an adversary. Distrib. Comput. 24(3–4), 137–147 (2011)
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Gafni, E.: Round-by-round fault detectors: unifying synchrony and asynchrony (extended abstract). In: PODC, pp. 143–152 (1998)
Gafni, E.: The extended BG-simulation and the characterization of t-resiliency. In: STOC, pp. 85–92. ACM Press (2009)
Gafni, E., Guerraoui, R.: Simulating few by many: limited concurrency = set consensus. Technical Report. http://www.cs.ucla.edu/eli/eli/kconc.pdf (2009)
Gafni, E., Guerraoui, R.: Generalized universality. In: Proceedings of the 22nd international conference on concurrency theory, CONCUR’11, pp. 17–27. Springer, Berlin (2011)
Gafni, E., Kuznetsov, P.: Turning adversaries into friends: simplified, made constructive, and extended. In: OPODIS, pp. 380–394 (2010)
Gafni, E., Kuznetsov, P.: On set consensus numbers. Distrib. Comput. 24(3–4), 149–163 (2011)
Gafni, E., Kuznetsov, P.: Relating L-resilience and wait-freedom via hitting sets. In: ICDCN, pp. 191–202. Full version: http://arxiv.org/abs/1004.4701 (2011)
Gafni, E., Rajsbaum, S.: Distributed programming with tasks. In: OPODIS, pp. 205–218 (2010)
Gray, J.: Notes on data base operating systems. In: Bayer, R.,Graham, R.M., Seegmueller, G. (eds.) An Advanced Course : Operating Systems. Lecture Notes in Computer Science 60, pp. 393–481. Springer, Berlin (1978)
Guerraoui, R., Kuznetsov, P.: Failure detectors as type boosters. Distrib. Comput. 20(5), 343–358 (2008)
Herlihy, M.: Wait-free synchronization. ACM Trans. Progr. Lang. Syst. 13(1), 123–149 (1991)
Herlihy, M., Shavit, N.: The topological structure of asynchronous computability. J. ACM 46(2), 858–923 (1999)
Jayanti, P., Toueg, S.: Every problem has a weakest failure detector. In: PODC, pp. 75–84 (2008)
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)
Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Lo, W.-K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared memory systems. WDAG, LNCS 857, 280–295 (1994)
Loui, M., Abu-Amara, H.: Memory requirements for agreement among unreliable asynchronous processes. Adv. Comput. Res. 4, 163–183 (1987)
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, Burlington (1996)
Raynal, M.: \(K\)-anti-Omega, August 2007. Rump session at PODC (2007)
Saks, M., Zaharoglou, F.: Wait-free k-set agreement is impossible: the topology of public knowledge. SIAM J. Comput. 29, 1449–1483 (2000)
Zieliński, P.: Anti-\(Omega\): the weakest failure detector for set agreement. Distrib. Comput. 22(5–6), 335–348 (2010)
Acknowledgments
The work of Carole Delporte-Gallet and Hugues Fauconnier is supported by the ANR SIMI2 DISPLEXITY.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Delporte-Gallet, C., Fauconnier, H., Gafni, E. et al. Wait-freedom with advice. Distrib. Comput. 28, 3–19 (2015). https://doi.org/10.1007/s00446-014-0231-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-014-0231-6