Skip to main content

Distributed function evaluation in the presence of transmission faults

  • Conference paper
  • First Online:
Algorithms (SIGAL 1990)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 450))

Included in the following conference series:

Abstract

We consider the problems of computing functions and of reaching an agreement in a distributed synchronous network of processors in the presence of dynamic transmission faults. We characterize the maximum number of transmission faults per clock cycle that can be tolerated for the computation of arbitrary or specific functions, with several types of faults. The n processors communicate by sending messages through dedicated communication links. Each processor has a one-way link to each other processor. In each clock cycle, each processor may send one message. The message is received in the same clock cycle by all other processors apart from those to which it travels on faulty communication links. Each link may be faulty at some points in time, and operate correctly at others. In a transmission, a faulty link can either omit a message (a message is sent, but none arrives), corrupt a message (a message arrives that is different from the message that was sent), or add a message (a message arrives, but none was sent). Messages are words over a finite alphabet, varying from single bits to strings of arbitrary length.

We propose a number of techniques for distributed function evaluation in the presence of transmission faults, based on broadcasting either enough of the function's arguments or the result value. For different types of allowable faults (omissions, corruptions, additions), we derive upper bounds on the number of tolerable faults. In most cases, these bounds are tight: already one additional fault makes strong majority (the weakest meaningful form of agreement) unachievable. We show that if out of n(n − 1) messages received by n processors per clock cycle,

  • at most n−2 are omissions, an arbitrary function can be computed in a constant number of cycles (in contrast, with at least n−1 omissions strong majority is impossible);

  • at most n−2 are omissions or corruptions, an arbitrary function can be computed;

  • at most n(n−1), i.e. all, messages are corruptions, an arbitrary function can be computed;

  • at most [n/2]−1 are arbitrary faults, or are corruptions and processors always transmit, strong majority can be reached in a constant number of cycles (in contrast, with at least [n/2] corruptions, where processors always transmit, strong majority is impossible);

  • at most [n/4]−1 are arbitrary faults, or are corruptions and processors always transmit, unanimity can be reached in a constant number of cycles.

For specific functions, we show how the number of cycles needed for the computation can be reduced significantly, as compared to the evaluation of an arbitrary function. Altogether, we draw quite an extensive map of possible and impossible computations in the presence of transmission faults.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Birman and T. Joseph, Reliable communication in the presence of failures. ACM Trans. Comp. Syst. 5, 1 (Feb. 1987).

    Google Scholar 

  2. J. Burns and N. A. Lynch, The Byzantine firing squad problem. In Adv. Comp. Res., Vol 4: Parallel and Distributed Computing, JAI Press Inc. Greenwich, Connecticut.

    Google Scholar 

  3. B. Coan, D. Dolev, C. Dwork and L. Stockmeyer, The distributed firing squad problem. In Proc. 17th ACM Symp. on Theory of Computing, Providence, May 1984, 335–345.

    Google Scholar 

  4. D. Dolev, The Byzantine Generals strike again. J. Algorithms 3, 1 (1982), 14–30.

    Article  Google Scholar 

  5. D. Dolev, C. Dwork and L. Stockmeyer, On the minimal synchronism needed for distributed consensus. J. ACM 34, 1(Jan. 1987), 77–97.

    Article  Google Scholar 

  6. D. Dolev, M. L. Fisher, R. Fowler, N. A. Lynch and H. R. Strong, Efficient Byzantine agreement without authentication. Inf. Control 52,3 (1982), 256–274.

    Article  Google Scholar 

  7. D. Dolev, J. Y. Halpern and H. R. Strong, On the possibility and impossibility of achieving clock synchronization. In Proc. 16th ACM Symp. on Theory of Computing, Washington, May 1984, pp. 504–510.

    Google Scholar 

  8. D. Dolev, H. R. Strong, Authenticated algorithms for Byzantine agreement. SIAM J. Computing 12, 4 (Nov. 1983), 656–666.

    Article  Google Scholar 

  9. M. J. Fisher, The consensus problem in unreliable distributed systems (a brief survey). Dept. Comp. Sci. Tech. Rep. 273, Yale University, June 1983.

    Google Scholar 

  10. J. Y. Halpern, B. Simons, H. R. Strong and D. Dolev, Fault tolerant clock synchronization. In Proc. 3rd ACM Symp. on Principles of Distributed Computing, Vancouver, Aug. 1984, 89–102.

    Google Scholar 

  11. L. Lamport, The weak Byzantine Generals problem. J. ACM 30, (July 1983), 668–676.

    Article  Google Scholar 

  12. L. Lamport and P. M. Melliar-Smith, Synchronizing clocks in presence of faults. J. ACM 32, 1 (Jan. 1985), 52–78.

    Article  Google Scholar 

  13. L. Lamport, R. Shostak and M. Pease, The Byzantine Generals problem. ACM Trans. Prog. Lang. Syst. 4, 3 (July 1982), 382–401.

    Article  Google Scholar 

  14. F. Ling, T. Kameda, Byzantine agreement under network failures. Tech. Rep. LCCR 87-18, Simon Fraser University, 1987.

    Google Scholar 

  15. J. Lundelius and N. A. Lynch, A new fault-tolerant algorithm for clock synchronization. Inf. Control 62, 2 (1984), 190–204.

    Article  Google Scholar 

  16. M. Pease, R. Shostak and L. Lamport, Reaching agreement in presence of faults. J. ACM 27, 2 (April 1980), 228–234.

    Google Scholar 

  17. K. J. Perry, A framework for agreement. In Proc. 2nd Int. Workshop on Distributed Algorithms, Amsterdam, July 1987, 57–75.

    Google Scholar 

  18. K. J. Perry and S. Toueg, Distributed agreement in the presence of processor and communication faults. IEEE Trans. Software Engineering SE-12, 3 (March 1986)

    Google Scholar 

  19. N. Santoro, P. Widmayer, Time is not a healer. In Proc. 6th Ann. Symposium Theor. Aspects of Computer Science, Paderborn, February 1989, LNCS 349, 304–313.

    Google Scholar 

  20. T. K. Srikanth and S. Toueg, Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distributed Computing 2, (1987), 80–94.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Tetsuo Asano Toshihide Ibaraki Hiroshi Imai Takao Nishizeki

Rights and permissions

Reprints and permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Santoro, N., Widmayer, P. (1990). Distributed function evaluation in the presence of transmission faults. In: Asano, T., Ibaraki, T., Imai, H., Nishizeki, T. (eds) Algorithms. SIGAL 1990. Lecture Notes in Computer Science, vol 450. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-52921-7_85

Download citation

  • DOI: https://doi.org/10.1007/3-540-52921-7_85

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-52921-7

  • Online ISBN: 978-3-540-47177-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics