Skip to main content

Evolutionary Design of Fault Tolerant Collective Communications

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5216))

Abstract

Scheduling of collective communications (CC) in interconnection networks possibly containing faulty links has been done with the use of the evolutionary techniques. Inter-node communication patterns scheduled in the minimum number of time slots have been obtained. The results show that evolutionary techniques often lead to ultimate scheduling of CC that reaches theoretical bounds on the number of steps. Analysis of fault tolerance by the same techniques revealed graceful CC performance degradation for a single link or node fault. Once the faulty region is located, CC can be re-scheduled during a recovery period.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van der Steen, A.J., Dongarra, J.J.: Overview of Recent Supercomputers. TOP 500® Supercomputer Sites, November 2007 Edition, http://www.arcade-eu.org/overview/

  2. Jaroš, J., Ohlídal, M., Dvořák, V.: An Evolutionary Approach to Collective Communication Scheduling. In: 2007 Genetic and Evolutionary Computational Conference, pp. 2037–2044. ACM, New York (2007)

    Google Scholar 

  3. Duato, J., Yalamanchili, S.: Interconnection Networks – An Engineering Approach. Morgan Kaufman Publishers, Elsevier Science (2003)

    Google Scholar 

  4. Levi, S.T., Agrawala, A.K.: Fault Tolerant system design. McGraw-Hill Inc., New York (1994)

    Google Scholar 

  5. Mühlenbein, H., Paaß, G.: From recombination of genes to the estimation of distributions I. Binary parameters. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 178–187. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  6. Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms. Kluwer Academic Publishers, London (2002)

    MATH  Google Scholar 

  7. Goldberg, D.: Genetics Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Reading (1989)

    Google Scholar 

  8. Jaroš, J., Dvořák, V.: Speeding-up OAS and AAS Communication in Networking System on Chips. In: Proc. of 8th IEEE Workshop on Design and Diagnostic of Electronic Circuits and Systems, Sopron, HU, UWH, p. 4 (2005) ISBN 9639364487

    Google Scholar 

  9. Ohlídal, M., Jaroš, J., Dvořák, V., Schwarz, J.: Evolutionary Design of OAB and AAB Communication Schedules for Interconnection Networks. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 267–278. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Stewart, L.C., Gingold, D.: A New Generation of Cluster Interconnect. White Paper, SiCortex Inc. (2006)

    Google Scholar 

  11. Dally, W.J., Seitz, C.L.: Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. Computers C-36(5), 547–553 (1987)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jaros, J. (2008). Evolutionary Design of Fault Tolerant Collective Communications. In: Hornby, G.S., Sekanina, L., Haddow, P.C. (eds) Evolvable Systems: From Biology to Hardware. ICES 2008. Lecture Notes in Computer Science, vol 5216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85857-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85857-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85856-0

  • Online ISBN: 978-3-540-85857-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics