Skip to main content

Performance results for a reliable low-latency cluster communication protocol

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1586))

Abstract

Existing low-latency protocols make unrealistically strong assumptions about reliability. This allows them to achieve impressive performance, but also prevents this performance being exploited by applications, which must then deal with reliability issues in the application code. We present results from a new protocol that provides error recovery, and whose performance is close to that of existing low-latency protocols. We achieve a CPU overhead of 1.5 μs for packet download and 3.6 μs for upload. Our results show that (a) executing a protocol in the kernel is not incompatible with high performance, and (b) complete control over the protocol stack enables (1) simple forms of flow control to be adopted, (2) proper bracketing of the unreliable portions of the interconnect thus minimising buffers held up for possible recovery, and (3) the sharing of buffer pools. The result is a protocol which performs well in the context of parallel computation and the loose coupling of processes in the workstations of a cluster.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 3Com. 3C90x Network Interface Cards Technical Reference, December 1997. Part Number:09-1163000.

    Google Scholar 

  2. D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yellow. The NAS parallel benchmarks 2.0. Report NAS-95-020, NASA, December 1995.

    Google Scholar 

  3. J. M. Blum, T. M. Warschko, and W. F. Tichy. PULC: Parastation user-level communication. Design and Overview. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 498–509. Springer, 1998.

    Google Scholar 

  4. N. J. Boden, D. Cohen, R. E. Felderman, A. E. Kulawik, C. L. Seitz, J. N. Seizovic, and W.-K. Su. Myrinet—a gigabit-per-second local-area network. http://www.myri.com, November 1994.

    Google Scholar 

  5. J. C. Brustolini and B. N. Bershad. Simple protocol processing for high-bandwidth low-latency networking. Technical Report CMU-CS-93-132, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, 1992.

    Google Scholar 

  6. P. Buonadonna, A. Geweke, and D. E. Culler. Implementation and analysis of the Virtual Interface Architecture. In SuperComputing’98, 1998.

    Google Scholar 

  7. G. Ciacco. Optimal communication performance on Fast Ethernet with GAMMA. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 534–548. Springer, 1998.

    Google Scholar 

  8. Cisco Systems. Catalyst 2900 series XL installation and configuration guide, 1997. Part Number: 78-4417-01.

    Google Scholar 

  9. S. R. Donaldson, J. M. D. Hill, and D. B. Skillicorn. BSP clusters: high performance, reliable and very low cost. Technical Report PRG-TR-5-98, Programming Research Group, Oxford University Computing Laboratory, September 1998.

    Google Scholar 

  10. S. R. Donaldson, J. M. D. Hill, and D. B. Skillicorn. Predictable communication on unpredictable networks: Implementing BSP over TCP/IP. In EuroPar’98, LNCS, Southampton, UK, September 1998. Springer-Verlag.

    Google Scholar 

  11. C. Dubnicki, A. Bilas, K. Li, and J. Philbin. Design and implementation of virtual memory-mapped communication on myrinet. In Proceedings of the 11th International Parallel Processing Symposium, pages 388–396. IEEE, IEEE Press, 1997.

    Google Scholar 

  12. C. Dubnicki, L. Iftode, E. W. Felton, and K. li. Software support for virtual memory mapped communication. In Proceedings of the 10th International Parallel Processing Symposium. IEEE, IEEE Press, 1996.

    Google Scholar 

  13. M. P. I. Forum. MPI A Message-Passing Interface Standard, May 1994.

    Google Scholar 

  14. W. D. Gropp and E. Lusk. User’s Guide for mpich, a Portable Implementation of MPI. Mathematics and Computer Science Division, Argonne National Laboratory, 1996. ANL-96/6.

    Google Scholar 

  15. J. M. D. Hill, B. McColl, D. C. Stefanescu, M. W. Goudreau, K. Lang, S. B. Rao, T. Suel, T. Tsantilas, and R. Bisseling. BSPlib: The BSP Programming Library. Parallel Computing, 24(14):1947–1980, November 1998. see www.bsp-worldwide.org for more details.

    Article  Google Scholar 

  16. J. M. D. Hill and D. Skillicorn. Lessons learned from implementing BSP. Journal of Future Generation Computer Systems, 13(4–5):327–335, April 1998.

    Article  Google Scholar 

  17. A. L. Hyaric. Converting the NAS benchmarks from MPI to BSP. Technical report, Oxford university Computing laboratory, 1997. Available from ftp://ftp.comlab.ox.ac.uk/pub/Packages/BSP/NASfromMPItoBSP.tar

    Google Scholar 

  18. Mier Communucations Inc. Product lab testing comparison: 10/100BASET switches, April 1998.

    Google Scholar 

  19. National Aeronautics and Space Administration. Summary of recent network performance on davinci cluster. http://science.nas.nasa.gov/Groups/LAN/cluster/latresults/sumtab.recent.html.

    Google Scholar 

  20. L. Prylli and B. Tourancheau. A new protocol designed for high performance networking on Myrinet. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 472–485. Springer, 1998.

    Google Scholar 

  21. A. Rubini. Linux Device Drivers. O’Reilly and Associates, 1998.

    Google Scholar 

  22. A. Simpson, J. M. D. Hill, and S. R. Donaldson. BSP in CSP: easy as ABC. Technical Report PRG-TR-6-98, Programming Research Group, Oxford University Computing Laboratory, September 1998.

    Google Scholar 

  23. D. Skillicorn, J. M. D. Hill, and W. F. McColl. Questions and answers about BSP. Scientific Programming, 6(3):249–274, Fall 1997.

    Google Scholar 

  24. M. R. Swanson and L. B. Stoller. Low latency workstation cluster communications using sender-based protocols. Technical Report UUCS-96-001, Department of Computer Science, University of Utah, Salt Lake City, UT 84112, USA, 1996.

    Google Scholar 

  25. L. G. Valiant Bulk-synchronous parallel computer. U.S. Patent No. 5083265, 1992.

    Google Scholar 

  26. M. Verma and T. cker Chiueh. Pupa: A low-latency communication system for Fast Ethernet, April 1998. Workshop on Personnel Computer Based Network of Workstations held at the 12th International Parallel Processing Symposium and the 9th Symposium on Parallel and Distributed Processing.

    Google Scholar 

  27. T. von Eicken, V. Avula, A. Basu, and V. Buch. Low-latency communication over ATM networks using Active Messages. Technical report, Department of Computer Science, Cornell University, Ithaca, NY 14850, 1995.

    Google Scholar 

  28. T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A mechanism for integrated communication and computation. In The 19th Annual International Symposium on Computer Architecture, volume 20(2) of ACM SIGARCH Computer Architecture News. ACM Press, May 1992.

    Google Scholar 

  29. M. Welsh, A. Basu, and T. von Eicken. Low-latency communication over Fast Ethernet. In EuroPar’96 Parallel Processing: Volume I, volume 1123 of Lecture Notes in Computer Science, pages 187–194. Springer, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Rolim Frank Mueller Albert Y. Zomaya Fikret Ercal Stephan Olariu Binoy Ravindran Jan Gustafsson Hiroaki Takada Ron Olsson Laxmikant V. Kale Pete Beckman Matthew Haines Hossam ElGindy Denis Caromel Serge Chaumette Geoffrey Fox Yi Pan Keqin Li Tao Yang G. Chiola G. Conte L. V. Mancini Domenique Méry Beverly Sanders Devesh Bhatt Viktor Prasanna

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag

About this paper

Cite this paper

Donaldson, S.R., Hill, J.M.D., Skillicorn, D.B. (1999). Performance results for a reliable low-latency cluster communication protocol. In: Rolim, J., et al. Parallel and Distributed Processing. IPPS 1999. Lecture Notes in Computer Science, vol 1586. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0097996

Download citation

  • DOI: https://doi.org/10.1007/BFb0097996

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65831-3

  • Online ISBN: 978-3-540-48932-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics