Abstract
Existing low-latency protocols make unrealistically strong assumptions about reliability. This allows them to achieve impressive performance, but also prevents this performance being exploited by applications, which must then deal with reliability issues in the application code. We present results from a new protocol that provides error recovery, and whose performance is close to that of existing low-latency protocols. We achieve a CPU overhead of 1.5 μs for packet download and 3.6 μs for upload. Our results show that (a) executing a protocol in the kernel is not incompatible with high performance, and (b) complete control over the protocol stack enables (1) simple forms of flow control to be adopted, (2) proper bracketing of the unreliable portions of the interconnect thus minimising buffers held up for possible recovery, and (3) the sharing of buffer pools. The result is a protocol which performs well in the context of parallel computation and the loose coupling of processes in the workstations of a cluster.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
3Com. 3C90x Network Interface Cards Technical Reference, December 1997. Part Number:09-1163000.
D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yellow. The NAS parallel benchmarks 2.0. Report NAS-95-020, NASA, December 1995.
J. M. Blum, T. M. Warschko, and W. F. Tichy. PULC: Parastation user-level communication. Design and Overview. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 498–509. Springer, 1998.
N. J. Boden, D. Cohen, R. E. Felderman, A. E. Kulawik, C. L. Seitz, J. N. Seizovic, and W.-K. Su. Myrinet—a gigabit-per-second local-area network. http://www.myri.com, November 1994.
J. C. Brustolini and B. N. Bershad. Simple protocol processing for high-bandwidth low-latency networking. Technical Report CMU-CS-93-132, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, 1992.
P. Buonadonna, A. Geweke, and D. E. Culler. Implementation and analysis of the Virtual Interface Architecture. In SuperComputing’98, 1998.
G. Ciacco. Optimal communication performance on Fast Ethernet with GAMMA. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 534–548. Springer, 1998.
Cisco Systems. Catalyst 2900 series XL installation and configuration guide, 1997. Part Number: 78-4417-01.
S. R. Donaldson, J. M. D. Hill, and D. B. Skillicorn. BSP clusters: high performance, reliable and very low cost. Technical Report PRG-TR-5-98, Programming Research Group, Oxford University Computing Laboratory, September 1998.
S. R. Donaldson, J. M. D. Hill, and D. B. Skillicorn. Predictable communication on unpredictable networks: Implementing BSP over TCP/IP. In EuroPar’98, LNCS, Southampton, UK, September 1998. Springer-Verlag.
C. Dubnicki, A. Bilas, K. Li, and J. Philbin. Design and implementation of virtual memory-mapped communication on myrinet. In Proceedings of the 11th International Parallel Processing Symposium, pages 388–396. IEEE, IEEE Press, 1997.
C. Dubnicki, L. Iftode, E. W. Felton, and K. li. Software support for virtual memory mapped communication. In Proceedings of the 10th International Parallel Processing Symposium. IEEE, IEEE Press, 1996.
M. P. I. Forum. MPI A Message-Passing Interface Standard, May 1994.
W. D. Gropp and E. Lusk. User’s Guide for mpich, a Portable Implementation of MPI. Mathematics and Computer Science Division, Argonne National Laboratory, 1996. ANL-96/6.
J. M. D. Hill, B. McColl, D. C. Stefanescu, M. W. Goudreau, K. Lang, S. B. Rao, T. Suel, T. Tsantilas, and R. Bisseling. BSPlib: The BSP Programming Library. Parallel Computing, 24(14):1947–1980, November 1998. see www.bsp-worldwide.org for more details.
J. M. D. Hill and D. Skillicorn. Lessons learned from implementing BSP. Journal of Future Generation Computer Systems, 13(4–5):327–335, April 1998.
A. L. Hyaric. Converting the NAS benchmarks from MPI to BSP. Technical report, Oxford university Computing laboratory, 1997. Available from ftp://ftp.comlab.ox.ac.uk/pub/Packages/BSP/NASfromMPItoBSP.tar
Mier Communucations Inc. Product lab testing comparison: 10/100BASET switches, April 1998.
National Aeronautics and Space Administration. Summary of recent network performance on davinci cluster. http://science.nas.nasa.gov/Groups/LAN/cluster/latresults/sumtab.recent.html.
L. Prylli and B. Tourancheau. A new protocol designed for high performance networking on Myrinet. In Parallel and Distributed Processing, volume 1388 of Lecture Notes in Computer Science, pages 472–485. Springer, 1998.
A. Rubini. Linux Device Drivers. O’Reilly and Associates, 1998.
A. Simpson, J. M. D. Hill, and S. R. Donaldson. BSP in CSP: easy as ABC. Technical Report PRG-TR-6-98, Programming Research Group, Oxford University Computing Laboratory, September 1998.
D. Skillicorn, J. M. D. Hill, and W. F. McColl. Questions and answers about BSP. Scientific Programming, 6(3):249–274, Fall 1997.
M. R. Swanson and L. B. Stoller. Low latency workstation cluster communications using sender-based protocols. Technical Report UUCS-96-001, Department of Computer Science, University of Utah, Salt Lake City, UT 84112, USA, 1996.
L. G. Valiant Bulk-synchronous parallel computer. U.S. Patent No. 5083265, 1992.
M. Verma and T. cker Chiueh. Pupa: A low-latency communication system for Fast Ethernet, April 1998. Workshop on Personnel Computer Based Network of Workstations held at the 12th International Parallel Processing Symposium and the 9th Symposium on Parallel and Distributed Processing.
T. von Eicken, V. Avula, A. Basu, and V. Buch. Low-latency communication over ATM networks using Active Messages. Technical report, Department of Computer Science, Cornell University, Ithaca, NY 14850, 1995.
T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A mechanism for integrated communication and computation. In The 19th Annual International Symposium on Computer Architecture, volume 20(2) of ACM SIGARCH Computer Architecture News. ACM Press, May 1992.
M. Welsh, A. Basu, and T. von Eicken. Low-latency communication over Fast Ethernet. In EuroPar’96 Parallel Processing: Volume I, volume 1123 of Lecture Notes in Computer Science, pages 187–194. Springer, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Donaldson, S.R., Hill, J.M.D., Skillicorn, D.B. (1999). Performance results for a reliable low-latency cluster communication protocol. In: Rolim, J., et al. Parallel and Distributed Processing. IPPS 1999. Lecture Notes in Computer Science, vol 1586. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0097996
Download citation
DOI: https://doi.org/10.1007/BFb0097996
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65831-3
Online ISBN: 978-3-540-48932-0
eBook Packages: Springer Book Archive