Efficient communication mechanisms for cluster based parallel computing

Davis, Al; Swanson, Mark; Parker, Mike

doi:10.1007/3-540-62573-9_1

Al Davis¹,
Mark Swanson¹ &
Mike Parker¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1199))

Included in the following conference series:

International Workshop on Communication, Architecture, and Applications for Network-Based Parallel Computing

108 Accesses
2 Citations

Abstract

The key to crafting an effective scalable parallel computing system lies in minimizing the delays imposed by the system. Of particular importance are communications delays, since parallel algorithms must communicate frequently. The communication delay is a system-imposed latency. The existence of relatively inexpensive high performance workstations and emerging high performance interconnect options provide compelling economic motivation to investigate NOW/COW (network/cluster of workstation) architectures. However, these commercial components have been designed for generality. Cluster nodes are connected by longer physical wire paths than found in special-purpose supercomputer systems. Both effects tend to impose intractable latencies on communication. Even larger system-imposed delays result from the overhead of sending and receiving messages. This overhead can come in several forms, including CPU occupancy by protocol and device code as well as interference with CPU access to various levels of the memory hierarchy. Access contention becomes even more onerous when the nodes in the system are themselves symmetric multiprocessors. Additional delays are incurred if the communication mechanism requires processes to run concurrently in order to communicate with acceptable efficiency. This paper presents the approach taken by the Utah Avalanche project which spans user level code, operating system support, and network interface hardware. The result minimizes the constraining effects of latency, overhead, and loosely coupled scheduling that are common characteristics in NOW-based architectures.

This work was supported by SPAWAR contract #N00039-94-C-0018 and ARPA order #B990.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, A., Bianchini, R., Chaiken, D., and Johnson, K. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (June 1995), pp. 2–13.
Google Scholar
Basu, A., Buch, V., Vogels, W., and von Eicken, T. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (December 1995).
Google Scholar
Blumrich, M., et al. Virtual Memory Mapped Network Interface for the SHRIMP Multicomputer. In Proceedings of the 21st Annual International Symposium on Computer Architecture (April 1994), pp. 142–153.
Google Scholar
Boden, N., et al. Myrinet — A Gigabit-per-second Local-Area Network. IEEE MICRO 15, 1 (February 1995), 29–36.
Google Scholar
Bryg, W., Chan, K., and Fiduccia, N. A High-Performance, Low-Cost Multiprocessor Bus for Workstations and Midrange Servers. Hewlett-Packard Journal 47, 1 (February 1996), 18–24.
Google Scholar
Buzzard, G., et al. An implementation of the Hamlyn sender-managed interface architecture. In Proceedings of the Second Symposium on Operating System Design and Implementation (October 1996).
Google Scholar
Chan, K., et al. Design of the HP PA 7200 CPU. Hewlett-Packard Journal 47, 1 (February 1996), 25–33.
Google Scholar
Chandra, S., Larus, J., and Rogers, A. Where is Time Spent in Message-Passing and Shared-Memory Programs? In Proceedings of the 6th Symposium on Architectural Support for Programming Languages and Operating Systems (Oct. 1994), pp. 61–73.
Google Scholar
Culler, D. E., et al. Parallel Programming in Split-C. In Proceedings of Supercompuling '93 (Nov. 1993), pp. 262–273.
Google Scholar
Dalton, C., et al. Afterburner: A Network-Independent Card Provides Architectural Support for High-Performance Protocols. IEEE Network (July 1993), 36–43.
Google Scholar
Dubnicki, C., Iftode, L., Felten, E., and Li, K. Software Support of Virtual Memory Mapped Communication. In 10th International Parallel Processing Symposium (Apr. 1996).
Google Scholar
Gillett, R., and Kaufmann, R. Experience Using the First-Generation Memory Channel for PCI Network. In HOT Interconnects Symposium IV (Aug. 1996).
Google Scholar
Heinrich, M., et al. The Performance Impact of Flexibility in the Stanford FLASH Multiprocessor. In Proceedings of the 6th Symposium on Architectural Support for Programming Languages and Operating Systems (Oct. 1994), pp. 274–285.
Google Scholar
Hewlett-Packard Co.PA-RISC 1.1 Architecture and Instruction Set Reference Manual, February 1994.
Google Scholar
Hunt, D. Advanced Performance Features of the 64-bit PA-8000. In COMPCON '95 (1995), pp. 123–128.
Google Scholar
Paikin, S., Lauria, and Chien, A. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Proceedings of Supercomputing '88 (1995).
Google Scholar
Stoller, L., Kuramkote, R., and Swanson, M. PAINT-PA Instruction Set Interpreter. Tech. Rep. UUCS-96-009, University of Utah — Computer Science Department, September 1996.
Google Scholar
Stoller, L., and Swanson, M. Direct Deposit: A Basic User-Level Protocol for Carpet Clusters. Tech. Rep. UUCS-95-003, University of Utah-Computer Science Department, March 1995.
Google Scholar
Swanson, M., and Stoller, L. Low Latency Workstation Cluster Communications Using Sender-Based Protocols — Computer Science Department. Tech. Rep. UUCS-96-001, University of Utah, March 1996.
Google Scholar
Thekkath, A., and Levy, H. Limits to Low-Latency Communications on High-Speed Networks. acm Transactions on Computer Systems 11, 2 (May 1993), 179–203.
Google Scholar
von Eicken, T., Culler, D. E., Goldstein, S. C., and Schauser, K. E. Active Messages: a Mechanism for Integrated Communication and Computation,. In Proceedings of the 19th Annual International Symposium on Computer Architecture (May 1992), pp. 256–266.
Google Scholar
Wilkes, J. Hamlyn — an interface for sender-based communication. Tech. Rep. HPL-OSR-92-13, Hewlett-Packard Research Laboratory, Nov. 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Utah, 84112, Salt Lake City, UT, USA
Al Davis, Mark Swanson & Mike Parker

Authors

Al Davis
View author publications
You can also search for this author in PubMed Google Scholar
Mark Swanson
View author publications
You can also search for this author in PubMed Google Scholar
Mike Parker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Dhabaleswar K. Panda Craig B. Stunkel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Davis, A., Swanson, M., Parker, M. (1997). Efficient communication mechanisms for cluster based parallel computing. In: Panda, D.K., Stunkel, C.B. (eds) Communication and Architectural Support for Network-Based Parallel Computing. CANPC 1997. Lecture Notes in Computer Science, vol 1199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62573-9_1

Download citation

DOI: https://doi.org/10.1007/3-540-62573-9_1
Published: 03 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62573-5
Online ISBN: 978-3-540-68085-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics