Skip to main content
Log in

An Empirical Evaluation of the Convex SPP-1000 Hierarchical Shared Memory System

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Cache coherency in a scalable parallel computer architecture requires mechanisms beyond the conventional common bus based snooping approaches which are limited to about 16 processors. The new Convex SPP-1000 achieves cache coherency across 128 processors through a two-level shared memory NUMA structure employing directory based and SCI protocol mechanisms. While hardware support for managing a common global name space minimizes overhead costs and simplifies programming, latency considerations for remote accesses may still dominate and can under unfavorable conditions constrain scalability. This paper provides the first published evaluation of the SP-1000 hierarchical cache coherency mechanisms from the perspective of measured latency and its impact on basic global How control mechanisms, scaling of a parallel science code, and sensitivity of cache miss rates to system scale. It is shown that global remote access latency is only a factor of seven greater than that of local cache miss penalty and that scaling of a challenging scientific application is not severely degraded by the hierarchical structure for achieving consistency across the system processor caches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. IEEE Standard for Scalable Coherent Interface, IEEE (1993).

  2. T. Sterling, D. Savarese, P. Merkey, and J. Gardner, An Initial Evaluation of the Convex SPP-1000 for Earth and Space Science Applications, Proc. of the First Int’l. Symp. on High Performance Computing Architecture (January 1995).

  3. Hewlett Packard Company, PA-RISC 1.1 Architecture and Instruction Set Reference Manual, Hewlett Packard Company (1992).

    Google Scholar 

  4. Thinking Machines Corporation, Connection Machine CM-5 Technical Summary, Cambridge, Massachusetts (1992).

    Google Scholar 

  5. Intel Corporation, Paragon User’s Guide, Beaverton, Oregon (1993).

    Google Scholar 

  6. Cray Research, Inc., CRAY T3D System Architecture Overview, Eagan, Minnesota.

  7. CONVEX Computer Corporation, Exemplar Architecture Manual, Richardson, Texas (1993).

    Google Scholar 

  8. Kendall Square Research Corporation, KSR Technical Summary, Waltham, Massachusetts (1992).

    Google Scholar 

  9. CONVEX Computer Corporation, Camelot MACH Microkernel Interface Specification: Architecture Interface Library, Richardson, Texas (May 1993).

    Google Scholar 

  10. J. E. Barnes and P. Hut, A Hierarchical O(n log n) Force Calculation Algorithm, Nature, Vol. 342 (1986).

  11. L. Hernquist, Vectorization of Tree Traversals, Journal of Computational Physics, Vol. 87 (1990).

  12. K. Olson and J. Dorband, An Implementation of a Tree Code on a SIMD Parallel Computer, Astrophysical Journal Supplement Series (September 1994).

  13. CONVEX Computer Corporation, Exemplar Programming Guide, Richardson, Texas (1993).

    Google Scholar 

  14. A. Agarwal, D. Chaiken, and K. Johnson et al., The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor, In: Scalable Shared Memory Multiprocessors, M. Dubois and S.S. Thakkar, Eds., Kluwer Academic Publishers, pp. 239–261 (1992).

    Chapter  Google Scholar 

  15. D. Chaiken, J. Kubiatowitz, and A. Agarwal, Limit LESS Directories: A Scalable Cache Coherence Scheme, Proc. of the Fourth Int’l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), pp. 224–234 (1991).

  16. M. S. Warren and J. K. Salmon, A Parallel Hashed Oct-tree N-Body Algorithm, Proc. of Supercomputing ’93, Washington: IEEE Computer Society Press (1993).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Sterling.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sterling, T., Savarese, D., Merkey, P. et al. An Empirical Evaluation of the Convex SPP-1000 Hierarchical Shared Memory System. Int J Parallel Prog 24, 377–396 (1996). https://doi.org/10.1007/BF03356755

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03356755

Key Words

Navigation