Abstract
Cache only memory architecture (COMA), even with its additional memory overhead, can incur longer inter/intra-node communication latency than cache-coherent nonuniform memory access (CC-NUMA). Some studies on COMA suggest that the inclusion property applied between the processor cache and its local memory is one of the major causes of less-than-desirable performance. The inclusion property creates extra accesses to the slow local memory. We consider the binding time of data address to the local memory to be an important factor related to the long latency in COMA. This paper considers the inclusion property in COMA and introduces a variant of COMA, dubbed Dynamic Memory Architecture (DYMA), where the local memory is utilized as a backing store for blocks discarded from the processor cache. Thus, by delaying the binding time, the long latency due to the inclusion property can be avoided. This paper examines the potential performance of DYMA compared to COMA and CC-NUMA.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
J. L. Baer, W. H. Wang, “On the Inclusion Property for Multi-Level Cache Hierarchies,” Proc. of the 15th ISCA, pp. 73–80, 1988.
M. Berry et al., “The Perfect Club Benchmark: Effective Performance Evaluation of Supercomputers,” Int'l Journal of Supercomputing Apps., Vol. 3, No. 3, 1989.
H. Burkhardt III et. al., “Overview of the KSR-1 Computer System,” Technical Report KSR-TR-9202001, Kendall Square Research Corporation, 1992.
J. B. Carter, J. K. Bennett, W. Zwaenepoel, “Implementation and Performance of Munin,” 13th Sym. on Operating System Principles, Oct. 1991.
R. Chandra et al., “Scheduling and Page Migration for Multiprocessor Computer Servers,” Proc. of the 6th ASPLOS-VI, Oct. 1994.
E. Hagersten, S. Haridi, A. Landin, “DDM — A Cache-Only Memory Architecture,” IEEE Computer, pp. 44–54, Sept. 1992.
T. Joe, “COMA-F: A Non-hierarchical Cache Only Memory Architecture,” Ph.D. Dissertation, Stanford University, Mar. 1995.
N.P. Jouppi, “Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers,” Proc. of the 17th ISCA, pp. 364–373, 1990.
G. Lee, “An Assessment of COMA Multiprocessors,” Proc. of the 9th Int'l Parallel Processing Symp., Santa Barbara, CA., Apr. 1995.
G. Lee, J. Kong, “Prospects of Distributed Shared Memory for Reducing Global Traffic in Shared-Bus Multiprocessors,” Proc. of the 7th IASTED/ISMM Int'l Conf., pp. 63–67, Oct. 1995.
D. E. Lenoski et al., “The Directory-Based Cache Coherence Protocol for DASH multiprocessor,” Proc. of the 17th ISCA, pp. 148–159, 1990.
K. Li, P. Hudak, “Memory Coherence in Shared Virtual Memory Systems,” ACM Trans. on Computer Systems, 7(4):321–359, Nov. 1889.
A. Saulsbury, T. Wilkinson, J. Carter, and A. Landin, “An Argument for Simple COMA,” Proc. of 1st IEEE Symp. on High Perform. Comp. Archi., 1995.
P. Sweazey, A.J. Smith, “A Class of Compatible Cache Consistency Protocols and their Support by the IEEE Futurebus,” Proc. of the 13th ISCA, 1986.
A. W. Wilson Jr., “ Hierarchical Cache/Bus Architecture for Shared Memory Multiprocessors,” Proc. of the 14th ISCA, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kong, J., Lee, G. (1996). Relaxing the inclusion property in cache only memory architecture. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds) Euro-Par'96 Parallel Processing. Euro-Par 1996. Lecture Notes in Computer Science, vol 1124. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0024733
Download citation
DOI: https://doi.org/10.1007/BFb0024733
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61627-6
Online ISBN: 978-3-540-70636-6
eBook Packages: Springer Book Archive