Abstract
In recent years, with the possible end of further improvements in single processor, more and more researchers shift to the idea of Chip Multiprocessors (CMPs). The burgeoning of multi-thread programs brings on dramatically increased inter-core communication. Unfortunately, traditional architectures fail to meet the challenge, as they conduct such a kind of communication on the last level of on-chip cache or even on the memory.This paper proposes a novel approach, called Collective Cache, to differentiate the access to shared/private data and handle data communication on the first level cache. In the proposed cache architecture, the share data found in the last level cache are moved into the Collective Cache, a L1 cache structure shared by all cores. We show that the mechanism this paper proposed can immensely enhance inter-processors communication, increase the usage efficiency of L1 cache and simplify data consistency protocol. Extensive analysis of this approach with Simics shows that it can reduce the L1 cache miss rate by 3.36%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Monchiero, M., Canal, R., Gonzalez, A.: Design space exploration for multicore architectures: A power/performance/thermal view. In: IEEE conference on supercomputing (June 2006)
Sinharoy, B., Kalla, R., Tendler, J., Eickemeyer, R., Joyner, J.: Power5 System Microarchitecture. IBM Journal of Research and Development 49(4) (2005)
Kongetira, P.: A 32-way Multithreaded SPARC? Processor. In: Proceedings of the 16th HotChips Symposium (August 2004)
Krewell, K.: UltraSPARC IV Mirrors Predecessor. In: Microprocessor. Report, November 2003, pp. 1–3 (2003)
McNairy, C., Bhatia, R.: Montecito: A Dual-Core Dual-Thread Itanium Processor. IEEE Micro. 25(2), 10–20 (2005)
Chang, J., Sohi, G.S.: Cooperative cache for chip multiprocessors. In: ISCA (2006)
Srikantaiah, S., Irwin, M.K.M.J.: Adaptive set pinning: Managing shared caches in Chip Multiprocessors. In: ASPLOS 2008 (2008)
Beckmann, B.M., Marty, M.R., Wood, D.A.: ASR: Adaptive selective replication for CMP caches. In: MICRO (2006)
Peter, S.: Magnusson: Simics: a full system simulator. IEEE Computer Society, Los Alamitos (2002)
Martin, M.M.K.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. In: Computer Architecture News (September 2005)
Hammond, L., Nayfeh, B.A., Olukotun, K.: A single-chip multiprocessor. IEEE Computer Society, Los Alamitos (1997)
Monchiero, M., Canal, R., Gonzalez, A.: Design space exploration for multicore architectures: A power/performance/thermal view. In: IEEE conference on supercomputing (June 2006)
Leverich, J., Arakida, H., Solomatnikov, A.: Comparing memory systems for chip multiprocessors. In: ISCA (2007)
Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: ASPLOS (2002)
Cgushti, Z., Powell, M.D., Vijaykumar, T.N.: Distance associativity for high-performance energy-efficient non-uniform cache architectures. In: MICRO (2003)
Beckmann, B.M., Wood, D.A.: Managing Wire Delay in Large Chip-Multiprocessor Caches. In: Proc. 37th Int’l. Symp. Microarchitecture (MICRO-37) (December)
Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing Replication, Communication, and Capacity Allocation in CMPs. In: Proc. 32nd Ann. Int’l. Symp. Computer Architecture (ISCA 2005) (June 2005)
Liu, C., Sivasubramaniam, A., Kandemir, M.: Organizing the last line of Defense before hitting the memory wall for CMPs. In: 10th HPCA (2004)
Huh, J., Kim, C.: A NUCA substrate for flexible CMP cache sharing. IEEE transactions on parallel and distributed systems (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, G., Fen, D., Tong, L., Xiang, L., Wang, C., Chen, T. (2009). L1 Collective Cache: Managing Shared Data for Chip Multiprocessors. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-03644-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03643-9
Online ISBN: 978-3-642-03644-6
eBook Packages: Computer ScienceComputer Science (R0)