ABSTRACT
As we embrace the era of chip multi-processors (CMP), we are faced with two major architectural challenges: (i) QoS or performance management of disparate applications running on CPU cores contending for shared cache/memory resources and (ii) global/local power management techniques to stay within the overall platform constraints. The problem is exacerbated as the number of cores sharing the resources in a chip increase. In the past, researchers have proposed independent solutions for these two problems. In this paper, we show that rate-based techniques that are employed to address power management can be adapted to address cache/memory QoS issues. The basic approach is to throttle down the processing rate of a core if it is running a low-priority task and its execution is interfering with the performance of a high priority task due to platform resource contention (i.e. cache or memory contention). We evaluate two rate throttling mechanisms (clock modulation, and frequency scaling) for effectively managing the interference between applications running in a CMP platform and delivering QoS/performance management. We show that clock modulation is much more applicable to cache/memory QoS than frequency scaling and that resource monitoring along with rate control provides effective power-performance management in CMP platforms.
- ACPI Specification at http://www.acpi/info/spec.htmGoogle Scholar
- P. Apparao, et al, "Characterization and Analysis of a Server Consolidation Benchmark" to appear in 2008 International Conference on Virtual Execution Environments (VEE 2008) Google ScholarDigital Library
- W. Bircher, and L. John, "Analysis of dynamic power management on multi-core processors," In Proceedings of the 22nd Annual international Conference on Supercomputing (ICS'08), 2008. Google ScholarDigital Library
- J. Chang, and G. Sohi, "Cooperative cache partitioning for chip multiprocessors," In Proceedings of the 21st Annual international Conference on Supercomputing (ICS), 2007. Google ScholarDigital Library
- D. Chandra, et al, "Predicting inter-thread cache contention on a chip multiprocessor architecture", 11th Int'l Symp. on High Performance Computer Architecture (HPCA), Feb, 2005 Google ScholarDigital Library
- F. Guo, Y Solihin, L Zhao, R Iyer. "A Framework for Providing Quality of Service in Chip Multi-Processors". IEEE Micro 2007. Google ScholarDigital Library
- Intel Corporation. "Intel Dual-Core Processors," http://www.intel.com/technology/computing/dual-core/Google Scholar
- C. Isci, et al., "Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management," MICRO-39, Dec 2006 Google ScholarDigital Library
- C. Isci, et al, "An Analysis of Multi-Core Global Power Management Policies," MICRO-39, Dec 2006Google Scholar
- R. Iyer, "CQoS: A Framework for Enabling QoS in Shared Caches of CMP Platforms," 18th Annual Int'l Conference on Supercomputing (ICS'04), July 2004. Google ScholarDigital Library
- R. Iyer, et al, "Datacenter-on-chip Architectures: Tera-scale Challenges and Opportunities" Intel Technology Journal, 2007Google Scholar
- R. Iyer et al, "QoS Policies and Architecture for Cache/Memory in CMP Platforms," ACM SIGMETRICS 2007 Google ScholarDigital Library
- H. Kannan, F. Guo, L. Zhao, R. Illikkal, R. Iyer, et al., "From Chaos to QoS: Case Studies in CMP Resource Management," 2nd Workshop on Design, Architecture and Simulation of CMP Platforms (dasCMP/Micro), Dec 2006Google Scholar
- S. Kim, D. Chandra, and Y. Solihin, "Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture", 13th International Conference on Parallel Architectures and Complication Technique (PACT), Sep-Oct 2004 Google ScholarDigital Library
- P. Kongetira, K. Aingaran, K. Olukotun, "Niagara: A 32-Way Multithreaded Processor," IEEE Micro. 2005 Google ScholarDigital Library
- K. Lee, T. Lin and C. Jen. An Efficient Quality-Aware Memory Controller for Multimedia Platform SoC. IEEE Trans. On Circuits and Systems for Video Technology, May 2005 Google ScholarDigital Library
- Charles Lefurgy, Xiaorui Wang, and Malcolm Ware, "Server-level Power Control ", 4th IEEE Conference on Autonomic Computing (ICAC'07), 2007 Google ScholarDigital Library
- S. Manne, A. Klauser, and D. Grunwald, "Pipeline Gating: Speculation Control for Energy Reduction," 25th International Symposium on Computer Architecture, June 1998. Google ScholarDigital Library
- J. Matthews, et al., "Quantifying the Performance Isolation Properties of Virtualization Systems, VMware and Solaris Containers," Workshop on Experimental Computer Science, 2007 Google ScholarDigital Library
- O. Mutlu and T. Moscibroda, "Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors," Proceedings of the 40th International Symposium on Microarchitecture (MICRO), December 2007. Google ScholarDigital Library
- G. Neiger et al, "Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization," Intel Technology Journal, August 2006.Google Scholar
- K. J. Nesbit, et al, "Fair Queuing Memory Systems", MICRO 2006 Google ScholarDigital Library
- K. J. Nesbit et al, "Virtual Private Caches", ISCA 2007 Google ScholarDigital Library
- Pointers to all SPEC CPU2000 material and results http://www.spec.org/cpu/Google Scholar
- Patsy K. Popa "X3 Managing Server Energy Consumption Using IBM PowerExecutive" http://www-07.ibm.com/systems/includes/content/x/about/pdf/XSW02410USEN.pdfGoogle Scholar
- Power and Thermal Management in the Intel® Core™ Duo Processor. Intel Technology Journal, 2006Google Scholar
- M. K. Qureshi and Yale N. Patt, "Utility-Based Cache Partitioning: Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches", MICRO 2006 Google ScholarDigital Library
- N. Rafique, et al, "Architecture Support for OS-Driven CMP Cache Management", 15th Int'l Conf. on Parallel Architectures and Compilation Techniques, Sept 2006. Google ScholarDigital Library
- P. Ranganathan, P. Leech, et al. Ensemble-level power management for dense blade servers. In ISCA, 2006 Google ScholarDigital Library
- L. Sha, R. Rajkumar and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. IEEE Transactions on Computers, Sept 1990. Google ScholarDigital Library
- J. Sharkey et al, "Evaluating design tradeoffs in on-chip power management for CMPs" ISLPED 2007 Google ScholarDigital Library
- H. Tsao, "IBM @eServer p5 570 Server Consolidation Using POWER5", White Paper, IBM CorporationGoogle Scholar
- L. Zhao et al, "CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms", 16th Int'l Conference on Parallel Architectures and Compilation Techniques (PACT), 2007 Google ScholarDigital Library
- H. Zhang. Service Disciplines for Guaranteed Performance Service in Packet-switching Networks. In Proc. of IEEE, Oct. 1995Google ScholarCross Ref
Index Terms
Rate-based QoS techniques for cache/memory in CMP platforms
Recommendations
QoS policies and architecture for cache/memory in CMP platforms
SIGMETRICS '07 Conference ProceedingsAs we enter the era of CMP platforms with multiple threads/cores on the die, the diversity of the simultaneous workloads running on them is expected to increase. The rapid deployment of virtualization as a means to consolidate workloads on to a single ...
An efficient cache design for scalable glueless shared-memory multiprocessors
CF '06: Proceedings of the 3rd conference on Computing frontiersTraditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the access to main memory to recover the sharing status of the block is ...
Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue
To alleviate bottlenecks in this era of many-core architectures, the authors propose a virtual write queue to expand the memory controller's scheduling window through visibility of cache behavior. Awareness of the physical main memory layout and a focus ...
Comments