Abstract
Recently, a single chip multiprocessor (CMP) is becoming an attractive architecture for improving throughput of program execution. In CMPs, multiple processor cores share several hardware resources such as cache memory and memory bus. Therefore, the resource contention significantly degrades performance of each thread and also loses fairness between threads.
In this paper, we propose a Dynamic Frequency and Voltage Scaling (DVFS) algorithm for improving total instruction throughput, fairness, and energy efficiency of CMPs. The proposed technique periodically observes the utilization ratio of shared resources and controls the frequency and the voltage of each processor core individually to balance the ratio between threads. We evaluate our technique and the evaluation results show that fairness between threads are greatly improved by the technique. Moreover, the total instruction throughput increases in many cases while reducing energy consumption.
- T. M. Austin, E. Larson, and D. Ernst. Simplescalar: An infrastructure for computer system modeling. IEEE Computer, 35(2):59--67, 2002. Google ScholarDigital Library
- D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for architectural-level power analysis and optimizations. In 27th ISCA, pages 83--94, June 2000. Google ScholarDigital Library
- D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting interthread cache contention on a chip multi-processor architecture. In 11th HPCA, pages 340--351, Feb. 2005. Google ScholarDigital Library
- D. M. Chapiro. Globally Asynchronous Locally Synchronous Systems. PhD thesis, Stanford Univeristy, 1984. Google ScholarDigital Library
- T. Fujiyoshi and et al. Intel pentium m processor datasheet. In 2005 ISSCC, pages 132--133, Feb. 2005.Google Scholar
- Intel. Intel Pentium M Processor Datasheet., June 2003.Google Scholar
- A. Iyer and D. Marculescu. Power and performance evaluation of globally asynchronous locally synchronous processors. In 29th ISCA, pages 158--168, May 2002. Google ScholarDigital Library
- S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In 13th PACT, pages 111--122, Oct. 2004. Google ScholarDigital Library
- Y. Li, D. Brooks, Z. Hu, and K. Skadron. Performance, energy, and thermal considerations for smt and cmp architectures. In 11th HPCA, pages 71--82, Feb. 2005. Google ScholarDigital Library
- C. Liu, A. Sivasubramaniam, M. T. Kandemir, and M. J. Irwin. Exploiting barriers to optimize power consumption of cmps. In 2005 IPDPS, Apr. 2005. Google ScholarDigital Library
- K. Luo, J. Gummaraju, and M. Franklin. Balancing throughput and fairness in smt processors. In ISPASS2001, pages 164--171, Nov. 2001.Google Scholar
- G. Magklis, M. L. Scott, G. Semeraro, D. H. Albonesi, and S. Dropsho. Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor. In 30th ISCA, pages 14--25, June 2003. Google ScholarDigital Library
- T. Moseley, D. Grunwald, J. L. Kihm, and D. A. Connors. Methods for modeling resource contention on simultaneous multithreading processors. In 23rd ICCD, pages 373--380, Oct. 2005. Google ScholarDigital Library
- K. Nose and et al. Deterministic inter-core synchronization with periodically all-in-phase clocking for low-power multicore socs. In 2005 ISSCC, pages 296--297, Feb. 2005.Google Scholar
- S. Rixner, W. J. Dally, U. J. Kapasi, P. R. Mattson, and J. D. Owens. Memory access scheduling. In 27th ISCA, pages 128--138, June 2000. Google ScholarDigital Library
- R. Sasanka, S. V. Adve, Y.-K. Chen, and E. Debes. The energy efficiency of cmp vs. smt for multimedia workloads. In 18th ICS, pages 196--206, June. 2004. Google ScholarDigital Library
- A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreading processor. In ASPLOS IX, pages 234--244, Nov. 2000. Google ScholarDigital Library
- G. E. Suh, S. Devadas, and L. Rudolph. A new memory monitoring scheme for memory-aware scheduling and partitioning. In 8th HPCA, pages 117--128, Feb. 2002. Google ScholarDigital Library
Index Terms
- Improving fairness, throughput and energy-efficiency on a chip multiprocessor through DVFS
Recommendations
A Torus-Based Hierarchical Optical-Electronic Network-on-Chip for Multiprocessor System-on-Chip
Networks-on-chip (NoCs) are emerging as a key on-chip communication architecture for multiprocessor systems-on-chip (MPSoCs). Optical communication technologies are introduced to NoCs in order to empower ultra-high bandwidth with low power consumption. ...
3-D Mesh-Based Optical Network-on-Chip for Multiprocessor System-on-Chip
Optical networks-on-chip (ONoCs) are emerging communication architectures that can potentially offer ultrahigh communication bandwidth and low latency to multiprocessor systems-on-chip (MPSoCs). In addition to ONoC architectures, 3-D integrated ...
A Low-power Low-cost Optical Router for Optical Networks-on-Chip in Multiprocessor Systems-on-Chip
ISVLSI '09: Proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSINetworks-on-chip (NoCs) can improve the communication bandwidth and power efficiency of multiprocessor systems-on-chip (MPSoC). However, traditional metallic interconnects consume significant amount of power to deliver even higher communication ...
Comments