skip to main content
10.1145/1006209.1006238acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

The energy efficiency of CMP vs. SMT for multimedia workloads

Published: 26 June 2004 Publication History

Abstract

This paper compares the energy efficiency of chip multiprocessing (CMP) and simultaneous multithreading (SMT) on modern out-of-order processors for the increasingly important multimedia applications. Since performance is an important metric for real-time multimedia applications, we compare configurations at equal performance. We perform this comparison for a large number of performance points derived using different processor architectures and frequencies/voltages.We find that for the design space explored, for each workload, at each performance point, CMP is more energy efficient than SMT. The difference is small for two thread systems, but large (18% to 44%) for four thread systems. We also find that the best SMT and the best CMP configuration for a given performance target have different architecture and frequency/voltage. Therefore, their relative energy efficiency depends on a subtle interplay between various factors such as capacitance, voltage, IPC, frequency, and the level of clock gating, as well as workload features. We perform a detailed analysis considering these factors and develop a mathematical model to explain these results.Although CMP shows a clear energy advantage for four-thread (and higher) workloads, it comes at the cost of increased silicon area. We therefore investigate a hybrid solution where a CMP is built out of SMT cores, and find it to be an effective compromise. Finally, we find that we can reduce energy further for CMP with a straightforward application of previously proposed techniques of adaptive architectures and dynamic voltage/frequency scaling.

References

[1]
Intel Pentium-M Processor Datasheet. http://www.intel.com/design/mobile/datashts/252612.htm.
[2]
D. H. Albonesi et al. Dynamically Tuning Processor Resources with Adaptive Processing. In IEEE Computer, December 2003.
[3]
B. Bentley and R. Gray. Validating The Intel Pentium 4 Processor: Power Reduction Validation. http://www.intel.com/technology/itj/q12001/articles/art 3.htm. In Intel Technology Journal, 2001.
[4]
S. Y. Borkar. Designing for power, http://www.intel.com/labs/features/mi04031.htm?iid=labs+mi04031.htm.
[5]
D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In ISCA, 2000.
[6]
J. Burns and J.-L. Gaudiot. Area and System Clock Effects on SMT/CMP Processors. In PACT, 2000.
[7]
Y.-K. Chen et al. Media Applications on Hyper-Threading Technology, http://developer.intel.com/technology/itj/2002/. In Intel Technology Journal, Vol.6, Issue 1, February 2002.
[8]
R. Gonzalez and M. Horowitz. Energy Dissipation In General Purpose Microprocessors. In IEEE Journal of Solid-state Circuits, September 1996.
[9]
J. Griswell et al. Using a Performance Model to Estimate Core Clock Gating Power Savings. In Workshop on Complexity-Effective Design, 2002.
[10]
L. Hammond, B. A. Nayfeh, and K. Olukotun. A Single-Chip Multiprocessor. In IEEE Computer Special Issue on Billion-Transistor Processors, September 1997.
[11]
C. J. Hughes et al. Variability in the Execution of Multimedia Applications and Implications for Architecture. In ISCA, 2001.
[12]
C. J. Hughes, V. S. Pai, P. Ranganathan, and S. V. Adve. RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors. IEEE Computer, February 2002.
[13]
R. Jain, C. J. Hughes, and S. V. Adve. Soft Real-Time Scheduling on Simultaneous Multithreaded Processors. In RTSS, 2002.
[14]
S. Kaxiras, G. Narlikar, A. D. Berenbaum, and Z. Hu. Comparing Power Consumption of an SMT and a CMP DSP for Mobile Phone Workloads. In CASES, 2001.
[15]
R. Sasanka, S. V. Adve, E. Debes, and Y.-K. Chen. Energy Efficiency of CMP and SMT Architectures for Multimedia Workloads. In UIUC CS Technical Report UIUCDCS-R-2003-2325, 2003.
[16]
R. Sasanka, C. J. Hughes, and S. V. Adve. Joint Local and Global Hardware Adaptations for Energy. In ASPLOS, 2002.
[17]
J. Seng, D. Tullsen, and G. Z. N. Cai. Power-Sensitive Multithreaded Architecture. In ASPLOS, 1996.
[18]
K. Skadron, T. Abdelzaher, and M. R. Stan. Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management. In HPCA, 2002.
[19]
D. Tullsen et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. In ISCA, 1996.
[20]
D. Tullsen et al. Converting Thread-Level Parallelism Into Instruction-Level Parallelism via Simultaneous Multithreading. In ACM Trans. Computer Systems, Aug. 1997.
[21]
H.-S. Wang, X. Zhu, L.-S. Peh, and S. Malik. Orion: A Power-Performance Simulator for Interconnection Networks. In Proc. of the 35th MICRO, November 2002.
[22]
Y. Zhang et al. HotLeakage: A Temperature-Aware Model of Subthreshold and Gate Leakage for Architects. In Tech Report CS-2003-05, Univ. of Virginia, 2003.

Cited By

View all
  • (2015)Soft-error mitigation by means of decoupled transactional memory threadsDistributed Computing10.1007/s00446-014-0215-628:2(75-90)Online publication date: 1-Apr-2015
  • (2014)Optimal reliability-constrained overdrive frequency selection in multicore systemsFifteenth International Symposium on Quality Electronic Design10.1109/ISQED.2014.6783340(300-308)Online publication date: Mar-2014
  • (2014)Green software development for multi-core architectures2014 IEEE Symposium on Computers and Communications (ISCC)10.1109/ISCC.2014.6912565(1-6)Online publication date: Jun-2014
  • Show More Cited By

Index Terms

  1. The energy efficiency of CMP vs. SMT for multimedia workloads

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '04: Proceedings of the 18th annual international conference on Supercomputing
    June 2004
    360 pages
    ISBN:1581138393
    DOI:10.1145/1006209
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 June 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CMP
    2. SMT
    3. energy efficiency
    4. multimedia

    Qualifiers

    • Article

    Conference

    ICS04
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Soft-error mitigation by means of decoupled transactional memory threadsDistributed Computing10.1007/s00446-014-0215-628:2(75-90)Online publication date: 1-Apr-2015
    • (2014)Optimal reliability-constrained overdrive frequency selection in multicore systemsFifteenth International Symposium on Quality Electronic Design10.1109/ISQED.2014.6783340(300-308)Online publication date: Mar-2014
    • (2014)Green software development for multi-core architectures2014 IEEE Symposium on Computers and Communications (ISCC)10.1109/ISCC.2014.6912565(1-6)Online publication date: Jun-2014
    • (2014)An analytical study of resource division and its impact on power and performance of multi-core processorsThe Journal of Supercomputing10.1007/s11227-014-1086-068:3(1265-1279)Online publication date: 1-Jun-2014
    • (2013)Performance/reliability trade-off in superscalar processors for aggressive NBTI restoration of functional unitsProceedings of the 23rd ACM international conference on Great lakes symposium on VLSI10.1145/2483028.2483097(221-226)Online publication date: 2-May-2013
    • (2012)Looking back and looking forwardCommunications of the ACM10.1145/2209249.220927255:7(105-114)Online publication date: 1-Jul-2012
    • (2011)Looking back on the language and hardware revolutionsACM SIGPLAN Notices10.1145/1961296.195040246:3(319-332)Online publication date: 5-Mar-2011
    • (2011)Looking back on the language and hardware revolutionsACM SIGARCH Computer Architecture News10.1145/1961295.195040239:1(319-332)Online publication date: 5-Mar-2011
    • (2011)Looking back on the language and hardware revolutionsProceedings of the sixteenth international conference on Architectural support for programming languages and operating systems10.1145/1950365.1950402(319-332)Online publication date: 5-Mar-2011
    • (2011)Characterizing Power and Temperature Behavior of POWER6-Based SystemIEEE Journal on Emerging and Selected Topics in Circuits and Systems10.1109/JETCAS.2011.21696301:3(228-241)Online publication date: Sep-2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media