skip to main content
10.1145/1815961.1816011acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Modeling critical sections in Amdahl's law and its implications for multicore design

Published:19 June 2010Publication History

ABSTRACT

This paper presents a fundamental law for parallel performance: it shows that parallel performance is not only limited by sequential code (as suggested by Amdahl's law) but is also fundamentally limited by synchronization through critical sections. Extending Amdahl's software model to include critical sections, we derive the surprising result that the impact of critical sections on parallel performance can be modeled as a completely sequential part and a completely parallel part. The sequential part is determined by the probability for entering a critical section and the contention probability (i.e., multiple threads wanting to enter the same critical section). This fundamental result reveals at least three important insights for multicore design. (i) Asymmetric multicore processors deliver less performance benefits relative to symmetric processors than suggested by Amdahl's law, and in some cases even worse performance. (ii) Amdahl's law suggests many tiny cores for optimum performance in asymmetric processors, however, we find that fewer but larger small cores can yield substantially better performance. (iii) Executing critical sections on the big core can yield substantial speedups, however, performance is sensitive to the accuracy of the critical section contention predictor.

References

  1. G. M. Amdahl. Validity of the single-processor approach to achieving large-scale computing capabilities. In Proceedings of the American Federation of Information Processing Societies Conference (AFIPS), pages 483--485, 1967. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Annavaram, E. Grochowski, and J. Shen. Mitigating Amdahl's law through EPI throttling. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 298--309, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec. 2006.Google ScholarGoogle Scholar
  4. S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The impact of performance asymmetry in emerging multicore architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 506--517, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Borkar. Thousand core chips -- a technology perspective. In Proceedings of the Design Automation Conference (DAC), pages 746--749, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Y. Watanabe, and T. Yamazaki. Synergistic processing in Cell's multicore architecture. IEEE Micro, 26(2):10-24, March/April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. L. Gustafson. Reevaluating Amdahl's law. Communications of the ACM, 31(5):532--533, May 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. D. an B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 102--113, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Herlihy and J. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 289--300, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. D. Hill and M. R. Marty. Amdahl's law in the multicore era. IEEE Computer, 41(7):33--38, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Ipek, M. Kirman, N. Kirman, and J. F. Martinez. Core fusion: Accommodating software diversity in chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 186--197, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Kim, M. S. Gupta, G.-Y. Wei, and D. Brooks. System level analysis of fast, per-core DVFS using on-chip switching regulators. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pages 123--134, Feb. 2008.Google ScholarGoogle Scholar
  13. R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the ACM/IEEE Annual International Symposium on Microarchitecture (MICRO), pages 81--92, Dec. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. F. Martinez and J. Torrellas. Speculative synchronization: Applying thread-level speculation to explicitly parallel applications. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 18--29, Oct. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Menasce and V. Almeida. Cost-performance analysis of heterogeneity in supercomputer architectures. In Proceedings of the International Conference on Supercomputing (ICS), pages 169--177, Nov. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pages 35--46, Sept. 2008.Google ScholarGoogle Scholar
  17. K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. Log™: Log-based transactional memory. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), pages 254--265, Feb. 2006.Google ScholarGoogle ScholarCross RefCross Ref
  18. T. Y. Morad, U. C. Weiser, A. Kolodny, M. Valero, and A. Ayguade. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. IEEE Computer Architecture Letters, 5(1):14--17, Jan. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. M. Paul and B. H. Meyer. Amdahl's law revisited for single chip systems. International Jounal of Parallel Programming, 35(2):101--123, Apr. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Rajwar and J. R. Goodman. Speculative lock elision: Enabling highly concurrent multithreaded execution. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 294--305, Dec. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Rajwar and J. R. Goodman. Transactional lock-free execution of lock-based programs. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 5--17, Oct. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 253--264, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. A. Suleman, M. K. Qureshi, and Y. N. Patt. Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on CMPs. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 277--286, Mar. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In Proceedings of the 23rd Annual International Symposium on Computer Architecture (ISCA), pages 191--202, May 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling critical sections in Amdahl's law and its implications for multicore design

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture
          June 2010
          520 pages
          ISBN:9781450300537
          DOI:10.1145/1815961
          • cover image ACM SIGARCH Computer Architecture News
            ACM SIGARCH Computer Architecture News  Volume 38, Issue 3
            ISCA '10
            June 2010
            508 pages
            ISSN:0163-5964
            DOI:10.1145/1816038
            Issue’s Table of Contents

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 June 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate543of3,203submissions,17%

          Upcoming Conference

          ISCA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader