research-article

Reducing memory buffering overhead in software thread-level speculation

Authors:

Clark VerbruggeAuthors Info & Claims

CC '16: Proceedings of the 25th International Conference on Compiler Construction

Pages 12 - 22

https://doi.org/10.1145/2892208.2892227

Published: 17 March 2016 Publication History

Abstract

Software-based, automatic parallelization through Thread-Level Speculation (TLS) has significant practical potential, but also high overhead costs. Traditional "lazy" buffering mechanisms enable strong isolation of speculative threads, but imply large memory overheads, while more recent "eager" mechanisms improve scalability, but are more sensitive to data dependencies and have higher rollback costs. We here describe an integrated system that incorporates the best of both designs, automatically selecting the best buffering mechanism. Our approach builds on well-optimized designs for both techniques, and we describe specific optimizations that improve both lazy and eager buffer management as well. We implement our design within MUTLS, a software-TLS system based on the LLVM compiler framework. Results show that we can get 75% geometric mean performance of OpenMP versions on 9 memory intensive benchmarks. Application of these optimizations is thus a useful part of the optimization stack needed for effective and practical software TLS.

References

[1]

Z. Cao. MUTLS (mixed model universal software thread-level speculation). http://www.sable.mcgill.ca/~zcao7/mutls, 2013.

[2]

Z. Cao and C. Verbrugge. Adaptive fork-heuristics for software threadlevel speculation. In PPAM’13: 10th International Conference on Parallel Processing and Applied Mathematics, pages 523–533, 2013.

[3]

Z. Cao and C. Verbrugge. Mixed model universal software threadlevel speculation. In ICPP’13, pages 651–660, 2013.

Digital Library

[4]

M. K. Chen and K. Olukotun. The Jrpm system for dynamically parallelizing Java programs. In ISCA’03, pages 434–446, June 2003.

Digital Library

[5]

C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In PLDI’07, pages 223– 234, June 2007.

Digital Library

[6]

Z.-H. Du, C.-C. Lim, X.-F. Li, C. Yang, Q. Zhao, and T.-F. Ngai. A cost-driven compilation framework for speculative parallelization of sequential programs. In PLDI’04, pages 71–81, June 2004.

Digital Library

[7]

M. J. Garzarán, M. Prvulovic, J. M. Llaber´ıa, V. Vi˜nals, L. Rauchwerger, and J. Torrellas. Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors. ACM Transactions on Architecture and Code Optimization, 2(3):247–279, Sept. 2005.

Digital Library

[8]

A. Kejariwal, X. Tian, M. Girkar, W. Li, S. Kozhukhov, U. Banerjee, A. Nicolau, A. V. Veidenbaum, and C. D. Polychronopoulos. Tight analysis of the performance potential of thread speculation using SPEC CPU2006. In PPoPP’07, pages 215–225, Mar. 2007.

Digital Library

[9]

M. Lupon, G. Magklis, and A. Gonzalez. A dynamically adaptable hardware transactional memory. MICRO’43, pages 27–38, 2010.

Digital Library

[10]

V. J. Marathe, W. N. S. III, and M. L. Scott. Adaptive software transactional memory. In Distributed Computing, pages 354–368. Springer, 2005.

Digital Library

[11]

M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In PLDI’09, pages 166–176, June 2009.

Digital Library

[12]

C. E. Oancea and A. Mycroft. Software thread-level speculation: an optimistic library implementation. In IWMSE’08: 1st International Workshop on Multicore Software Engineering, pages 23–32, May 2008.

Digital Library

[13]

C. E. Oancea, A. Mycroft, and T. Harris. A lightweight in-place implementation for software thread-level speculation. In SPAA’09, pages 223–232, Aug. 2009.

Digital Library

[14]

M. Payer and T. R. Gross. Performance evaluation of adaptivity in software transactional memory. In ISPASS’11, pages 165–174, 2011.

Digital Library

[15]

C. J. Pickett and C. Verbrugge. Software thread level speculation for the Java language and virtual machine environment. In LCPC’05, volume 4339, pages 304–318, 2005.

Digital Library

[16]

C. G. Qui˜nones, C. Madriles, J. Sánchez, P. Marcuello, A. González, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In PLDI’05, pages 269– 279, June 2005.

Digital Library

[17]

A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In ASPLOS’10, pages 65–76, Mar. 2010.

Digital Library

[18]

T. Riegel, P. Felber, and C. Fetzer. A lazy snapshot algorithm with eager validation. In DISC’06: Proceedings of the 20th international conference on Distributed Computing, pages 284–298. Springer, 2006.

Digital Library

[19]

P. Rundberg and P. Stenström. An all-software thread-level data dependence speculation system for multiprocessors. Journal of Instruction-Level Parallelism, 3:1–28, Oct. 2001.

[20]

J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253–300, Aug. 2005.

Digital Library

[21]

C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In MICRO’08, pages 330–341. IEEE Computer Society, 2008.

[22]

C. Tian, M. Feng, and R. Gupta. Supporting speculative parallelization in the presence of dynamic data structures. In PLDI’10, pages 62–73, 2010.

Digital Library

[23]

P. Yiapanis, D. Rosas-Ham, G. Brown, and M. Lujan. Optimizing software runtime systems for speculative parallelization. ACM Transactions on Architecture and Code Optimization, 9(4):39:1–39:27, Jan. 2013.

Digital Library

[24]

L. Zhao, W. Choi, and J. Draper. Sel-tm: Selective eager-lazy management for improved concurrency in transactional memory. In IPDPS’12, pages 95–106, 2012.

Digital Library

Cited By

Bu DWang YLi LLiu ZYu WMusariri M(2018)Exploring Parallelism in MiBench with Loop and Procedure Level Speculation2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00033(141-146)Online publication date: Dec-2018
https://doi.org/10.1109/BDCloud.2018.00033
Li YZhao YWu Q(2017)GbA: A graph‐based thread partition approach in speculative multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.429429:21Online publication date: 4-Oct-2017
https://doi.org/10.1002/cpe.4294

Index Terms

Reducing memory buffering overhead in software thread-level speculation
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
    2. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

The STAMPede approach to thread-level speculation

Multithreaded processor architectures are becoming increasingly commonplace: many current and upcoming designs support chip multiprocessing, simultaneous multithreading, or both. While it is relatively straightforward to use these architectures to ...
Compiler-Driven Software Speculation for Thread-Level Parallelism

Current parallelizing compilers can tackle applications exercising regular access patterns on arrays or affine indices, where data dependencies can be expressed in a linear form. Unfortunately, there are cases that independence between statements of code ...
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors

Thread-Level Speculation (TLS) provides architectural support to aggressively run hard-to-analyze code in parallel. As speculative tasks run concurrently, they generate unsafe or speculative memory state that needs to be separately buffered and managed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CC '16: Proceedings of the 25th International Conference on Compiler Construction

March 2016

270 pages

ISBN:9781450342414

DOI:10.1145/2892208

General Chair:
Ayal Zaks
Intel, Israel / Technion, Israel
,
Program Chair:
Manuel Hermenegildo
IMDEA SW Institute, Spain / T.U. Madrid-UPM, Spain

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Sciences and Engineering Research Council of Canada

Conference

CGO '16

Sponsor:

CGO '16: 14th Annual IEEE/ACM International Symposium on Code Generation and Optimization

March 17 - 18, 2016

Barcelona, Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
107
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bu DWang YLi LLiu ZYu WMusariri M(2018)Exploring Parallelism in MiBench with Loop and Procedure Level Speculation2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00033(141-146)Online publication date: Dec-2018
https://doi.org/10.1109/BDCloud.2018.00033
Li YZhao YWu Q(2017)GbA: A graph‐based thread partition approach in speculative multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.429429:21Online publication date: 4-Oct-2017
https://doi.org/10.1002/cpe.4294

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten