skip to main content
10.1145/2892208.2892227acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
research-article

Reducing memory buffering overhead in software thread-level speculation

Published: 17 March 2016 Publication History

Abstract

Software-based, automatic parallelization through Thread-Level Speculation (TLS) has significant practical potential, but also high overhead costs. Traditional "lazy" buffering mechanisms enable strong isolation of speculative threads, but imply large memory overheads, while more recent "eager" mechanisms improve scalability, but are more sensitive to data dependencies and have higher rollback costs. We here describe an integrated system that incorporates the best of both designs, automatically selecting the best buffering mechanism. Our approach builds on well-optimized designs for both techniques, and we describe specific optimizations that improve both lazy and eager buffer management as well. We implement our design within MUTLS, a software-TLS system based on the LLVM compiler framework. Results show that we can get 75% geometric mean performance of OpenMP versions on 9 memory intensive benchmarks. Application of these optimizations is thus a useful part of the optimization stack needed for effective and practical software TLS.

References

[1]
Z. Cao. MUTLS (mixed model universal software thread-level speculation). http://www.sable.mcgill.ca/~zcao7/mutls, 2013.
[2]
Z. Cao and C. Verbrugge. Adaptive fork-heuristics for software threadlevel speculation. In PPAM’13: 10th International Conference on Parallel Processing and Applied Mathematics, pages 523–533, 2013.
[3]
Z. Cao and C. Verbrugge. Mixed model universal software threadlevel speculation. In ICPP’13, pages 651–660, 2013.
[4]
M. K. Chen and K. Olukotun. The Jrpm system for dynamically parallelizing Java programs. In ISCA’03, pages 434–446, June 2003.
[5]
C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In PLDI’07, pages 223– 234, June 2007.
[6]
Z.-H. Du, C.-C. Lim, X.-F. Li, C. Yang, Q. Zhao, and T.-F. Ngai. A cost-driven compilation framework for speculative parallelization of sequential programs. In PLDI’04, pages 71–81, June 2004.
[7]
M. J. Garzarán, M. Prvulovic, J. M. Llaber´ıa, V. Vi˜nals, L. Rauchwerger, and J. Torrellas. Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors. ACM Transactions on Architecture and Code Optimization, 2(3):247–279, Sept. 2005.
[8]
A. Kejariwal, X. Tian, M. Girkar, W. Li, S. Kozhukhov, U. Banerjee, A. Nicolau, A. V. Veidenbaum, and C. D. Polychronopoulos. Tight analysis of the performance potential of thread speculation using SPEC CPU2006. In PPoPP’07, pages 215–225, Mar. 2007.
[9]
M. Lupon, G. Magklis, and A. Gonzalez. A dynamically adaptable hardware transactional memory. MICRO’43, pages 27–38, 2010.
[10]
V. J. Marathe, W. N. S. III, and M. L. Scott. Adaptive software transactional memory. In Distributed Computing, pages 354–368. Springer, 2005.
[11]
M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In PLDI’09, pages 166–176, June 2009.
[12]
C. E. Oancea and A. Mycroft. Software thread-level speculation: an optimistic library implementation. In IWMSE’08: 1st International Workshop on Multicore Software Engineering, pages 23–32, May 2008.
[13]
C. E. Oancea, A. Mycroft, and T. Harris. A lightweight in-place implementation for software thread-level speculation. In SPAA’09, pages 223–232, Aug. 2009.
[14]
M. Payer and T. R. Gross. Performance evaluation of adaptivity in software transactional memory. In ISPASS’11, pages 165–174, 2011.
[15]
C. J. Pickett and C. Verbrugge. Software thread level speculation for the Java language and virtual machine environment. In LCPC’05, volume 4339, pages 304–318, 2005.
[16]
C. G. Qui˜nones, C. Madriles, J. Sánchez, P. Marcuello, A. González, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In PLDI’05, pages 269– 279, June 2005.
[17]
A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In ASPLOS’10, pages 65–76, Mar. 2010.
[18]
T. Riegel, P. Felber, and C. Fetzer. A lazy snapshot algorithm with eager validation. In DISC’06: Proceedings of the 20th international conference on Distributed Computing, pages 284–298. Springer, 2006.
[19]
P. Rundberg and P. Stenström. An all-software thread-level data dependence speculation system for multiprocessors. Journal of Instruction-Level Parallelism, 3:1–28, Oct. 2001.
[20]
J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253–300, Aug. 2005.
[21]
C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In MICRO’08, pages 330–341. IEEE Computer Society, 2008.
[22]
C. Tian, M. Feng, and R. Gupta. Supporting speculative parallelization in the presence of dynamic data structures. In PLDI’10, pages 62–73, 2010.
[23]
P. Yiapanis, D. Rosas-Ham, G. Brown, and M. Lujan. Optimizing software runtime systems for speculative parallelization. ACM Transactions on Architecture and Code Optimization, 9(4):39:1–39:27, Jan. 2013.
[24]
L. Zhao, W. Choi, and J. Draper. Sel-tm: Selective eager-lazy management for improved concurrency in transactional memory. In IPDPS’12, pages 95–106, 2012.

Cited By

View all
  • (2018)Exploring Parallelism in MiBench with Loop and Procedure Level Speculation2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00033(141-146)Online publication date: Dec-2018
  • (2017)GbA: A graph‐based thread partition approach in speculative multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.429429:21Online publication date: 4-Oct-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CC '16: Proceedings of the 25th International Conference on Compiler Construction
March 2016
270 pages
ISBN:9781450342414
DOI:10.1145/2892208
  • General Chair:
  • Ayal Zaks,
  • Program Chair:
  • Manuel Hermenegildo
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Software thread-level speculation
  2. automatic parallelization
  3. memory buffering
  4. optimization

Qualifiers

  • Research-article

Funding Sources

Conference

CGO '16

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Exploring Parallelism in MiBench with Loop and Procedure Level Speculation2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00033(141-146)Online publication date: Dec-2018
  • (2017)GbA: A graph‐based thread partition approach in speculative multithreadingConcurrency and Computation: Practice and Experience10.1002/cpe.429429:21Online publication date: 4-Oct-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media