ABSTRACT
Transactional memory (TM) is a promising synchronization mechanism for the next generation of multicore processors. Best-effort Hardware Transactional Memory (HTM) designs, such as Sun's prototype Rock processor and AMD's proposed Advanced Synchronization Facility (ASF), can efficiently execute many transactions, but abort in some cases due to various limitations. Hybrid TM systems can use a compatible software TM (STM) in such cases.
We introduce a family of hybrid TMs built using the recent NOrec STM algorithm that, unlike existing hybrid approaches, provide both low overhead on hardware transactions and concurrent execution of hardware and software transactions. We evaluate implementations for Rock and ASF, exploring how the differing HTM designs affect optimization choices. Our investigation yields valuable input for designers of future best-effort HTMs.
- A.-R. Adl-Tabatabai and T. Shpeisman (Eds.). Draft Specification of Transactional Language Constructs for C. research.sun.com/scalable/pubs/C++transactional-constructs-1.0.pdf, Aug. 2009. Version 1.0.Google Scholar
- A.-R. Adl-Tabatabai, B. T. Lewis, V. Menon, B. R. Murphy, B. Saha, and T. Shpeisman. Compiler and Runtime Support for Efficient Software Transactional Memory. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, Jun. 2006. Google ScholarDigital Library
- Advanced Micro Devices. Advanced Synchronization Facility: Proposed Architectural Specification. Publication #45432, rev. 2.1, developer.amd.com/assets/45432-ASF_Spec_2.1.pdf, Mar. 2009.Google Scholar
- C. S. Ananian, K. Asanovic, B. C. Kuszmaul, C. E. Leiserson, and S. Lie. Unbounded Transactional Memory. In 11th Intl. Symp. on High-Performance Computer Architecture, Feb. 2005. Google ScholarDigital Library
- C. Blundell, J. Devietti, E. C. Lewis, and M. M. K. Martin. Making the fast case common and the uncommon case simple in unbounded transactional memory. SIGARCH Comput. Archit. News, 35: 24--34, June 2007. Google ScholarDigital Library
- C. Blundell, E. C. Lewis, and M. M. K. Martin. Subtleties of Transactional Memory Atomicity Semantics. Computer Architecture Letters, 5 (2), Nov. 2006. Google ScholarDigital Library
- t al.(2008)Cao Minh, Chung, Kozyrakis, and Olukotun}minh-iiswc-2008C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford Transactional Applications for Multi-processing. In IEEE Intl. Symp. on Workload Characterization, Sep. 2008.Google Scholar
- S. Chaudhry, R. Cypher, M. Ekman, M. Karlsson, A. Landin, and S. Yip. Rock: A High-Performance SPARC™ CMT Processor. IEEE Micro, 29 (2): 6--16, Mar.-Apr. 2009. Google ScholarDigital Library
- D. Christie, J.-W. Chung, S. Diestelhorst, M. Hohmuth, M. Pohlack, C. Fetzer, M. Nowack, T. Riegel, P. Felber, P. Marlier, and E. Riviere. Evaluation of AMD's Advanced Synchronization Facility within a Complete Transactional Memory Stack. In EuroSys Conf., Apr. 2010. Google ScholarDigital Library
- L. Dalessandro, M. F. Spear, and M. L. Scott. NOrec: Streamlining S™ by Abolishing Ownership Records. In 15th ACM Symp. on Principles and Practice of Parallel Programming, Jan. 2010. Google ScholarDigital Library
- P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid Transactional Memory. In 12th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct. 2006. Google ScholarDigital Library
- D. Dice, Y. Lev, V. J. Marathe, M. Moir, D. Nussbaum, and M. Oleszewski. Simplifying Concurrent Algorithms by Exploiting Hardware Transactional Memory. In 22nd ACM Symp. on Parallelism in Algorithms and Architectures, 2010. Google ScholarDigital Library
- D. Dice, Y. Lev, M. Moir, D. Nussbaum, and M. Olszewski. Early Experience with a Commercial Hardware Transactional Memory Implementation. SMLI TR-2009-180, Sun Microsystems Laboratories, Oct. 2009. Google ScholarDigital Library
- D. Dice, O. Shalev, and N. Shavit. Transactional Locking II. In 20th Intl. Symp. on Distributed Computing, Sep. 2006. Google ScholarDigital Library
- S. Diestelhorst and M. Hohmuth. Hardware Acceleration for Lock-Free Data Structures and Software-Transactional Memory. In Wkshp. on Exploiting Parallelism with Transactional Memory and other Hardware Assisted Methods, Apr. 2008.Google Scholar
- S. Diestelhorst, M. Pohlack, M. Hohmuth, D. Christie, J.-W. Chung, and L. Yen. Implementing AMD's Advanced Synchronization Facility in an Out-of-Order x86 Core. In 5th ACM SIGPLAN Wkshp. on Transactional Computing, Apr. 2010.Google Scholar
- C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software Behavior Oriented Parallelization. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, Jun. 2007. Google ScholarDigital Library
- F. Ellen, Y. Lev, V. Luchangco, and M. Moir. SNZI: Scalable NonZero Indicators. In 26th ACM Symp. on Principles of Distributed Computing, Aug. 2007. Google ScholarDigital Library
- P. Felber, C. Fetzer, P. Marlier, M. Nowack, and T. Riegel. Brief announcement: Hybrid time-based transactional memory. In N. Lynch and A. Shvartsman, editors, Distributed Computing, Lecture Notes in Computer Science. Springer Berlin/Heidelberg, 2010. Google ScholarDigital Library
- R. Guerraoui and M. Kapalka. On the Correctness of Transactional Memory. In 13th ACM Symp. on Principles and Practice of Parallel Programming, 2008. Google ScholarDigital Library
- L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabju, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional Memory Coherence and Consistency. In 31st Intl. Symp. on Computer Architecture, Jun. 2004. Google ScholarDigital Library
- T. Harris and K. Fraser. Revocable Locks for Non-Blocking Programming. In 10th ACM Symp. on Principles and Practice of Parallel Programming, Jun. 2005. Google ScholarDigital Library
- T. Harris, J. Larus, and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture. Morgan Claypool, 2nd edition, 2010. Google ScholarDigital Library
- T. Harris, M. Plesko, A. Shinar, and D. Tarditi. Optimizing Memory Transactions. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, Jun. 2006. Google ScholarDigital Library
- M. Herlihy and J. E. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In 20th Intl. Symp. on Computer Architecture, May. 1993. Google ScholarDigital Library
- S. Kumar, M. Chu, C. J. Hughes, P. Kundu, and A. Nguyen. Hybrid Transactional Memory. In 11th ACM Symp. on Principles and Practice of Parallel Programming, Mar. 2006. Google ScholarDigital Library
- C. Lameter. Effective Synchronization on Linux/NUMA Systems. In May 2005 Gelato Federation Meeting, May. 2005.Google Scholar
- Y. Lev, V. Luchangco, V. J. Marathe, M. Moir, D. Nussbaum, and M. Olszewski. Anatomy of a Scalable Software Transactional Memory. In 4th ACM SIGPLAN Wkshp. on Transactional Computing, 2009. http://research.sun.com/scalable/pubs/ TRANSACT2009-ScalableSTMAnatomy.pdf.Google Scholar
- Y. Lev, M. Moir, and D. Nussbaum. PhTM: Phased Transactional Memory. In 2nd ACM SIGPLAN Wkshp. on Transactional Computing, Aug. 2007.Google Scholar
- S. Lie. Hardware Support for Unbounded Transactional Memory. Master's thesis, Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science, May 2004.Google Scholar
- P. E. McKenney. Is Parallel Programming Hard, And, If So, What Can You Do About It? http://www.rdrop.com/users/paulmck/perfbook/perfbook.2010.01.23a.pdf, 2010. {Viewed Jan. 24, 2010}.Google Scholar
- V. Menon, S. Balensiefer, T. Shpeisman, A.-R. Adl-Tabatabai, R. L. Hudson, B. Saha, and A. Welc. Practical Weak-Atomicity Semantics for Java S™. In 20th ACM Symp. on Parallelism in Algorithms and Architectures, June 2008. Google ScholarDigital Library
- K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. Log™: Log-based Transactional Memory. In 12th Intl. Symp. on High-Performance Computer Architecture, Feb. 2006.Google ScholarCross Ref
- M. Olszewski, J. Cutler, and J. G. Steffan. JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory. In 16th Intl. Conf. on Parallel Architectures and Compilation Techniques, Sep. 2007. Google ScholarDigital Library
- R. Rajwar, M. Herlihy, and K. Lai. Virtualizing Transactional Memory. In 32nd Intl. Symp. on Computer Architecture, Jun. 2005. Google ScholarDigital Library
- T. Riegel, P. Marlier, M. Nowack, P. Felber, and C. Fetzer. Optimizing Hybrid Transactional Memory: The Importance of Nonspeculative Operations. TUD-FI10-06-Nov.2010, Technische Universitaet Dresden, Nov. 2010.Google Scholar
- B. Saha, A.-R. Adl-Tabatabai, R. L. Hudson, C. C. Minh, and B. Hertzberg. McRT-STM: A High Performance Software Transactional Memory System For A Multi-Core Runtime. In 11th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, Mar. 2006. Google ScholarDigital Library
- M. Spear, V. Marathe, W. Scherer, and M. Scott. Conflict detection and validation strategies for software transactional memory. In S. Dolev, editor, Distributed Computing, Lecture Notes in Computer Science. Springer Berlin/Heidelberg, 2006. Google ScholarDigital Library
- M. F. Spear. Lightweight, Robust Adaptivity for Software Transactional Memory. In 22nd ACM Symp. on Parallelism in Algorithms and Architectures, June 2010. Google ScholarDigital Library
- F. Tabba, A. W. Hay, and J. R. Goodman. Transactional Value Prediction. In 4th ACM SIGPLAN Wkshp. on Transactional Computing, Feb. 2009.Google Scholar
- C. Wang, W.-Y. Chen, Y. Wu, B. Saha, and A.-R. Adl-Tabatabai. Code Generation and Optimization for Transactional Memory Constructs in an Unmanaged Language. In Intl. Symp. on Code Generation and Optimization, Mar. 2007. Google ScholarDigital Library
- S. White and M. Spear. On Reconciling Hardware Atomicity, Memory Models, and __tm_waiver. In 2nd Workshop on the Theory of Transactional Memory (WTTM), Sep. 2010.Google Scholar
- M. Yourst. PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator. In 2007 IEEE Intl. Symp. on Performance Analysis of Systems and Software, Apr. 2007.Google ScholarCross Ref
Index Terms
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory
Recommendations
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory
ASPLOS '11Transactional memory (TM) is a promising synchronization mechanism for the next generation of multicore processors. Best-effort Hardware Transactional Memory (HTM) designs, such as Sun's prototype Rock processor and AMD's proposed Advanced ...
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory
ASPLOS '11Transactional memory (TM) is a promising synchronization mechanism for the next generation of multicore processors. Best-effort Hardware Transactional Memory (HTM) designs, such as Sun's prototype Rock processor and AMD's proposed Advanced ...
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory
ASPLOS'15Because of hardware TM limitations, software fallbacks are the only way to make TM algorithms guarantee progress. Nevertheless, all known software fallbacks to date, from simple locks to sophisticated versions of the NOrec Hybrid TM algorithm, have ...
Comments