ABSTRACT
Hardware Transactional Memory (HTM) systems reflect choices from three key design dimensions: conflict detection, version management, and conflict resolution. Previously proposed HTMs represent three points in this design space: lazy conflict detection, lazy version management, committer wins (LL); eager conflict detection, lazy version management, requester wins (EL); and eager conflict detection, eager version management, and requester stalls with conservative deadlock avoidance (EE). To isolate the effects of these high-level design decisions, we develop a common framework that abstracts away differences in cache write policies, interconnects, and ISA to compare these three design points. Not surprisingly, the relative performance of these systems depends on the workload. Under light transactional loads they perform similarly, but under heavy loads they differ by up to 80%. None of the systems performs best on all of our benchmarks. We identify seven performance pathologies-interactions between workload and system that degrade performance-as the root cause of many performance differences: FriendlyFire, StarvingWriter, SerializedCommit, FutileStall, StarvingElder, RestartConvoy, and DuelingUpgrades. We discuss when and on which systems these pathologies can occur and show that they actually manifest within TM workloads. The insight provided by these pathologies motivated four enhanced systems that often significantly reduce transactional memory overhead. Importantly, by avoiding transaction pathologies, each enhanced system performs well across our suite of benchmarks.
- Alaa R. Alameldeen and David A. Wood. Variability in Architectural Simulations of Multi-threaded Workloads. In Proceedings of the Ninth IEEE Symposium on High-Performance Computer Architecture, pages 7--18, February 2003. Google ScholarDigital Library
- C. Scott Ananian, Krste Asanovic, Bradley C. Kuszmaul, Charles E. Leiserson, and Sean Lie. Unbounded Transactional Memory. In Proceedings of the Eleventh IEEE Symposium on High-Performance Computer Architecture, February 2005. Google ScholarDigital Library
- Mike Blasgen, Jim Gray, Mike Mitoma, and Tom Price. The Convoy Phenomenon. SIGOPS Oper. Syst. Rev., 13(2):20--25, 1979. Google ScholarDigital Library
- Luis Ceze, James Tuck, Calin Cascaval, and Josep Torrellas. Bulk Disambiguation of Speculative Threads in Multiprocessors. In Proceedings of the 33nd Annual International Symposium on Computer Architecture, June 2006. Google ScholarDigital Library
- Hassan Chafi, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, JaeWoong Chung, Lance Hammond, Christos Kozyrakis, and Kunle Olukotun. A Scalable, Non-blocking Approach to Transactional Memory. In Proceedings of the Thirteenth IEEE Symposium on High-Performance Computer Architecture, pages 97--108, February 2007. Google ScholarDigital Library
- Weihaw Chuang, Satish Narayanasmy, Ganesh Venkatesh, Jack Sampson, Michael Van Biesbrouck, Gilles Pokam, Osvaldo Colavin, and Brad Calder. Unbounded Page-Based Transactional Memory. In Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006. Google ScholarDigital Library
- JaeWoong Chung, Chi Cao Minh, Austen McDonald, Hassan Chafi, Brian D. Carlstrom, Travis Skare, Christos Kozyrakis, and Kunle Olukotun. Tradeoffs in Transactional Memory Virtualization. In Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006. Google ScholarDigital Library
- P. J. Courtois, F. Heymans, and D. L. Parnas. Concurrent control with readers and writers. Communications of the ACM, 14(10):667--668, 1971. Google ScholarDigital Library
- Peter Damron, Alexandra Fedorova, Yossi Lev, Victor Luchango, Mark Moir, and Daniel Nussbaum. Hybrid Transactional Memory. In Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006. Google ScholarDigital Library
- Lance Hammond, Vicky Wong, Mike Chen, Brian D. Carlstrom, John D. Davis, Ben Hertzberg, Manohar K. Prabhu, Honggo Wijaya, Christos Kozyrakis, and Kunle Olukotun. Transactional Memory Coherence and Consistency. In Proceedings of the 31st Annual International Symposium on Computer Architecture, June 2004. Google ScholarDigital Library
- Tim Harris, Mark Plesko, Avraham Shinnar, and David Tarditi. Optimizing memory transactions. In Proceedings of the SIGPLAN 2006 Conference on Programming Language Design and Implementation, June 2006. Google ScholarDigital Library
- Maurice Herlihy and J. Eliot B. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 289--300, May 1993. Google ScholarDigital Library
- Stefanos Kaxiras and James R. Goodman. Improving CC-NUMA Performance Using Instruction-Based Prediction. In Proceedings of the Fifth IEEE Symposium on High-Performance Computer Architecture, January 1999. Google ScholarDigital Library
- James R. Larus and Ravi Rajwar. Transactional Memory. Morgan & Claypool Publishers, 2006.Google Scholar
- Peter S. Magnusson et al. Simics: A Full System Simulation Platform. IEEE Computer, 35(2):50--58, February 2002. Google ScholarDigital Library
- Milo M.K. Martin, Daniel J. Sorin, Bradford M. Beckmann, Michael R. Marty, Min Xu, Alaa R. Alameldeen, Kevin E. Moore, Mark D. Hill, and David A. Wood. Multifacet's General Execution-driven Multiprocessor Simulator (GEMS) Toolset. Computer Architecture News, pages 92--99, September 2005. Google ScholarDigital Library
- Austen McDonald, JaeWoong Chung, Brian Carlstrom, Chi Cao Minh, Hassan Chafi, Christos Kozyrakis, and Kunle Olukotun. Architectural Semantics for Practical Transactional Memory. In Proceedings of the 33nd Annual International Symposium on Computer Architecture, June 2006. Google ScholarDigital Library
- John M. Mellor-Curmmey and Michael L. Scott. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarDigital Library
- Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, and David A. Wood. LogTM: Log-Based Transactional Memory. In Proceedings of the Twelfth IEEE Symposium on High-Performance Computer Architecture, pages 258--269, February 2006.Google ScholarCross Ref
- Michelle J. Moravan, Jayaram Bobba, Kevin E. Moore, Luke Yen, Mark D. Hill, Ben Liblit, Michael M. Swift, and David A. Wood. Supporting Nested Transactional Memory in LogTM. In Proceedings of the Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 359--370, October 2006. Google ScholarDigital Library
- Ravi Rajwar and James R. Goodman. Transactional Lock-Free Execution of Lock-Based Programs. In Proceedings of the Tenth International Conference on Architectural Support for Programming Languages and Operating Systems, October 2002. Google ScholarDigital Library
- Ravi Rajwar, Maurice Herlihy, and Konrad Lai. Virtualizing Transactional Memory. In Proceedings of the 32nd Annual International Symposium on Computer Architecture, June 2005. Google ScholarDigital Library
- Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, and Benjamin Hertzberg. McRT-STM: a High Performance Software Transactional Memory System for a Multi-Core Runtime. In Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 187--197, March 2006. Google ScholarDigital Library
- W. N. Scherer III and M. L. Scott. Advanced Contention Management for Dynamic Software Transactional Memory. In Twenty-Fourth ACM Symposium on Principles of Distributed Computing, July 2005. Google ScholarDigital Library
- Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24--37, June 1995. Google ScholarDigital Library
- Luke Yen, Jayaram Bobba, Michael R. Marty, Kevin E. Moore, Haris Volos, Mark D. Hill, Michael M. Swift, and David A. Wood. LogTM-SE: Decoupling Hardware Transactional Memory from Caches. In Proceedings of the Thirteenth IEEE Symposium on High-Performance Computer Architecture, pages 261--272, February 2007. Google ScholarDigital Library
Index Terms
- Performance pathologies in hardware transactional memory
Recommendations
Performance pathologies in hardware transactional memory
Hardware Transactional Memory (HTM) systems reflect choices from three key design dimensions: conflict detection, version management, and conflict resolution. Previously proposed HTMs represent three points in this design space: lazy conflict detection, ...
Refereeing conflicts in hardware transactional memory
ICS '09: Proceedings of the 23rd international conference on SupercomputingIn the search for high performance, most transactional memory (TM) systems execute atomic blocks concurrently and must thus be prepared for data conflicts. The TM system also needs to choose a policy to decide when and how to manage the resulting ...
Conflict Avoidance Scheduling Using Grouping List for Transactional Memory
IPDPSW '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD ForumConventional Transactional Memory (TM) systems may experience performance degradation in applications with high contention, given the fact that execution of transaction will frequently restart due to conflicts. The restarting of transaction essentially ...
Comments