skip to main content
10.1145/1995896.1995904acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Transactional conflict decoupling and value prediction

Published: 31 May 2011 Publication History

Abstract

This paper explores data speculation for improving the performance of Hardware Transactional Memory (HTM). We attempt to reduce transactional conflicts by decoupling them from cache coherence conflicts; many HTMs do not distinguish between transactional conflicts and coherence conflicts, leading to false transactional conflicts. We also attempt to mitigate the effects of coherence conflicts by using value prediction in transactions. We show that coherence decoupling and value prediction in transactions complement each other, because they both speculate on data in ways that are infeasible in the absence of HTM support.
As a demonstration of how data speculation can improve performance, we introduce DPTM, a best-effort HTM that mitigates the effects of false sharing at the cache line level. DPTM does not alter the underlying cache coherence protocol, and requires only minor, processor-local, modifications.
We evaluate DPTM against a baseline best-effort HTM, and compare it with data restructuring by padding, the most commonly used method to avoid false sharing. Our experiments show that DPTM can dramatically improve performance in the presence of false sharing without degrading performance in its absence, and consistently performs better than restructuring by padding.

References

[1]
A. Adl-Tabatabai, B. Lewis, V. Menon, B. Murphy, B. Saha, and T. Shpeisman. Compiler and runtime support for efficient software transactional memory. PLDI, 2006.
[2]
H. Akkary and M. Driscoll. A dynamic multithreading processor. MICRO, 1998.
[3]
A. Alameldeen and D. Wood. Variability in architectural simulations of multi-threaded workloads. HPCA, 2003.
[4]
C. Ananian, K. Asanovic, B. Kuszmaul, C. Leiserson, and S. Lie. Unbounded transactional memory. HPCA, 2005.
[5]
C. Ananian and M. Rinard. Efficient object-based software transactions. SCOOL, 2005.
[6]
L. Baugh, N. Neelakantam, and C. Zilles. Using hardware memory protection to build a high-performance, strongly-atomic hybrid transactional memory. 2008.
[7]
C. Blundell, A. Raghavan, and M. Martin. RETCON: Transactional repair without replay. ISCA, 2010.
[8]
J. Bobba, N. Goyal, M. Hill, M. Swift, and D. Wood. Token™: Efficient execution of large transactions with hardware transactional memory. ISCA, 2008.
[9]
J. Bobba, K. Moore, H. Volos, L. Yen, M. Hill, M. Swift, and D. Wood. Performance pathologies in hardware transactional memory. ISCA, 2007.
[10]
W. Bolosky and M. Scott. False sharing and its effect on shared memory. 1993.
[11]
L. Ceze, J. Tuck, J. Torrellas, and C. Cascaval. Bulk disambiguation of speculative threads in multiprocessors. ISCA, 2006.
[12]
W. Chuang, S. Narayanasamy, G. Venkatesh, J. Sampson, M. Biesbrouck, G. Pokam, B. Calder, and O. Colavin. Unbounded page-based transactional memory. ASPLOS, 2006.
[13]
M. Cintra and J. Torrellas. Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors. HPCA, 2002.
[14]
P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. ASPLOS, 2006.
[15]
D. Dice, Y. Lev, M. Moir, and D. Nussbaum. Early experience with a commercial hardware transactional memory implementation. ASPLOS, 2009.
[16]
D. Geer. Chip makers turn to multicore processors. IEEE Computer, 2005.
[17]
J. Goodman and P. Woest. The Wisconsin Multicube: a new large-scale cache-coherent multiprocessor. ISCA, 1988.
[18]
D. Grossman. The transactional memory / garbage collection analogy. OOPSLA, 2007.
[19]
L. Hammond, V. Wong, M. Chen, B. Carlstrom, J. Davis, B. Hertzberg, M. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherence and consistency. ISCA, 2004.
[20]
T. Harris, K. Fraser, and I. Pratt. A practical multi-word compare-and-swap operation. DISC, 2002.
[21]
T. Harris, J. R. Larus, and R. Rajwar. Transactional Memory. Synthesis Lectures on Computer Architecture. Morgan and Claypool Publishers, 2nd edition, 2010.
[22]
J. Hennessy and D. Patterson. Computer Architecture: A Quantitative Approach. 2006.
[23]
M. Herlihy, V. Luchangco, M. Moir, and W. Scherer. Software transactional memory for dynamic-sized data structures. PODC, 2003.
[24]
M. Herlihy and J. Moss. Transactional memory: Architectural support for lock-free data structures. ISCA, 1993.
[25]
M. Herlihy and J. Moss. System for achieving atomic non-sequential multi-word operations in shared memory. US Patent 5,428,761, 1995.
[26]
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008.
[27]
O. Hofmann, C. Rossbach, and E. Witchel. Maximum benefit from a minimal H™. 2009.
[28]
J. Huh, J. Chang, D. Burger, and G. Sohi. Coherence decoupling: making use of incoherence. ASPLOS, 2004.
[29]
T. Jeremiassen and S. Eggers. Reducing false sharing on shared memory multiprocessors through compile time data transformations. PPoPP, 1995.
[30]
M. Kadiyala and L. Bhuyan. A dynamic cache sub-block design to reduce false sharing. ICCD, 1995.
[31]
T. Knight. An architecture for mostly functional languages. LFP, 1986.
[32]
K. Lepak and M. Lipasti. Silent stores for free. MICRO, 2000.
[33]
K. Lepak and M. Lipasti. Temporally silent stores. ASPLOS, 2002.
[34]
P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 2002.
[35]
V. Marathe, M. Spear, C. Heriot, and A. Acharya. Lowering the overhead of nonblocking software transactional memory. TRANSACT, 2006.
[36]
M. Martin, D. Sorin, B. Beckmann, M. Marty, M. Xu, A. Alameldeen, K. Moore, M. Hill, and D. Wood. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. 2005.
[37]
M. Martin, D. J. Sorin, H. W. Cain, M. D. Hill, and M. H. Lipasti. Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing. MICRO, 2001.
[38]
J. Martínez and J. Torrellas. Speculative synchronization: applying thread-level speculation to explicitly parallel applications. ASPLOS, 2002.
[39]
A. McDonald, J. Chung, H. Chafi, C. Minh, B. Carlstrom, L. Hammond, C. Kozyrakis, and K. Olukotun. Characterization of TCC on chip-multiprocessors. PACT, 2005.
[40]
C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. IISWC, 2008.
[41]
M. Moir, K. Moore, and D. Nussbaum. The adaptive transactional memory test platform: A tool for experimenting with transactional code for Rock. TRANSACT, 2008.
[42]
K. Moore, J. Bobba, M. Moravan, M. Hill, and D. Wood. Log™: Log-based transactional memory. HPCA, 2006.
[43]
M. Olszewski, J. Cutler, and J. Steffan. JudoS™: A dynamic binary-rewriting approach to software transactional memory. PACT, 2007.
[44]
S. Pant and G. Byrd. Extending concurrency of transactional memory programs by using value prediction. CF, 2009.
[45]
R. Rajwar and J. Goodman. Speculative lock elision: enabling highly concurrent multithreaded execution. MICRO, 2001.
[46]
R. Rajwar and J. Goodman. Transactional lock-free execution of lock-based programs. ASPLOS, 2002.
[47]
R. Rajwar, M. Herlihy, and K. Lai. Virtualizing transactional memory. ISCA, 2005.
[48]
H. Ramadan, C. Rossbach, D. Porter, O. Hofmann, A. Bhandari, and E. Witchel. Meta™/TxLinux: transactional memory for an operating system. ISCA, 2007.
[49]
W. Scherer, D. Lea, and M. Scott. A scalable elimination-based exchange channel. SCOOL, 2005.
[50]
A. Shriraman, S. Dwarkadas, and M. Scott. Flexible decoupled transactional memory support. ISCA, 2008.
[51]
G. Sohi, S. Breach, and T. Vijaykumar. Multiscalar processors. ISCA, 1995.
[52]
J. Steffan, C. Colohan, A. Zhai, and T. Mowry. Improving value communication for thread-level speculation. HPCA, 2002.
[53]
F. Tabba, A. W. Hay, and J. R. Goodman. Transactional value prediction. In TRANSACT '09: The 4th annual SIGPLAN Workshop on Transactional Memory. ACM, 2009.
[54]
S. Tomić, C. Perfumo, C. Kulkarni, A. Armejach, A. Cristal, O. Unsal, T. Harris, and M. Valero. EazyH™: eager-lazy hardware transactional memory. MICRO, 2009.
[55]
J. Torrellas, M. Lam, and J. Hennessy. False sharing and spatial locality in multiprocessor caches. IEEE Transactions on Computers, 1994.
[56]
E. Vallejo, T. Harris, A. Cristal, O. Unsal, and M. Valero. Hybrid transactional memory to accelerate safe lock-based transactions. TRANSACT, 2008.
[57]
S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. ISCA, 1995.
[58]
L. Yen, J. Bobba, M. Marty, K. Moore, H. Volos, M. Hill, M. Swift, and D. Wood. LogT™-SE: Decoupling hardware transactional memory from caches. HPCA, 2007.
[59]
R. Yoo, Y. Ni, A. Welc, B. Saha, A. Adl-Tabatabai, and H.-H. Lee. Kicking the tires of software transactional memory: why the going gets tough. SPAA, 2008.

Cited By

View all
  • (2016)PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446072(285-296)Online publication date: Mar-2016
  • (2015)Hardware Approaches to Transactional Memory in Chip MultiprocessorsHandbook on Data Centers10.1007/978-1-4939-2092-1_27(805-835)Online publication date: 17-Mar-2015
  • (2013)Reducing False Transactional Conflicts with Speculative Sub-Blocking State -- An Empirical Study for ASF Transactional Memory SystemProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum10.1109/IPDPSW.2013.113(1879-1888)Online publication date: 20-May-2013
  • Show More Cited By

Index Terms

  1. Transactional conflict decoupling and value prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '11: Proceedings of the international conference on Supercomputing
    May 2011
    398 pages
    ISBN:9781450301022
    DOI:10.1145/1995896
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. transactional memory
    2. value prediction

    Qualifiers

    • Research-article

    Conference

    ICS '11
    Sponsor:
    ICS '11: International Conference on Supercomputing
    May 31 - June 4, 2011
    Arizona, Tucson, USA

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446072(285-296)Online publication date: Mar-2016
    • (2015)Hardware Approaches to Transactional Memory in Chip MultiprocessorsHandbook on Data Centers10.1007/978-1-4939-2092-1_27(805-835)Online publication date: 17-Mar-2015
    • (2013)Reducing False Transactional Conflicts with Speculative Sub-Blocking State -- An Empirical Study for ASF Transactional Memory SystemProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum10.1109/IPDPSW.2013.113(1879-1888)Online publication date: 20-May-2013
    • (2013)DDASTM: Ensuring Conflict Serializability Efficiently in Distributed STMGrid and Pervasive Computing10.1007/978-3-642-38027-3_35(326-335)Online publication date: 2013
    • (2011)Hardware transactional memory for GPU architecturesProceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/2155620.2155655(296-307)Online publication date: 3-Dec-2011

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media