skip to main content
10.1145/1346281.1346300acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

SoftSig: software-exposed hardware signatures for code analysis and optimization

Published: 01 March 2008 Publication History

Abstract

Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficiently and with low complexity with hardware signatures.
To enable flexible use of signatures, this paper proposes to expose a Signature Register File to the software through a rich ISA. The software has great flexibility to decide, for each signature,which addresses to collect and which addresses to disambiguate against. We call this architecture SoftSig. In addition, as an example of SoftSig use, we show how to detect redundant function calls efficiently and eliminate them dynamically. We call this algorithm MemoiSE. On average for five popular applications, MemoiSE reduces the number of dynamic instructions by 9.3%, thereby reducing the execution time of the applications by 9%.

Supplementary Material

JPG File (1346300.jpg)
index.html (index.html)
Slides from the presentation
ZIP File (p145-jtuck-slides.zip)
Supplemental material for SoftSig: software-exposed hardware signatures for code analysis and optimization
Audio only (1346300.mp3)
Video (1346300.mp4)

References

[1]
D. Bernstein, D. Cohen, and D. E. Maydan, "Dynamic Memory Disambiguation for Array References," in International Symposium on Microarchitecture, November 1994.
[2]
B. Bloom, "Space/Time Trade-Offs in Hash Coding with Allowable Errors," Communications of the ACM, vol. 11, July 1970.
[3]
L. Ceze, J. Tuck, C. Cascaval, and J. Torrellas, "Bulk Disambiguation of Speculative Threads in Multiprocessors," in International Symposium on Computer Architecture, June 2006.
[4]
L. Ceze, J. Tuck, P. Montesinos, and J. Torrellas, "BulkSC: Bulk Enforcement of Sequential Consistency," in International Symposium on Computer Architecture, June 2007.
[5]
D. A. Connors, H. C. Hunter, B.-C. Cheng, and W.-M. W. Hwu, "Hardware Support for Dynamic Activation of Compiler-Directed Computation Reuse," in International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000.
[6]
D. A. Connors and W.-M. W. Hwu, "Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results," in International Symposium on Microarchitecture, November 1999.
[7]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press, Cambridge, MA, 2001.
[8]
Y. Ding and Z. Li, "A Compiler Scheme for Reusing Intermediate Computation Results," in International Symposium on Code Generation and Optimization, March 2004.
[9]
D. M. Gallagher, W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W.-M. W. Hwu, "Dynamic Memory Disambiguation Using the Memory Conflict Buffer," in International Conference on Architectural Support for Programming Languages and Operating Systems, October 1994.
[10]
J. Huang and D. Lilja, "Exploiting Basic Block Value Locality with Block Reuse," in International Symposium on High Performance Computer Architecture, January 1999.
[11]
Intel Corporation, Intel 64 and IA-32 Architectures Software Developer's Manual. Volume 3B: System Programming Guide, Part II, November 2007.
[12]
V. Krishnan and J. Torrellas, "A Chip-Multiprocessor Architecture with Speculative Multithreading," IEEE Trans. on Computers, September 1999.
[13]
J. Lin, T. Chen, W.-C. Hsu, and P.-C. Yew, "Speculative Register Promotion Using Advanced Load Address Table (ALAT)," in International Symposium on Code Generation and Optimization, March 2003.
[14]
M. Lipasti, C.Wilkerson, and J. Shen, "Value Locality and Load Value Prediction," in International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996.
[15]
M. H. Lipasti and J. P. Shen, "Exceeding the Dataflow Limit Via Value Prediction," in International Symposium on Microarchitecture, December 1996.
[16]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S.Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation," in International Conference on Programming Language Design and Implementation, June 2005.
[17]
D. Michie, ""Memo" Functions and Machine Learning," in Nature, April 1968.
[18]
C. C. Minh et al., "An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees," in International Symposium on Computer Architecture, June 2007.
[19]
A. Moshovos, G. Memik, A. Choudhary, and B. Falsafi, "JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers," in International Symposium on High-Performance Computer Architecture, January 2001.
[20]
N. Neelakantam, R. Rajwar, S. Srinivas, U. Srinivasan, and C. Zilles, "Hardware Atomicity for Reliable Software Speculation," in International Symposium on Computer Architecture, June 2007.
[21]
J.-K. Peir, S.-C. Lai, S.-L. Lu, J. Stark, and K. Lai, "Bloom Filtering Cache Misses for Accurate Data Speculation and Prefetching," in International Conference on Supercomputing, June 2002.
[22]
M. Postiff, D. Greene, and T. Mudge, "The Store-load Address Table and Speculative Register Promotion," in International Symposium on Microarchitecture, December 2000.
[23]
J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos, "SESC Simulator," January 2005. http://sesc.sourceforge.net.
[24]
S. Sastry, R. Bodik, and J. Smith, "Characterizing Coarse-Grained Reuse of Computation," in Workshop on Feedback-Directed and Dynamic Optmization, 2000.
[25]
S. Sethumadhavan, R. Desikan, D. Burger, C. Moore, and S. Keckler, "Scalable Hardware Memory Disambiguation for High ILP Processors," in International Symposium on Microarchitecture, December 2003.
[26]
A. Sodani and G. S. Sohi, "Dynamic Instruction Reuse," in International Symposium on Computer Architecture, June 1997.
[27]
A. Sodani and G. S. Sohi, "An Empirical Analysis of Instruction Repetition," in International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998.
[28]
G. Sohi, S. Breach, and T. Vijayakumar, "Multiscalar Processors," in International Symposium on Computer Architecture, June 1995.
[29]
B. Su, S. Habib, W. Zhao, J. Wang, and Y. Wu, "A Study of Pointer Aliasing for Software Pipelining Using Run-time Disambiguation," in International Symposium on Microarchitecture, November 1994.
[30]
Y. Wu, D.-Y. Chen, and J. Fang, "Better Exploration of Region level Value Locality with Integrated Computation Reuse and Value Prediction," in International Symposium on Computer Architecture, June 2001.
[31]
L. Yen et al., "LogTM-SE: Decoupling Hardware Transactional Memory from Caches," in International Symposium on High Performance Computer Architecture, February 2007.

Cited By

View all

Index Terms

  1. SoftSig: software-exposed hardware signatures for code analysis and optimization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
    March 2008
    352 pages
    ISBN:9781595939586
    DOI:10.1145/1346281
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 43, Issue 3
      ASPLOS '08
      March 2008
      339 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1353536
      Issue’s Table of Contents
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 36, Issue 1
      ASPLOS '08
      March 2008
      339 pages
      ISSN:0163-5964
      DOI:10.1145/1353534
      Issue’s Table of Contents
    • cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 42, Issue 2
      ASPLOS '08
      March 2008
      339 pages
      ISSN:0163-5980
      DOI:10.1145/1353535
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 March 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. memory disambiguation
    2. multi-core architectures
    3. runtime optimization

    Qualifiers

    • Research-article

    Conference

    ASPLOS08

    Acceptance Rates

    ASPLOS XIII Paper Acceptance Rate 31 of 127 submissions, 24%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Leveraging Caches to Accelerate Hash Tables and MemoizationProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358272(440-452)Online publication date: 12-Oct-2019
    • (2019)AxMemoProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322215(685-697)Online publication date: 22-Jun-2019
    • (2018)Leveraging Hardware Caches for MemoizationIEEE Computer Architecture Letters10.1109/LCA.2017.276230817:1(59-63)Online publication date: 1-Jan-2018
    • (2017)Compile-time function memoizationProceedings of the 26th International Conference on Compiler Construction10.1145/3033019.3033024(45-54)Online publication date: 5-Feb-2017
    • (2016)A Hardware Approach to Detect, Expose and Tolerate High Level Data Races2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)10.1109/PDP.2016.57(159-167)Online publication date: Feb-2016
    • (2015)Intercepting Functions for MemoizationACM Transactions on Architecture and Code Optimization10.1145/275155912:2(18:1-18:23)Online publication date: 24-Jun-2015
    • (2013)DeAliaserACM SIGPLAN Notices10.1145/2499368.245113648:4(167-180)Online publication date: 16-Mar-2013
    • (2013)DeAliaserACM SIGARCH Computer Architecture News10.1145/2490301.245113641:1(167-180)Online publication date: 16-Mar-2013
    • (2013)Improving the energy efficiency of hardware-assisted watchpoint systemsProceedings of the 50th Annual Design Automation Conference10.1145/2463209.2488800(1-6)Online publication date: 29-May-2013
    • (2013)DeAliaserProceedings of the eighteenth international conference on Architectural support for programming languages and operating systems10.1145/2451116.2451136(167-180)Online publication date: 16-Mar-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media