research-article

Register file prefetching

Authors:

Sudhanshu Shukla,

Sumeet Bandishte,

Sreenivas SubramoneyAuthors Info & Claims

ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture

Pages 410 - 423

https://doi.org/10.1145/3470496.3527398

Published: 11 June 2022 Publication History

Abstract

The memory wall continues to limit the performance of modern out-of-order (OOO) processors, despite the expensive provisioning of large multi-level caches and advancements in memory prefetching. In this paper, we put forth an important observation that the memory wall is not monolithic, but is constituted of many latency walls arising due to the latency of each tier of cache/memory. Our results show that even though level-1 (L1) data cache latency is nearly 40X lower than main memory latency, mitigating this latency offers a very similar performance opportunity as the more widely studied, main memory latency.

This motivates our proposal Register File Prefetch (RFP) that intelligently utilizes the existing OOO scheduling pipeline and available L1 data cache/Register File bandwidth to successfully prefetch 43.4% of load requests from the L1 cache to the Register File. Simulation results on 65 diverse workloads show that this translates to 3.1% performance gain over a baseline with parameters similar to Intel Tiger Lake processor, which further increases to 5.7% for a futuristic up-scaled core. We also contrast and differentiate register file prefetching from techniques like load value and address prediction that enhance performance by speculatively breaking data dependencies. Our analysis shows that RFP is synergistic with value prediction, with both the features together delivering 4.1% average performance improvement, which is significantly higher than the 2.2% performance gain obtained from just doing value prediction.

References

[1]

Mehdi Alipour, Stefanos Kaxiras, David Black-Schaffer, and Rakesh Kumar. 2020. Delay and Bypass: Ready and Criticality Aware Instruction Scheduling in Out-of-Order Processors. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[2]

Ricardo Alves, Stefanos Kaxiras, and David Black-Schaffer. 2021. Early Address Prediction: Efficient Pipeline Prefetch and Reuse. ACM Transactions on Architecture and Code Optimization 18, 3, Article 39 (June 2021), 22 pages.

Digital Library

[3]

Apache Software Foundation. 2010. Hadoop. https://hadoop.apache.org

[4]

Todd M. Austin and Gurindar S. Sohi. 1995. Zero-Cycle Loads: Microarchitecture Support for Reducing Load Latency. In Proceedings of the 28th Annual International Symposium on Microarchitecture.

[5]

Jean-Loup Baer and Tien-Fu Chen. 1991. An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty. In Supercomputing '91:Proceedings of the 1991 ACM/IEEE Conference on Supercomputing.

Digital Library

[6]

Mohammad Bakhshalipour, Mehran Shakerinava, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2019. Bingo Spatial Data Prefetcher. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[7]

Sumeet Bandishte, Jayesh Gaur, Zeev Sperber, Lihu Rappoport, Adi Yoaz, and Sreenivas Subramoney. 2020. Focused Value Prediction. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[8]

BAPCo. 2018. SYSmark 2018. https://bapco.com/products/sysmark-2018/

[9]

Michael Bekerman, Stephan Jourdan, Ronny Ronen, Gilad Kirshenboim, Lihu Rappoport, Adi Yoaz, and Uri Weiser. 1999. Correlated Load-Address Predictors. In Proceedings of the 26th International Symposium on Computer Architecture.

[10]

Rahul Bera, Anant V. Nori, Onur Mutlu, and Sreenivas Subramoney. 2019. DSPatch: Dual Spatial Pattern Prefetcher. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture.

Digital Library

[11]

Eshan Bhatia, Gino Chacon, Seth Pugsley, Elvira Teran, Paul V. Gratz, and Daniel A. Jiménez. 2019. Perceptron-Based Prefetch Filtering. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).

[12]

Mary D. Brown, Jared Stark, and Yale N. Patt. 2001. Select-Free Instruction Scheduling Logic. In Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[13]

Mainak Chaudhuri, Jayesh Gaur, Nithiyanandan Bashyam, Sreenivas Subramoney, and Joseph Nuzman. 2012. Introducing Hierarchy-awareness in Replacement and Bypass Algorithms for Last-level Caches. In 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]

George Z. Chrysos and Joel S. Emer. 1998. Memory Dependence Prediction Using Store Sets. In Proceedings. 25th Annual International Symposium on Computer Architecture.

[15]

Jamison D. Collins, Hong Wang, D.M. Tullsen, Christopher Hughes, Yong-Fong Lee, Dan Lavery, and John P. Shen. 2001. Speculative Precomputation: Long-range Prefetching of Delinquent Loads. In Proceedings 28th Annual International Symposium on Computer Architecture.

[16]

Standard Performance Evaluation Corporation. 2006. SPEC CPU 2006. https://www.spec.org/cpu2006/

[17]

Standard Performance Evaluation Corporation. 2010. SPECjEnterprise© 2010. https://www.spec.org/jEnterprise2010/

[18]

Standard Performance Evaluation Corporation. 2015. SPECjbb© 2015. https://www.spec.org/jbb2015/

[19]

Standard Performance Evaluation Corporation. 2017. SPEC CPU 2017. https://www.spec.org/cpu2017/

[20]

Transaction Processing Performance Council. 2010. TPC-C. http://www.tpc.org/tpcc/

[21]

Transaction Processing Performance Council. 2015. TPC-E. http://www.tpc.org/tpce/

[22]

Ian Cutress. 2020. Intel's 11th Gen Core Tiger Lake SoC Detailed: SuperFin, Willow Cove and Xe-LP. https://www.anandtech.com/show/15971/intels-11th-gen-core-tiger-lake-soc-detailed-superfin-willow-cove-and-xelp

[23]

Fredrik Dahlgren and Per Stenström. 1995. Effectiveness of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Nultiprocessors. In Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[24]

James Dundas and Trevor Mudge. 1997. Improving Data Cache Performance by Pre-executing Instructions Under a Cache Miss. In Proceedings of the 11th international conference on Supercomputing.

Digital Library

[25]

R. J. Eickemeyer and S. Vassiliadis. 1993. A load-instruction unit for pipelined processors. IBM Journal of Research and Development (1993).

[26]

Brian Fields, Shai Rubin, and Rastislav Bodik. 2001. Focusing Processor Policies via Critical-Path Prediction. In Proceedings 28th Annual International Symposium on Computer Architecture.

Digital Library

[27]

John W.C. Fu, Janak H. Patel, and Bob L. Janssens. 1992. Stride Directed Prefetching In Scalar Processors. In [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.

[28]

Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, and Hans-Arno Jacobsen. 2013. BigBench: Towards an Industry Standard Benchmark for Big Data Analytics. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data.

Digital Library

[29]

José González and Antonio González. 1997. Speculative Execution via Address Prediction and Data Prefetching. In Proceedings of the 11th International Conference on Supercomputing.

Digital Library

[30]

Milad Hashemi, Onur Mutlu, and Yale N. Patt. 2016. Continuous Runahead: Transparent Hardware Acceleration for Memory Intensive Workloads. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]

Timothy H. Heil, Zak Smith, and J.E. Smith. 1999. Improving Branch Predictors by Correlating on Data Values. In MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[32]

John L Hennessy and David A Patterson. 2011. Computer Architecture: A Quantitative Approach. Elsevier.

Digital Library

[33]

Ibrahim Hur and Calvin Lin. 2006. Memory Prefetching Using Adaptive Stream Detection. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[34]

Yasuo Ishii, Mary Inaba, and Kei Hiraki. 2009. Access Map Pattern Matching for Data Cache Prefetch. In Proceedings of the 23rd international conference on Supercomputing.

Digital Library

[35]

Akanksha Jain and Calvin Lin. 2016. Back to the Future: Leveraging Belady's Algorithm for Improved Cache Replacement. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

Digital Library

[36]

Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, and Joel Emer. 2010. High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP). In Proceedings of the 37th annual international symposium on Computer architecture.

Digital Library

[37]

Doug Joseph and Dirk Grunwald. 1997. Prefetching using Markov Predictors. In Proceedings of the 24th annual international symposium on Computer architecture.

Digital Library

[38]

Changhee Jung, Daeseob Lim, Jaejin Lee, and Yan Solihin. 2006. Helper Thread Prefetching for Loosely-Coupled Multiprocessor Systems. In Proceedings 20th IEEE International Parallel Distributed Processing Symposium.

[39]

Neelu S. Kalani and Biswabandan Panda. 2021. Instruction Criticality Based Energy-Efficient Hardware Data Prefetching. IEEE Computer Architecture Letters 20, 2 (2021), 146--149.

[40]

Joonsung Kim, Hamin Jang, Hunjun Lee, Seungho Lee, and Jangwoo Kim. 2021. UC-Check: Characterizing Micro-operation Caches in x86 Processors and Implications in Security and Performance. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture.

Digital Library

[41]

Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path Confidence based Lookahead Prefetching. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[42]

Sushant Kondguli and Michael Huang. 2017. T2: A Highly Accurate and Energy Efficient Stride Prefetcher. In 2017 IEEE International Conference on Computer Design (ICCD).

[43]

Primate Labs. 2021. Geekbench 5 CPU Benchmark. https://www.geekbench.com/

[44]

Sandia National Labs. [n.d.]. LAMMPS. https://www.lammps.org

[45]

An-Chow Lai, C. Fide, and B. Falsafi. 2001. Dead-Block Prediction & Dead-Block Correlating Prefetchers. In Proceedings of the 28th Annual International Symposium on Computer Architecture.

[46]

Mikko H. Lipasti, Christopher B. Wilkerson, and John Paul Shen. 1996. Value Locality and Load Value Prediction. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems.

Digital Library

[47]

Heiner Litz, Grant Ayers, and Parthasarathy Ranganathan. 2022. CRISP: Critical Slice Prefetching. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems.

Digital Library

[48]

Jiwei Lu, Abhinav Das, Wei-Chung Hsu, Khoa Nguyen, and Santosh G. Abraham. 2005. Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor. In 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[49]

R. Manikantan, R. Govindarajan, and Kaushik Rajan. 2011. Extended Histories: Improving Regularity and Performance in Correlation Prefetchers. In Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers.

Digital Library

[50]

Pierre Michaud. 2016. Best-Offset Hardware Prefetching. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[51]

Teresa Monreal, Antonio Gonzalez, Mateo Valero, José Gonzalez, and Victor Viñals. 1999. Delaying Physical Register Allocation Through Virtual-Physical Registers. In MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

Digital Library

[52]

Kyle J. Nesbit and James E. Smith. 2004. Data Cache Prefetching Using a Global History Buffer. In 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[53]

Anant V. Nori, Jayesh Gaur, Siddharth Rai, Sreenivas Subramoney, and Hong Wang. 2018. Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-level Cache Hierarchies. In Proceedings of the 45th Annual International Symposium on Computer Architecture.

Digital Library

[54]

Subbarao Palacharla and R.E. Kessler. 1994. Evaluating Stream Buffers as a Secondary Cache Replacement. In Proceedings of 21 International Symposium on Computer Architecture.

[55]

Il Park, Chong Liang Ooi, and T. N. Vijaykumar. 2003. Reducing Design Complexity of the Load/Store Queue. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture.

[56]

Arthur Perais. 2021. Leveraging Targeted Value Prediction to Unlock New Hardware Strength Reduction Potential. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture.

Digital Library

[57]

Arthur Perais and André Seznec. 2014. EOLE: Paving the Way for an Effective Implementation of Value Prediction. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[58]

Arthur Perais and André Seznec. 2014. Practical Data Value Speculation for Future High-end Processors. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[59]

Arthur Perais and André Seznec. 2015. BeBoP: A cost effective predictor infrastructure for superscalar value prediction. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[60]

Seth H Pugsley, Zeshan Chishti, Chris Wilkerson, Peng-fei Chuang, Robert L Scott, Aamer Jaleel, Shih-Lien Lu, Kingsum Chow, and Rajeev Balasubramonian. 2014. Sandbox Prefetching: Safe run-time evaluation of aggressive prefetchers. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[61]

Amir Roth. 2005. Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization. In 32nd International Symposium on Computer Architecture (ISCA'05).

Digital Library

[62]

S. Sair, T. Sherwood, and B. Calder. 2003. A decoupled predictor-directed stream prefetching architecture. IEEE Trans. Comput. 52, 3 (2003), 260--276.

Digital Library

[63]

Yiannakis Sazeides and James E. Smith. 1997. The Predictability of Data Values. In Proceedings of 30th Annual International Symposium on Microarchitecture.

[64]

Andreas Sembrant, Trevor Carlson, Erik Hagersten, David Black-Shaffer, Arthur Perais, André Seznec, and Pierre Michaud. 2015. Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

Digital Library

[65]

André Seznec. 2018. Exploring value prediction with the EVES predictor. In 1st Championship Value Prediction.

[66]

André Seznec and Pierre Michaud. 2006. A case for (partially) TAgged GEometric history length branch prediction. Journal of Instruction-level Parallelism - JILP 8 (01 2006).

[67]

Rami Sheikh, Harold W. Cain, and Raguram Damodaran. 2017. Load Value Prediction via Path-based Address Prediction: Avoiding Mispredictions due to Conflicting Stores. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

Digital Library

[68]

Rami Sheikh and Derek Hower. 2019. Efficient Load Value Prediction Using Multiple Predictors and Filters. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[69]

Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically Characterizing Large Scale Program Behavior. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems.

Digital Library

[70]

Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H Pugsley, and Zeshan Chishti. 2015. Efficiently Prefetching Complex Address Patterns. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[71]

Zhan Shi, Xiangru Huang, Akanksha Jain, and Calvin Lin. 2019. Applying Deep Learning to the Cache Replacement Problem. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture.

Digital Library

[72]

Alan Jay Smith. 1978. Sequential Program Prefetching in Memory Hierarchies. Computer 11, 12 (1978), 7--21.

Digital Library

[73]

Yan Solihin, Jaejin Lee, and Josep Torrellas. 2002. Using a User-Level Memory Thread for Correlation Prefetching. In Proceedings 29th Annual International Symposium on Computer Architecture.

[74]

Stephen Somogyi, Thomas F. Wenisch, Anastassia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2006. Spatial Memory Streaming. In 33rd International Symposium on Computer Architecture (ISCA'06).

[75]

Niranjan K Soundararajan, Peter Braun, Tanvir Ahmed Khan, Baris Kasikci, Heiner Litz, and Sreenivas Subramoney. 2021. PDede: Partitioned, Deduplicated, Delta Branch Target Buffer. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture.

Digital Library

[76]

Jared Stark, Mary D. Brown, and Yale N. Patt. 2000. On Pipelining Dynamic Instruction Scheduling Logic. In Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[77]

Gary S. Tyson and Todd M. Austin. 1999. Memory Renaming: Fast, Early and Accurate Processing of Memory Communication. International Journal of Parallel Programming (1999).

[78]

WikiChip. [n.d.]. AMD Zen Microarchitecture. Retrieved March 31, 2021 from https://en.wikichip.org/wiki/amd/microarchitectures/zen

[79]

WikiChip. [n.d.]. SunnyCove - Microarchitectures - Intel. Retrieved March 31, 2021 from https://en.wikichip.org/wiki/intel/microarchitectures/sunny_cove

[80]

Adi Yoaz, Mattan Erez, Ronny Ronen, and Stephan Jourdan. 1999. Speculation Techniques for Improving Load Related Instruction Scheduling. In Proceedings of the 26th International Symposium on Computer Architecture.

Digital Library

[81]

Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: A Unified Engine for Big Data Processing. Commun. ACM 59, 11 (oct 2016), 56--65.

Digital Library

[82]

Weifeng Zhang, Dean M. Tullsen, and Brad Calder. 2007. Accelerating and Adapting Precomputation Threads for Effcient Prefetching. In 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[83]

Tianhao Zheng, Haishan Zhu, and Mattan Erez. 2018. SIPT: Speculatively Indexed, Physically Tagged Caches. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

Cited By

Saglam BHo NFalquez CPortero ASchätzle FSuarez EPleiter D(2024)Data Prefetching on Processors with Heterogeneous MemoryProceedings of the International Symposium on Memory Systems10.1145/3695794.3695800(45-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695800
Bera RRanganathan ARakshit JMahto SNori AGaur JOlgun AKanellopoulos KSadrosadati MSubramoney SMutlu O(2024)Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00017(88-102)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00017
Lee YLee JRo W(2023)Performance Analysis of Criticality-Aware Out-of-Order Cores for Exploiting MLP2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)10.1109/ITC-CSCC58803.2023.10212794(1-4)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ITC-CSCC58803.2023.10212794

Index Terms

Register file prefetching
1. Computer systems organization
  1. Architectures
    1. Serial architectures

Recommendations

Disruptive prefetching: impact on side-channel attacks and cache designs
SYSTOR '15: Proceedings of the 8th ACM International Systems and Storage Conference

Caches are integral parts in modern computers; they leverage the memory access patterns of a program to mitigate the gap between the fast processors and slow memory components.

Unfortunately, the behavior of caches can be exploited by attackers to infer ...
Page Size Aware Cache Prefetching
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The increase in working set sizes of contemporary applications outpaces the growth in cache sizes, resulting in frequent main memory accesses that deteriorate system performance due to the disparity between processor and memory speeds. Prefetching ...
Load value prediction via path-based address prediction: avoiding mispredictions due to conflicting stores
MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is inherently limited by true data dependencies. Value prediction was proposed to address this ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture

June 2022

1097 pages

ISBN:9781450386104

DOI:10.1145/3470496

General Chairs:
Valentina Salapura
Google
,
Mohamed Zahran
New York University
,
Program Chairs:
Fred Chong
The University of Chicago
,
Lingjia Tang
The University of Michigan

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

In-Cooperation

IEEE CS TCAA: IEEE CS technical committee on architectural acoustics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISCA '22

Sponsor:

SIGARCH

ISCA '22: The 49th Annual International Symposium on Computer Architecture

June 18 - 22, 2022

New York, New York

Acceptance Rates

ISCA '22 Paper Acceptance Rate 67 of 400 submissions, 17%;

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
2,516
Total Downloads

Downloads (Last 12 months)374
Downloads (Last 6 weeks)46

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Saglam BHo NFalquez CPortero ASchätzle FSuarez EPleiter D(2024)Data Prefetching on Processors with Heterogeneous MemoryProceedings of the International Symposium on Memory Systems10.1145/3695794.3695800(45-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695800
Bera RRanganathan ARakshit JMahto SNori AGaur JOlgun AKanellopoulos KSadrosadati MSubramoney SMutlu O(2024)Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00017(88-102)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00017
Lee YLee JRo W(2023)Performance Analysis of Criticality-Aware Out-of-Order Cores for Exploiting MLP2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)10.1109/ITC-CSCC58803.2023.10212794(1-4)Online publication date: 25-Jun-2023
https://doi.org/10.1109/ITC-CSCC58803.2023.10212794

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten