skip to main content
10.1145/1356058.1356068acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Accurate critical path prediction via random trace construction

Published: 06 April 2008 Publication History

Abstract

We present a new approach to performing program analysis through profile-guided random generation of instruction traces. Using hardware support available in commercial processors, we profile the behavior of individual instructions. Then, in conjunction with the program binary, we use that information to fabricate short (1,000-instruction) traces by randomly evaluating branches in proportion to their profiled behavior. We demonstrate our technique in the context of critical path analysis, showing it can achieve the same accuracy as a hardware critical path predictor, but with lower hardware requirements. Key to achieving this accuracy is correctly identifying memory dependences in the fabricated trace, for which purpose we use a form of abstract interpretation to identify aliasing store-load pairs without explicitly profiling them. We also demonstrate that our approach is very tolerant of the quality of profile information available.

References

[1]
Advanced Micro Devices, Incorporated. Lightweight profiling proposal. http://developer.amd.com/assets/HardwareExtensionsfor\\LightweightProfilingPublic20070720.pdf, August 2007.
[2]
M. Arnold and B. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of the Conference on Programming Language Design and Implementation, pages 168--179, June 2001.
[3]
T. Ball, P. Mataga, and M. Sagiv. Edge profiling versus path profiling: The showdown. In Proceedings of the 25th Symposium on Principles of Programming Languages, pages 134--148, January 1998.
[4]
J. Dean, J. Hicks, C. Waldspurger, W. Weihl, and G. Chrysos. ProfileMe: Hardware support for instruction--level profiling on out--of--order processors. In Proceedings of the 30th International Symposium on Microarchitecture, pages 292--303, December 1997.
[5]
J. Ellis. Bulldog: A Compiler for VLIW Architectures. PhD thesis, Department of Computer Science, Yale University, April 1986.
[6]
B. Fields, R. Bodik, M. Hill, and C. Newburn. Slack: Maximizing performance under technological constraints. In Proceedings of the 29th Annual International Symposium on Computer Architecture, pages 47--58, May 2002.
[7]
B. Fields, R. Bodik, M. Hill, and C. Newburn. Using interaction cost for microarchitectural bottleneck analysis. In Proceedings of the 36th International Symposium on Microarchitecture, pages 228--242, December 2003.
[8]
B. Fields, S. Rubin, and R. Bodik. Focusing processor policies via critical--path prediction. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 74--85, July 2001.
[9]
N. Jouppi. The nonuniform distribution of instruction--level and machine parallelism and its effect on performance. IEEE Transactions on Computers, 38(12):1645--1658, December 1989.
[10]
R. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 19(2):24--36, March 1999.
[11]
A. Mericas. The PowerPC performance monitor. In Workshop on Hardware Performance Monitor Design and Functionality, February 2005.
[12]
G. Muthler, D. Crowe, S. Patel, and S. Lumetta. Instruction fetch deferral using static slack. In Proceedings of the 35th International Symposium on Microarchitecture, pages 51--61, November 2002.
[13]
N. Nihdi. Performance monitoring on Pentium 4 processors. In Workshop on Hardware Performance Monitor Design and Functionality, February 2005.
[14]
D. Noonburg and J. Shen. Theoretical modeling of superscalar processor performance. In Proceedings of the 27th International Symposium on Microarchitecture, pages 52--62, November 1994.
[15]
S. Nussbaum and J. Smith. Modeling superscalar processors via statistical simulation. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pages 15--24, September 2001.
[16]
M. Oskin, F. Chong, and M. Farrens. HLS: Combining statistical and symbolic simulation to guide microprocessor designs. In Proceedings of the 27th Annual International Symposium on Computer Architecture, pages 71--82, June 2000.
[17]
N. Riley and C. Zilles. Probabilistic counter updates for predictor hysteresis and stratification. In Proceedings of the 12th International Conference on High Performance Computer Architecture, pages 110--120, February 2006.
[18]
P. Salverda and C. Zilles. A criticality analysis of clustering in superscalar processors. In Proceedings of the 38th International Symposium on Microarchitecture, pages 55--66, November 2005.
[19]
P. Salverda and C. Zilles. Dependence--based scheduling revisited: A tale of two baselines. In 6th Annual Workshop on Duplicating, Deconstructing, and Debunking, June 2007.
[20]
J. Seng, E. Tune, and D. Tullsen. Reducing power with dynamic critical path information. In Proceedings of the 34th International Symposium on Microarchitecture, pages 114--123, December 2001.
[21]
E. Tune, D. Liang, D. Tullsen, and B. Calder. Dynamic prediction of critical path instructions. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture, pages 185--195, January 2001.
[22]
E. Tune, D. Tullsen, and B. Calder. Quantifying instruction criticality. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pages 104--113, September 2002.

Cited By

View all
  • (2022)CalipersProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532390(1-14)Online publication date: 28-Jun-2022
  • (2017)Modeling and predicting execution time of scientific workflows in the Grid using radial basis function neural networkCluster Computing10.1007/s10586-017-1018-x20:3(2805-2819)Online publication date: 1-Sep-2017
  • (2015)A Novel Critical Path Based Routing Method Based on for NOCProceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems10.1109/HPCC-CSS-ICESS.2015.159(1546-1551)Online publication date: 24-Aug-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO '08: Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
April 2008
235 pages
ISBN:9781595939784
DOI:10.1145/1356058
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 April 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. instruction criticality
  2. profiling
  3. trace fabrication

Qualifiers

  • Research-article

Conference

CGO '08

Acceptance Rates

CGO '08 Paper Acceptance Rate 21 of 66 submissions, 32%;
Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)CalipersProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532390(1-14)Online publication date: 28-Jun-2022
  • (2017)Modeling and predicting execution time of scientific workflows in the Grid using radial basis function neural networkCluster Computing10.1007/s10586-017-1018-x20:3(2805-2819)Online publication date: 1-Sep-2017
  • (2015)A Novel Critical Path Based Routing Method Based on for NOCProceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems10.1109/HPCC-CSS-ICESS.2015.159(1546-1551)Online publication date: 24-Aug-2015
  • (2014)Exploiting critical data regions to reduce data cache energy consumptionProceedings of the 17th International Workshop on Software and Compilers for Embedded Systems10.1145/2609248.2609253(69-78)Online publication date: 10-Jun-2014
  • (2012)Combined profilingProceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software10.1109/ISPASS.2012.6189227(210-220)Online publication date: 1-Apr-2012
  • (2010)Implicit hints: Embedding hint bits in programs without ISA changes2010 IEEE International Conference on Computer Design10.1109/ICCD.2010.5647699(364-369)Online publication date: Oct-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media