Abstract
This paper presents an adaptive technique for warming up caches in sampled microprocessor simulation. The simulator monitors the warm-up process of the caches and decides when the caches are warmed up based on simple heuristics. This mechanism allows the warm up length to be adaptive to cache sizes and benchmark variability characteristics. With only half or one-third of the average warm-up length of previous methods, the proposed Self-Monitored Adaptive (SMA) warm-up technique achieves CPI results very similar to previous methods. On average SMA exhibits only 0.2% warm-up error in CPI. For simulating small caches, the SMA technique can reduce the warm-up overhead by an order of magnitude compared to previous techniques. Finally, SMA gives the user some indicator of warm-up error at the end of the cycle-accurate simulation that helps the user to gauge the accuracy of the warm-up.
Similar content being viewed by others
References
R.E. Wunderlich, T.F. Wenisch, B. Falsafi, and J.C. Hoe, SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling, in Proceedings of the 30th Annual International Symposium on Computer Architecture, pp. 84–95 (June 2003).
J.W. Haskins, Jr. and K. Skadron, Memory Reference Reuse Latency: Accelerated Sampled Microarchitecture Simulation, in Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 195–203 (March 2003).
A. Agarwal M. Horowitz J. Hennessy (1989) ArticleTitleAn Analytical Cache Model ACM Transactions on Computer Systems 7 IssueID2 184–215 Occurrence Handle10.1145/63404.63407
J.W.C. Fu and J. H. Patel, Trace Driven Simulation using Sampled Traces, in Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, Vol. I, Architecture, pp. 211–220, (January 1994).
P. Crowley and J.L. Baer, On the Use of Trace Sampling for Architectural Studies of Desktop Applications, in Proceedings of the 1999 SIGMETRICS Conference, pp. 208–209 (May 1999).
T. M. Conte, M. A. Hirsch and K. N. Menezes, Reducing State Loss for Effective Trace Sampling of Superscalar Processors, in Proceedings of the 1996 International Conference on Computer Design (ICCD-96), pp. 468–477 (October 1996).
H. W. Cain, K. M. Lepak, B. A. Schwartz, and M. H. Lipasti, Precise and Accurate Processor Simulation, in 5th Workshop On Computer Architecture Evaluation Using Commercial Workloads (CAECW), (February 2002).
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder, Automatically Characterizing Large Scale Program Behavior, in Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems pp. 45–57 (October 2002).
E. Perelman, G. Hamerly, and B. Calder, Picking Statistically Valid and Early Simulation Points, in International Conference on Parallel Architectures and Compilation Techniques, pp. 244–255 (September 2003).
S. Laha J.H. Patel R.K. Iyer (1988) ArticleTitleAccurate Low-cost Methods for Performance Evaluation of Cache Memory Systems IEEE Transactions on Computers 37 IssueID11 1325–1335 Occurrence Handle10.1109/12.8699
D. A. Wood, M. D. Hill, and R. E. Kessler, A Model for Estimating Trace-sample Miss Ratios, in Proceedings of the ACM SIGMETRICS Conference for the Measurement and Modeling of Computer Systems, pp. 79–89 (June 1991).
R. E. Kessler, M. D. Hill, and D. A. Wood, A comparison of Trace-sampling Techniques for Multi-megabyte Caches, IEEE Transactions on Computers, 43(6):pp. 664–675 (1994).
A. T. Nguyen, P. Bose, K. Ekanadham, A. Nanda, and M. Michael, Accuracy and Speed-up of Parallel Trace-driven Architectural Simulation, in Proceedings of the 11th International Parallel Processing Symposium (IPPS’97), pp. 39–44, (April 1997).
J. W. Haskins, Jr. and K. Skadron, Minimal Subset Evaluation: Rapid Warm-up for Simulated Hardware State. in Proceedings of the International Conference on Computer Design, pp. 32–39 (September 2001).
L. Eeckhout, S. Eyerman, B. Callens, and K. De Bosschere, Accurately Warmed-up Trace Samples for the Evaluation of Cache Memories, in High Performance Computing Symposium 2003, Orlando, Florida, pp. 267–274, (2003).
L. Eeckhout Y. Luo K. De Bosschere L.K. John (2005) ArticleTitleBLRL: Accurate and Efficient Warmup for Sampled Processor Simulation The Computer Journal 48 IssueID4 451–459 Occurrence Handle10.1093/comjnl/bxh103
SimpleScalar LLC. http://www.simplescalar.com/access time july 2003.
D. Burger and T. M. Austin, The SimpleScalar Tool Set, Version 2.0, Technical Report 1342, Computer Sciences Department, University of Wisconsin-Madson (June 1997).
J. W. Haskins, Memory Reference Reuse Latency: Rapid Warm Up for Sampled Microarchitecture Simulation, http://www.cs.virginia.edu/~jwh6q/mrrl-web/ access time August 2005.
D. E. Vengroff and G. R. Gao, Partial Sampling with Reverse State Reconstruction: A New Technique for Branch Predictor Performance Estimation, in Fourth International Symposium On High-Performance Computer Architecture (HPCA), pp. 342–351 (1998).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luo, Y., John, L.K. & Eeckhout, L. SMA: A Self-Monitored Adaptive Cache Warm-Up Scheme for Microprocessor Simulation. Int J Parallel Prog 33, 561–581 (2005). https://doi.org/10.1007/s10766-005-7305-9
Issue Date:
DOI: https://doi.org/10.1007/s10766-005-7305-9