Abstract
This paper describes a methodology for a very efficient characterization of a workload’s memory access properties using the least recently used (LRU) replacement policy. The resulting access reuse profile captures working set sizes of a workload and can be used to characterize the amount of locality of data references and predict its general caching behavior.
The approach discussed in this paper is flexible and can be used in conjunction with tracing or execution-driven techniques. Because of the efficiency of the proposed algorithm – processing over one million memory accesses per second – the LRU profiles can be collected for a large number of workloads and the resulting data can be used in early stages of computer system design.
We illustrate the method with data collected for NAS Parallel Benchmarks. For selected benchmarks we compare the miss rate profiles for various sizes of the workload. We also compare the resulting LRU profiles with point predictions of miss rates generated with conventional cache simulations and observe a good match. In the concluding part of the paper we report the performance results for the proposed method.
This material is based upon work supported by DARPA under Contract No. NBCH3039002 and by the Australian Research Council Linkage Grant LP0347178.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Carrington, L., Snavely, A., Gao, X., Wolter, N.: Performance prediction framework for scientic applications. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2659, pp. 926–935. Springer, Heidelberg (2003)
Strohmaier, E., Shan, H.: Architecture independent performance characterization and benchmarking for scientific applications. In: 12th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS 2004), pp. 467–474 (2004)
Weinberg, J., Snavely, A., McCracken, M.O., Strohmaier, E.: Quantifying locality in the memory access patterns of hpc applications. In: SC 2005: High Performance Networking and Computing (2005)
Denning, P.J.: The working set model for program behavior. Commun. ACM 11(5), 323–333 (1968)
Denning, P.J., Schwartz, S.C.: Properties of the working-set model. Commun. ACM 15(3), 191–198 (1972)
Smith, A.J.: Cache memories. ACM Comput. Surv. 14(3), 473–530 (1982)
Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems 21(4), 703–746 (1999)
Rau, B.R.: Properties and applications of the least-recently-used stack model. Technical report, Stanford, CA, USA (1977)
Cascaval, C., Padua, D.A.: Estimating cache misses and locality using stack distances. In: ICS 2003: Proceedings of the 17th Annual International Conference on Supercomputing, pp. 150–159. ACM Press, New York (2003)
Arlitt, M.F., Williamson, C.L.: Internet web servers: workload characterization and performance implications. IEEE/ACM Trans. Netw. 5(5), 631–645 (1997)
Paxson, V., Floyd, S.: Wide area traffic: the failure of poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995)
NASA Advanced Supercomputing: NAS Parallel Benchmarks (Version 3.1 Serial) (2005), http://www.nas.nasa.gov/Software/NPB/
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, D., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications 5(3), 63–73 (1991)
Valgrind (Version 3.0.1) (2005), http://www.valgrind.org/
Nethercote, N.: Dynamic Binary Analysis and Instrumentation. PhD thesis, University of Cambridge (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonebakker, L., Over, A., Sharapov, I. (2006). Working Set Characterization of Applications with an Efficient LRU Algorithm. In: Horváth, A., Telek, M. (eds) Formal Methods and Stochastic Models for Performance Evaluation. EPEW 2006. Lecture Notes in Computer Science, vol 4054. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11777830_6
Download citation
DOI: https://doi.org/10.1007/11777830_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35362-1
Online ISBN: 978-3-540-35365-2
eBook Packages: Computer ScienceComputer Science (R0)