Abstract
The Linear-Scan algorithm (1970), applicable to priority replacement policies, computes stack distances and the number of misses incurred on a given address trace, for all cache sizes, in time O(V) per access. Here, V is the number of distinct (virtual) items referenced within the trace. While the time bound was subsequently lowered to O(log V) for the Least Recently Used policy, no improvements have been reported for general priority policies. This work introduces the class of policies with nearly static priorities (NSP), which encompasses several known policies. The Min-Tree algorithm is proposed for NSP policies, whose performance is quite sensitive to the policy as well as to the address trace. Under suitable probabilistic assumptions, the expected time per access is O(log2 V). Experimental evidence collected on a mix of 30 benchmarks shows that the Min-Tree algorithm can be significantly faster than Linear-Scan, for interesting policies such as OPT (or Belady), Least Frequently Used (LFU), and Most Recently Used (MRU). Min-Tree can be parallelized to run in time O(log V) using O(V/log V) processors, in the worst case. A more sophisticated Lazy Min-Tree algorithm is also developed with \({O(\sqrt{V}\log V)}\) worst-case time per access. This bound applies, in particular, to the policies OPT, LFU, and Least Recently/Frequently Used (LRFU), for which the best previously known bound was O(V). Although random replacement is not an NSP policy, the framework developed in this work leads to a stack-distance algorithm with O(log V) expected time per access.
Similar content being viewed by others
References
Aggarwal, A., Alpern, B., Chandra, A., Snir, M.: A model for hierarchical memory. In: Proceedings of the 19th ACM Symposium on Theory of Computing, pp. 305–314. ACM (May 1987)
Allen R., Kennedy K.: Optimizing Compilers for Modern Architectures. Morgan Kauffman, Los Altos, CA (2002)
Almà àsi, G., Cascaval, C., Padua, D.: Calculating stack distances efficiently. In: Proceedings of ACM SIGPLAN Workshop on Memory Systems Performance, pp. 37–43. ACM (June 2002)
Belady L.: A study of replacement algorithms for a virtual-storage. IBM Syst. J. 5(2), 78–101 (1966)
Bennett B., Kruskal V.: Lru stack processing. IBM J. Res. Dev. 19(4), 353–357 (1975)
Bilardi G., Ekanadham K., Pattnaik P.: On approximating the ideal random access machine by physical machines. J. ACM 56, 57 (2009)
Bilardi, G., Versaci, F.: An optimal policy for paging with stochastic inputs. Manuscript 1, 35 (2011)
Bohrer, P., Elnozahy, M., Gheith, A., Lefurgy, C., Nakra, T., Peterson, J., Rajamony, R., Rockhold, R., Shafi, H., Simpson, R., Speight, E., Sudeep, K., Hensbergen, E.V., Zhang, L.: Mambo a full system simulator for the powerpc architecture. In: ACM SIGMETRICS Performance Evaluation Review (2004)
Cormen T., Leiserson C., Rivest R., Stein C.: Introduction to Algorithms. The MIT Press, Cambridge, MA (1997)
Fisherv, R.A., Yates, F.: Statistical Tables for Biological, Agricultural and Medical Research, 3rd edn. Oliver & Boyd, Edinburgh (Harlow, UK) [1938] (1948)
Fotheringham J.: Dynamic storage allocation in the atlas computer, including an automatic use of a backing store. Commun. ACM 4(10), 435–436 (1961)
Franaszek P.A., Wagner T.J.: Some distribution-free aspects of paging algorithm performance. J. ACM 21(1), 31–39 (1974)
Hennessy J.L., Patterson D.A.: Computer Architecture: A Quantitative Approach. Morgan Kauffman, Los Altos, CA (2006)
Knuth, D.J.: The Art of Computer Programming, vol. 2. Addison Wesley, Reading, MA [1969] (1998)
Lebeck, A.R., Wood, D.A.: Active memory: A new abstraction for memory-system simulation. In: Proceedings of 1993 ACM Sigmetrics Conference on Measurements and Modeling of Computer Systems, pp. 220–230. ACM (May 1995)
Lee D., Choi J., Kim J.-H., Noh S., Min S., Cho Y., Kim C.: Lrfu: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans. Comput. 50(12), 1352–1361 (2001)
Mattson R., Gecsei J., Slutz D., Traiger I.: Evaluation techniques for storage hierarchies. IBM Syst. J. 9(2), 78–117 (1970)
Moreira, J.: Load/store characteristics for a set of power systems benchmarks. Technical Report RC25126, IBM Thomas J. Watson Research Center, Yorktown Heights (2011)
Oden P.H., Shedler G.S.: A model of memory contention in a paging machine. Commun. ACM 15(8), 761–771 (1972)
O’Neil E., O’Neil P., Weikum G.: An optimality proof of the lru-k page replacement algorithm. J. ACM 46(1), 92–112 (1999)
Przybylski S.: Cache and Memory Hierarchy Design. A Performance Directed Approach. Morgan Kaufmann, Palo Alto, CA (1990)
Savage J.: Models of Computation. Exploring the Power of Computing. Addison-Wesley, Reading, MA (1997)
Silerschatz A., Galvin P.B., Gagne G.: Operating System Concepts. Wiley, New York (2005)
Spirn, J.R., Denning, P.J.: Experiments with program locality. In: Proceedings of AFIPS ’72 (Fall, part I), pp. 611–621 (1972)
Sugumar, R., Abraham, S.: Efficient simulation of caches under optimal replacement with application to miss characterization. In: Proceedings of 1993 ACM Sigmetrics Conference on Measurements and Modeling of Computer Systems, pp. 24–35. ACM (May 1993)
Wolfe M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading, MA (1995)
Author information
Authors and Affiliations
Corresponding author
Additional information
The work of G. Bilardi was supported, by the IBM Visiting Scientist Program, by MIUR-PRIN Project AlgoDEEP, by PAT-INFN Project AuroraScience, and by the University of Padova Projects STPD08JA32 and CPDA099949.
Rights and permissions
About this article
Cite this article
Bilardi, G., Ekanadham, K. & Pattnaik, P. Efficient Stack Distance Computation for a Class of Priority Replacement Policies. Int J Parallel Prog 41, 430–468 (2013). https://doi.org/10.1007/s10766-012-0200-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-012-0200-2