Abstract
As the amount of on-chip cache increases as a result of Moore’s law, cache utilization is increasingly important as the number of processor cores multiply and the contention for memory bandwidth becomes more severe. Optimal cache management requires knowing the future access sequence and being able to communicate this information to hardware. The paper addresses the communication problem with two new optimal algorithms for Program-directed OPTimal cache management (P-OPT), in which a program designates certain accesses as bypasses and trespasses through an extended hardware interface to effect optimal cache utilization. The paper proves the optimality of the new methods, examines their theoretical properties, and shows the potential benefit using a simulation study and a simple test on a multi-core, multi-processor PC.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The research was conducted while Xiaoming Gu and Chengliang Zhang were graduate students at the University of Rochester. It was supported by two IBM CAS fellowships and NSF grants CNS-0720796, CNS-0509270, and CCR-0238176.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
IA-64 Application Developer’s Architecture Guide (May 1999)
Belady, L.A.: A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5(2), 78–101 (1966)
Belady, L.A., Nelson, R.A., Shedler, G.S.: An anomaly in space-time characteristics of certain programs running in a paging machine. Commun. ACM 12(6), 349–353 (1969)
Beyls, K., D’Hollander, E.: Reuse distance-based cache hint selection. In: Proceedings of the 8th International Euro-Par Conference, Paderborn, Germany (August 2002)
Beyls, K., D’Hollander, E.: Generating cache hints for improved program efficiency. Journal of Systems Architecture 51(4), 223–250 (2005)
Burger, D.C., Goodman, J.R., Kagi, A.: Memory bandwidth limitations of future microprocessors. In: Proceedings of the 23th International Symposium on Computer Architecture, Philadelphia, PA (May 1996)
Cascaval, C., Padua, D.A.: Estimating cache misses and locality using stack distances. In: Proceedings of International Conference on Supercomputing, San Francisco, CA (June 2003)
Ding, C., Kennedy, K.: Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing 64(1), 108–134 (2004)
Fang, C., Carr, S., Onder, S., Wang, Z.: Instruction based memory distance analysis and its application to optimization. In: Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO (2005)
Marin, G., Mellor-Crummey, J.: Scalable cross-architecture predictions of memory hierarchy response for scientific applications. In: Proceedings of the Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico (2005)
Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM System Journal 9(2), 78–117 (1970)
Petrank, E., Rawitz, D.: The hardness of cache conscious data placement. In: Proceedings of ACM Symposium on Principles of Programming Languages, Portland, Oregon (January 2002)
Qureshi, M.K., Jaleel, A., Patt, Y.N., Steely Jr., S.C., Emer, J.S.: Adaptive insertion policies for high performance caching. In: Proceedings of the International Symposium on Computer Architecture, San Diego, California, USA, June 2007, pp. 381–391 (2007)
Sleator, D.D., Tarjan, R.E.: Amortized efficiency of list update and paging rules. Communications of the ACM 28(2) (1985)
Wang, Z., McKinley, K.S., Rosenberg, A.L., Weems, C.C.: Using the compiler to improve cache replacement decisions. In: Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, Charlottesville, Virginia (September 2002)
Zhong, Y., Dropsho, S.G., Shen, X., Studer, A., Ding, C.: Miss rate prediction across program inputs and cache configurations. IEEE Transactions on Computers 56(3) (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gu, X., Bai, T., Gao, Y., Zhang, C., Archambault, R., Ding, C. (2008). P-OPT: Program-Directed Optimal Cache Management. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-89740-8_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)