Abstract
In this paper, we propose the use of compiler controlled customized placement policies for embedded processor data caches. Profile driven customized placement improves the sharing of cache resources across memory lines thereby reducing conflict misses and lowering the average memory access time (AMAT) and consequently execution time. Alternatively, customized placement policies can be used to reduce the cache size and associativity for a fixed AMAT with an attendant reduction in power and area. These advantages are achieved with a small increase in complexity of the address translation in indexing the cache. The consequent increase in critical path length is offset by lowered miss rates. Simulation experiments with embedded benchmark kernels show that caches with customized placement provide miss rates comparable to traditional caches with larger sizes and higher associativities.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
McKee, S.A.: Reflections on the memory wall. In: Conf. Computing Frontiers (2004)
Zhang, M., Asanovi, K.: Fine-grain CAM-tag cache resizing using miss tags. In: ISLPED (2002)
Hu, Z., Martonosi, M., Kaxiras, S.: Improving cache power efficiency with an asymmetric set-associative cache. In: Workshop on Memory Performance Issues (2001), citeseer.ist.psu.edu/493283.html
Intel Corporation: Intel IXP2800 Network Processor Hardware Reference Manual (2002)
Banakar, R., Steinke, S., Lee, B.-S., Balakrishnan, M., Marwedel, P.: Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In: CODES (2002)
Steinke, S., Wehmeyer, L., Lee, B., Marwedel, P.: Assigning Program and Data Objects to Scratchpad for Energy Reduction. In: DATE (2002)
Panda, P.R., Dutt, N.D., Nicolau, A.: Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications. In: EDTC ’97 (1997)
Miller, J.E., Agarwal, A.: Software-based instruction caching for embedded processors. In: ASPLOS (2006)
Udayakumaran, S., Dominguez, A., Barua, R.: Dynamic allocation for scratch-pad memory using compile-time decisions. Trans. on Embedded Computing Sys. 5(2) (2006), doi:10.1145/1151074.1151085
Sherwood, T., Varghese, G., Calder, B.: A pipelined memory architecture for high throughput network processors. In: ISCA (2003)
Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. Electr. Notes Theor. Comput. Sci. 89(2) (2003)
Guthaus, M., Ringenberg, J., Ernst, D., Mudge, T., Austin, T.M., Brown, R.: MiBench: A free, commercially representative embedded benchmark suite. In: 4th IEEE International Workshop on Workload Characteristics, IEEE Computer Society Press, Los Alamitos (2001)
Tarjan, D., Thoziyoor, S., Jouppi, N.P.: CACTI 4.0: An Integrated Cache Timing, Power,and Area Model (2006)
Rabbah, R.M., Palem, K.V.: Data remapping for design space optimization of embedded memory systems. ACM Transactions in Embedded Computing Systems 2(2), 186–218 (2003)
Chilimbi, T.M., Hill, M.D., Larus, J.R.: Cache-Conscious Structure Layout. In: PLDI (1999)
Qureshi, M.K., Thompson, D., Patt, Y.N.: The V-Way Cache: Demand Based Associativity via Global Replacement. In: ISCA (2005)
Chiou, D., Jain, P., Rudolph, L., Devadas, S.: Application-specific memory management for embedded systems using software-controlled caches. In: DAC (2000)
Zhang, C.: Balanced cache: Reducing conflict misses of direct-mapped caches. In: ISCA (2006)
Hallnor, E.G., Reinhardt, S.K.: A fully associative software-managed cache design. In: ISCA (2000)
Peir, J.-K., Lee, Y., Hsu, W.W.: Capturing dynamic memory reference behavior with adaptive cache topology. In: ASPLOS (1998)
Seznec, A.: A Case for Two-Way Skewed-Associative Caches. In: ISCA (1993)
Calder, B., G, D., Emer, J.: Predictive sequential associative cache. In: HPCA (1996)
Agarwal, A., Pudar, S.D.: Column-associative caches: A technique for reducing the miss rate of direct-mapped caches. In: ISCA (1993)
Jouppi, N.P.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In: ISCA (1990)
Petrov, P., Orailoglu, A.: Towards effective embedded processors in codesigns: customizable partitioned caches. In: CODES (2001)
Ramaswamy, S., Sreeram, J., Yalamanchili, S., Palem, K.: Data Trace Cache: An Application Specific Cache Architecture. In: Workshop on Memory Dealing with Performance and Applications (MEDEA) (2005)
Dahlgren, F., Stenstrom, P.: On reconfigurable on-chip data caches. In: ISCA (1991)
Ramaswamy, S., Yalamanchili, S.: Customizable Fault Tolerant Embedded Processor Caches. In: ICCD (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Ramaswamy, S., Yalamanchili, S. (2007). Customized Placement for High Performance Embedded Processor Caches. In: Lukowicz, P., Thiele, L., Tröster, G. (eds) Architecture of Computing Systems - ARCS 2007. ARCS 2007. Lecture Notes in Computer Science, vol 4415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71270-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-71270-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71267-1
Online ISBN: 978-3-540-71270-1
eBook Packages: Computer ScienceComputer Science (R0)