Abstract
In this paper, we consider flux caches prefetching and a media application. We analyze the MPEG4 encoder workload with realistic data set in a scenario representative for the embedded systems domain. Our study shows that different well known data prefetch mechanisms can gain little reduction in the cache miss ratios when applied on the complete MPEG4 application. Furthermore, we investigate the potential improvement when dedicated prefetching strategies are applied to the sum of absolute differences (SAD) kernels in MPEG4. We propose a flux cache mechanism that dynamically invokes cache designs with dedicated prefetching engines that can fully utilize the available memory bandwidth. We show that our proposal improves the cache miss ratios by a factor close to 3x.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gaydadjiev, G.N., Vassiliadis, S.: Flux caches: What are they and are they useful? In: Hämäläinen, T.D., Pimentel, A.D., Takala, J., Vassiliadis, S. (eds.) SAMOS 2005. LNCS, vol. 3553, pp. 93–102. Springer, Heidelberg (2005)
VanderWiel, S.P., Lilja, D.J.: Data prefetch mechanisms. ACM Computing Surveys 32, 174–199 (2000)
Smith, A.J.: Sequential program prefetching in memory hierarchies. IEEE Computer 11 12, 7–21 (1978)
Vassiliadis, S., Wong, S., Gaydadjiev, G.N., Bertels, K., Kuzmanov, G.K., Panainte, E.M.: The molen polymorphic processor. IEEE Transactions on Computers, 1363–1375 (2004)
Lin, W.F., Reinhardt, S.K., Burger, D.: Reducing DRAM latencies with an integrated memory hierarchy design. In: HPCA, pp. 301–312 (2001)
Gornish, E.H., Veidenbaum, A.: An integrated hardware/software data prefetching scheme for shared-memory multiprocessors. Int. J. Parallel Program 27, 35–70 (1999)
Chen, T.F.: An effective programmable prefetch engine for on-chip caches. In: MICRO 28: Proceedings of the 28th annual international symposium on Microarchitecture, pp. 237–242. IEEE Computer Society Press, Los Alamitos (1995)
Zhang, Z., Torrellas, J.: Speeding up irregular applications in shared-memory multiprocessors: memory binding and group prefetching. In: ISCA 1995: Proceedings of the 22nd annual international symposium on Computer architecture, pp. 188–199. ACM Press, New York (1995)
Wang, Z., Burger, D., McKinley, K.S., Reinhardt, S.K., Weems, C.C.: Guided region prefetching: a cooperative hardware/software approach. In: ISCA 2003: Proceedings of the 30th annual international symposium on Computer architecture, pp. 388–398. ACM Press, New York (2003)
Corbal, J., Espasa, R., Valero, M.: Three-dimensional memory vectorization for high bandwidth media memory systems. In: MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pp. 149–160. IEEE Computer Society Press, Los Alamitos (2002)
Kuzmanov, G., Gaydadjiev, G.N., Vassiliadis, S.: Visual data rectangular memory. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 760–767. Springer, Heidelberg (2004)
Edler, J., Hill, M.D.: Dinero IV trace-driven uniprocessor cache simulator (1998), http://www.cs.wisc.edu/~markhill/DineroIV
Smith, A.J.: Cache Memories. Computing Surveys 14, 473–530 (1982)
Burger, D., Austin, T.M., Bennett, S.: Evaluating future microprocessors: The simplescalar tool set. Technical Report CS-TR-1996-1308 (1996)
http://www.itu.int/rec/recommendation.asp?lang=en&parent=T-REC-H.261
Vassiliadis, S., Gaydadjiev, G.N., Bertels, K., Panainte, E.M.: The molen programming paradigm. In: Proceedings of the Third International Workshop on Systems, Architectures, Modeling, and Simulation, pp. 1–10 (2003)
Panainte, E.M., Bertels, K., Vassiliadis, S.: Compiling for the molen programming paradigm. In: Y. K. Cheung, P., Constantinides, G.A. (eds.) FPL 2003. LNCS, vol. 2778, pp. 900–910. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gaydadjiev, G.N., Vassiliadis, S. (2006). SAD Prefetching for MPEG4 Using Flux Caches. In: Vassiliadis, S., Wong, S., Hämäläinen, T.D. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2006. Lecture Notes in Computer Science, vol 4017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11796435_26
Download citation
DOI: https://doi.org/10.1007/11796435_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36410-8
Online ISBN: 978-3-540-36411-5
eBook Packages: Computer ScienceComputer Science (R0)