skip to main content
research-article

Modeling and exploiting spatial locality trade-offs in wavelet-based applications under varying resource requirements

Published:05 March 2010Publication History
Skip Abstract Section

Abstract

Future dynamic applications will require new mapping strategies to deliver power-efficient performance. Fully static design-time mappings will not be able to optimally address the unpredictably varying application characteristics and system resource requirements. Instead, the platforms will not only need to be programmable in terms of instruction set processors, but also at least partial reconfigurability will be required, while the applications themselves will need to exploit this increased freedom at runtime to adapt to the dynamism. In this context, it is important for applications to optimally exploit the memory hierarchy under varying memory availability. This article presents an analysis of spatial locality trade-offs in wavelet-based applications, to be used in dynamic execution environments: Depending on the encountered runtime conditions, the execution switches to different memory optimized instantiations or localizations, optimally exploiting temporal and spatial locality under these conditions. This is enabled by systematic mapping guidelines, indicating how the miss-rate behavior of a localization is influenced by a specific execution condition, under which conditions a certain localization is optimal and which miss-rate gains may be obtained by switching to that localization.

References

  1. Amrutur, B. and Horowitz, M. 2000. Speed and power scaling of SRAM's. IEEE J. Solid-State Circuits 35.Google ScholarGoogle Scholar
  2. Anderson, J., Amarasinghe, S., and Lam, M. 1995. Data and computation transformations for multiprocessors. In Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP'95). ACM, New York, 166--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Banerjee, U. K. 1993. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bernabe, G., Garcia, J., and Gonzalez, J. 2005. Reducing 3D fast WT execution time using blocking and the streaming SIMD extensions. J. VLSI Sig. Proc. 41, 2, 209--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Beyls, K. and D'Hollander, E. 2002. Reuse distance-based cache hint selection. In Proceedings of the 8th International Euro-Par Conference. Springer, Berlin, 265--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Catthoor, F., Wuytack, S., De Greef, E., Balasha, F., Nachtergaele, L., and Vandecappelle, A. 1998. Custom Memory Management Methodology. Kluwer Academic Publishers, The Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chatterjee, S. and Brooks, C. 2002. Cache-efficient wavelet lifting in JPEG 2000. In Proceedings of the 2002 IEEE International Conference on Multimedia and Expo. IEEE, Los Alamitos, CA, 797--800.Google ScholarGoogle Scholar
  8. Chrysafis, C. and Ortega, A. 2000. Line-based, reduced memory, wavelet image compression. IEEE Trans. Image Process. 378--389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cierniak, M. and Li, W. 1995. Unifying data and control transformations for distributed shared- memory machines. In Proceedings of the Programming Language Design and Implementation (PLDI'95). ACM, New York, 205--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ferentinos, V., Geelen, B., Catthoor, F., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2007. Adaptive mapping to resource availability for dynamic wavelet-based applications. In Proceedings of 5th IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia'07). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  11. Geelen, B., Ferentinos, V., Catthoor, F., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2007. Exploiting varying resource requirements in wavelet-based applications in dynamic execution environments. J. VLSI Sig. Proc.Google ScholarGoogle Scholar
  12. Geelen, B., Ferentinos, V., Catthoor, F., Lafruit, G., Verkest, D., Lauwereins, R., and Stouraitis, T. 2008. Application-level impact of new wavelet transform data layout choices. ACM Trans. Des. Autom. Electron. Syst.Google ScholarGoogle Scholar
  13. Geelen, B., Ferentinos, V., Catthoor, F., Vandecappelle, A., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2006. Software-controlled scratchpad mapping strategies for wavelet-based applications. In Proceedings of the IEEE Workshop on Signal Processing Systems. IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  14. Hill, M. S. 1998. Dinero IV, release 7, trace-driven uniprocessor cache simulator. http://www.cs.wisc.edu/_markhill/DineroIV.Google ScholarGoogle Scholar
  15. Huang, C.-T., Tseng, P.-C., and Chen, L.-G. 2005. Generic ram-based architectures for 2D discrete wavelet transform with line-based method. IEEE Trans. Circuits Syst. Video Tech. 15, 7, 910--920. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kandemir, M. 2001. A compiler technique for improving whole-program locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'01). ACM, New York, 179--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kandemir, M., Kadayif, I., Choudhary, A., Ramanujam, J., and Ibrahim, K. 2004. Compiler-directed scratchpad mem. opt. for embedded multiproc's. IEEE Trans. VLSI Syst. 3, 281--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kulkarni, C., Ghez, C., Miranda, M., Catthoor, F., and De Man, H. 2001. Cache conscious data layout organization for emb. multimedia applications. In Proceedings of the Design, Automation and Test in EuropeConference (DATE'01). IEEE, Los Alamitos, CA, 686--691. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lafruit, G., Nachtergaele, L., and Bormans, J. 1999. Opt. mem. org. for scalable texture codecs in MPEG-4. IEEE Trans. Circuit Syst. Video Tech. 2, 218--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leung, S. and Zahorjan, J. 1995. Optimizing data locality by array restructuring. Tech. rep. TR-95-09-01. Department of Computer Science and Engineering, University of Washington.Google ScholarGoogle Scholar
  21. Mallat, S. G. 1989. A theory for multires. signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11, 7, 674--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Masselos, K., Catthoor, F., Kakarudas, A., Goutis, C., and De Man, H. 2001. Memory hierarchy layer assignment for data reuse exploitation in multimedia algorithms realized on predefined processor architectures. In Proceedings of the IEEE International Conference on Electronics, Circuits and Systems. IEEE, Los Alamitos, CA, 281--287.Google ScholarGoogle Scholar
  23. McKinley, K. S., Carr, S., and Tseng, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. Program. Lang. Syst. 18, 4, 424--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Meerwald, P., Norcen, R., and Uhl, A. 2002. Cache issues with JPEG2000 wavelet lifting. In Proceedings of Visual Communications and Image Processing. SPIE, vol. 4671, Bellingham, WA, 626--634.Google ScholarGoogle Scholar
  25. O'Boyle, M. and Knijnenburg, P. 1997. Nonsingular data transformations: Definition, validity and applications. In Proceedings of the International Conference on Supercomputing (ICS'97). ACM, New York, 309--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Palkovic, M., Corporaal, H., and Catthoor, F. 2005. Global memory optimization for embedded systems allowed by code duplication. In Proceedings of 9th International Workshop on Software and Compilers for Embedded Systems (SCOPES'05). ACM, New York, 72--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Panda, P., Nakamura, H., Dutt, N., and Nicolau, A. 1999. Augmenting loop tiling with data alignment for improved cache performance. IEEE Trans. Comput. 48, 2, 142--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Patterson, D. and Henessy, J. 1996. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Schwarz, H., Marpe, D., and Wiegand, T. 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circuits Syst. Video Tech. 17, 9, 1103--1120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Steinke, S., Wehmeyer, L., Lee, B.-S., and Marwedel, P. 2002. Assigning program and data objects to scratchpad for energy reduction. In Proceedings of the 5th ACM/IEEE Design, Automation and Test in EuropeConference. ACM, New York, 409--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sweldens, W. 1995. The lifting scheme: A new philosophy in biorthogonal wavelet constructions. In Wavelet Applications in Signal and Image Processing III. Proceedings of SPIE, 2569. SPIE, Bellingham, WA, 68--79.Google ScholarGoogle Scholar
  32. Taubman, D. and Marcellin, M. 2002. JPEG2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, The Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Van Achteren, T., Lauwereins, R., and Catthoor, F. 2002. Data reuse exploration techniques for loop-dominated applications. In Proceedings of the 5th ACM/IEEE Design, Automation and Test in EuropeConference. ACM, New York, 428--435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vetterli, M. and Kovacic, J. 1995. Wavelets and Subband Cod. Prentice Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wu, B.-F. and Lin, C.-F. 2005. A high-performance and memory-efficient pipeline architecture for the discrete wavelet transform of jpeg2000 codec. IEEE Trans. Circuits Syst. Video Tech. 15, 12, 1615--1628. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling and exploiting spatial locality trade-offs in wavelet-based applications under varying resource requirements

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Embedded Computing Systems
          ACM Transactions on Embedded Computing Systems  Volume 9, Issue 3
          February 2010
          442 pages
          ISSN:1539-9087
          EISSN:1558-3465
          DOI:10.1145/1698772
          Issue’s Table of Contents

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 March 2010
          • Accepted: 1 January 2009
          • Revised: 1 October 2008
          • Received: 1 November 2007
          Published in tecs Volume 9, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader