skip to main content
research-article

Modeling and exploiting spatial locality trade-offs in wavelet-based applications under varying resource requirements

Published: 05 March 2010 Publication History

Abstract

Future dynamic applications will require new mapping strategies to deliver power-efficient performance. Fully static design-time mappings will not be able to optimally address the unpredictably varying application characteristics and system resource requirements. Instead, the platforms will not only need to be programmable in terms of instruction set processors, but also at least partial reconfigurability will be required, while the applications themselves will need to exploit this increased freedom at runtime to adapt to the dynamism. In this context, it is important for applications to optimally exploit the memory hierarchy under varying memory availability. This article presents an analysis of spatial locality trade-offs in wavelet-based applications, to be used in dynamic execution environments: Depending on the encountered runtime conditions, the execution switches to different memory optimized instantiations or localizations, optimally exploiting temporal and spatial locality under these conditions. This is enabled by systematic mapping guidelines, indicating how the miss-rate behavior of a localization is influenced by a specific execution condition, under which conditions a certain localization is optimal and which miss-rate gains may be obtained by switching to that localization.

References

[1]
Amrutur, B. and Horowitz, M. 2000. Speed and power scaling of SRAM's. IEEE J. Solid-State Circuits 35.
[2]
Anderson, J., Amarasinghe, S., and Lam, M. 1995. Data and computation transformations for multiprocessors. In Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP'95). ACM, New York, 166--178.
[3]
Banerjee, U. K. 1993. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Norwell, MA.
[4]
Bernabe, G., Garcia, J., and Gonzalez, J. 2005. Reducing 3D fast WT execution time using blocking and the streaming SIMD extensions. J. VLSI Sig. Proc. 41, 2, 209--223.
[5]
Beyls, K. and D'Hollander, E. 2002. Reuse distance-based cache hint selection. In Proceedings of the 8th International Euro-Par Conference. Springer, Berlin, 265--274.
[6]
Catthoor, F., Wuytack, S., De Greef, E., Balasha, F., Nachtergaele, L., and Vandecappelle, A. 1998. Custom Memory Management Methodology. Kluwer Academic Publishers, The Netherlands.
[7]
Chatterjee, S. and Brooks, C. 2002. Cache-efficient wavelet lifting in JPEG 2000. In Proceedings of the 2002 IEEE International Conference on Multimedia and Expo. IEEE, Los Alamitos, CA, 797--800.
[8]
Chrysafis, C. and Ortega, A. 2000. Line-based, reduced memory, wavelet image compression. IEEE Trans. Image Process. 378--389.
[9]
Cierniak, M. and Li, W. 1995. Unifying data and control transformations for distributed shared- memory machines. In Proceedings of the Programming Language Design and Implementation (PLDI'95). ACM, New York, 205--217.
[10]
Ferentinos, V., Geelen, B., Catthoor, F., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2007. Adaptive mapping to resource availability for dynamic wavelet-based applications. In Proceedings of 5th IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia'07). IEEE, Los Alamitos, CA.
[11]
Geelen, B., Ferentinos, V., Catthoor, F., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2007. Exploiting varying resource requirements in wavelet-based applications in dynamic execution environments. J. VLSI Sig. Proc.
[12]
Geelen, B., Ferentinos, V., Catthoor, F., Lafruit, G., Verkest, D., Lauwereins, R., and Stouraitis, T. 2008. Application-level impact of new wavelet transform data layout choices. ACM Trans. Des. Autom. Electron. Syst.
[13]
Geelen, B., Ferentinos, V., Catthoor, F., Vandecappelle, A., Lafruit, G., Stouraitis, T., Lauwereins, R., and Verkest, D. 2006. Software-controlled scratchpad mapping strategies for wavelet-based applications. In Proceedings of the IEEE Workshop on Signal Processing Systems. IEEE, Los Alamitos, CA.
[14]
Hill, M. S. 1998. Dinero IV, release 7, trace-driven uniprocessor cache simulator. http://www.cs.wisc.edu/_markhill/DineroIV.
[15]
Huang, C.-T., Tseng, P.-C., and Chen, L.-G. 2005. Generic ram-based architectures for 2D discrete wavelet transform with line-based method. IEEE Trans. Circuits Syst. Video Tech. 15, 7, 910--920.
[16]
Kandemir, M. 2001. A compiler technique for improving whole-program locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'01). ACM, New York, 179--192.
[17]
Kandemir, M., Kadayif, I., Choudhary, A., Ramanujam, J., and Ibrahim, K. 2004. Compiler-directed scratchpad mem. opt. for embedded multiproc's. IEEE Trans. VLSI Syst. 3, 281--287.
[18]
Kulkarni, C., Ghez, C., Miranda, M., Catthoor, F., and De Man, H. 2001. Cache conscious data layout organization for emb. multimedia applications. In Proceedings of the Design, Automation and Test in EuropeConference (DATE'01). IEEE, Los Alamitos, CA, 686--691.
[19]
Lafruit, G., Nachtergaele, L., and Bormans, J. 1999. Opt. mem. org. for scalable texture codecs in MPEG-4. IEEE Trans. Circuit Syst. Video Tech. 2, 218--243.
[20]
Leung, S. and Zahorjan, J. 1995. Optimizing data locality by array restructuring. Tech. rep. TR-95-09-01. Department of Computer Science and Engineering, University of Washington.
[21]
Mallat, S. G. 1989. A theory for multires. signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11, 7, 674--693.
[22]
Masselos, K., Catthoor, F., Kakarudas, A., Goutis, C., and De Man, H. 2001. Memory hierarchy layer assignment for data reuse exploitation in multimedia algorithms realized on predefined processor architectures. In Proceedings of the IEEE International Conference on Electronics, Circuits and Systems. IEEE, Los Alamitos, CA, 281--287.
[23]
McKinley, K. S., Carr, S., and Tseng, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. Program. Lang. Syst. 18, 4, 424--453.
[24]
Meerwald, P., Norcen, R., and Uhl, A. 2002. Cache issues with JPEG2000 wavelet lifting. In Proceedings of Visual Communications and Image Processing. SPIE, vol. 4671, Bellingham, WA, 626--634.
[25]
O'Boyle, M. and Knijnenburg, P. 1997. Nonsingular data transformations: Definition, validity and applications. In Proceedings of the International Conference on Supercomputing (ICS'97). ACM, New York, 309--316.
[26]
Palkovic, M., Corporaal, H., and Catthoor, F. 2005. Global memory optimization for embedded systems allowed by code duplication. In Proceedings of 9th International Workshop on Software and Compilers for Embedded Systems (SCOPES'05). ACM, New York, 72--79.
[27]
Panda, P., Nakamura, H., Dutt, N., and Nicolau, A. 1999. Augmenting loop tiling with data alignment for improved cache performance. IEEE Trans. Comput. 48, 2, 142--149.
[28]
Patterson, D. and Henessy, J. 1996. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco, CA.
[29]
Schwarz, H., Marpe, D., and Wiegand, T. 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circuits Syst. Video Tech. 17, 9, 1103--1120.
[30]
Steinke, S., Wehmeyer, L., Lee, B.-S., and Marwedel, P. 2002. Assigning program and data objects to scratchpad for energy reduction. In Proceedings of the 5th ACM/IEEE Design, Automation and Test in EuropeConference. ACM, New York, 409--415.
[31]
Sweldens, W. 1995. The lifting scheme: A new philosophy in biorthogonal wavelet constructions. In Wavelet Applications in Signal and Image Processing III. Proceedings of SPIE, 2569. SPIE, Bellingham, WA, 68--79.
[32]
Taubman, D. and Marcellin, M. 2002. JPEG2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, The Netherlands.
[33]
Van Achteren, T., Lauwereins, R., and Catthoor, F. 2002. Data reuse exploration techniques for loop-dominated applications. In Proceedings of the 5th ACM/IEEE Design, Automation and Test in EuropeConference. ACM, New York, 428--435.
[34]
Vetterli, M. and Kovacic, J. 1995. Wavelets and Subband Cod. Prentice Hall, Upper Saddle River, NJ.
[35]
Wu, B.-F. and Lin, C.-F. 2005. A high-performance and memory-efficient pipeline architecture for the discrete wavelet transform of jpeg2000 codec. IEEE Trans. Circuits Syst. Video Tech. 15, 12, 1615--1628.

Cited By

View all

Index Terms

  1. Modeling and exploiting spatial locality trade-offs in wavelet-based applications under varying resource requirements

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 9, Issue 3
        February 2010
        442 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/1698772
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Journal Family

        Publication History

        Published: 05 March 2010
        Accepted: 01 January 2009
        Revised: 01 October 2008
        Received: 01 November 2007
        Published in TECS Volume 9, Issue 3

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Dynamism
        2. loop transformations
        3. wavelet transform

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 17 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media