Skip to main content
Log in

Systematic Application of Data Transfer and Storage Optimizing Code Transformations for Power Consumption and Execution Time Reduction in ACROPOLIS: A Pre-Compiler for Multimedia Applications

  • Published:
Design Automation for Embedded Systems Aims and scope Submit manuscript

Abstract

A systematic methodology for the application of data transfer and storageoptimizing code transformations to high-level descriptions of multimedia systemsrealized on instruction set processors is proposed. A detailed order for theapplication of different data transfer and storage optimizing transformationsis proposed in the context of combined execution time and power optimizations.A use methodology including a number of support steps that allow the efficientapplication of the data transfer and storage oriented transformations is proposedas well. Applicatio n of the proposed transformation-based methodology movesthe main part of the memory accesses from the large background memories (lyingpossibly off-chip) to smaller ones (on-chip) or even to foreground storage(registers). Data cache performance is improved thus reducing power consumptionin the data memory hierarchy and related interconnects. Execution time andthe power consumption due to instruction storage and transfers are reducedas well after the application of the proposed methodology. Experimental resultsfrom several real-life multimedia applications prove the effectiveness ofthe proposed methodology. The proposed approach has been applied in the contextof realizations on custom hardware processors as well with promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Rabaey, J.M., and M. Pedram. Low Power Design Methodologies. Kluwer Academic Publishers, 1995.

  2. Seki, T., E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano, and N. Suzuki.A6-ns 1-Mb CMOS SRAM With Latched Sense Amplifier. IEEE Journal of Solid State Circuits, vol. 28, no. 4, pp. 478-483, (Apr.)1993.

    Google Scholar 

  3. Wuytack, S., F. Catthoor, L. Nachtergaele, and H. DeMan. Power Exploration for Data Dominated Video Applications. In Proc. IEEE Intnl. Symposium on Low Power Design, Monterey, CA, pp. 359-364, (Aug.)1996.

  4. Meng, T.H., B. Gordon, E. Tsern, and A. Hung.Portable Video-on-Demand in Wireless Communication. Proceedings of the IEEE, special issue on Low Power Design, vol. 83, no. 4, pp. 659-680, (April)1995.

    Google Scholar 

  5. Tiwari, V., S. Malik, and A. Wolfe. Power Analysis of Embedded Software: A First Step Towards Software Power Minimization. IEEETrans. on VLSI Systems, vol. 2, no. 4, pp. 437-445, (Dec.)1994.

    Google Scholar 

  6. http://www.imec.be/vsdm/projects/mm comp/. The IMEC multimedia compilation project ACROPOLIS.

  7. Catthoor, F., S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle. Custom Memory Management Methodology Exploration of Memory Organisation for Embedded Multimedia System Design. ISBN 0-7923-8288-9,Kluwer Acad. Publ., Boston, 1998.

    Google Scholar 

  8. Van Meerbergen, J., P. Lippens, W. Verhaegh, and A. Van Der Werf. PHIDEO: High-Level Synthesis for High Throughput Applications. Journal of VLSI Signal Processing, special issue on Design Environments for DSP. (I.Verbauwhede, J.Rabaey, eds.), Kluwer, Boston, vol. 9, nos.1/2, pp. 89-104, (Jan.)1995.

    Google Scholar 

  9. Ancourt, C., D. Barthou, C. Guettier, F. Irigoin, B. Jeannet, J. Jourdan, and J. Mattioli. Automatic Data Mapping of Signal Processing Applications. In Proc. of Intnl. Conference on Application Specific Array Processors, 1997.

  10. Fang, J. Z., and M. Lu. An Iteration Partition Approach for Cache or Local Memory Thrashing on Parallel Processing. IEEE Transactions on Computers, vol.C-42, no. 5, (May)1993.

  11. Gannon, D., W. Jalby, and K. Gallivan. Strategies for Cache and Local Memory Management by Global Program Transformations. Journal of Parallel and Distributed Computing, vol. 5, pp. 568-586,1988.

    Google Scholar 

  12. Wolf, M., and M. Lam. A Data Locality Optimizing Algorithm. In Proc. of the SIGPLAN'91Conf. on Programming Language Design and Implementation, Toronto, ON, Canada, pp. 30-43, (June) 1991.

  13. Panda, P.R., N.D. Dutt, and A. Nicolau. Memory Issues in Embedded in Systems-on-Chip: Optimization and Exploration, Kluwer Academic Publishers, Boston,1998.

    Google Scholar 

  14. Pirsch, P., H.-J. Stolberg, Y.-K. Chen, S.Y. Kung. Implementation of Media Processors. IEEE Signal Processing Magazine, no. 4, pp. 48-51, (July) 1997.

  15. Landman, P.. Low Power Architectural Design Methodologies, Doctoral Dissertation, U.C. Berkeley, (Aug.)1994.

  16. Kamble, M., and K. Ghose.Analytical Energy Dissipation Models for Low Power Caches. In Proc. of the 1997 International Symposium on Low Power Electronics and Design, Monterey, CA, (August 18-20).

  17. Catthoor, F., M. Janssen, L. Nachtergaele, H. DeMan. System-Level Data-Flow Transformations for Power Reduction in Image and Video Processing.In Proc. of ICECS'96, pp.1025-1028.

  18. Kulkarni, D., and M. tumm. Loop and Data Ttpdel 99ransformations: A Tutorial.Technical Report CSRI-337, Computer Systems Research Institute, University of Toronto, (June) 1993.

  19. Diguet, J.P., S. Wuytack, F. Catthoor, and H. DeMan. Hierarchy Exploration in High level Memory Management. In Proc. of the 1997 International Symposium on Low Power Electronics and Design, Monterey, CA, (August 18-20).

  20. DeGreef, E., F. Catthoor, and H. DeMan. Memory Size Reduction through Storage Order Optimization for Embedded Parallel Multimedia Applications. In Intl. Parallel Processing Symposium (IPPS) in Proc. Workshop on Parallel Processing and Multimedia, (April)1997, pp. 84-98.

  21. Aho, A., R. Sethi, and J. Ullman. Compilers: Principles Techniques and Tools, Addison-Wesley Publishing Company, Reading, MA,1986.

    Google Scholar 

  22. Strobach, P.. A New Technique in Scene Adaptive Coding. In Proc. 4thEur. Signal Processing Conf.,EUSIPCO-88, Grenoble, France, Elsevier Publ., Amsterdam, pp. 1141-1144, (Sep.)1988.

    Google Scholar 

  23. Hall, M., J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, and M. Lam. Maximizing Multiprocessor Performance With the SUIF Compiler. IEEE Computer Magazine, vol. 30, no.12, pp. 84-89, (December)1996.

    Google Scholar 

  24. Kulkarni, C., F. Catthoor, and H. DeMan. Hardware Cache Optimization for Parallel Multimedia Applications. In EuroPar Conference, (September) 1998, pp. 668-676.

  25. Lam, M., E. Rothberg, and M. Wolf.The Cache Performance and Optimizations of Blocked Algorithms. In Architectural Support for Programming Languages and Operating Systems Conference, pp. 63-74, (April)1991.

  26. Rabiner, L. R., and R. WSchafer. Digital Signal Processing of Speech Signals, Prentice Hall International, Englewood Cliffs, NJ, 1988.

    Google Scholar 

  27. The PhilipsTriMedia Family of Processors, http://www.trimedia.philips.com.

  28. Masselos, K., F. Catthoor, C.E. Goutis, H. DeMan. Interaction Between Sub-word Parallelism Exploitation and Low Power Code Transformations for VLIW Multi-Media Processors. In Volta Workshop, Italy, (March)1999.

  29. Kelly, W., and W. Pugh.Generating Schedules and Code Within a Unified Reordering Transformation Framework. Technical Report UMIACS-TR-92-126, CS-TR-2995, Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, College Park, MD 20742, 1992.

    Google Scholar 

  30. Amarasinghe, S., J. Anderson, M. Lam, and C. Tseng. The SUIF Compiler for Scalable Parallel Machines. Proc. of the 7th SIAM Conference on Parallel Processing for Scientific Computing, 1995.

  31. Banerjee, U., R. Eigemnann, A. Nicolau, and D. Padua. Automatic Program Parallelisation. Proceedings of the IEEE, invited paper, vol. 81, no. 2, pp. 211-243, (February) 1993.

    Google Scholar 

  32. McKinley, K., M. Hall, T. Harvey, K. Kennedy, N. McIntosh, J. Oldham, M. Paleczny, and G. Roth. Experiences Using the Para Scope Editor: An Interactive Parallel Programming Tool. In Proc. of the 4th ACMSIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego,USA,(May) 1993.

  33. Brockmeyer, E., S. Wuytack, A. Vandecappelle, and F. Catthoor. Low Power Storage for Hierarchical Graphs. In Proc. of the 3rd ACM/IEEE Design and Test in Europe Conference, Paris, France, User Forum, pp. 249-254, (April) 2000.

  34. Masselos, K., K. Danckaert, F. Catthoor, C.E. Goutis, and H. DeMan. A Methodology for Power Efficient Partitioning of Data Dominated Algorithm Specifications Within Performance Constraints. In Proc. of the 1999 International Symposium on Low Power Electronics and Design, pp. 270-272,CA,August 1999.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Masselos, K., Catthoor, F., Goutis, C.E. et al. Systematic Application of Data Transfer and Storage Optimizing Code Transformations for Power Consumption and Execution Time Reduction in ACROPOLIS: A Pre-Compiler for Multimedia Applications. Design Automation for Embedded Systems 8, 51–86 (2003). https://doi.org/10.1023/A:1022340119745

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1022340119745

Navigation