Skip to main content
Log in

Combined Application of Data Transfer and Storage Optimizing Transformations and Subword Parallelism Exploitation for Power Consumption and Execution Time Reduction in VLIW Multimedia Processors

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

In this paper the important issues in mapping data dominated multimedia applications on Very Long Instruction Word (VLIW) multimedia processors are addressed. The main design quality factors of applications realized on the target architecture platform are presented and their interactions are explored. Power consumption is the major cost factor while performance is the overriding constraint in realizations of multimedia applications on the target architecture platform. A methodology for the reduction of the data transfer and storage related power consumption, which forms an important part of the total power budget of the system, and the execution time of applications realized on VLIW multimedia processors, has been developed. The methodology is based on the application of a number of transformations, mainly oriented towards data transfer and storage optimization, to a high level description of the target application. The main focus of this paper is on the interaction of the proposed code transformations with the exploitation of subword parallelism (for example through the application of special performance improving arithmetic subword instructions present in modern VLIW multimedia processors). Experimental results from real-life data-dominated multimedia applications clearly demonstrate that the application of the proposed transformations is orthogonal to the exploitation of subword parallelism. A second conclusion is that the positive impact of the proposed code transformations on performance is typically even larger than the effect of the subword parallelism exploitation for the complete application. The effect of the subword parallelism exploitation is even enhanced after the application of the proposed code transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J.M. Rabaey and M. Pedram, Low Power Design Methodologies. Kluwer Academic Publishers 1995.

  2. T. Seki, E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano, and N. Suzuki, “A 6-ns 1-Mb CMOS SRAM with Latched Sense Amplifier,” IEEE Journal of Solid State Circuits, vol. 28, no. 4, 1993, pp. 478-483.

    Article  Google Scholar 

  3. S. Wuytack, F. Catthoor, L. Nachtergaele, and H. DeMan, “Power Exploration for Data Dominated Video Applications,” in Proc. IEEE Intn. Symposium on Low Power Design, Monterey CA, 1996, pp. 359-364.

  4. H. DeMan, F. Catthoor, G. Goosens, J. Vanhoof, J. Van Meerbergen, S. Note, and J. Huisken, “Architecture-Driven Synthesis Techniques for VLSI Implementation of DSP Algorithms,” Proc. of the IEEE, special issue on “The Future of Computer-Aided Design,” vol. 78, no. 2, 1990, pp. 319-335.

    Google Scholar 

  5. P. Lippens, J. Van Meerbergen, W. Verhaegh, and A. Van Der Werf, “Allocation of Multiport Memories for Hierarchical Data Streams,” in Proc. IEEE Intnl. Conference on Computer-Aided Design, Santa Clara CA, Nov. 1993.

  6. T.H. Meng, B. Gordon, E. Tsern, and A. Hung, “Portable Video-on-Demand in Wireless Communication,” special issue on “Low Power Design” of the Proceedings of the IEEE, vol. 83, no. 4, April 1995, pp. 659-680.

    Google Scholar 

  7. V. Tiwari, S. Malik, and A. Wolfe, “Power Analysis of Embedded Software: A First Step Towards Software Power Minimization,” IEEE Trans. on VLSI Systems, vol. 2, no. 4, 1994, pp. 437-445.

    Article  Google Scholar 

  8. F. Catthoor, S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle, “Custom Memory Management Methodology—Exploration of Memory Organisation for Embedded Multimedia System Design,” Boston: Kluwer Acad. Publ. 1998, ISBN 0-7923-8288-9.

    MATH  Google Scholar 

  9. A. Raghunathan and N. Jha, “SCALP: An Iterative-Improvement-Based Low-Power Data Path Synthesis System,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 16, no. 11, 1997, pp. 1260-1277.

    Article  Google Scholar 

  10. D. Kirovski and M. Potkonjak, “System-Level Synthesis of Low Power Hard Real-Time Systems,” in Proc. of Design Automation Conference 1997 (DAC'97).

  11. C. Kulkarni, F. Catthoor, and H. DeMan, “Hardware Cache Optimization for Parallel Multimedia Applications,” in Proc. of EUROPAR-98, Southampton, UK, Sept. 1998.

  12. K. Danckaert, F. Catthoor, and H. DeMan, “System-Level Memory Optimization for Hardware Software Co-Design,” International Workshop Hardware/Software Co-design, March 1997, pp. 55-59.

  13. L. Nachtergaele, D. Moolenaar, B. Vanhoof, and F. Catthoor, “System-Level Power Optimization of Video Codecs on Embedded Cores: A Systematic Approach,” Journal of VLSI Signal Processing, Kluwer, Boston 1998.

    Google Scholar 

  14. P.R. Panda, N.D. Dutt, and A. Nicolau, Memory Issues in Embedded in Systems-on-Chip: Optimization and Exploration, Boston: Kluwer Academic Publishers, 1998.

    Google Scholar 

  15. W. Kelly and W. Pugh, “Generating Schedules and Code Within a Unified Reordering Transformation Framework,” Technical Report UMIACS-TR-92-126, CS-TR-2995, Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland, College Park, MD, 1992.

    Google Scholar 

  16. S. Amarasinghe, J. Anderson, M. Lam, and C. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” in Proc. of the 7th SIAM Conference on Parallel Processing for Scientific Computing, 1995.

  17. U. Banerjee, R. Eigenmann, A. Nicolau, and D. Padua, “Automatic Program Parallelisation,” Proceedings of the IEEE, invited paper, vol. 81, no. 2, 1993, pp. 211-243.

    Article  Google Scholar 

  18. K. McKinley, M. Hall, T. Harvey, K. Kennedy, N. McIntosh, J. Oldham, M. Paleczny, and G. Roth, “Experiences Using the ParaScope Editor: An Interactive Parallel Programming Tool,” in Proc. of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, USA, May 1993.

  19. http://www.imec.be/vsdm/projects/mm_comp/. The IMEC multimedia compilation project—ACROPOLIS.

  20. P. Pirsch, H.-J. Stolberg, Y.-K. Chen, and S.Y. Kung, “Implementation of Media Processors,” IEEE Signal Processing Magazine, no. 4, July 1997, pp. 48-51.

  21. F. Catthoor, “Energy-Delay Efficient Data Storage and Transfer Architectures: Circuit Technology versus Design Methodology Solutions,” in Proc. of DATE 98, 1998, pp. 709-714.

  22. The Philips TriMedia Family of Processors, http://www.trimedia.philips.com.

  23. P. Landman, “Low Power Architectural Design Methodologies,” Doctoral Dissertation, U.C. Berkeley, Aug. 1994.

  24. M. Kamble and K. Ghose, “Analytical Energy Dissipation Models for Low Power Caches,” in Proc. of the 1997 International Symposium on Low Power Electronics and Design, Monterey CA, August 18–20, 1997.

  25. P. Strobach, “A New Technique in Scene Adaptive Coding, in Proc. 4th Eur. Signal Processing Conf., EUSIPCO-88, Grenoble, France, Elsevier Publ., Amsterdam, Sept. 1988, pp. 1141-1144.

    Google Scholar 

  26. L.R. Rabiner and R.W. Schafer, “Digital Signal Processing of Speech Signals,” Englewood Cliffs, NJ: Prentice Hall International, 1988.

    Google Scholar 

  27. V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards, Kluwer Academic Publishers, 1994.

  28. V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, “Reducing Power in High-Performance Microprocessors,” in Proc. of Design Automation Conference 1998 (DAC'98).

  29. M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, and M. Lam, “Maximizing Multiprocessor Performance with the SUIF Compiler,” IEEE Computer Magazine, vol. 30, no. 12, 1996, pp. 84-89.

    Article  Google Scholar 

  30. D. Kulkarni and M. Stumm, “Loop and Data Transformations: A Tutorial,” Technical Report CSRI-337, Computer Systems Research Institute, University of Toronto, June 1993.

  31. A. Aho, R. Sethi, and J. Ullman, Compilers: Principles, Techniques and Tools. Reading Massachusetts: Addison-Wesley Publishing Company, 1986.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Masselos, K., Catthoor, F., Goutis, C.E. et al. Combined Application of Data Transfer and Storage Optimizing Transformations and Subword Parallelism Exploitation for Power Consumption and Execution Time Reduction in VLIW Multimedia Processors. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 37, 53–73 (2004). https://doi.org/10.1023/B:VLSI.0000017003.70829.34

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000017003.70829.34

Navigation