Skip to main content
Log in

System-Level Data-Flow Transformation Exploration and Power-Area Trade-offs Demonstrated on Video Codecs

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

Application studies in the domain of image and video processing systems indicate that up to 80% of the power and area cost in customized architectures for such data-dominant processing is due to storage and transfers for multi-dimensional (M-D) data. This paper has two main contributions. First, as a crucial step to reduce this dominant cost, we propose an exploration subscript focused on data-flow transformations which address the system-level storage organization. This subscript fits within a complete high-level memory management methodology developed in the context of our ATOMIUM research activity. We will also indicate the potential for future design support in each of the stages of the subscript. Secondly, we will demonstrate the usefulness of the stages in this novel system exploration approach based on realistic test-vehicles, in particular crucial modules in a complex H.263 video decoder system for teleconferencing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. L. Terman and R.-H. Yan (Eds.), Special issue on “Low power electronics,” of the Proc. of the IEEE, Vol. 83, No.4, pp. 495– 700, April 1995.

  2. F. Catthoor, W. Geurts, and H. De Man, “Loop transformation methodology for fixed-rate video, image and telecom processing applications,” Proc. Intnl. Conf. on Applic.-Spec. Array Processors, San Francisco, CA, pp. 427–438, Aug. 1994.

  3. S. Wuytack, F. Catthoor, F. Franssen, L. Nachtergaele, and H. De Man, “Global communication and memory optimizing transformations for low power systems,” IEEE Intnl. Workshop on Low Power Design, Napa, CA, pp. 203–208, April 1994.

  4. .H. Meng, B. Gordon, E. Tsern, and A. Hung, “Portable video-on-demand in wireless communication,” special issue on “Low power electronics” of the Proc. of the IEEE, Vol. 83, No.4, pp. 659–680, April 1995.

    Google Scholar 

  5. L. Nachtergaele, F. Catthoor, F. Balasa, F. Franssen, E. De Greef, H. Samsom, and H. De Man, “Optimization of memory organization and hierarchy for decreased size and power in video and image processing systems,” Proc. Intnl. Workshop on Memory Technology, Design and Testing, San Jose, CA, pp. 82–87, Aug. 1995.

  6. U. Banerjee, R. Eigenmann, A. Nicolau, and D. Padua, “Automatic program parallelisation,” Proc. of the IEEE, invited paper, Vol. 81, No.2, Feb. 1993.

  7. C. Polychronopoulos, “Compiler optimizations for enhancing parallelism and their impact on the architecture design,” IEEE Trans. on Computers, Vol. 37, No.8, pp. 991–1004, Aug. 1988.

    Article  MathSciNet  Google Scholar 

  8. A.V. Aho, R. Sethi, and J.D. Ulmann, Compilers: Principles, Techniques and Tools, Addison-Wesley, 1986.

  9. U. Banerjee, Loop Transformations for Restructuring Compilers: The Foundations, Kluwer, Boston, 1993.

    Book  MATH  Google Scholar 

  10. D.B. Loveman, “Program improvement by source-to-source transformation,” Journal of the ACM, Vol. 24, No.1, pp. 121– 145, 1977.

    Article  MathSciNet  MATH  Google Scholar 

  11. D.A. Padua and M.J. Wolfe, “Advanced compiler optimizations for supercomputers,” Communications of the ACM, Vol. 29, No.12, pp. 1184–1201, 1986.

    Article  Google Scholar 

  12. M. van Swaaij, F. Franssen, F. Catthoor, and H. De Man, “Automating high-level control flow transformations for DSP memory management,” Proc. IEEE Workshop on VLSI Signal Processing, Napa Valley, CA, Oct. 1992. Also in VLSI Signal Processing V, K. Yao, R. Jain, and W. Przytula (Eds.), IEEE Press, New York, pp. 397–406, 1992.

  13. R. Walker and D. Thomas, “Behavioral transformation for algorithmic level IC design,” IEEE Trans. on Comp.-aided Design, Vol. 8, No.10, pp. 1115–1128, Oct. 1989.

    Article  Google Scholar 

  14. M. Wolfe, “The tiny loop restructuring tool,” Proc. of Intnl. Conf. on Parallel Processing, pp. II.46–II.53, 1991.

  15. G. Fettweiss and L. Thiele, “Algebraic recurrence transformations for massive parallelism,” Proc. IEEE Workshop on VLSI Signal Processing V, K. Yao, R. Jain, and W. Przytula (Eds.), IEEE Press, New York, pp. 332–341, 1992.

    Chapter  Google Scholar 

  16. R.I. Hartley and A. Casavant, “Optimizing pipelined networks of associative and commutative operators,” IEEE Trans. on Comp.-aided Design, Vol. CAD-13, No.11, pp. 1418–1425, Nov. 1994.

    Article  Google Scholar 

  17. S.-H. Huang and J. Rabaey, “Maximizing the throughput of high-performance DSP applications using behavioral transformations,” Proc. European Design Automation Conf., Paris, France, pp. 25–30, Feb. 1994.

  18. A. Nicolau and R. Potasman, “Incremental tree height reduction for supercomputers,” Communications of the ACM, Vol. 29, No.12, pp. 1184–1201, Dec. 1986.

    Article  Google Scholar 

  19. K. Parhi, “Algorithmic transformation techniques for concurrent processors,” Proc. of the IEEE, Vol. 77, No.12, pp. 1879–1895, Dec. 1989.

    Article  MATH  Google Scholar 

  20. M. Potkonjak, M. Srivastava, and J. Rabaey, “Efficient substitution of multiple constant multiplications by shifts and additions using iterative pairwise matching,” Proc. 31st ACM/IEEE Design Automation Conf., San Diego, CA, pp. 189–194, June 1994.

  21. A. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R.W. Brodersen, “Optimizing power using transformations,” IEEE Trans. on Comp.-aided Design, Vol. CAD-14, No.1, pp. 12–30, Jan. 1995.

    Article  Google Scholar 

  22. C.H. Gebotys, “Low energy memory component design for cost-sensitive high-performance embedded systems,” Proc. IEEE Custom Integrated Circuits Conf., San Diego, CA, pp. 397–400, May 1996.

  23. R. Mehra, L. Guerra, and J. Rabaey, “Exploiting locality for low-power design,” Proc. IEEE Custom Integrated Circuits Conf., San Diego, CA, pp. 401–404, May 1996.

  24. J.M. Janssen, F. Catthoor, and H. De Man, “A specification invariant technique for operation cost minimisation in flow-graphs,” Proc. 7th ACM/IEEE Intnl. Symp. on High-Level Synthesis, Niagara-on-the-Lake, Canada, pp. 146–151, May 1994.

  25. D. Singh, J. Rabaey, M. Pedram, F. Catthoor, S. Rajgopal, N. Sehgal, and T. Mozdzen, “Power conscious CAD tools and methodologies: A perspective,” Special issue on “Low power electronics” of the Proc. of the IEEE, Vol. 83, No.4, pp. 570– 594, April 1995.

    Google Scholar 

  26. S. Meyers, More Effective C++, Addison Wesley, 1996.

  27. J. Rosseel, F. Catthoor, and H. De Man, “The exploitation of global operations in affine space-time mapping,” Proc. IEEE Workshop on VLSI Signal Processing, Napa Valley, CA, Oct. 1992. Also in VLSI Signal Processing, V.K. Yao, R. Jain, and W. Przytula (Eds.), IEEE Press, New York, pp. 309–319, 1992.

  28. M. Barreteau and P. Feautrier, “Efficient mapping of interdependent scans,” Proc. EuroPar Conference, Lyon, France, Aug. 1996. Lecture Notes in Computer Science Series, Springer-Verlag, pp. 463–466.

  29. M. Potkonjak and J. Rabaey, “Optimizing resource utilization using transformations,” Proc. IEEE Int. Conf. Comp. Aided Design, Santa Clara, CA, pp. 88–91, Nov. 1991.

  30. D. Whitfield and M.L. Soffa, “An approach to ordering optimizing transformations,” 2nd ACM Symposium on Principles and Practice of Parallel Programming, pp. 137–147, March 1990.

  31. T. Ebrahimi, E. Reussens, and W. Li, “New trends in very low bitrate video coding,” Proc. of the IEEE, Vol. 83, No.6, pp. 877– 890, June 1995.

    Article  Google Scholar 

  32. T. Nishitani, P. Ang, and F. Catthoor (Eds.), VLSI Video/Image Signal Processing, Kluwer Academic Publishers, Boston, 1993.

    Google Scholar 

  33. P. Pirsch, N. Demassieux, and W. Gehrke, “VLSI architectures for video compression—A survey,” Proc. of the IEEE, invited paper, Vol. 83, No.2, pp. 220–246, Feb. 1995.

    Article  Google Scholar 

  34. L. Torres and M. Kunt (Eds.), Video Coding: The Second Generation Approach, Kluwer Academic Publishers, Boston, 1997.

    Google Scholar 

  35. ITU-H.263, “Video coding for narrow telecommunications channels at less than 64 kbits/s,” http://www.nta.no/brukere/DVC/h263 - wht/.

  36. K. Itoh, K. Sasaki, and Y. Nakagome, “Trends in low-power RAM circuit technologies,” Special issue on Low Power Design of the Proc. of the IEEE, Vol. 83, No.4, pp. 524–543, April 1995.

    Google Scholar 

  37. J.M. Mulder, N.T. Quach, and M.J. Flynn, “An area model for on-chip memories and its application,” IEEE J. Solid-state Circ., Vol. SC-26, No.1, pp. 98–105, Feb. 1991.

    Article  Google Scholar 

  38. L. Nachtergaele, F. Catthoor, B. Kapoor, D. Moolenaar, and S. Janssen, “Low power storage exploration for H.263 video decoder,” IEEE Workshop on VLSI Signal Processing, Monterey, CA, Oct. 1996. Also in VLSI Signal Processing IX, W. Burleson, K. Konstantinides, and T. Meng (Eds.), IEEE Press, New York, pp. 116–125, 1996.

  39. Digital Video Coding at Telenor R&D, “Telenor's H.263 Software,” Version 3.1, http://www.nta.no/brukere/DVC/h263 - software/.

  40. DFL User Manual, EDC/Mentor, Abdijstraat 34, 3000 Leuven.

  41. GEC Plessey Semiconductors, “An overview of the H.261 video compression standard and its implementation in the GPS chipset,” Product announcement AN206, Oct. 1995.

  42. M. Harrand, M. Henry, P. Chaisemartin, P. Mougeat, Y. Durand, A. Tournier, R. Wilson, J. Herluison, J. Langchambon, J. Bauer, M. Runtz, and J. Bulone, “A single chip videophone encoder/decoder,” Proc. IEEE Int. Solid-State Circ. Conf., pp. 292– 293, Feb. 1995.

  43. K. Danckaert, F. Catthoor, and H. De Man, “System-level memory management for weakly parallel image processing,” Proc. EuroPar Conference, Lyon, France, Aug. 1996. Lecture Notes in Computer Science Series, Springer-Verlag, pp. 217–225, 1996.

  44. D. Moolenaar, “System specification and storage architecture exploration for two video compression schemes,” Master's Thesis, T.U. Delft-IMEC, May 1996.

  45. J.M. Janssen, F. Catthoor, and H. De Man, “A specification invariant technique for regularity improvement between flow-graph clusters,” Proc. European Design Automation Conf., Paris, France, pp. 138–143, Feb. 1996.

  46. C.E. Leiserson and J.B. Saxe, “Optimizing synchronous circuitry by retiming,” Proc. Third Caltech Conference of VLSI, R. Bryant (Ed.), Comp. Science Press, 1983.

  47. K. Parhi, “High-level algorithm and architecture transformations for DSP synthesis,” Journal of VLSI Signal Processing, Special issue on “Design environments for DSP,” I. Verbauwhede and J. Rabaey (Eds.), Vol. 9, No.1, Kluwer, Boston, pp. 121–143, Jan. 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Catthoor, F., Janssen, M., Nachtergaele, L. et al. System-Level Data-Flow Transformation Exploration and Power-Area Trade-offs Demonstrated on Video Codecs. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 18, 39–50 (1998). https://doi.org/10.1023/A:1007941326114

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007941326114

Keywords

Navigation