Abstract
Data access usually leads to more than 50% of the power cost in a modern signal processing system. To realize a low-power design, how to reduce the memory access power is a critical issue. Data reuse (DR) is a technique that recycles the data read from memory and can be used to reduce memory access power. In this paper, a systematic method of DR exploration for low-power architecture design is presented. For a start, the signal processing algorithms should be formulated as the nested loops structures, and data locality is explored by use of loop analysis. Then, corresponding DR techniques are applied to reduce memory access power. The proposed design methodology is applied to the motion estimation (ME) algorithms of H.264 video coding standard. After analyzing the ME algorithms, suitable parallel architectures and processing flows of the integer ME (IME) and fractional ME (FME) are proposed to achieve efficient DR. The amount of memory access is respectively reduced to 0.91 and 4.37% in the proposed IME and FME designs, and thus lots of memory access power is saved. Finally, the design methodology is also beneficial for other signal processing systems with a low-power consideration.
Similar content being viewed by others
References
T. Mudge, “Power: A First-class Architectural Design Constraint,” IEEE Comput., vol. 34, no. 4, pp. 52–58, Apr. 2001.
K. Danckaert, K. Masselos, F. Catthoor, H. J. D. Man, and C. Goutis, “Strategy for Power-Efficient Design of Parallel Systems,” IEEE Trans. VLSI Syst., vol. 7, no. 2, 1999, pp. 258–265, June.
S. Wuytack, F. Catthoor, L. Nachtergaele, and H. D. Man, “Power Exploration for Data Dominated Video Applications,” in Proc. IEEE Int. Conf. on Low Power Electronics and Design (ISLPED), 1996.
S. Wuytack, J.-P. Diguet, F. V. M. Catthoor, and H. J. D. Man, “Formalized Methodology for Data Reuse Exploration for Low-Power Hierarchical Memory Mappings,” IEEE Trans. VLSI Syst., vol. 6, no. 4, 1998, pp. 529–536, Dec.
C.-P. Lin, P.-C. Tseng, Y.-T. Chiu, S.-S. Lin, C.-C. Cheng, H.-C. Fang, W.-M. Chao, and L.-G. Chen, “A 5mW MPEG4 SP Encoder with 2D Bandwidth-sharing Motion Estimation for Mobile Applications,” in ISSCC Digest of Technical Papers, 2006.
H.-J. Stolberg, S. Moch, L. Friebe, A. Dehnhardt, M. B. Kulaczewski, M. Berekovic, and P. Pirsch, “An SoC with Two Multimedia DSPs and a RISC Core for Video Compression Applications,” in ISSCC Digest of Technical Papers, 2004.
Information Technology—Coding of Audio-Visual Objects—Part 2: Visual. ISO/IEC 14496-2, 1999.
Joint Video Team of ITU-T and ISO/IEC JTC 1, “Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification,” Mar. 2003.
A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, “Low-power CMOS Digital Design,” IEEE J. Solid State Circuits, vol. 27, no. 4, pp. 473–483, Apr. 1992.
W. M. Elgharbawy and M. A. Bayoumi, “Leakage Sources and Possible Solutions in Nanometer CMOS Technologies,” IEEE Circuits and Syst. Mag., vol. 5, no. 4, 2005, pp. 6–17.
Y. Su and M.-T. Sun, “Fast Multiple Reference Frame Motion Estimation for H.264,” in Proc. IEEE Int. Conf. on Multimedia and Expo (ICME), 2004.
Y.-H. Chen, T.-C. Chen, and L.-G. Chen, “Hardware Oriented Content-adaptive Fast Algorithm for Variable Block-size Integer May 25, 2007 DRAFT Motion Estimation in H.264,” in Proc. IEEE Int. Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2005.
Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, H.264/AVC Reference Software JM8.2. http://bs.hhi.de/ suehring/tml/download/, May 2004.
J.-C. Tuan, T.-S. Chang, and C.-W. Jen, “On the Data Reuse and Memory Bandwidth Analysis for Full-Search Block-Matching VLSI Architecture,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 1, 2002, pp. 61–72, Jan.
Y.-W. Huang, T.-C. Wang, B.-Y. Hsieh, and L.-G. Chen, “Hardware Architecture Design for Variable Block Size Motion Estimation in MPEG-4 AVC/JVT/ITU-T H.264,” in Proc. IEEE Int. Symposium on Circuits and Systems (ISCAS), 2003.
S. Y. Yap and J. V. McCanny, “A VLSI Architecture for Variable Block Size Video Motion Estimation,” IEEE Trans. Circuits Syst. II, vol. 51, no. 7, pp. 384–389, July 2004.
H. F. Ates and Y. Altunbasak, “SAD Reuse in Hierarchical Motion Estimation for the H.264 Encoder,” in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing (ICASSP), 2005.
C.-Y. Chen, S.-Y. Chien, Y.-W. Huang, T.-C. Chen, T.-C. Wang, and L.-G. Chen, “Analysis and Architecture Design of Variable block size Motion Estimation for H.264/AVC,” IEEE Trans. Circuits Syst. 1, Fundam. Theory Appl., vol. 53, no. 3, 2006, pp. 578–593.
T.-C. Chen, Y.-W. Huang, C.-Y. Tsai, C.-T. Huang, and L.-G. Chen, “Single Reference Frame Multiple Current Macroblocks Scheme for Multi-Frame Motion Estimation in H.264/AVC,” in Proc. IEEE Int. Symposium on Circuits and Systems (ISCAS), 2005.
S.-S. Lin, P.-C. Tseng, and L.-G. Chen, “Low-power Parallel Tree Architecture for Full Search Block-matching Motion Estimation,” in Proc. IEEE Int. Symposium on Circuits and Systems (ISCAS), 2004.
J. Miyakoshi, Y. Kuroda, M. Miyama, K. Imamura, H. Hashimoto, and M. Yoshimoto, “A Sub-mW MPEG-4 Motion Estimation Processor Core for Mobile Video Application,” in IEEE Custom Integrated Circuits Conference (CICC), 2003.
T.-C. Chen, Y.-W. Huang, and L.-G. Chen, “Fully Utilized and Reusable Architecture for Fractional Motion Estimation of H.264/AVC,” in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing (ICASSP), 2004.
C. Yang, S. Goto, and T. Ikenaga, “High Performance VLSI Architecture of Fractional Motion Estimation in H.264 for HDTV,” in Proc. IEEE Int. Symposium on Circuits and Systems (ISCAS), 2006.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, YH., Chen, TC., Tsai, CY. et al. Data Reuse Exploration for Low Power Motion Estimation Architecture Design in H.264 Encoder. J Sign Process Syst Sign Image 50, 1–17 (2008). https://doi.org/10.1007/s11265-007-0112-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0112-3