Skip to main content
Log in

Buffer structure optimized VLSI architecture for efficient hierarchical integer pixel motion estimation implementation

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Integer pixel motion estimation (IME) is one crucial module with high complexity in high-definition video encoder. Efficient algorithm and architecture joint design is supposed to tradeoff multiple target parameters including throughput capacity, logic gate, on-chip SRAM size, memory bandwidth, and rate distortion performance. Data organization and on-chip buffer structure are crucial factors for IME architecture design, accounting for multiple target performance tradeoff. In this work, we combine global hierarchical search and local full search to propose hardware efficient IME algorithm, and then propose hardware VLSI architecture with optimized on-chip buffer structure. The major contribution of this work is characterized by: (1) improved hierarchical IME algorithm with presearch and deliberate data organization, (2) multistage on-chip reference pixel buffer structure with high data reuse between integer and fraction pixel motion estimations, (3) highly reused and reconfigurable processing element structure. The optimized data organization and buffer structure achieves nearly 70 % buffer saving with less than average 0.08, 0.12 dB the worst case, PSNR degradation compared with full search based architecture. At the hardware cost of 336 and 382 K logic gate and 20 kB SRAM, the proposed architecture achieves the throughput of 384 and 272 cycles per macroblock, at system frequency of 95 and 264 MHz for 1080p and QFHD @30fps format video coding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. ITU-T Recommendation and International Standard of Joint Video Specification. ITU-T Rec. H.264/ISO/IEC 14496-10 AVC, Mar.2005

  2. SMPTE: 421M, VC-1 Compressed video bitstream format and decoding process. http://www.smpte.org/smpte_store/standards/pdf/s421m.pdf

  3. Huang, Y.-W., et al.: A 1.3 TOPS H.264/AVC single-chip encoder for HDTV applications. In: IEEE ISSCC Digest Technical Papers, pp. 128–129 (2005)

  4. Chang, H.C., et al.: A 7mW-to-183mW dynamic quality-scalable H.264 video encoder chip, ISSCC Digest Technical Papers, pp. 280–281 (2007)

  5. Liu, Z., Song, Y., Shao, M., Li, S., Li, L., Ishiwata, S., Nakagawa, M., Goto, S., Ikenaga, T.: A 1.41W H.264/AVC real-time encoder SOC for HDTV1080P. In: VLSI Circuits Symposium of Digest, pp. 12–13 (2007)

  6. Lin, Y.-K., et al.: A 242mW 10mm2 1080P H.264/AVC High-Profile Encoder Chip, ISSCC Digest Technical Paper, pp. 314–615, (2008)

  7. Chen, Y.-H., Chuang, T.-D., Chen, Y.-J., Li, C.-T., Hsu, C.-J., Chien, S.-Y., Chen, L.G.: An H.264/AVC scalable extension and high profile HDTV 1080p encoder chip, 2008 Symposium on VLSI Circuits Digest of Technical Papers, pp. 104105 (2008)

  8. Chen, T.-C., et al.: 2.8 to 67.2 mW low-power and power-aware H.264 encoder for mobile applications. In: VLSI Circuits Symposium Digest, pp. 222–223 (2007)

  9. Iwata, K., Mochizuki, S., Kimura, M., et al.: A 256 mW 40 Mbps full-HD H.264 high-profile codec featuring a dual-macroblock pipeline architecture in 65 nm CMOS, IEEE J. Solid-State Circuits. 44(4), 1184–1191 (2009)

    Google Scholar 

  10. Ding, L.-F., Chen, W.-Y., Tsung, P.-K., Chen, T.-C., Lin, P.-C., Chang, C.-Y., Chen, W.-L., Chen, L.-G.: A 212 MPixels/s 4096 × 2160p multi-view video encoder chip for 3D/quad HDTV applications. In: IEEE ISSCC Digest Technical Papers (2009)

  11. Matsui, H, Ogawa, T, et al.: An H.264 full HD 60i double speed encoder IP supporting both MBAFF and field-pic structure. International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Taiwan (2011)

  12. Nittam, K., Ikeda, M.: An H.264/AVC high422 profile and MPEG-2 422 profile encoder LSI for HDTV broadcasting infrastructures, International Symposium on VLSI Circuits (2008)

  13. Yin, H.B., Qi, H.G., Jia, H., Xie, D., Gao, W.: Efficient macroblock pipeline structure in high definition AVS video encoder VLSI architecture, 2010 IEEE International Symposium on Circuits and Systems (ISCAS 2010) Paris, France, 30 May–2 June 2010

  14. Huang, Y.-W., Chen, C.-Y., et al.: Survey on block matching motion estimation algorithms and architectures with new results. J. VLSI Signal Process. 42, 297–320 (2006)

    Article  MATH  Google Scholar 

  15. Chen, C.-Y., Chien, S.-Y., Huang, Y.-W., Chen, T.-C., Wang, T.-C., Chen, L.-G.: Analysis and architecture design of variable block-size motion estimation for H.264/AVC. IEEE Trans. Circuits Syst. I 53(3), 578–593 (2006)

    Article  Google Scholar 

  16. Chang, H.-C., Chen, J.-W., Wu, B.-T., Su, C.-L., Wang, J.-S., Guo, J.-I.: A dynamic quality-adjustable H.264 video encoder for power-aware video applications. IEEE Trans. Circuits Syst. Video Tech. 19(12), 1739–1754 (2009)

    Article  Google Scholar 

  17. Liu, Z., Song, Y., Shao, M., Li, S., Li, L., Goto, S., Ikenaga, T.: 32-parallel SAD tree hardwired engine for variable block size motion estimation in HDTV1080P real-time encoding application. In: Proceeding of IEEE Workshop Signal Processing System, pp. 675–680 (2007)

  18. Lin, Y.-K., Lin, C.-C., Kuo, T.-Y., Chang, T.-S.: A hardware-efficient H.264/AVC motion-estimation design for high-definition video. IEEE Trans. Circuits Syst. I Regul. Pap. 55(6), 1526–1535 (2008)

    Google Scholar 

  19. Chen, Y.-H., Chen, T.-C., Tsai, C.-Y., Tsai, S.-F., Chen, L.-G.: Algorithm and architecture design of power-oriented H264/AVC baseline profile encoder for portable devices. IEEE Trans. Circuits Syst. Video Tech. 19(8), 1118–1128 (2009)

    Article  MathSciNet  Google Scholar 

  20. Ding, L.-F., Chen, W.-Y., Tsung, P.-K., et al.: A 212 MPixels/s 4096 2160p multiview video encoder chip for 3D/quad full HDTV applications. IEEE J. Solid-State Circuits 45(1), 46–58 (2010)

    Article  Google Scholar 

  21. Yin, H., Jia, H., Qi, H., Ji, X., Xie, X., Gao, W.: A Hardware-efficient multi-resolution block matching algorithm and its VLSI architecture for high definition MPEG-like video encoders. IEEE Trans. Circuits Syst. Video Technol. 20(9), 1242–1254 (2010). (2010)

    Article  Google Scholar 

  22. Tsai, T.-H., Pan, Y.-N.: High efficiency architecture design of real-time QFHD for H.264/AVC fast block motion estimation. IEEE Trans. Circuits Syst. Video Technol. 21(11), 1646–1658 (2011)

    Article  Google Scholar 

  23. Wen, X., OC, Au, Xu, J., Fang, L., Cha, R., Li, J.: Novel RD-optimized VBSME with matching highly data re-usable hardware architecture. IEEE Trans. Circuits Syst. Video Technol. 21(2), 206–219 (2011). (2011)

    Article  Google Scholar 

  24. Kim, J., Park, T.: A novel VLSI architecture for full-search variable block-size motion estimation. IEEE Trans. Consumer Electron. I 55(2), 728–733 (2009)

    Article  Google Scholar 

  25. Lee, J.H., Lee, N.S.: Variable block size motion estimation algorithm and its hardware architecture for H.264/AVC. Proc. IEEE Int. Symp. Circuits Syst. 3, 741–744 (2004)

    Google Scholar 

  26. Lin, H.D., Anesko, A., Petryna, B.: A 14-GOPS programmable motion estimator for H.26 × videocoding. IEEE J. Solid-State Circuits 31(11), 1742–1750 (1996)

    Article  Google Scholar 

  27. Cheng, S.C., Hang, H.M., et al.: A comparison of block matching algorithms mapped to systolic-array implementation. IEEE Trans. Circuits Syst. Video Technol. 7(5), 741–757 (1997)

    Article  Google Scholar 

  28. Vanne, J., Aho, E., Kuusilinna, K., Hämäläinen, T.D.: A configurable motion estimation architecture for block-matching algorithms. IEEE Trans. Circuits Syst. Video Technol. 19(4), 74–86 (2009)

    Article  Google Scholar 

  29. Song, B.C., et al.: Multi-resolution block matching algorithm and its VLSI architecture for fast motion estimation in a MPEG-2 video encoder. IEEE Trans. CSVT 14(9), 1119–1137 (2004)

    Google Scholar 

  30. Tuan, J.-C., Chang, T.-S., Jen, C.-W.: On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture. IEEE Trans. Circuits Syst. Video Technol. 12(1), 61–72 (2002)

    Google Scholar 

  31. Chen, C.-Y., Huang, C.-T., Chen, Y.-H., Chen, L.-G.: Level C+ data reuse scheme for motion estimation with corresponding coding orders. IEEE Trans. Circuits Syst. Video Technol. 16(4), 553–558 (2006)

    Google Scholar 

  32. Chen, Z., Zhou, P., He, Y., Wang, G.: Fast motion estimation for JVT JVT-G016 (2003)

  33. Calhoun, B.H., Cao, Y., Li, X., Mai, K., Pileggi, L.T., Rutenbar, R.A., Shepard, K.L.: Digital circuit design challenges and opportunities in the era of nanoscale CMOS. Proc. IEEE 96(2), 343–365 (2008)

    Article  Google Scholar 

  34. Bjøntegaard, G.: Calculation of average PSNR differences between RD curves. document VCEG-M33 of ITU-T Q6/16, Austin TX, USA (2001)

Download references

Acknowledgments

The authors would like to thank all the reviewers for their thoughtful comments and suggestions which helped to improve the technical presentation of this paper. This work was supported by NSFC 60802025, ZJNSF Y1110114 LY12F01011, S&T project of Zhejiang province 2010C310075, and the open project of State Key Laboratory of ASIC & System of Fudan University 10KF010, and the open project of SKL of Novel Soft Technology, Nanjing University (KFKT2012B09).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haibing Yin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, H., Park, D.S. & Zhang, X.Y. Buffer structure optimized VLSI architecture for efficient hierarchical integer pixel motion estimation implementation. J Real-Time Image Proc 11, 507–525 (2016). https://doi.org/10.1007/s11554-013-0341-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-013-0341-6

Keywords

Navigation