Skip to main content
Log in

Bit-by-Bit Pipelined and Hybrid-Grained 2D Architecture for Motion Estimation of H.264/AVC

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In H.264/AVC, the motion estimation (ME) routine supports variable block size and involves highly parallel sum of absolute difference (SAD) computations. In this study, we introduce a bit serial hybrid-grained processing element (PE) based 2D architecture that has both early termination and intensive data reuse capabilities. PEs operate on most significant bit-first arithmetic for early termination and the 2D architecture enables on-chip data reuse between neighboring PEs in a bit-by-bit pipelined fashion. Hybrid-grained PEs reduce the hardware overhead of conventional adder tree structures used for implementing the variable block size ME. Our design reduces the gate count by 7x compared to its ASIC counterpart, operates at a comparable frequency while sustaining 30 fps and 60 fps; and outperforms bit parallel and bit serial architectures in terms of throughput and performance per gate for various video formats.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15

Similar content being viewed by others

References

  1. Wiegand, T., Sullivan, G. J., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.

    Article  Google Scholar 

  2. Rhee, I., et. al. (2000). Quadtree-structured variable-size block-matching motion estimation with minimal error. IEEE Transactions on Circuits and Systems for Video Technology, 10, 42–50.

    Article  Google Scholar 

  3. Li, B. M. H., & Leong, P. H. W. (2008). Serial and parallel FPGA-based variable block size motion estimation processors. Journal of VLSI Signal Processing, 51(1), 77–98.

    Google Scholar 

  4. Su, C.-L., & Jen, C.-W. (2000). Motion estimation using on-line arithmetic. In IEEE international symposium on circuits and systems (Vol. 1).

  5. Olivares, J., Hormigo, J., Villalba, J., Benavides, I., & Zapata, E. L. (2006). SAD computation based on online arithmetic for motion estimation. Microprocessors and Microsystems, 30(5), 250–258.

    Article  Google Scholar 

  6. Marshall, A., et al. (1999). A reconfigurable arithmetic array for multimedia applications. In Proc. ACM/SIGDA FPGA’99, Monterey, 21–23 Feb. 1999.

  7. Ebeling, C., Cronquist, D. C., Franklin, P., & Fisher, C. (1996). RaPiD—a configurable computing architecture for compute-intensive applications. University of Washington Department of Computer Science & Engineering Tech Report, TR-96-11-03.

  8. Verma, R., & Akoglu, A. (2007). A coarse grained reconfigurable architecture for variable size block motion estimation. In IEEE international conference on field-programmable technology 2007 (ICFPT’07) (pp. 81–88). Kitakyushu, Japan.

  9. Chen, C. Y., Chien, S. Y., Huang, Y. W., Chen, T. C., Wang, T. C., & Chen, L. G. (2006). Analysis and architecture design of variable block-size motion estimation for H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology, 53(2), 578–593.

    Google Scholar 

  10. Chen, T. C., Chen, Y. H., Tsai, S. F., Chien, S. Y., & Chen, L. G. (2007). Fast algorithm and architecture design of low-power integer motion estimation for H.264/AVC. IEEE Transactions on Circuits and Systems for Video Technology, 17(5), 568–577.

    Article  Google Scholar 

  11. Chen, T. C., Chien, S. Y., Huang, Y. W., Tsai, C. H., Chen, C. Y., Chen, T. W., et al. (2006). Analysis and architecture design of an HDTV720p 30frames/s H.264/AVC encoder. IEEE Transactions on Circuits and Systems for Video Technology, 16(6), 673–688.

    Article  Google Scholar 

  12. Chen, T. C., Fang, H. C., Lian, C. J. Tsai, C. H., Huang, Y. W., Chen, T. W., et al. (2006). Algorithm analysis and architecture design for HDTV applications—a look at the H.264/AVC video compressor system. IEEE Transactions on Circuits and Systems for Video Technology, 22(3), 22–31.

    Google Scholar 

  13. Kim, M., Hwang, I., & Chae, S. I. (2005). A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264. In Proc. ASP-DAC (Vol. 1, pp. 631–634).

  14. Lappalainen, V., Hailapuro, A., Hamalainen, T. D., & Nokia Res. Center, Tampere (2002). Performance of H.26L video encoder on general-purpose processor. The Journal of VLSI Signal Processing.

  15. Reader, S., & Meng, T. (1999). Performance evaluation of motion estimation algorithms for digital signal processors. Tech. Report, Stanford University.

  16. Kuhn, P. M. (1999). Fast MPEG-4 motion estimation: Processor based and flexible VLSI implementations. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 23, 67–92.

    Google Scholar 

  17. Shen, J. F., et al. (2001). A Novel low-power full-search block-matching motion-estimation design for H.263+. IEEE Transactions on Circuits and Systems for Video Technology, 11(7), 890–897.

    Article  Google Scholar 

  18. de Vos, L., & Schobinger, M. (1995). VLSI architecture for a flexible block matching processor. IEEE Transactions Circuits and Systems for Video Technology, 5, 417–428.

    Article  Google Scholar 

  19. Yap, S. Y., & McCanny, J. V. (2004). A VLSI architecture for variable block size video motion estimation. IEEE Transactions on CAS II, 51(7), 384–389.

    Google Scholar 

  20. Ou, C.-M., Le, C.-F., & Hwang, W.-J. (2005). An efficient VLSI architecture for H.264 variable block size motion estimation. IEEE Transaction on Consumer Electronics, 51(4), 1291–1299.

    Article  Google Scholar 

  21. Yap, S. Y., & McCanny, J. V. (2003). A VLSI architecture for advanced video coding motion estimation. In Proc. IEEE intl. conf. applications-specific systems, arch., processors (pp. 293–301).

  22. Soohoo, A. (2005). FPGA co-processing architectures for video compression. Altera Corporation.

  23. Waingold, E., et al. (1997). Baring it all to software: RAW machines. IEEE Computer, 30(9), 86–93.

    Article  Google Scholar 

  24. Mirsky, E., & DeHon, A. (1996). MATRIX: A reconfigurable computing architecture with configurable instruction distribution and deployable resources. In Proc. IEEE FCCM’96, Napa, CA, USA, 17–19 April 1996.

  25. Yang, K. M., Sun, M. T., & Wu, L. (1989). A family of VLSI designs for the motion compensation block-matching algorithm. IEEE Transactions on Circuits and Systems for Video Technology, 36(10), 1317–1325.

    Google Scholar 

  26. Lai, Y. K., & Chen, L. G. (1998). A data-interlacing architecture with two dimensional data-reuse for full-search block-matching algorithm. IEEE Transactions on Circuits and Systems for Video Technology, 8(2), 124–127.

    Article  MathSciNet  Google Scholar 

  27. Yeo, H., & Hu, Y. H. (1995). A novel modular systolic array architecture for full-search block matching motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 5(5), 407–416.

    Article  Google Scholar 

  28. Ercegovac, M. D., & Lang, T. (1989). On-line arithmetic for DSP applications. In 32nd Midwest symposium on circuits and systems, Urbana.

  29. Avizienis, A. (1961). Signed-digit number representations for fast parallel arithmetic. IRE Transactions on Electronic Computers, EC-10(9), 389–400.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Gregory Striemer for his contributions to this paper during the analysis of the results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, Y., Akoglu, A. Bit-by-Bit Pipelined and Hybrid-Grained 2D Architecture for Motion Estimation of H.264/AVC. J Sign Process Syst 68, 49–62 (2012). https://doi.org/10.1007/s11265-010-0575-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0575-5

Keywords