Skip to main content
Log in

Algorithm and Software Optimization of Variable Block Size Motion Estimation for H.264/AVC on a VLIW–SIMD DSP

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

We implemented the H.264/AVC variable block size motion estimation (VBSME) using a very long instruction word (VLIW)–single instruction multiple data (SIMD) digital signal processor (DSP). The SAD_Reuse method which has a regular structure is chosen for VBSME not only to remove redundant sum of absolute difference (SAD) operations but also to utilize the instruction level parallelism (ILP) and data level parallelism (DLP) of the architecture. A fast mode decision algorithm is developed to reduce the number of ‘compare and update’ operations and simplify the rate distortion optimization (RDO). The developed fast mode decision uses the difference of motion vectors and the maximum a posteriori (MAP) estimation of the rate-distortion costs. Several advanced software techniques that include software pipelining and packed-data processing are employed. Especially, memory access overhead reduction schemes including the multi-block processing and the inter-procedural scheduling are used for the software optimization. In order to reduce the ‘write buffer full’ in the quarter pixel ME, a 4 bit quantization scheme is developed, which increases the number of arithmetic operations but decreases the stall cycles very much. The implemented variable block size ME for H.264/AVC requires an average of 9 M and 78 Mcycles per frame for QCIF and CIF size video sequences, respectively, in the TMS320C64x DSP architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Wiegard, T., Sullivan, G. J., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC Video Coding Standard. IEEE transactions on circuits and systems for video technology, 13(7), 560–576.

    Article  Google Scholar 

  2. Lim, K. P., Sullivan, G. J., & Wiegard, T. (2004). Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Doc. JVT-K049, March.

  3. Chung, H. Y., Yung, N. H., & Cheung, P. Y. (2001). Fast Motion Estimation with Search Center Prediction. Optical engineering, 40(6), 952–963.

    Article  Google Scholar 

  4. Hong, M., & Park, Y. (2006). Dynamic Search Range Decision for Motion Estimation. VCEG-N33, Sept.

  5. Ma, K. K., & Hosur, P. I. (2000). Performance Report of Motion Vector Field Adaptive Search Technique (MVFAST). ISO/IEC JTC1/SC29/WG11 MPEG99/m5851, NL, March.

  6. Li, R., Zeng, B., & Liou, M. L. (1994). A New Three-step Search Algorithm for Block Motion Estimation. IEEE transactions on circuits and systems for video technology, 4(4), 438–442.

    Article  Google Scholar 

  7. Po, L. M., & Ma, W. C. (1996). A Novel Four-step Search Algorithm for Fast Block Motion Estimation. IEEE transactions on circuits and systems for video technology, 6(3), 313–317.

    Article  Google Scholar 

  8. Tourapis, A. M. (2002). Enhanced Predictive Zonal Search for Single and Multiple Frame Motion Estimation. Proc. of SPIE Conf. on Visual Communications and Image Processing, 4671(2), San Jose, CA, 1069–1079, Jan.

  9. Chen, Z., Zhou, P., & He, Y. (2002). Fast Integer-pel and Fractional-pel Motion Estimation for JVT. JVT-F017  6th Meeting, Awaji Island, Japan, Dec.

  10. Xu, X., & He, Y. (2005). Comments on Motion Estimation Algorithms in Current JM Software. JVT-Q089 17th Meeting, Nice, Fr., Oct.

  11. TMS320C6414, TMS320C6415, TMS320C6416 Fixed-Point Digital Signal Processors, Literature Num. SPRS146G, Texas Instruments, March 2003.

  12. TMS320C6000 Optimizing Compiler User’s Guide, Literature Num. SPRU187K, Texas Instruments, Oct. 2002.

  13. Im, H., Lee, W., & Sung, W. (2005). Implementation of an H.264 Motion Estimation Algorithm on a VLIW Programmable Digital Signal Processor. IEEE Workshop on Signal Processing Systems, 302–306, Nov.

  14. Ates, H. F., & Altunbasak, Y. (2005). SAD Reuse in Hierarchical Motion Estimation for the H.264 Encoder. IEEE Int. Conf. on Acoustics Speech and Signal Processing, 2, 905–908, March.

    Google Scholar 

  15. Yalcin, S., Ates, H. F., & Hamzaoglu, I. (2005). A High Performance Hardware Architecture for an SAD Reuse Based Hierarchical Motion Estimation Algorithm for H.264 Video Coding. IEEE Int. Conf. on Field Programmable Logic and Applications, Aug.

  16. Yin, P., Tourapis, H. C., Tourapis, A. M., & Boyce, J. (2003). Fast Mode Decision and Motion Estimation for JVT/H.264. Proc. of IEEE Int. Conf. on Image Processing, 3, Barcelona, Spain, 853–856, Sep.

  17. Kuo, T., & Chan, C. (2006). Fast Variable Block Size Motion Estimation for H.264 Using Likelihood and Correlation of Motion Field. IEEE transactions on circuits and systems for video technology, 16(10), Oct.

  18. Tai, S., Chen, Y., & Li, S. (2004). Low Complexity Variable-size Block-matching Motion Estimation for Adaptive Motion Compensation Block Size in H.264. Proc. of IEEE Asia Pacific Int. Conf. on Circuits and Systems, 1, 613–616, Dec.

    Article  Google Scholar 

  19. Patras, I., Hendriks, E. A., & Lagendijk, R. L. (2002). Confidence Measures for Block Matching Motion Estimation. Proc. of IEEE Int. Conf. on Image Processing, 2, 277–280, Sep.

    Google Scholar 

  20. Lengwehasatit, K., & Ortega, A. (1999). Complexity-distortion Tradeoffs in Vector Matching Based on Probabilistic Partial Distance Techniques. Proc. of IEEE Int. Conf. on Data Compression, Snowbird, UT, March.

  21. Lai, Y., Tseng, Y., Lin, C., & Sun, M. (2005). H.264 Encoder Speed-up via Joint Algorithm/Code-Level Optimization. Proc. of SPIE Conf. on Visual Communications and Image Processing, 5960(2), 1089–1100, July.

    Google Scholar 

  22. Sihvo, T., & Niittylahti, J. (2005). H.264/AVC Interpolation Optimization. IEEE Workshop on Signal Processing Systems Design and Implementation, 307–312, Nov.

  23. Zhou, X., Li, E. Q., & Chen, Y. K. (2003). Implementation of H.264 Decoder on General-purpose Processors with Media Instructions. Proc. of SPIE Conf. on Image and Video Communications and Processing, 5022, 224–235, Jan.

    Google Scholar 

  24. Choi, H., Lee, W., & Sung, W. (2007). Memory Access Reduced Software Implementation of H.264/AVC Sub-pixel Motion Estimation using Differential Data Encoding. IEEE Int. Symposium on Circuits and Systems, 2898–2901, May.

  25. Joint Video Team Reference Software JM12.3, ITU-T, [On-line]. Available: http://ftp3.itu.ch/av-arch/jvt-site/reference_software/.

  26. TMS320C6000 Instruction Set Simulator User’s Guide, Literature Num. SPRU546, Texas Instruments, Sep. 2001.

  27. Peng, C. (2004). Video Encoding Optimization on TMS320DM64x/C64x. Application Report SPRAA63, Texas Instruments, Oct.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wonchul Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, W., Choi, H. & Sung, W. Algorithm and Software Optimization of Variable Block Size Motion Estimation for H.264/AVC on a VLIW–SIMD DSP. J Sign Process Syst Sign Image 51, 289–302 (2008). https://doi.org/10.1007/s11265-007-0151-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-007-0151-9

Keywords

Navigation