Abstract
We implemented the H.264/AVC variable block size motion estimation (VBSME) using a very long instruction word (VLIW)–single instruction multiple data (SIMD) digital signal processor (DSP). The SAD_Reuse method which has a regular structure is chosen for VBSME not only to remove redundant sum of absolute difference (SAD) operations but also to utilize the instruction level parallelism (ILP) and data level parallelism (DLP) of the architecture. A fast mode decision algorithm is developed to reduce the number of ‘compare and update’ operations and simplify the rate distortion optimization (RDO). The developed fast mode decision uses the difference of motion vectors and the maximum a posteriori (MAP) estimation of the rate-distortion costs. Several advanced software techniques that include software pipelining and packed-data processing are employed. Especially, memory access overhead reduction schemes including the multi-block processing and the inter-procedural scheduling are used for the software optimization. In order to reduce the ‘write buffer full’ in the quarter pixel ME, a 4 bit quantization scheme is developed, which increases the number of arithmetic operations but decreases the stall cycles very much. The implemented variable block size ME for H.264/AVC requires an average of 9 M and 78 Mcycles per frame for QCIF and CIF size video sequences, respectively, in the TMS320C64x DSP architecture.
References
Wiegard, T., Sullivan, G. J., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC Video Coding Standard. IEEE transactions on circuits and systems for video technology, 13(7), 560–576.
Lim, K. P., Sullivan, G. J., & Wiegard, T. (2004). Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Doc. JVT-K049, March.
Chung, H. Y., Yung, N. H., & Cheung, P. Y. (2001). Fast Motion Estimation with Search Center Prediction. Optical engineering, 40(6), 952–963.
Hong, M., & Park, Y. (2006). Dynamic Search Range Decision for Motion Estimation. VCEG-N33, Sept.
Ma, K. K., & Hosur, P. I. (2000). Performance Report of Motion Vector Field Adaptive Search Technique (MVFAST). ISO/IEC JTC1/SC29/WG11 MPEG99/m5851, NL, March.
Li, R., Zeng, B., & Liou, M. L. (1994). A New Three-step Search Algorithm for Block Motion Estimation. IEEE transactions on circuits and systems for video technology, 4(4), 438–442.
Po, L. M., & Ma, W. C. (1996). A Novel Four-step Search Algorithm for Fast Block Motion Estimation. IEEE transactions on circuits and systems for video technology, 6(3), 313–317.
Tourapis, A. M. (2002). Enhanced Predictive Zonal Search for Single and Multiple Frame Motion Estimation. Proc. of SPIE Conf. on Visual Communications and Image Processing, 4671(2), San Jose, CA, 1069–1079, Jan.
Chen, Z., Zhou, P., & He, Y. (2002). Fast Integer-pel and Fractional-pel Motion Estimation for JVT. JVT-F017 6th Meeting, Awaji Island, Japan, Dec.
Xu, X., & He, Y. (2005). Comments on Motion Estimation Algorithms in Current JM Software. JVT-Q089 17th Meeting, Nice, Fr., Oct.
TMS320C6414, TMS320C6415, TMS320C6416 Fixed-Point Digital Signal Processors, Literature Num. SPRS146G, Texas Instruments, March 2003.
TMS320C6000 Optimizing Compiler User’s Guide, Literature Num. SPRU187K, Texas Instruments, Oct. 2002.
Im, H., Lee, W., & Sung, W. (2005). Implementation of an H.264 Motion Estimation Algorithm on a VLIW Programmable Digital Signal Processor. IEEE Workshop on Signal Processing Systems, 302–306, Nov.
Ates, H. F., & Altunbasak, Y. (2005). SAD Reuse in Hierarchical Motion Estimation for the H.264 Encoder. IEEE Int. Conf. on Acoustics Speech and Signal Processing, 2, 905–908, March.
Yalcin, S., Ates, H. F., & Hamzaoglu, I. (2005). A High Performance Hardware Architecture for an SAD Reuse Based Hierarchical Motion Estimation Algorithm for H.264 Video Coding. IEEE Int. Conf. on Field Programmable Logic and Applications, Aug.
Yin, P., Tourapis, H. C., Tourapis, A. M., & Boyce, J. (2003). Fast Mode Decision and Motion Estimation for JVT/H.264. Proc. of IEEE Int. Conf. on Image Processing, 3, Barcelona, Spain, 853–856, Sep.
Kuo, T., & Chan, C. (2006). Fast Variable Block Size Motion Estimation for H.264 Using Likelihood and Correlation of Motion Field. IEEE transactions on circuits and systems for video technology, 16(10), Oct.
Tai, S., Chen, Y., & Li, S. (2004). Low Complexity Variable-size Block-matching Motion Estimation for Adaptive Motion Compensation Block Size in H.264. Proc. of IEEE Asia Pacific Int. Conf. on Circuits and Systems, 1, 613–616, Dec.
Patras, I., Hendriks, E. A., & Lagendijk, R. L. (2002). Confidence Measures for Block Matching Motion Estimation. Proc. of IEEE Int. Conf. on Image Processing, 2, 277–280, Sep.
Lengwehasatit, K., & Ortega, A. (1999). Complexity-distortion Tradeoffs in Vector Matching Based on Probabilistic Partial Distance Techniques. Proc. of IEEE Int. Conf. on Data Compression, Snowbird, UT, March.
Lai, Y., Tseng, Y., Lin, C., & Sun, M. (2005). H.264 Encoder Speed-up via Joint Algorithm/Code-Level Optimization. Proc. of SPIE Conf. on Visual Communications and Image Processing, 5960(2), 1089–1100, July.
Sihvo, T., & Niittylahti, J. (2005). H.264/AVC Interpolation Optimization. IEEE Workshop on Signal Processing Systems Design and Implementation, 307–312, Nov.
Zhou, X., Li, E. Q., & Chen, Y. K. (2003). Implementation of H.264 Decoder on General-purpose Processors with Media Instructions. Proc. of SPIE Conf. on Image and Video Communications and Processing, 5022, 224–235, Jan.
Choi, H., Lee, W., & Sung, W. (2007). Memory Access Reduced Software Implementation of H.264/AVC Sub-pixel Motion Estimation using Differential Data Encoding. IEEE Int. Symposium on Circuits and Systems, 2898–2901, May.
Joint Video Team Reference Software JM12.3, ITU-T, [On-line]. Available: http://ftp3.itu.ch/av-arch/jvt-site/reference_software/.
TMS320C6000 Instruction Set Simulator User’s Guide, Literature Num. SPRU546, Texas Instruments, Sep. 2001.
Peng, C. (2004). Video Encoding Optimization on TMS320DM64x/C64x. Application Report SPRAA63, Texas Instruments, Oct.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, W., Choi, H. & Sung, W. Algorithm and Software Optimization of Variable Block Size Motion Estimation for H.264/AVC on a VLIW–SIMD DSP. J Sign Process Syst Sign Image 51, 289–302 (2008). https://doi.org/10.1007/s11265-007-0151-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0151-9