Fast variable-size block motion estimation for efficient H.264/AVC encoding☆
Introduction
Motion estimation (ME) and motion compensation are critical components in video coding systems. Block matching ME, which searches the most similar block in certain areas of a previously encoded frame, reduces the temporal redundancy of the block and represents it with a displacement motion vector. The fixed block-size motion estimation (FBME) with a 16×16 block-size achieves good coding efficiency and has been adopted widely in international standards, such as MPEG-1 [1], MPEG-2 [2], H.261 [3], and H.263 [4]. After removing the temporal redundancy, the residual block can be compressed by spatial transformation and quantization to remove the spatial redundancy. The motion vectors and the quantized transform coefficients can be further compacted by differential and variable length coding techniques. It is well known that the FBME is the most computation demanding function in the aforementioned video coders.
Many fast and efficient algorithms have been developed in the past decades to overcome the computation bottleneck of the FBME. Usually, these fast algorithms use either simple matching criteria or efficient search patterns which are based on the assumption of a unimodal error surface, to reduce the computation. The matching criteria include the sum of squared difference (SSD) (or mean square error (MSE)), sum of absolute difference (SAD) (or mean absolute difference (MAD)), etc. Efficient search patterns comprise three-step search (TSS) [6], novel three-step search (NTSS) [7], four-step search [8], diamond search (DS) [9], and hexagon-based search (HBS) [10]. The DS method is particularly effective for slow motion video sequences. The HBS algorithm is attractive since its complexity and quality performance are close to those of DS, and its speed-up is especially evident for large motion video. The ME algorithms using efficient search patterns usually suffer from the problem of being trapped at local minima. In [11], the diversity-based search strategy (DSS) is suggested by combining the strengths of DS and TSS search patterns, to cover different kinds of motion to avoid from being trapped at local minima. Some ME algorithms focus on selecting a better search-center from motion vectors of adjacent blocks and co-located blocks in the previous frame [12], [13], [14]. The search-center could be obtained from several pre-checked search-points that result in acceptable matching errors in the search area. Starting from a better search-center not only reduces the search-points but also improves the performance since it helps to reduce the probability of being trapped at local minima. The concept of initialization for a better search-center or prediction of motion vector can be combined with fast algorithms described above.
Recently, some researchers demonstrated that variable block-size motion estimation (VBME) gives better coding efficiency than FBME [16], [17], [18]. Hence, H.263+ [4] and MPEG-4 [5] further suggest the 8×8 ME mode to improve the coding efficiency. The advanced video coding standard, H.264/AVC [15] extends the VBME to seven different block-sizes and multiple reference frames, to further improve the overall performance. The VBME with multiple reference frames requires much heavier computation then the FBME.
In H.264/AVC, there are 16×16, 16×8, 8×16, and 8×8 block modes, where the 8×8 block mode can be further classified into 8×8, 8×4, 4×8 and 4×4 sub-block modes, as shown in Fig. 1. To search for the best block-partition with these seven possible block-sizes, applying fast FBME algorithms to each block-size is an intuitive solution to reduce the VBME computation. Thus, some modified FBME strategies [19], [20], [21] have been suggested for the H.264/AVC standard.
In this paper, we first extend the DSS [11] to VBME by initiating multiple search-centers, and adaptively exploit search strategies with different search step-sizes for different block-sizes. To reduce the computation, an inter-mode decision is applied to select an initial block-size for ME. Merge and early termination strategies are further suggested to save computation substantially. The rest of the paper is organized as follows. In Section 2, we first briefly analyze the behaviors of motion vector predictions from spatial candidates and the possible motion vector inferences from difference block-sizes, for the H.264 encoder. Then, we extend the diversity-based ME algorithm to include different search initializations and search step-sizes adapting to various block-sizes to achieve efficient ME. In Section 3, we propose early termination of motion search, adaptation of initial block-sizes, merge-or-skip, and compose-and-refine strategies to further reduce the VBME computation. Finally, experimental results and conclusions are given in 4 Experimental results, 5 Conclusions, respectively.
Section snippets
Variable block-size motion estimation
To achieve the best ME for a block-size, , we should find the motion vector by minimizing the Lagrangian cost as where the SAD between the current and the reference block is given by
In (2), and denote the pixels in the current and the mth coded frames, respectively. In (1), the motion vector difference (MVD) is expressed as
Computation reduction algorithms
For computation reduction, in this section, we further propose several effective algorithms. We first develop the zero-block condition to early stop the unnecessary searches. Moreover, we also suggest an adaptive selection algorithm to determine an initial block-size for ME according to the motion vectors or prediction errors obtained from the initial block-type. With the merge-or-skip strategy, we can avoid the search for all block-sizes. The detailed descriptions of computation reduction
Experimental results
To evaluate our proposed algorithm, we integrate it into the reference software JM v6.1d [24] of H.264/AVC. Some important parameters are set as follows: (1) sequence type is IPPP…; (2) search range is 33×33; (3) number of reference frames is 1; (4) Hadamard transform is used; (5) entropy coding method is CAVLC; (6) no rate-distortion optimization (RDO); and (7) no random intra macroblock refresh in P pictures. Video sequences such as Table Tennis, Foreman, Stefan, Mother and Daughter, and
Conclusions
In this paper, we first propose an extended diversity-based (EDSS) ME algorithm. It uses adaptive step-size and search-pattern for different block-sizes. By using motion vectors obtained from the current macroblock with different block-sizes, and from the adjacent blocks, the EDSS algorithm is very close to FS in terms of the rate-distortion performance. We then develop a threshold-based early stop strategy to reduce the unnecessary computations. Further, a merge-or-skip strategy, which can be
References (31)
- ISO/IEC, Information Technology-Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About...
- ISO/IEC, Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Video,...
- Video codec for audiovisual services at p×64kbits, ITU-T Recommendation H.261, March...
- Video coding for low bitrate communication, ITU-T Recommendation H.263 Version 1, 1995. Version 2, September...
- ISO/IEC, Information Technology-Generic Coding of Audio-Visual Objects Part 2: Visual, ISO/IEC 14496-2 (MPEG-4 Video),...
- T. Koga, K. Iinuma, A. Iijima, T. Ishiguro, Motion-compensated interframe coding for video conferencing, in:...
- et al.
A new three-step search algorithm for block motion estimation
IEEE Trans. Circuits Systems Video Technol.
(1994) - et al.
A novel four-step search algorithm for fast block motion estimation
IEEE Trans. Circuits Systems Video Technol.
(1996) - et al.
A new diamond search algorithm for fast block matching motion estimation
IEEE Trans. Image Process.
(2000) - et al.
Hexagon-based search pattern for fast block motion estimation
IEEE Trans. Circuits Systems Video Technol.
(2002)
Diversity-based fast block motion estimation
Proceedings of IEEE International Conference on Multimedia and Expo 2003 (ICME’03)
A new predictive search area approach for fast block motion estimation
IEEE Trans. Image Process.
Fast motion vector estimation using multiresolution-spatio-temporal correlations
IEEE Trans. Circuits Systems Video Technol.
An efficient search strategy for block motion estimation using image features
IEEE Trans. Image Process.
Cited by (13)
Adaptive variable block-size early motion estimation termination algorithm for H.264/AVC video coding standard
2009, IEEE Transactions on Circuits and Systems for Video TechnologyAdaptive search area selection of variable block-size motion estimation of H.264/AVC video coding standard
2009, ISM 2009 - 11th IEEE International Symposium on MultimediaAn efficient early-termination mode decision algorithm for H.264
2009, IEEE Transactions on Consumer ElectronicsA fast macroblock mode decision algorithm for the baseline profile in the H.264 video coding standard
2009, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)A hardware-efficient H.264/AVC motion-estimation design for high-definition video
2008, IEEE Transactions on Circuits and Systems I: Regular PapersFast motion estimation for H.264/AVC in Walsh-Hadamard domain
2008, IEEE Transactions on Circuits and Systems for Video Technology
- ☆
This research was partially supported by National Science Council under Contract #NSC-92-2213- E006-023 and the Opto-Electronics and Systems Laboratories, Industrial Technology Research Institute under Contract # 93S18-S3, Taiwan.