A dual quad-tree based variable block-size coding method

https://doi.org/10.1016/j.jvcir.2010.08.004Get rights and content

Abstract

Recent video coding standards with hybrid structure adopt variable block-size processing techniques including variable block-size motion estimation and compensation, variable block-size intra prediction, and variable block-size transform. This paper gives analysis on the variable block-size techniques based on software simulations, and variable block-size transform is specially studied. As a result of the analysis, a generalized dual quad-tree based variable block-size coding (DQTC) structure is proposed. This structure also shows good flexibility and expansibility, in which the prediction block-size set and the transform block-size set can be configured according to requirements and the implementation complexity constraints. Simulation results show a considerable performance improvement for the proposed structure with low implementation complexity while the coding block-size sets and parameters are optimized.

Research highlights

► Segment different regions into blocks with a quad-tree structure for prediction. ► Further segment each inter predicted residue block into sub-blocks to be transformed. ► For intra prediction, the same block-sizes as the prediction are used for transform. ► The available block-size sets for prediction and transformation can be configured. ► Apply different quantization strategies for different coding block-sizes.

Introduction

Block based hybrid coding structure is widely employed in recent coding standards. In this structure, pictures are divided into non-overlapped blocks, and each block is predicted by the reconstructed pixels of the current picture or reference pictures, then the residues (prediction error) are transformed, scaled, quantized, and finally entropy coded. The block-size keeps a key issue of this coding structure.

DCT transform coding has been widely accepted as the most efficient coding tool for residue coding [1]. A larger transform provides a better energy compaction and a better performance for smooth regions especially at low bit-rate [2]. On the other hand, a larger transform shows up more severe ringing artifacts around edges when using coarse quantization, and it gives poor performance to the regions of small objects [3]. The variable block-size transform takes advantages of large size and small size transforms, the performance optimizing through transform size selecting according to the signal properties is the key technique to improve the coding efficiency.

For block-based inter picture prediction, the most well-known problem is that the borders of moving objects normally do not coincide with the borders of the encoding blocks. To obtain an appropriate MV (Motion Vector) for more pixels, a flexible block-tiling method with variable block-size are usually used, such as in H.264/AVC [4], especially for those moving objects with complex shapes. The similar problem can also be encountered in the intra picture prediction process.

The hardware complexity is an important issue. In multimedia communication, limited bandwidth, quality requirement and the power consumption are three major issues should be considered. For mobile applications, the video encoders using dedicated circuits are energy-efficient. And hardware encoders for high end products make the SD/HD home consumer camera become possible. A high performance video coding technique with low implementation structure especially the hardware structure is always welcome. From the implementation complexity point of view, variable block-size coding can introduce many problems. Large transforms are more complex than smaller ones [5], and small-sized motion compensation always brings some tough problems to the dataflow and pipeline construction [6].

Variable block-size coding can derive its origin to the 1960s, the general idea of using transforms of different block-sizes and different quantization methods respectively within a picture was presented [7]. The idea of variable block-size coding with the segmenting structure of quad-tree was presented in the 1980s. In [8], to isolate regions containing edges, a 4-level partition method based on a quad-tree was employed, using the block-sizes of 32 × 32, 16 × 16, 8 × 8, and 4 × 4. The quad-tree based hybrid variable block-size coding was introduced in [9], [10], [11], including quad-tree based variable block-size DCT transform, and quad-tree based variable block-size motion compensation. In [11], the quantization step-sizes of the DCT coefficients differed in block-sizes, which were set experimentally. In [12], the quad-tree structure was employed to segment the high-detailed region, and the residue was coded using different methods, where high detail regions were segmented into small blocks of 4 × 4 and coded with vector quantization, with high fidelity, and others are coded with scalar quantization. [13] proposed a method of choosing the optimized block-tiling quad-tree in a rate-distortion sense, showing that a variable block-size algorithm has significantly better rate and distortion behavior than a fixed block-sized one does. This work was based on a variable block-size motion compensation structure, however, the method and conclusion can be extended to any quad-tree based variable block-size coding structures. As a development of these ideas, a concept of variable block-size coding was presented [2], [14]. This scheme was called adaptive block-size transform (ABT), indicating the adaptation of the transform block-size to the block-size used for motion compensation and intra prediction. For this adaptation, transforms of size 8 × 8, 8 × 4, 4 × 8 and 4 × 4 were employed along with the prediction of the same sizes. Later, a new variable block-size coding scheme was presented [15], in which only the transforms with the block-sizes of 8 × 8 and 4 × 4 were used, and could be dynamically selected for inter predicted residual blocks of sizes from 8 × 8 to 16 × 16.

The variable block-size coding was proven to introduce considerable gain but was not adopted by early standards, e.g., H.261 and MPEG-2, probably because of the implementation complexity. Considering both implementation complexity and coding performance, the motion estimation is based on the block-size of 16 × 16, and the transform size is 8 × 8 in those legacy standards. As the requirement of fidelity increases while also IC technique develops, some techniques for smaller coding block-sizes have become acceptable. H.263 allows smaller block-sized motion compensation, i.e. the 8 × 8 block-size. This can be treated as the first practical use of variable block-size coding in standards. To acquire better performance, recent standards such as H.264/AVC support more flexibility in the selection of intra prediction, motion compensation and transformation block-sizes. Details will be introduced in Section 2.1.

The idea of variable block-size coding develops along with the block-based coding structure. There have been many ideas of variable block-size coding which are shown to introduce great coding gain. In addition, there are still some open issues interesting, such as how to make the variable block-size coding working well with other coding tools, to have balanced coding efficiency and implementation efficiency. In this paper, an analysis based on experiment results is shown to help us understanding variable block-size coding behaviors. In addition, the requirement of video coding and the trade-off between coding performance improvement and hardware implementation complexity are considered. As a result, a variable block-size coding method with low complexity is proposed and verified on software.

This paper is organized as follows. In Section 2, the variable block-size coding for H.264/AVC is introduced and investigated, and the coding block-size selection is studied. In respect of the test results and the analysis in Section 2, a variable block-size coding method is proposed and discussed in Section 3. The performance for the proposed method is analyzed in Section 4. Finally, conclusions are given in Section 5.

Section snippets

Studies on the variable block-size coding based on H.264/AVC

H.264/AVC is a well-known state-of-art coding standard. In this section, we use this platform to show some properties of variable block-size coding. H.264/AVC specifies the use of smaller coding block-sizes in addition to the traditional ones. The variable block-size coding for H.264/AVC including all the three tools listed below:

  • Tool-1

    Variable block-size transform using 8 × 8 and 4 × 4 (also 8 × 4 and 4 × 8 in early H.264/AVC ABT [14], [16]);

  • Tool-2

    Variable block-size intra prediction using 16 × 16, 8 × 8 and 4 × 4

A dual quad-tree based coding structure

The quad-tree structure [8] is proved to work well for block matching based inter prediction, because it’s a flexible way in segmenting an arbitrary shaped object.

Fig. 3 shows that 4 × 4 coded blocks generally do not congregate. To specify the blocks with such kind of distribution, the quad-tree structure is also the best choice [11].

In respect of the analysis above, we propose a variable coding structure combining both quad-tree based prediction and quad-tree based transform.

In this structure,

Experimental results

This section shows the coding performance of the low-complexity DQTC method specified in Section 3.2, in which A={16×16,16×8,8×16,8×8}B={8×8,4×4}. Based on the statistic data over several SD sequences, a typical value of ΔQP is realized, Δ QP = 8 for I pictures and ΔQP = 5 for the others. To simplify the encoding process, the typical value of ΔQP and the traditional choice λ = 0.85(QUANT)2 are set for all sequences in the following tests.

The test platform is AVS reference software RM6.2 k [29], in

Conclusion

Experiments show that normally high resolution sequences favor larger coding block-sizes while low resolution sequences favor smaller ones, and for the sequences which can benefit from the coding block-size variety, variable block-size transform is the most efficient tool. Analysis based on experimental results show that the primary benefit of variable block-size coding can be obtained by segmenting regions of different categories into blocks of different sizes with a quad-tree structure, and

Acknowledgment

The authors thank Hisilicon Technologies Company Limited, Chinese University of Hong Kong, and Hong Kong Applied Science and Technology Research Institute Company Limited for crosschecking the results.

References (32)

  • N. Ahmed et al.

    Discrete cosine transform

    IEEE Trans. Comput.

    (1974)
  • M. Wien

    Variable block-size transforms for H.264/AVC

    IEEE Trans. Circuits Syst. Video Technol.

    (2003)
  • X. Mao, Y. He, Image subjective quality with variable block size coding, in: Proceedings of International Video Coding...
  • ISO/IEC 14496-10, Coding of audiovisual objects-part 10: advanced video coding, also ITU-T Recommendation H.264...
  • D. An, X. Tong, B. Zhu, S. Li, Y. He, Comparative Study on implementation complexity of AVS mobile profile, in: AVS...
  • T.C. Chen, C.Jr. Lian, L.G. Chen, Hardware architecture design of an H.264/AVC video codec, in: Proceedings of the 2006...
  • J.W. Woods, T.S. Huang, Picture bandwidth compression by linear transformation and block quantization, in: Symposium...
  • D. Vaisey, A. Gersho, Variable block-size image coding, in: IEEE International Conference on ICASSP ’87, vol. 12, 1987,...
  • J. Guichard, G. Eude, Hybrid variable blocksize coding scheme based upon 3 DCTs and motion conpensation techniques at...
  • A. Puri, H. Hang, D. Schilling, Interframe coding with variable block size motion compensation, in: Proceedings of...
  • CT. Chen, Adaptive transform coding via quadtree-based variable blocksize DCT, in: IEEE International Conference on...
  • D. Vaisey et al.

    Image compression with variable block size segmentation

    IEEE Trans. Signal Process.

    (1992)
  • G.J. Sullivan, R.L. Baker, Rate-distortion optimized motion compensation for video compression using fixed or variable...
  • M. Wien, A. Dahlhoff, 16 bit adaptive block size transforms, Doc. JVT-C107, in: JVT Third Meeting, Fairfax, VA, May...
  • S. Gordon, D. Marpe, T. Wiegand, Simplified use of 8×8 transforms – updated proposal & results, in: Doc. JVT-K028. JVT...
  • M. Wien. JM (Joint Model of Joint Video Team of ISO/IEC and ITU-T) reference software 4.2, in: JVT-E025, JVT Fifth...
  • Cited by (6)

    This work is supported by the Chinese 973 project 2009CB320903.

    View full text