A fast algorithm based on gray level co-occurrence matrix and Gabor feature for HEVC screen content coding

https://doi.org/10.1016/j.jvcir.2021.103128Get rights and content

Highlights

  • Propose a fast algorithm based on GLCM and Gabor feature model for HEVC-SCC.

  • Predict the partitioning size of coding unit (CU) by the number of non-zero values in GLCM.

  • Classify CUs into different types by Gabor filter.

  • Reduce candidate prediction modes for diverse types of CUs.

  • Experimental results justify the efficiency of the proposed method.

Abstract

To reduce the computational complexity of screen content video coding (SCC), a fast algorithm based on gray level co-occurrence matrix and Gabor feature model for HEVC-SCC, denoted as GGM, is proposed in this paper. By studying the correlation of non-zero number in gray level co-occurrence matrix with different partitioning depth, the coding unit (CU) size of intra coding can be prejudged, which selectively skips the intra prediction process of CU in other depth. With Gabor filter, the edge information reflecting the features of screen content images to the human visual system (HVS) are extracted. According to Gabor feature, CUs are classified into natural content CUs (NCCUs), smooth screen content CUs (SSCUs) and complex screen content CUs (CSCUs), with which, the calculation and judgment of unnecessary intra prediction modes are skipped. Under all-intra (AI) configuration, experimental results show that the proposed algorithm GGM can achieve encoding time saving by 42.13% compared with SCM-8.3, and with only 1.85% bit-rate increasement.

Introduction

With the emerging requirements such as cloud gaming, remote desktop interfacing, slideshows sharing, screen content video (SCV) has received more and more attention as a special video type. Differing from conventional camera-captured video (CCV), SCVs are mainly generated by computers, such as text, computer graphics, graphical user interface, or a mixture of camera-captured and computer-generated content. Characteristics of screen content video are reflected in large flat areas, less capturing noise, repetitive patterns and characters, limited number of colors, high image contrast and sharp edges, and etc. The research for SCV has become a hot topic in academia and industry [1], [2].

For screen content video coding, ISO/IEC Moving Picture Expert Group and ITU-T Video Coding Experts Group, also referred as “Joint Collaborative Team on Video Coding” (JCT-VC), has launched the standardization of screen content coding (SCC) extension [3] on top of the latest video standard—High Efficiency Video Coding (HEVC) since January 2014. Compared with HEVC, there are four additional coding tools in HEVC-SCC, which are Intra Block Copy (IBC) [4], Palette Mode (PLT) [5], Adaptive Color transform (ACT) [6], [7], and Adaptive Motion Vector Resolution (AMVR) [8]. These tools enable 55% bit-rate reduction for SCV while maintaining the subjective perception quality [9]. However, the coding performance gain is at the expense of computational complexity. The more coding tools, the more calculations and comparisons for CU size judgment and prediction mode decision.

To reduce the coding complexity of HEVC-SCC and maintain a certain video quality, some solutions [10], [11], [12], [13], [14], [15], [16], [17], [18], [19] have been proposed. A hierarchical hash scheme and corresponding block matching algorithm [10] was proposed to reduce the complexity of block matching while inter coding, which achieved time saving by 12% and 16% for random access and low-delay coding structures, respectively. Lee et al. [11] introduced a fast transform skip mode decision method to reduce the complexity of residual coding, but only 28.2% of the coding time was saved during the transformation process. Mention to the fast intra coding for HEVC-SCC, Tsang et al. [12] proposed an efficient intra prediction algorithm particularly for smooth region. For CUs with simple textures, early termination can be achieved by skipping IBC and PLT mode checking. To fit for ordinary CUs, a fast prediction algorithm [13] was further proposed by skipping unnecessary checking process of IBC mode based on CU activity and gradient. With which, IBC mode checking can be skipped, while PLT still needed to be checked together with intra modes. Zhang et al. [14] proposed an adaptive search scheme, which simplified the block matching search process for IBC mode. Though considering the encoding time saving of IBC mode for screen content, it is not suitable for scenes where natural content and screen content are mixed. Therefore, the characteristics of screen content video should be concerned. Since there are dynamic and stationary regions, or natural and computer-generated contents on a screen, incoming CUs were divided into stationary and dynamic CUs by comparing their pixel values with collocated CUs in algorithm [15]. A fast mode decision algorithm [16] was based on the corner point detection and distinct color number extraction. In [17], a random forest was used as the classification tool to reduce the number of mode candidates. Both [16], [17] classify video region into natural content and screen content, and then adopt different mode strategies for two types. If the texture feature of CUs is analyzed before the mode decision, it is expected to narrow down the candidate prediction mode numbers for corresponding feature, and further reduce the computational complexity.

Methods above are mostly considering the fast mode prediction; the exhaustive CU size determination is also time consuming. To early determine the CU size, Zhang et al. [18] proposed heuristic rules based on entropy to predict the CU size decision, and then eliminate the incorrect decision by utilizing coding bits after checking all mode candidates. With machine learning, conventional neural network-based classifiers were trained to make fast CU size decision by utilizing features that describe CU statistics and sub-CU homogeneity in algorithm [19]. However, the rate distortion (RD) performance loss is relatively high.

Therefore, the pre-decision of both CU size and intra prediction mode is under consideration in this paper. By taking advantages of large flat areas and sharp edges of screen content, a fast algorithm based on gray level co-occurrence matrix (GLCM) and Gabor feature for HEVC screen content coding, named GGM, is proposed. For each CU, a gray level co-occurrence matrix (GLCM) is calculated, and then, the number of non-zero values in GLCM indicating the texture complexity of current CU is utilized to predict the CU partitioning size, and reduce the complexity of CU depth searching. Since Gabor feature reflects the perception of human visual system (HVS) on screen content image well [20], Gabor feature of CU at current depth is extracted, and then, CU is divided into natural content CU (NCCU), smooth screen content CU (SSCU), and complex screen content CU (CSCU). For diverse CU, the number of candidate prediction modes can be reduced, and time saving is achieved.

The remaining of this paper is organized as follows. The correlation of non-zero value amount in GLCM with CU numbers in different partitioning depth is studied in Section 2. Section 3 presents the mode pre-decision by extracting Gabor feature to classify CUs into different CU types and skip unnecessary mode checking for each type. Section 4 describes the proposed fast intra coding algorithm for HEVC-SCC based on GLCM and Gabor feature. Experimental results and comparison analysis with state-of-the-art methods are shown in Section 5. Section 6 draws the conclusions.

Section snippets

Traditional CU partitioning process

In HEVC standard, each video frame is divided into coding tree units (CTUs) with the size of L × L, where L can be chosen as 64, 32, 16 or 8. CTUs in a frame are encoded in raster scan order. An example of CTU partition and its corresponding partitioning structure is shown in Fig. 1. As Fig. 1(a) indicates, each CTU can be split into four smaller CUs of equal size, named sub-CUs or child sub-CUs. Each sub-CU can be further split into four smaller CU of equal size, and the CU splitting process

The proposed mode pre-decision

In SCC intra coding, the optimal intra prediction mode can be selected from 37 candidate intra modes, which include 35 traditional modes and 2 SCC modes (IBC and PLT mode). For each CU, the best mode is selected by comparing the RD cost of all prediction modes, which means the traversal of best mode selection process is time consuming, and with high complexity.

Due to the introduction of IBC and PLT modes, the intra prediction complexity of HEVC-SCC is higher than HEVC standard. The encoding

Fast algorithm based on GLCM and Gabor feature model for HEVC-SCC (GGM)

Based on the analysis discussed above, a fast algorithm based on gray level co-occurrence matrix (GLCM) and Gabor feature for HEVC screen content coding, named GGM, is proposed. With GLCM, the CTU partitioning depth and CU size can be prejudged, and adaptively skip unnecessary process of other partitioning depth. And then, a specially-designed Gabor filtering (i.e., the imaginary part with odd symmetry) is conducted on the horizontal and the vertical directions to extract the CU feature, which

Experimental results and analysis

To evaluate the performance of the proposed fast mode and partitioning depth decision framework, the coding efficiency and computational complexity of GGM were compared with those of the original SCM-8.3 [22]. Standard sequences and corresponding video types are listed in . Experiments are tested under the all-intra configuration with four QPs (22, 27, 32, and 37) [29]. Experimental results for the proposed GGM and three state-of-the-art HEVC-SCC fast intra prediction algorithms are shown in

Conclusion

In this paper, a fast intra coding algorithm based on gray level co-occurrence matrix and Gabor feature (GGM) for HEVC-SCC is proposed. By studying the correlation of non-zero value amount in GLCM with CTU numbers in different partitioning depth, the CU size of intra coding can be prejudged, which means the intra prediction process of CU in other depth can be selectively skipped. Then, Gabor feature is utilized to classify CUs into different types, and different processing strategy are proposed

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was partially supported by National Natural Science Foundation of China (Grant Nos. 61871434 and 61802136), Natural Science Foundation for Outstanding Young Scholars of Fujian Province (Grant No. 2019J06017), Fujian-100 Talented People Program, Key Science and Technology Project of Xiamen City (Grant No. 3502ZCQ20191005), High-Level Talent Project Foundation of Huaqiao University (Grant Nos. 16BS709, and 14BS204).

References (33)

  • R.T. Maloney et al.

    Orientation anisotropies in human primary visual cortex depend on contrast

    Neuro Image

    (2015)
  • H. Yang et al.

    Perceptual Quality Assessment of Screen Content Images

    IEEE Trans. Image Process.

    (2015)
  • K. Gu et al.

    No-Reference Quality Assessment of Screen Content Pictures

    IEEE Trans. Image Process.

    (2017)
  • J. Xu et al.

    Overview of the emerging HEVC screen content coding extension

    IEEE Trans. Circ. Syst. Video Technol.

    (2016)
  • C. Pang et al.

    Non-RCE3: Intra motion compensation with 2-D MVs

    (2013)
  • W. Pu et al.

    Palette mode coding in HEVC screen content coding extension

    IEEE J. Emerg. Sel. Top. Circ. Syst.

    (2016)
  • L. Zhang et al.

    Adaptive Colo-Space transform in HEVC screen content coding

    IEEE J. Emerging Sel. Top. Circuits Syst.

    (2016)
  • D. Marpe et al.

    Macroblock-Adaptive residual color space transforms for 4: 4: 4 video coding

  • B. Li et al.

    Adaptive motion vector resolution for screen content

    (2014)
  • Q. Jiang et al.

    Optimizing multistage discriminative dictionaries for blind image quality assessment

    IEEE Trans. Multimedia

    (2018)
  • W. Xiao et al.

    Fast Hash-based Inter Block Matching for Screen Content Coding

    IEEE Trans. Circuits Syst. Video Technol.

    (2016)
  • D. Lee et al.

    Fast Transform Skip Mode Decision for HEVC Screen Content Coding

  • S. Tsang et al.

    Fast and efficient intra coding techniques for smooth region in screen content coding based on boundary prediction samples

  • S. Tsang, W. Kuang, Y. Chan, W. Sui, Fast HEVC screen content coding by skipping unnecessary checking of intra block...
  • H. Zhang et al.

    Fast intra mode decision and block matching for HEVC screen content compression

  • W. Kuang et al.

    Efficient mode decision for HEVC screen content coding by content analysis

  • This paper has been recommended for acceptance by Zicheng Liu.

    View full text