Circular intra prediction for 360 degree video coding☆
Introduction
In the recent years, with the development of multimedia, Virtual Reality (VR) and Augmented Reality (AR) techniques have become more and more popular, not only in the academic society, but also in the industrial society. They have been applied to various fields due to the capability of providing immersive experience to the consumers, such as distant education, movie entertainment, and medical treatment. 360 degree video is one of the important formats of AR and VR. In addition to the wider field of view, the requirements of higher resolution and frame rate, better visual quality, more bandwidth and etc., are supposed to be satisfied in the application of 360 degree video. However, how to effectively store or transmit the extra multimedia data from different dimensions and channels is a challenging task when compared with the traditional 2D video. Since October 2017, the Call for Proposals (CfP) of next generation video coding standard has been issued with the capability beyond High Efficiency Video Coding (HEVC) [1], in which the contents are divided into three categories [2], [3], i.e., (1) Standard Dynamic Range (SDR), (2) High Dynamic Range (HDR) and Wide Color Gamut (WCG) [4], (3) 360 degree video. In April 2018, the next generation video standard was formally named as Versatile Video Coding (VVC) [5] by Joint Video Experts Team (JVET). As one of the important parts in VVC, 360 degree video is the trend of future multimedia development, which has achieved the commercial success on the Head Mounted Display (HMD) device, such as HTC Vive. In response to the CfP, numerous proposals [6] are submitted to the category of 360 degree video.
Different from the traditional 2D video, the 360 degree video with more multimedia information from various dimensions and channels is represented in a 3D spherical pattern. To make it easily adapt to the current block-based video coding standard, the projection from 3D sphere to 2D plane is required before encoding. At the decoder side, the inverse projection from 2D plane to 3D sphere is performed to reconstruct the 360 degree video. In these proposals, there are a variety of projection formats [7] that have been accepted by JVET, such as Equi-Rectangular Projection (ERP), Cube Map Projection (CMP), Compact OctaHedron Projection (COHP), and Compact IcoSahedron Projection (CISP). Most of the 360 degree videos and sequences are represented in ERP format. Every projection format has its advantages and disadvantages. In the projected 2D plane, the projection distortion inevitably occurs due to the pixel sampling, which may have an impact on the coding performance of 360 degree video.
With these projection formats, the 360Lib reference software [8] has been developed by the JVET, which is utilized for projection format conversion and spherical quality assessment. It can not only perform different projection formats conversion, such as from ERP to CMP, but also can be incorporated in the HEVC test Model (HM) and VVC Test Model (VTM) for the 360 degree video coding. After encoding, the reconstructed video is evaluated by different spherical quality metrics that have been implemented in the 360Lib, i.e., end-to-end Weighted to Spherically uniform Peak Signal to Noise Ratio (WS-PSNR) [8], Spherical PSNR (S-PSNR) [8], and Craster Parabolic Projection PSNR (CPP-PSNR) [8]. However, the video codec engine on the platform of HM + 360Lib or VTM + 360Lib is as same as the original HM or VTM. To further improve the coding performance, the specific spherical characteristics are supposed to be exploited to develop advanced coding tools for 360 degree video coding.
The main contributions of this work can be summarized as follows.
- (1)
A novel intra prediction method with the specific circular pattern, i.e., Circular Intra Prediction (CIP), is developed to further remove the spatial redundancy, in which the spherical characteristics of 360 degree video are taken into account.
- (2)
The radian of circular pattern in CIP is determined according to the degree of projection deformation, i.e., smaller radian of circular pattern indicates stronger projection deformation and vice versa.
- (3)
To achieve better coding performance, one additional binary flag is utilized for the performance competition between the traditional AIP and proposed CIP.
The remaining of this paper is organized as follows. Section 2 introduces the related works. The motivation is described in Section 3. Section 4 presents the proposed circular intra prediction for 360 degree video coding. The experimental results and analyses are discussed in Section 5. Section 6 concludes this paper.
Section snippets
Projection format conversion for 360 degree video
As mentioned before, there are several projection formats from 3D sphere to 2D plane, such as ERP, CMP, COHP, and CISP. Here, we take the ERP format as an example, other projection formats can be found in [7], [8] in detail. To efficiently represent the 3D sphere, XYZ coordinate system is used in Fig. 1, where [-] indicates the longitude and [-/2, /2] indicates the latitude, is the ratio of a circle’s circumference to its diameter. The point () in XYZ coordinate system can
Motivations and problem formulation
The 360 degree videos in ERP format are presented in Fig. 3. According to the projection from 3D sphere to 2D plane, the pixels in the high latitude are largely oversampled. As the latitude increases, the deformation becomes more and more severe. With Eq. (3), we can explain the oversampling and deformation in a mathematic model. If , then the values of are always fixed, i.e., . It indicates that the pixels at the above edge of ERP format are all equal to the value of pole
Circular intra prediction
As we know, the limitation of traditional AIP is that the prediction results are all produced from the reference pixels with a linear pattern. However, in the 360 degree video, the contents are deformed because of the projection from 3D sphere to 2D plane, which do not always follow the line based prediction, but curve, circle or oval. As a result, non-line based prediction is supposed to be presented to adapt to the difference of the 360 degree video from the traditional video for better
Experimental results and analysis
In this section, the experiments are conducted on the platform of VTM 5.0 [30] + 360Lib 9.1 [31], in which the proposed CIP is implemented in both video encoder and decoder. The workstation equipped with the Intel Core i9-7900X CPU, 32 GB memory, Windows 10 Enterprise 64-bit operating system, is used in our experiments. The original platform of VTM 5.0 + 360Lib 9.1 is employed as the anchor for the coding performance evaluation.
On the platform of VTM 5.0 + 360Lib 9.1, the largest Coding Unit
Conclusions
In this paper, a novel intra prediction algorithm is presented for 360 degree video coding. Different from the traditional AIP, a specific circular pattern is applied in the proposed CIP, which is utilized to adapt to the projection deformation from 3D sphere to 2D plane. In the CIP, one mode is produced with several concentric circles that cover the to-be-predicted block, and the target pixels are inferred from the left or the above reference pixels according to the circular pattern. Different
Declaration of Competing Interest
The authors declared that there is no conflict of interest.
Acknowledgement
This work was supported in part by Shenzhen Science and Technology Program under Grant JCYJ20180507183823045, in part by the Natural Science Foundation of China under Grant 61901459 and Grant 61902389, in part by China Postdoctoral Science Foundation under Grant 2019M653127, in part by Guangdong International Science and Technology Cooperative Research Project under Grant 2018A050506063, in part by Free Application Fund of Natural Science Foundation of Guangdong Province under Grant
References (33)
- et al.
Overview of the high efficiency video coding (HEVC) standard
IEEE Trans. Circuits Syst. Video Technol.
(Dec. 2012) - et al.
A unified video codec for SDR, HDR, and 360 video applications
IEEE Trans. Circuits Syst. Video Technol.
(May 2020) - et al.
General video coding technology in responses to the joint call for proposals on video compression with capability beyond HEVC
IEEE Trans. Circuits Syst. Video Technol.
(May 2020) - et al.
High dynamic range and wide color gamut video coding in HEVC: status and potential future enhancements
IEEE Trans. Circuits Syst. Video Technol.
(Jan. 2016) - B. Bross, Versatile video coding (Draft 2), Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC...
- et al.
Omnidirectional 360 video coding technology in responses to the joint call for proposals on video compression with capability beyond HEVC
IEEE Trans. Circuits Syst. Video Technol.
(May 2020) - et al.
Efficient projection and coding tools for 360 video
IEEE J. Emerg. Sel. Top. Circ. Syst.
(Mar. 2019) - Y. Ye, J. Boyce, Algorithm description of projection format conversion and video quality metrics in 360Lib version 9,...
- F. Duanmu, Y. He, X. Xiu, P. Hanhart, Y. Ye, Y. Wang, Hybrid cubemap projection format for 360 degree video coding,...
- et al.
360 video coding based on projection format adaptation and spherical neighboring relationship
IEEE J. Emerg. Sel. Top. Circ. Syst.
(Mar. 2019)
Advanced spherical motion model and local padding for 360 video compression
IEEE Trans. Image Process.
Lambda domain perceptual rate control for 360 degree video compression
IEEE J. Sel. Top. Signal Process.
Video coding optimization for virtual reality 360 degree source
IEEE J. Sel. Top. Signal Process.
Spherical domain rate distortion optimization for omnidirectional video coding
IEEE Trans. Circuits Syst. Video Technol.
Cited by (2)
Teaching mechanism empowered by virtual simulation: Edge computing–driven approach
2023, Digital Communications and NetworksCitation Excerpt :In Ref. [31], the authors discussed industrial needs and proposed a video compression technology beyond HEVC for 360° video coding. In Ref. [32], a novel circular intraprediction was proposed to improve the coding performance of 360° video. It was performed in a circular pattern, where the center of the circle was located around the to-be-predicted block and different centers of circles could produce different intraprediction modes.
Texture-Aware Spherical Rotation for High Efficiency Omnidirectional Intra Video Coding
2022, IEEE Transactions on Circuits and Systems for Video Technology
- ☆
This paper has been recommended for acceptance by Zicheng Liu.