Abstract
Light field imaging simultaneously records the position and direction information of light in scene, as one of the important techniques for digital media. The amount of light field image (LFI) data is huge, it needs to be effectively compressed. In this paper, a perceptual LFI coding method with coding tree unit (CTU) level bit allocation strategy is proposed. To remove angular redundancy, a hybrid coding framework with joint deep learning reconstruction networks is constructed. At the encoder side, only four corner sub-aperture images (SAIs) are compressed with new CTU level bit allocation, a complete SAIs array is reconstructed by a LFI angular super-resolution network at the decoder side. To remove perceptual redundancy, we design a CTU level bit allocation strategy with the assumption of perceptual consistency, considering the characteristics of the human visual system in the bit allocation process. Experimental results show that for the proposed method with the designed CTU level bit allocation strategy, an average BD-BR savings of 13.676% in Y-PPSNR metric and 2.045% in VSI metric can be achieved. Compared with the high efficiency video coding (HEVC) intra coding model, the proposed method can achieve an average BD-BR savings of over 90%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xiang, J., Jiang, G., Yu, M., Jiang, Z., Ho, Y.-S.: No-reference light field image quality assessment using four-dimensional sparse transform. IEEE Trans. Multimedia 25, 457–472 (2023)
Yang, N., et al.: Detection method of rice blast based on 4D light field refocusing depth information fusion. Comput. Electron. Agric. 205, 107614 (2023)
Yuan, L., Gao, J., Wang, X. and Cui, H.: Research on 3D reconstruction technology based on the fusion of polarization imaging and light field depth information. In: 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), pp. 1792–1797. IEEE, Xi'an, China (2022)
Shen, S., Xing, S., Sang, X., Yan, B., Chen, Y.: Virtual stereo content rendering technology review for light-field display. Displays 76, 102320 (2022)
Dai, F., Zhang, J., Ma, Y. and Zhang, Y.: Lenselet image compression scheme based on subaperture images streaming. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4733–4737. IEEE, Quebec City, QC, Canada (2015)
Monteiro, R., Lucas, L., Conti, C., et al.: Light field HEVC-based image coding using locally linear embedding and self-similarity compensated prediction. In: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4. IEEE, Seattle, WA, USA (2016)
Ahmad, W., Olsson, R., Sjöström, M.: Interpreting plenoptic images as multi-view sequences for improved compression. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 4557–4561. IEEE, Beijing, China (2017)
Bakir, N., Hamidouche, W., Fezza, S.A., Samrouth, K., Déforges, O.: Light field image coding using VVC standard and view synthesis based on dual discriminator GAN. IEEE Trans. Multimedia 23, 2972–2985 (2021)
Hedayati, E., Havens, T.C., Bos, J.P.: Light field compression by residual CNN-assisted JPEG. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE, Shenzhen, China (2021)
Huang, X., An, P., Chen, Y., Liu, D., Shen, L.: Low bitrate light field compression with geometry and content consistency. IEEE Trans. Multimedia 24, 152–165 (2022)
Liu, D., Huang, Y., Fang, Y., Zuo, Y., An, P.: Multi-Stream Dense View Reconstruction Network for Light Field Image Compression. IEEE Transactions on Multimedia, early access (2022)
Miangoleh, S.M.H., Dille, S., Mai, L., Paris, S., Aksoy, Y.: Boosting monocular depth estimation models to high-resolution via content-adaptive multi-resolution merging. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9680–9689. IEEE, Nashville, TN, USA (2021)
Wang, F., Pan, J., Xu, S., Tang, J.: Learning discriminative cross-modality features for RGB-D saliency detection. IEEE Trans. Image Process. 31, 1285–1297 (2022)
Li, B., Li, H., Li, L., Zhang, J.: λ domain rate control algorithm for high efficiency video coding. IEEE Trans. Image Process. 23(9), 3841–3854 (2014)
Wang, Y., et al.: Disentangling light fields for super-resolution and disparity estimation. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 425–443 (2023)
EPFL dataset. https://www.epfl.ch/labs/mmspg/downloads/epfl-light-field-image-dataset/. Accessed 28 April 2023
Dansereau, D.G., Pizarro, O., Williams, S.B.: Decoding, calibration and rectification for lenselet-based plenoptic cameras. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1027–1034. IEEE, Portland, OR, USA (2013)
Majid, M., Owais, M., Anwar, S.M.: Visual saliency based redundancy allocation in HEVC compatible multiple description video coding. Multimed. Tools Appl. 77, 20955–20977 (2018)
Zhang, L., Shen, Y., Li, H.: VSI: a visual saliency-induced index for perceptual image quality assessment. IEEE Trans. Image Process. 23(10), 4270–4281 (2014)
Bjontegaard, G.: Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jin, P., Jiang, G., Chen, Y., Jiang, Z., Yu, M. (2023). Perceptual Light Field Image Coding with CTU Level Bit Allocation. In: Tsapatsoulis, N., et al. Computer Analysis of Images and Patterns. CAIP 2023. Lecture Notes in Computer Science, vol 14185. Springer, Cham. https://doi.org/10.1007/978-3-031-44240-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-44240-7_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44239-1
Online ISBN: 978-3-031-44240-7
eBook Packages: Computer ScienceComputer Science (R0)