research-article

Neural-Network-Based Cross-Channel Intra Prediction

Authors:

Houqiang LiAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 3

Article No.: 77, Pages 1 - 23

https://doi.org/10.1145/3434250

Published: 22 July 2021 Publication History

Abstract

To reduce the redundancy among different color channels, e.g., YUV, previous methods usually adopt a linear model that tends to be oversimple for complex image content. We propose a neural-network-based method for cross-channel prediction in intra frame coding. The proposed network utilizes twofold cues, i.e., the neighboring reconstructed samples with all channels, and the co-located reconstructed samples with partial channels. Specifically, for YUV video coding, the neighboring samples with YUV are processed by several fully connected layers; the co-located samples with Y are processed by convolutional layers; and the proposed network fuses the twofold cues. We observe that the integration of twofold information is crucial to the performance of intra prediction of the chroma components. We have designed the network architecture to achieve a good balance between compression performance and computational efficiency. Moreover, we propose a transform domain loss for the training of the network. The transform domain loss helps obtain more compact representations of residues in the transform domain, leading to higher compression efficiency. The proposed method is plugged into HEVC and VVC test models to evaluate its effectiveness. Experimental results show that our method provides more accurate cross-channel intra prediction compared with previous methods. On top of HEVC, our method achieves on average 1.3%, 5.4%, and 3.8% BD-rate reductions for Y, Cb, and Cr on common test sequences, and on average 3.8%, 11.3%, and 9.0% BD-rate reductions for Y, Cb, and Cr on ultra-high-definition test sequences. On top of VVC, our method achieves on average 0.5%, 1.7%, and 1.3% BD-rate reductions for Y, Cb, and Cr on common test sequences.

References

[1]

Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimization of nonlinear transform codes for perceptual quality. In Picture Coding Symposium (PCS’16). IEEE, 1–5.

[2]

Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimized image compression. (2016). arXiv:1611.01704 http://arxiv.org/abs/1611.01704.

[3]

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In British Machine Vision Conference (BMVC’12). BMVA Press, 1–10.

[4]

Gisle Bjontegaard. 2001. Calculation of Average PSNR Differences between RD-Curves. Technical Report VCEG-M33. VCEG.

[5]

Frank Bossen. 2011. Common Test Conditions and Software Reference Configurations. Technical Report JCTVC-F900. JCT-VC.

[6]

Frank Bossen, Jill Boyce, X. Li, V. Seregin, and K. Sühring. 2018. JVET Common Test Conditions and Software Reference Configurations for SDR Video. Technical Report JVET-L1010. JVET.

[7]

Grigore C. Burdea and Philippe Coiffet. 2003. Virtual Reality Technology. John Wiley & Sons.

Digital Library

[8]

Guillaume Charpiat, Matthias Hofmann, and Bernhard Schölkopf. 2008. Automatic image colorization via multimodal predictions. In European Conference on Computer Vision (ECCV’08). Springer, 126–139.

Digital Library

[9]

Zezhou Cheng, Qingxiong Yang, and Bin Sheng. 2015. Deep colorization. In International Conference on Computer Vision (ICCV’15). 415–423.

Digital Library

[10]

Alex Yong-Sang Chia, Shaojie Zhuo, Raj Kumar Gupta, Yu-Wing Tai, Siu-Yeung Cho, Ping Tan, and Stephen Lin. 2011. Semantic colorization with internet images. ACM Transactions on Graphics 30, 6 (2011), 156.

Digital Library

[11]

Yuanying Dai, Dong Liu, and Feng Wu. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In Multimedia Modeling Conference (MMM’17). Springer, 28–39.

[12]

Aditya Deshpande, Jason Rock, and David Forsyth. 2015. Learning large-scale automatic image colorization. In International Conference on Computer Vision (ICCV’15). 567–575.

Digital Library

[13]

Chao Dong, Change Loy Chen, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision (ECCV’14). Springer, 184–199.

[14]

Christophe Gisquet and Edouard François. 2013. Model correction for cross-channel chroma prediction. In Data Compression Conference (DCC’13). IEEE, 23–32.

Digital Library

[15]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In International Conference on Artificial Intelligence and Statistics. 249–256.

[16]

Marc Górriz, Saverio Blasi, Alan F. Smeaton, Noel E. O’Connor, and Marta Mrak. 2020. Chroma intra prediction with attention-based CNN architectures. (2020). arXiv:2006.15349 http://arxiv.org/abs/2006.15349.

[17]

Raj Kumar Gupta, Alex Yong-Sang Chia, Deepu Rajan, Ee Sin Ng, and Huang Zhiyong. 2012. Image colorization using similar images. In ACM Multimedia. ACM, 369–378.

Digital Library

[18]

Philipp Helle, Jonathan Pfaff, Michael Schäfer, Roman Rischke, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. Intra picture prediction for video coding with neural networks. In Data Compression Conference (DCC’19). IEEE, 448–457.

[19]

Yueyu Hu, Wenhan Yang, Mading Li, and Jiaying Liu. 2019. Progressive spatial recurrent neural network for intra prediction. IEEE Transactions on Multimedia 21, 12 (2019), 3024–3037.

Digital Library

[20]

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5197–5206.

[21]

Yi-Chin Huang, Yi-Shin Tung, Jun-Cheng Chen, Sung-Wen Wang, and Ja-Ling Wu. 2005. An adaptive edge detection based colorization algorithm and its applications. In ACM Multimedia. ACM, 351–354.

Digital Library

[22]

Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics 35, 4 (2016), 110.

Digital Library

[23]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia. ACM, 675–678.

Digital Library

[24]

Jungsun Kim, S. Park, Younghee Choi, Y. Jeon, and B. Jeon. 2010. New Intra Chroma Prediction Using Inter-channel Correlation. Technical Report JCTVC-B021. JCT-VC.

[25]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 1097–1105.

Digital Library

[26]

Jani Lainema, Frank Bossen, Woo-Jin Han, Junghye Min, and Kemal Ugur. 2012. Intra coding of the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1792–1801.

Digital Library

[27]

Edmund Y. Lam and Joseph W. Goodman. 2000. A mathematical analysis of the DCT coefficient distributions for images. IEEE Transactions on Image Processing 9, 10 (2000), 1661–1666.

Digital Library

[28]

Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. 2016. Learning representations for automatic colorization. In European Conference on Computer Vision (ECCV’16). Springer, 577–593.

[29]

Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. ACM Transactions on Graphics 23, 3 (2004), 689–694.

Digital Library

[30]

Jiahao Li, Bin Li, Jizheng Xu, Ruiqin Xiong, and Wen Gao. 2018. Fully connected network-based intra prediction for image coding. IEEE Transactions on Image Processing 27, 7 (2018), 3236–3247.

[31]

Yue Li, Li Li, Zhu Li, Jianchao Yang, Ning Xu, Dong Liu, and Houqiang Li. 2018. A hybrid neural network for chroma intra prediction. In International Conference on Image Processing (ICIP’18). 1797–1801.

[32]

Yue Li, Dong Liu, Houqiang Li, Li Li, Feng Wu, Hong Zhang, and Haitao Yang. 2018. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems for Video Technology 28, 9 (2018), 2316–2330.

[33]

Dong Liu, Yue Li, Jianping Lin, Houqiang Li, and Feng Wu. 2020. Deep learning-based video coding: A review and a case study. Computing Surveys 53, 1 (2020), 1–35.

Digital Library

[34]

Zhenyu Liu, Xianyu Yu, Yuan Gao, Shaolin Chen, Xiangyang Ji, and Dongsheng Wang. 2016. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network. IEEE Transactions on Image Processing 25, 11 (2016), 5088–5103.

Digital Library

[35]

Maria Meyer, Jonathan Wiesner, Jens Schneider, and Christian Rohlfing. 2019. Convolutional neural networks for video intra prediction using cross-component adaptation. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP’19). IEEE, 1607–1611.

[36]

J. Pfaff, P. Helle, D. Maniry, S. Kaltenstadler, W. Samek, H. Schwarz, D. Marpe, and T. Wiegand. 2018. Neural network based intra prediction for video coding. In Applications of Digital Image Processing XLI, Vol. 10752. International Society for Optics and Photonics.

[37]

Jonathan Pfaff, Heiko Schwarz, Detlev Marpe, et al. 2020. Video compression using generalized binary partitioning, trellis coded quantization, perceptually optimized encoding, and advanced prediction and transform coding. IEEE Transactions on Circuits and Systems for Video Technology 30, 5 (2020), 1281–1295.

[38]

A. Segall, V. Baroncini, J. Boyce, J. Chen, and T. Suzuki. 2017. Joint Call for Proposals on Video Compression with Capability Beyond HEVC. Technical Report JVET-H1002. JVET.

[39]

Rui Song, Dong Liu, Houqiang Li, and Feng Wu. 2017. Neural network-based arithmetic coding of intra prediction modes in HEVC. In International Conference on Visual Communications and Image Processing (VCIP’17). IEEE, 1–4.

[40]

Yafei Song, Jia Li, Xiaogang Wang, and Xiaowu Chen. 2017. Single image dehazing using ranking convolutional neural network. IEEE Transactions on Multimedia 20, 6 (2017), 1548–1560.

[41]

Gary J. Sullivan, Jens Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649–1668.

Digital Library

[42]

Youbao Tang and Xiangqian Wu. 2019. Salient object detection using cascaded convolutional neural networks and adversarial learning. IEEE Transactions on Multimedia 21, 9 (2019), 2237–2247.

[43]

Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár. 2017. Lossy image compression with compressive autoencoders. (2017). arXiv:1703.00395 http://arxiv.org/abs/1703.00395.

[44]

Radu Timofte, Eirikur Agustsson, Luc Van Gool, et al. 2017. NTIRE 2017 challenge on single image super-resolution: Methods and results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’17). IEEE, 1110–1121.

[45]

George Toderici, Sean M. O’Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. 2015. Variable rate image compression with recurrent neural networks. (2015). arXiv:1511.06085 http://arxiv.org/abs/1511.06085.

[46]

George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full resolution image compression with recurrent neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5435–5443.

[47]

Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 560–576.

Digital Library

[48]

Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 341–349.

Digital Library

[49]

Ning Yan, Dong Liu, Houqiang Li, Bin Li, Li Li, and Feng Wu. 2019. Convolutional neural network-based fractional-pixel motion compensation. IEEE Transactions on Circuits and Systems for Video Technology 29, 3 (2019), 840–853.

Digital Library

[50]

Chih-Yuan Yang and Ming-Hsuan Yang. 2013. Fast direct super-resolution by simple functions. In International Conference on Computer Vision (ICCV’13). 561–568.

Digital Library

[51]

Chia-Hung Yeh, Tsung-Yi Tseng, Cheng-Wei Lee, and Chih-Yang Lin. 2015. Predictive texture synthesis-based intra coding scheme for advanced video coding. IEEE Transactions on Multimedia 17, 9 (2015), 1508–1514.

Digital Library

[52]

Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces. Springer, 711–730.

Digital Library

[53]

Kai Zhang, Jianle Chen, Li Zhang, Xiang Li, and Marta Karczewicz. 2018. Enhanced cross-component linear model for chroma intra-prediction in video coding. IEEE Transactions on Image Processing 27, 8 (2018), 3983–3997.

[54]

Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In European Conference on Computer Vision (ECCV’16). Springer, 649–666.

[55]

Tao Zhang, Haoming Chen, Ming-Ting Sun, Debin Zhao, and Wen Gao. 2017. Signal dependent transform based on SVD for HEVC intracoding. IEEE Transactions on Multimedia 19, 11 (2017), 2404–2414.

[56]

Tao Zhang, Xiaopeng Fan, Debin Zhao, and Wen Gao. 2016. Improving chroma intra prediction for HEVC. In International Conference on Multimedia and Expo Workshops (ICME Workshops’16). IEEE, 1–6.

[57]

Tao Zhang, Xiaopeng Fan, Debin Zhao, Ruiqin Xiong, and Wen Gao. 2017. Hybrid intraprediction based on local and nonlocal correlations. IEEE Transactions on Multimedia 20, 7 (2017), 1622–1635.

[58]

Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, Jingwei Yan, and Keyu Yan. 2016. A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Transactions on Multimedia 18, 12 (2016), 2528–2536.

Digital Library

[59]

Xingyu Zhang, Christophe Gisquet, Edouard Francois, Feng Zou, and Oscar C. Au. 2014. Chroma intra prediction based on inter-channel correlation for HEVC. IEEE Transactions on Image Processing 23, 1 (2014), 274–286.

Digital Library

[60]

Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2017. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47–57.

[61]

Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang. 2019. Generative adversarial network based intra prediction for video coding. IEEE Transactions on Multimedia 22, 1 (2019), 45–58.

Digital Library

Cited By

Liang ZWang YLu WCao X(2024)Boosting Semi-Supervised Learning with Dual-Threshold Screening and Similarity LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3672563Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3672563
Huo SLiu DZhang HLi LMa SWu FGao W(2024)Towards Hybrid-Optimization Video CodingACM Computing Surveys10.1145/365214856:9(1-36)Online publication date: 24-Apr-2024
https://dl.acm.org/doi/10.1145/3652148
Antil ADhiman C(2024)MF2ShrT: Multimodal Feature Fusion Using Shared Layered Transformer for Face Anti-spoofingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364081720:6(1-21)Online publication date: 8-Mar-2024
https://dl.acm.org/doi/10.1145/3640817
Show More Cited By

Index Terms

Neural-Network-Based Cross-Channel Intra Prediction
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Convolutional neural network based low complexity HEVC intra encoder
Abstract
Video coding is one of the key technologies of visual sensors. As the state-of-art video coding standard, High Efficiency Video Coding (HEVC) achieves a significant high compression ratio for video. However, it also introduces heavy computational ...
Fast intra prediction for high efficiency video coding
PCM'12: Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing

Emerging High Efficiency Video Coding (HEVC) video coding standard promises the significant compression performance improvement compared to the H.264/AVC. However it comes with the tremendous encoding complexity increase. Thus, it is very useful and ...
Fast intra coding unit decision for high efficiency video coding based on statistical information

The latest video coding compression standard is known as highefficiency video coding (HEVC). It supports high-resolution video sequences and has better coding performance than the previous standard H.264/AVC. A quad-tree based coding unit (CU) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 17, Issue 3

August 2021

443 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3476118

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 July 2021

Accepted: 01 November 2020

Revised: 01 September 2020

Received: 01 May 2020

Published in TOMM Volume 17, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

39
Total Citations
View Citations
336
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liang ZWang YLu WCao X(2024)Boosting Semi-Supervised Learning with Dual-Threshold Screening and Similarity LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3672563Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3672563
Huo SLiu DZhang HLi LMa SWu FGao W(2024)Towards Hybrid-Optimization Video CodingACM Computing Surveys10.1145/365214856:9(1-36)Online publication date: 24-Apr-2024
https://dl.acm.org/doi/10.1145/3652148
Antil ADhiman C(2024)MF2ShrT: Multimodal Feature Fusion Using Shared Layered Transformer for Face Anti-spoofingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364081720:6(1-21)Online publication date: 8-Mar-2024
https://dl.acm.org/doi/10.1145/3640817
Qiu HLi HWu QShi HWang LMeng FXu L(2024)Learning Offset Probability Distribution for Accurate Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363721420:5(1-24)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3637214
Cheng YYan YZhu WPan YPan BYang X(2024)Head3D: Complete 3D Head Generation via Tri-plane Feature DistillationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363571720:6(1-20)Online publication date: 8-Mar-2024
https://dl.acm.org/doi/10.1145/3635717
Sheng BLi JGui LGuo ZXiao F(2024)LiteWiSys: A Lightweight System for WiFi-based Dual-task Action PerceptionACM Transactions on Sensor Networks10.1145/363217720:4(1-19)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3632177
Li ZLi JLi YLi LLiu DWu F(2024)In-Loop Filtering via Trained Look-Up Tables2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP63160.2024.10849824(1-5)Online publication date: 8-Dec-2024
https://doi.org/10.1109/VCIP63160.2024.10849824
Song XHou SHuang YCao CLiu XHuang YShan C(2024)Gait Attribute Recognition: A New Benchmark for Learning Richer Attributes From Human Gait PatternsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.331893419(1-14)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3318934
Zou CWan SJi TBlanch MMrak MHerranz L(2024)Chroma Intra Prediction With Lightweight Attention-Based Neural NetworksIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328298034:1(549-560)Online publication date: Jan-2024
https://doi.org/10.1109/TCSVT.2023.3282980
Liang FZhang J(2024)Neural network-based cross-channel chroma prediction for versatile video codingThe Journal of Supercomputing10.1007/s11227-023-05868-y80:9(12166-12185)Online publication date: 8-Feb-2024
https://dl.acm.org/doi/10.1007/s11227-023-05868-y
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents