research-article

When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding

Authors:

Shiqi WangAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 1000 - 1008

https://doi.org/10.1145/3394171.3413504

Published: 12 October 2020 Publication History

Abstract

The standard paradigm of video super-resolution (SR) is to generate the spatial-temporal coherent high-resolution (HR) sequence from the corresponding low-resolution (LR) version which has already been decoded from the bitstream. However, a highly practical while relatively under-studied way is enabling the built-in SR functionality in the decoder, in the sense that almost all videos are compactly represented. In this paper, we systematically investigate the SR of compressed LR videos by leveraging the interactivity between decoding prior and deep prior. By fully exploiting the compact video stream information, the proposed bitstream prior embedded SR framework achieves compressed video SR and quality enhancement simultaneously in a single feed-forward process. More specifically, we propose a motion vector guided multi-scale local attention module that explicitly exploits the temporal dependency and suppresses coding artifacts with substantially economized computational complexity. Moreover, a scale-wise deep residual-in-residual network is learned to reconstruct the SR frames from the multi-scale fused features. To facilitate the research of compressed video SR, we also build a large-scale dataset with compressed videos of diverse content, including ready-made diversified kinds of side information extracted from the bitstream. Both quantitative and qualitative evaluations show that our model achieves superior performance for compressed video SR, and offers competitive performance compared to the sequential combinations of the state-of-the-art methods for compressed video artifacts removal and SR.

Supplementary Material

MP4 File (3394171.3413504.mp4)

Presentation video of the paper titled "When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding."

Download
116.51 MB

References

[1]

2018. HM HEVC reference software. https://vcgit.hhi.fraunhofer.de/jct-vc/HM/- /tree/HM-16.20.

[2]

Frank Bossen. 2013. Common test conditions and software reference configurations. JCTVC-L1100 (2013).

[3]

Benjamin Bross, Jianle Chen, Shan Liu, and Ye-Kui Wang. 2020. Versatile Video Coding (Draft 9). JVET-R2001-vA (2020).

[4]

Jose Caballero, Christian Ledig, Andrew Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4778--4787.

[5]

Yuanying Dai, Dong Liu, and Feng Wu. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In International Conference on Multimedia Modeling. Springer, 28--39.

[6]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision. Springer, 184--199.

[7]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, 2 (2015), 295--307.

Digital Library

[8]

Yuchen Fan, Jiahui Yu, Ding Liu, and Thomas S Huang. 2019. Scale-wise Convolution for Image Restoration. arXiv preprint arXiv:1912.09028 (2019).

[9]

Chih-Ming Fu, Elena Alshina, Alexander Alshin, Yu-Wen Huang, Ching-Yeh Chen, Chia-Yang Tsai, Chih-Wei Hsu, Shaw-Min Lei, Jeong-Hoon Park, and Woo-Jin Han. 2012. Sample adaptive offset in the HEVC standard. IEEE Transactions on Circuits and Systems for Video technology, Vol. 22, 12 (2012), 1755--1764.

Digital Library

[10]

Philipp Helle, Simon Oudin, Benjamin Bross, Detlev Marpe, M Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand. 2012. Block merging for quadtree-based partitioning in HEVC. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1720--1731.

Digital Library

[11]

Yan Huang, Wei Wang, and Liang Wang. 2015. Bidirectional recurrent convolutional networks for multi-frame super-resolution. In Advances in Neural Information Processing Systems. 235--243.

[12]

Younghyun Jo, Seoung Wug Oh, Jaeyeon Kang, and Seon Joo Kim. 2018. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3224--3232.

[13]

Woong-il Choi Heechul Yang Minsoo Park Yinji Piao Seungsoo Jeong Anish Tamse Narae Choi Kwang Pyo Choi JeongHoon Park (Samsung) Roman Chernyak Ye-Kui Wang Sergey Ikonin Alexander Karabutov FNU Hendry Jianle Chen (Huawei) Marta Karczewicz Dmytro Rusanovskyy Nikolay Shlyakhov Yan Zhang Han Huang Chao-Hsiung Hung Chun-Chi Chen Wei-Jung Chien Vadim Seregin Nan Hu Hilmi Egilmez (Qualcomm) Kiho Choi, Min Woo Park. 2019. Description of video coding technology proposal by Samsung, Huawei, and Qualcomm for New Video Coding Standard. ISO/IEC JTC1/SC29/WG11 MPEG2018/M46354 (2019).

[14]

Il-Koo Kim, Junghye Min, Tammy Lee, Woo-Jin Han, and JeongHoon Park. 2012. Block partitioning structure in the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1697--1706.

Digital Library

[15]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1646--1654.

[16]

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 624--632.

[17]

Jani Lainema, Frank Bossen, Woo-Jin Han, Junghye Min, and Kemal Ugur. 2012. Intra coding of the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1792--1801.

Digital Library

[18]

Yue Li, Dong Liu, Houqiang Li, Li Li, Feng Wu, Hong Zhang, and Haitao Yang. 2018. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 9 (2018), 2316--2330.

[19]

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 136--144.

[20]

Joe Yuchieh Lin, Rui Song, Chi-Hao Wu, TsungJung Liu, Haiqiang Wang, and C-C Jay Kuo. 2015. MCL-V: A streaming video quality assessment database. Journal of Visual Communication and Image Representation, Vol. 30 (2015), 1--9.

Digital Library

[21]

Ding Liu, Zhaowen Wang, Yuchen Fan, Xianming Liu, Zhangyang Wang, Shiyu Chang, and Thomas Huang. 2017. Robust video super-resolution with learned temporal dynamics. In Proceedings of the IEEE International Conference on Computer Vision. 2507--2515.

[22]

Andrey Norkin, Gisle Bjontegaard, Arild Fuldseth, Matthias Narroschke, Masaru Ikeda, Kenneth Andersson, Minhua Zhou, and Geert Van der Auwera. 2012. HEVC deblocking filter. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1746--1754.

Digital Library

[23]

Jens-Rainer Ohm, Gary J Sullivan, Heiko Schwarz, Thiow Keng Tan, and Thomas Wiegand. 2012. Comparison of the coding efficiency of video coding standards-including high efficiency video coding (HEVC). IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1669--1684.

Digital Library

[24]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).

[25]

Margaret H Pinson. 2013. The consumer digital video library [best of the web]. IEEE Signal Processing Magazine, Vol. 30, 4 (2013), 172--174.

[26]

Yurui Ren, Xiaoming Yu, Junming Chen, Thomas H Li, and Ge Li. 2020. Deep Image Spatial Transformation for Person Image Generation. arXiv preprint arXiv:2003.00696 (2020).

[27]

Mehdi SM Sajjadi, Raviteja Vemulapalli, and Matthew Brown. 2018. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6626--6634.

[28]

Kalpana Seshadrinathan, Rajiv Soundararajan, Alan Conrad Bovik, and Lawrence K Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE Transactions on Image Processing, Vol. 19, 6 (2010), 1427--1441.

Digital Library

[29]

Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1874--1883.

[30]

Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1649--1668.

Digital Library

[31]

Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, and Jiaya Jia. 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision. 4472--4480.

[32]

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu TDAN. 2018. Temporally deformable alignment network for video super-resolution. arXiv preprint arXiv:1812.02898, Vol. 1, 2 (2018), 3.

[33]

Longguang Wang, Yulan Guo, Zaiping Lin, Xinpu Deng, and Wei An. 2018. Learning for video super-resolution through HR optical flow estimation. In Asian Conference on Computer Vision. Springer, 514--529.

[34]

Tingting Wang, Mingjin Chen, and Hongyang Chao. 2017. A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In 2017 Data Compression Conference (DCC). IEEE, 410--419.

[35]

Xintao Wang, Kelvin CK Chan, Ke Yu, Chao Dong, and Chen Change Loy. 2019. Edvr: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.

[36]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, Vol. 13, 4 (2004), 600--612.

Digital Library

[37]

Thomas Wiegand, Gary J Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H. 264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, 7 (2003), 560--576.

Digital Library

[38]

Yi Xu, Longwen Gao, Kai Tian, Shuigeng Zhou, and Huyang Sun. 2019. Non-Local ConvLSTM for Video Compression Artifact Reduction. In Proceedings of the IEEE International Conference on Computer Vision. 7043--7052.

[39]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision, Vol. 127, 8 (2019), 1106--1125.

Digital Library

[40]

Ren Yang, Mai Xu, Zulin Wang, and Tianyi Li. 2018. Multi-frame quality enhancement for compressed video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6664--6673.

[41]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, and Jiayi Ma. 2019. Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations. In Proceedings of the IEEE International Conference on Computer Vision. 3106--3115.

[42]

Bowen Zhang, Limin Wang, Zhe Wang, Yu Qiao, and Hanli Wang. 2016. Real-time action recognition with enhanced motion vector CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2718--2726.

[43]

Liang Zhao, Zhihai He, Wenming Cao, and Debin Zhao. 2016. Real-time moving object segmentation and classification from HEVC compressed surveillance video. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 6 (2016), 1346--1357.

Digital Library

Cited By

Zhang TChen ZHe XRen CTeng Q(2025)QP-adaptive compressed video super-resolution with coding priorsSignal Processing10.1016/j.sigpro.2024.109878(109878)Online publication date: Jan-2025
https://doi.org/10.1016/j.sigpro.2024.109878
Yin GQu ZJiang XJiang SHan ZZheng NYang HLiu XYang YLi DQiu L(2024)Online Streaming Video Super-Resolution With Convolutional Look-Up TableIEEE Transactions on Image Processing10.1109/TIP.2024.337410433(2305-2317)Online publication date: 2024
https://doi.org/10.1109/TIP.2024.3374104
Wei LYe MJi LGan YLi SLi X(2024)Multi-Level Alignments for Compressed Video Super-ResolutionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341114470:3(5101-5114)Online publication date: Aug-2024
https://doi.org/10.1109/TCE.2024.3411144
Show More Cited By

Index Terms

When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Arbitrary-scale Super-resolution via Deep Learning: A Comprehensive Survey
Abstract
Super-resolution (SR) is an essential class of low-level vision tasks, which aims to improve the resolution of images or videos in computer vision. In recent years, significant progress has been made in image and video super-resolution techniques ...
Highlights
- This work is the first systematic review on arbitrary scale super-resolution (SR).
- Two novel taxonomies for arbitrary scale SR methods are proposed.
- The advantages and limitations of each class of methods are analyzed.
- The ...
Super-resolution network with dynamic cleanup and temporal–spatial attention for compressed videos
Abstract
Video super-resolution methods focus on restoring high-resolution video frames from low-resolution videos with pre-defined degradations, and few consider compression. However, the videos on the Internet are compressed to reduce the massive size ...
Spectral super-resolution meets deep learning: Achievements and challenges
Abstract
Spectral super-resolution (sSR) is a very important technique to obtain hyperspectral images from only RGB images, which can effectively overcome the high acquisition cost and low spatial resolution of hyperspectral imaging. From ...
Highlights
- Deep learning-based spectral super-resolution methods are reviewed.
- Three ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

October 2020

4889 pages

ISBN:9781450379885

DOI:10.1145/3394171

General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '20

Sponsor:

SIGMM

MM '20: The 28th ACM International Conference on Multimedia

October 12 - 16, 2020

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
545
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang TChen ZHe XRen CTeng Q(2025)QP-adaptive compressed video super-resolution with coding priorsSignal Processing10.1016/j.sigpro.2024.109878(109878)Online publication date: Jan-2025
https://doi.org/10.1016/j.sigpro.2024.109878
Yin GQu ZJiang XJiang SHan ZZheng NYang HLiu XYang YLi DQiu L(2024)Online Streaming Video Super-Resolution With Convolutional Look-Up TableIEEE Transactions on Image Processing10.1109/TIP.2024.337410433(2305-2317)Online publication date: 2024
https://doi.org/10.1109/TIP.2024.3374104
Wei LYe MJi LGan YLi SLi X(2024)Multi-Level Alignments for Compressed Video Super-ResolutionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341114470:3(5101-5114)Online publication date: Aug-2024
https://doi.org/10.1109/TCE.2024.3411144
Zhou SJiang XTan WHe RYan BEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)MVFlow: Deep Optical Flow Estimation of Compressed Videos with Motion Vector PriorProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611750(1964-1974)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611750
Bae JPark S(2023)Revisiting video super-resolution: you only look outstanding framesJournal of Electronic Imaging10.1117/1.JEI.32.2.02301232:02Online publication date: 1-Mar-2023
https://doi.org/10.1117/1.JEI.32.2.023012
Ayyoubzadeh SLiu WKezele IYu YWu XWang YJin T(2023)Test-Time Adaptation for Optical Flow Estimation Using Motion VectorsIEEE Transactions on Image Processing10.1109/TIP.2023.330910832(4977-4988)Online publication date: 2023
https://doi.org/10.1109/TIP.2023.3309108
Liu W(2023)Multi-Directional Information Channel for Compressed Video Super-Resolution2023 20th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)10.1109/ICCWAMTIP60502.2023.10387078(1-5)Online publication date: 15-Dec-2023
https://doi.org/10.1109/ICCWAMTIP60502.2023.10387078
Fan HWu JLu WLi XYan G(2023)Co-ViSu: a Video Super-Resolution Accelerator Exploiting Codec Information Reuse2023 33rd International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL60245.2023.00021(93-100)Online publication date: 4-Sep-2023
https://doi.org/10.1109/FPL60245.2023.00021
Yang RTimofte RZheng MXing QQiao MXu MJiang LLiu HChen YBen YZhou XFu CCheng PYu GLi JWu RZhang ZShang WLv ZChen YZhou MRen DZhang KZuo WOstyakov PDmitry VSoltanayev SSergey CMagauiya ZZou XYan YNavarrete Michelini PLu YZhang DLiu SGao SWu BZheng CZhang XLu KWang NNguyen Canh TBach TWang QSun XMa HZhao SLi JXie LShi SYang YWang XGu JDong CShi XNian CJiang DLin JXie ZYe MLuo DPeng LChen SLiu XWang QLiu XLiang BDong HHuang YChen KGuo XSun YWu HWei PHuang YChen JHyun Lee IAli Khowaja SYoon J(2022)NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00129(1220-1237)Online publication date: Jun-2022
https://doi.org/10.1109/CVPRW56347.2022.00129
Chen PYang WWang MSun LHu KWang S(2021)Compressed Domain Deep Video Super-ResolutionIEEE Transactions on Image Processing10.1109/TIP.2021.310182630(7156-7169)Online publication date: 2021
https://doi.org/10.1109/TIP.2021.3101826

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents