skip to main content
10.1145/3394171.3413504acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding

Published: 12 October 2020 Publication History

Abstract

The standard paradigm of video super-resolution (SR) is to generate the spatial-temporal coherent high-resolution (HR) sequence from the corresponding low-resolution (LR) version which has already been decoded from the bitstream. However, a highly practical while relatively under-studied way is enabling the built-in SR functionality in the decoder, in the sense that almost all videos are compactly represented. In this paper, we systematically investigate the SR of compressed LR videos by leveraging the interactivity between decoding prior and deep prior. By fully exploiting the compact video stream information, the proposed bitstream prior embedded SR framework achieves compressed video SR and quality enhancement simultaneously in a single feed-forward process. More specifically, we propose a motion vector guided multi-scale local attention module that explicitly exploits the temporal dependency and suppresses coding artifacts with substantially economized computational complexity. Moreover, a scale-wise deep residual-in-residual network is learned to reconstruct the SR frames from the multi-scale fused features. To facilitate the research of compressed video SR, we also build a large-scale dataset with compressed videos of diverse content, including ready-made diversified kinds of side information extracted from the bitstream. Both quantitative and qualitative evaluations show that our model achieves superior performance for compressed video SR, and offers competitive performance compared to the sequential combinations of the state-of-the-art methods for compressed video artifacts removal and SR.

Supplementary Material

MP4 File (3394171.3413504.mp4)
Presentation video of the paper titled "When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding."

References

[1]
2018. HM HEVC reference software. https://vcgit.hhi.fraunhofer.de/jct-vc/HM/- /tree/HM-16.20.
[2]
Frank Bossen. 2013. Common test conditions and software reference configurations. JCTVC-L1100 (2013).
[3]
Benjamin Bross, Jianle Chen, Shan Liu, and Ye-Kui Wang. 2020. Versatile Video Coding (Draft 9). JVET-R2001-vA (2020).
[4]
Jose Caballero, Christian Ledig, Andrew Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4778--4787.
[5]
Yuanying Dai, Dong Liu, and Feng Wu. 2017. A convolutional neural network approach for post-processing in HEVC intra coding. In International Conference on Multimedia Modeling. Springer, 28--39.
[6]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision. Springer, 184--199.
[7]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, 2 (2015), 295--307.
[8]
Yuchen Fan, Jiahui Yu, Ding Liu, and Thomas S Huang. 2019. Scale-wise Convolution for Image Restoration. arXiv preprint arXiv:1912.09028 (2019).
[9]
Chih-Ming Fu, Elena Alshina, Alexander Alshin, Yu-Wen Huang, Ching-Yeh Chen, Chia-Yang Tsai, Chih-Wei Hsu, Shaw-Min Lei, Jeong-Hoon Park, and Woo-Jin Han. 2012. Sample adaptive offset in the HEVC standard. IEEE Transactions on Circuits and Systems for Video technology, Vol. 22, 12 (2012), 1755--1764.
[10]
Philipp Helle, Simon Oudin, Benjamin Bross, Detlev Marpe, M Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand. 2012. Block merging for quadtree-based partitioning in HEVC. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1720--1731.
[11]
Yan Huang, Wei Wang, and Liang Wang. 2015. Bidirectional recurrent convolutional networks for multi-frame super-resolution. In Advances in Neural Information Processing Systems. 235--243.
[12]
Younghyun Jo, Seoung Wug Oh, Jaeyeon Kang, and Seon Joo Kim. 2018. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3224--3232.
[13]
Woong-il Choi Heechul Yang Minsoo Park Yinji Piao Seungsoo Jeong Anish Tamse Narae Choi Kwang Pyo Choi JeongHoon Park (Samsung) Roman Chernyak Ye-Kui Wang Sergey Ikonin Alexander Karabutov FNU Hendry Jianle Chen (Huawei) Marta Karczewicz Dmytro Rusanovskyy Nikolay Shlyakhov Yan Zhang Han Huang Chao-Hsiung Hung Chun-Chi Chen Wei-Jung Chien Vadim Seregin Nan Hu Hilmi Egilmez (Qualcomm) Kiho Choi, Min Woo Park. 2019. Description of video coding technology proposal by Samsung, Huawei, and Qualcomm for New Video Coding Standard. ISO/IEC JTC1/SC29/WG11 MPEG2018/M46354 (2019).
[14]
Il-Koo Kim, Junghye Min, Tammy Lee, Woo-Jin Han, and JeongHoon Park. 2012. Block partitioning structure in the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1697--1706.
[15]
Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1646--1654.
[16]
Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 624--632.
[17]
Jani Lainema, Frank Bossen, Woo-Jin Han, Junghye Min, and Kemal Ugur. 2012. Intra coding of the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1792--1801.
[18]
Yue Li, Dong Liu, Houqiang Li, Li Li, Feng Wu, Hong Zhang, and Haitao Yang. 2018. Convolutional neural network-based block up-sampling for intra frame coding. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 9 (2018), 2316--2330.
[19]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 136--144.
[20]
Joe Yuchieh Lin, Rui Song, Chi-Hao Wu, TsungJung Liu, Haiqiang Wang, and C-C Jay Kuo. 2015. MCL-V: A streaming video quality assessment database. Journal of Visual Communication and Image Representation, Vol. 30 (2015), 1--9.
[21]
Ding Liu, Zhaowen Wang, Yuchen Fan, Xianming Liu, Zhangyang Wang, Shiyu Chang, and Thomas Huang. 2017. Robust video super-resolution with learned temporal dynamics. In Proceedings of the IEEE International Conference on Computer Vision. 2507--2515.
[22]
Andrey Norkin, Gisle Bjontegaard, Arild Fuldseth, Matthias Narroschke, Masaru Ikeda, Kenneth Andersson, Minhua Zhou, and Geert Van der Auwera. 2012. HEVC deblocking filter. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1746--1754.
[23]
Jens-Rainer Ohm, Gary J Sullivan, Heiko Schwarz, Thiow Keng Tan, and Thomas Wiegand. 2012. Comparison of the coding efficiency of video coding standards-including high efficiency video coding (HEVC). IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1669--1684.
[24]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
[25]
Margaret H Pinson. 2013. The consumer digital video library [best of the web]. IEEE Signal Processing Magazine, Vol. 30, 4 (2013), 172--174.
[26]
Yurui Ren, Xiaoming Yu, Junming Chen, Thomas H Li, and Ge Li. 2020. Deep Image Spatial Transformation for Person Image Generation. arXiv preprint arXiv:2003.00696 (2020).
[27]
Mehdi SM Sajjadi, Raviteja Vemulapalli, and Matthew Brown. 2018. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6626--6634.
[28]
Kalpana Seshadrinathan, Rajiv Soundararajan, Alan Conrad Bovik, and Lawrence K Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE Transactions on Image Processing, Vol. 19, 6 (2010), 1427--1441.
[29]
Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1874--1883.
[30]
Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 12 (2012), 1649--1668.
[31]
Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, and Jiaya Jia. 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision. 4472--4480.
[32]
Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu TDAN. 2018. Temporally deformable alignment network for video super-resolution. arXiv preprint arXiv:1812.02898, Vol. 1, 2 (2018), 3.
[33]
Longguang Wang, Yulan Guo, Zaiping Lin, Xinpu Deng, and Wei An. 2018. Learning for video super-resolution through HR optical flow estimation. In Asian Conference on Computer Vision. Springer, 514--529.
[34]
Tingting Wang, Mingjin Chen, and Hongyang Chao. 2017. A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In 2017 Data Compression Conference (DCC). IEEE, 410--419.
[35]
Xintao Wang, Kelvin CK Chan, Ke Yu, Chao Dong, and Chen Change Loy. 2019. Edvr: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.
[36]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, Vol. 13, 4 (2004), 600--612.
[37]
Thomas Wiegand, Gary J Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H. 264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, 7 (2003), 560--576.
[38]
Yi Xu, Longwen Gao, Kai Tian, Shuigeng Zhou, and Huyang Sun. 2019. Non-Local ConvLSTM for Video Compression Artifact Reduction. In Proceedings of the IEEE International Conference on Computer Vision. 7043--7052.
[39]
Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision, Vol. 127, 8 (2019), 1106--1125.
[40]
Ren Yang, Mai Xu, Zulin Wang, and Tianyi Li. 2018. Multi-frame quality enhancement for compressed video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6664--6673.
[41]
Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, and Jiayi Ma. 2019. Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations. In Proceedings of the IEEE International Conference on Computer Vision. 3106--3115.
[42]
Bowen Zhang, Limin Wang, Zhe Wang, Yu Qiao, and Hanli Wang. 2016. Real-time action recognition with enhanced motion vector CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2718--2726.
[43]
Liang Zhao, Zhihai He, Wenming Cao, and Debin Zhao. 2016. Real-time moving object segmentation and classification from HEVC compressed surveillance video. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 6 (2016), 1346--1357.

Cited By

View all
  • (2025)QP-adaptive compressed video super-resolution with coding priorsSignal Processing10.1016/j.sigpro.2024.109878(109878)Online publication date: Jan-2025
  • (2024)Online Streaming Video Super-Resolution With Convolutional Look-Up TableIEEE Transactions on Image Processing10.1109/TIP.2024.337410433(2305-2317)Online publication date: 2024
  • (2024)Multi-Level Alignments for Compressed Video Super-ResolutionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341114470:3(5101-5114)Online publication date: Aug-2024
  • Show More Cited By

Index Terms

  1. When Bitstream Prior Meets Deep Prior: Compressed Video Super-resolution with Learning from Decoding

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '20: Proceedings of the 28th ACM International Conference on Multimedia
    October 2020
    4889 pages
    ISBN:9781450379885
    DOI:10.1145/3394171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. coding bitstream prior
    2. compressed video
    3. deep learning
    4. video super-resolution

    Qualifiers

    • Research-article

    Conference

    MM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)48
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)QP-adaptive compressed video super-resolution with coding priorsSignal Processing10.1016/j.sigpro.2024.109878(109878)Online publication date: Jan-2025
    • (2024)Online Streaming Video Super-Resolution With Convolutional Look-Up TableIEEE Transactions on Image Processing10.1109/TIP.2024.337410433(2305-2317)Online publication date: 2024
    • (2024)Multi-Level Alignments for Compressed Video Super-ResolutionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.341114470:3(5101-5114)Online publication date: Aug-2024
    • (2023)MVFlow: Deep Optical Flow Estimation of Compressed Videos with Motion Vector PriorProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611750(1964-1974)Online publication date: 26-Oct-2023
    • (2023)Revisiting video super-resolution: you only look outstanding framesJournal of Electronic Imaging10.1117/1.JEI.32.2.02301232:02Online publication date: 1-Mar-2023
    • (2023)Test-Time Adaptation for Optical Flow Estimation Using Motion VectorsIEEE Transactions on Image Processing10.1109/TIP.2023.330910832(4977-4988)Online publication date: 2023
    • (2023)Multi-Directional Information Channel for Compressed Video Super-Resolution2023 20th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)10.1109/ICCWAMTIP60502.2023.10387078(1-5)Online publication date: 15-Dec-2023
    • (2023)Co-ViSu: a Video Super-Resolution Accelerator Exploiting Codec Information Reuse2023 33rd International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL60245.2023.00021(93-100)Online publication date: 4-Sep-2023
    • (2022)NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00129(1220-1237)Online publication date: Jun-2022
    • (2021)Compressed Domain Deep Video Super-ResolutionIEEE Transactions on Image Processing10.1109/TIP.2021.310182630(7156-7169)Online publication date: 2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media