skip to main content
10.1145/3587716.3587788acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

Temporal Difference Enhancement for High-Resolution Video Frame Interpolation

Published: 07 September 2023 Publication History

Abstract

Video frame interpolation techniques provide a smoother visual experience by enhancing the temporal resolution of videos. To generate intermediate frames, numerous techniques estimate various parameters, such as optical flow and occlusion masks, directly on the original resolution images. As a result, processing high-resolution images requires more computing power and inference time. This paper proposes a lightweight network for high-resolution video frame interpolation that performs a complete interpolation workflow on low-resolution images to provide accurate low-resolution optical flow and occlusion masks. To effectively restore the optical flow and mask of the original resolution image, we propose an extremely lightweight temporal difference enhancement module that makes use of the hidden motion information in the temporal difference to aid in the restoration of optical flow and mask. The proposed network has comparable performance and faster inference speed for high-resolution video interpolation compared to the current mainstream network. The ablation experiment demonstrates the importance of the temporal difference module.

References

[1]
Pierre Charbonnier, Laure Blanc-Feraud, Gilles Aubert, and Michel Barlaud. 1994. Two deterministic half-quadratic regularization algorithms for computed imaging. In Proceedings of 1st International Conference on Image Processing, Vol. 2. IEEE, 168–172.
[2]
Xianhang Cheng and Zhenzhong Chen. 2021. Multiple video frame interpolation via enhanced deformable separable convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
[3]
Myungsub Choi, Heewon Kim, Bohyung Han, Ning Xu, and Kyoung Mu Lee. 2020. Channel attention is all you need for video frame interpolation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10663–10671.
[4]
Duolikun Danier, Fan Zhang, and David Bull. 2021. Spatio-Temporal Multi-Flow Network for Video Frame Interpolation. arXiv preprint arXiv:2111.15483 (2021).
[5]
John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. Deepstereo: Learning to predict new views from the world’s imagery. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5515–5524.
[6]
Shurui Gui, Chaoyue Wang, Qihua Chen, and Dacheng Tao. 2020. Featureflow: Robust video interpolation via structure-to-texture generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14004–14013.
[7]
Berthold KP Horn and Brian G Schunck. 1981. Determining optical flow. Artificial intelligence 17, 1-3 (1981), 185–203.
[8]
Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, and Yu-Wing Tai. 2022. Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17411–17420.
[9]
Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, and Jan Kautz. 2018. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9000–9008.
[10]
Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, and Jie Yang. 2022. IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1969–1978.
[11]
Hyeongmin Lee, Taeoh Kim, Tae-young Chung, Daehyun Pak, Yuseok Ban, and Sangyoun Lee. 2020. Adacof: Adaptive collaboration of flows for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5316–5325.
[12]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[13]
Simon Meister, Junhwa Hur, and Stefan Roth. 2018. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[14]
Simon Niklaus and Feng Liu. 2020. Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5437–5446.
[15]
Simon Niklaus, Long Mai, and Feng Liu. 2017. Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE International Conference on Computer Vision. 261–270.
[16]
Junheum Park, Keunsoo Ko, Chul Lee, and Chang-Su Kim. 2020. Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation. In European Conference on Computer Vision. Springer, 109–125.
[17]
Federico Perazzi, Jordi Pont-Tuset, Brian McWilliams, Luc Van Gool, Markus Gross, and Alexander Sorkine-Hornung. 2016. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 724–732.
[18]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234–241.
[19]
Hyeonjun Sim, Jihyong Oh, and Munchurl Kim. 2021. Xvfi: Extreme video frame interpolation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14489–14498.
[20]
Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, 2020. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence 43, 10 (2020), 3349–3364.
[21]
Limin Wang, Zhan Tong, Bin Ji, and Gangshan Wu. 2021. Tdn: Temporal difference networks for efficient action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1895–1904.
[22]
Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision. Springer, 20–36.
[23]
Chao-Yuan Wu, Nayan Singhal, and Philipp Krahenbuhl. 2018. Video compression through image interpolation. In Proceedings of the European conference on computer vision (ECCV). 416–431.
[24]
Jin Xin, Wu Longhai, Shen Guotao, Chen Youxin, Chen Jie, Koo Jayoon, and Hahm Cheul-hee. 2022. Enhanced Bi-directional Motion Estimation for Video Frame Interpolation. arXiv preprint arXiv:2206.08572 (2022).
[25]
Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision 127, 8 (2019), 1106–1125.

Index Terms

  1. Temporal Difference Enhancement for High-Resolution Video Frame Interpolation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing
    February 2023
    619 pages
    ISBN:9781450398411
    DOI:10.1145/3587716
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 September 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. High-resolution network
    2. Lightweight network
    3. Temporal difference
    4. Video frame interpolation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICMLC 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 27
      Total Downloads
    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media