research-article

Temporal Difference Enhancement for High-Resolution Video Frame Interpolation

Authors:

Chongwen WangAuthors Info & Claims

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

Pages 433 - 437

https://doi.org/10.1145/3587716.3587788

Published: 07 September 2023 Publication History

Abstract

Video frame interpolation techniques provide a smoother visual experience by enhancing the temporal resolution of videos. To generate intermediate frames, numerous techniques estimate various parameters, such as optical flow and occlusion masks, directly on the original resolution images. As a result, processing high-resolution images requires more computing power and inference time. This paper proposes a lightweight network for high-resolution video frame interpolation that performs a complete interpolation workflow on low-resolution images to provide accurate low-resolution optical flow and occlusion masks. To effectively restore the optical flow and mask of the original resolution image, we propose an extremely lightweight temporal difference enhancement module that makes use of the hidden motion information in the temporal difference to aid in the restoration of optical flow and mask. The proposed network has comparable performance and faster inference speed for high-resolution video interpolation compared to the current mainstream network. The ablation experiment demonstrates the importance of the temporal difference module.

References

[1]

Pierre Charbonnier, Laure Blanc-Feraud, Gilles Aubert, and Michel Barlaud. 1994. Two deterministic half-quadratic regularization algorithms for computed imaging. In Proceedings of 1st International Conference on Image Processing, Vol. 2. IEEE, 168–172.

[2]

Xianhang Cheng and Zhenzhong Chen. 2021. Multiple video frame interpolation via enhanced deformable separable convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).

Digital Library

[3]

Myungsub Choi, Heewon Kim, Bohyung Han, Ning Xu, and Kyoung Mu Lee. 2020. Channel attention is all you need for video frame interpolation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10663–10671.

[4]

Duolikun Danier, Fan Zhang, and David Bull. 2021. Spatio-Temporal Multi-Flow Network for Video Frame Interpolation. arXiv preprint arXiv:2111.15483 (2021).

[5]

John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. Deepstereo: Learning to predict new views from the world’s imagery. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5515–5524.

[6]

Shurui Gui, Chaoyue Wang, Qihua Chen, and Dacheng Tao. 2020. Featureflow: Robust video interpolation via structure-to-texture generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14004–14013.

[7]

Berthold KP Horn and Brian G Schunck. 1981. Determining optical flow. Artificial intelligence 17, 1-3 (1981), 185–203.

[8]

Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, and Yu-Wing Tai. 2022. Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17411–17420.

[9]

Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, and Jan Kautz. 2018. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9000–9008.

[10]

Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, and Jie Yang. 2022. IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1969–1978.

[11]

Hyeongmin Lee, Taeoh Kim, Tae-young Chung, Daehyun Pak, Yuseok Ban, and Sangyoun Lee. 2020. Adacof: Adaptive collaboration of flows for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5316–5325.

[12]

Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).

[13]

Simon Meister, Junhwa Hur, and Stefan Roth. 2018. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.

[14]

Simon Niklaus and Feng Liu. 2020. Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5437–5446.

[15]

Simon Niklaus, Long Mai, and Feng Liu. 2017. Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE International Conference on Computer Vision. 261–270.

[16]

Junheum Park, Keunsoo Ko, Chul Lee, and Chang-Su Kim. 2020. Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation. In European Conference on Computer Vision. Springer, 109–125.

Digital Library

[17]

Federico Perazzi, Jordi Pont-Tuset, Brian McWilliams, Luc Van Gool, Markus Gross, and Alexander Sorkine-Hornung. 2016. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 724–732.

[18]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234–241.

[19]

Hyeonjun Sim, Jihyong Oh, and Munchurl Kim. 2021. Xvfi: Extreme video frame interpolation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14489–14498.

[20]

Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, 2020. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence 43, 10 (2020), 3349–3364.

[21]

Limin Wang, Zhan Tong, Bin Ji, and Gangshan Wu. 2021. Tdn: Temporal difference networks for efficient action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1895–1904.

[22]

Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision. Springer, 20–36.

[23]

Chao-Yuan Wu, Nayan Singhal, and Philipp Krahenbuhl. 2018. Video compression through image interpolation. In Proceedings of the European conference on computer vision (ECCV). 416–431.

Digital Library

[24]

Jin Xin, Wu Longhai, Shen Guotao, Chen Youxin, Chen Jie, Koo Jayoon, and Hahm Cheul-hee. 2022. Enhanced Bi-directional Motion Estimation for Video Frame Interpolation. arXiv preprint arXiv:2206.08572 (2022).

[25]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision 127, 8 (2019), 1106–1125.

Digital Library

Index Terms

Temporal Difference Enhancement for High-Resolution Video Frame Interpolation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Cross-view Resolution and Frame Rate Joint Enhancement for Binocular Video
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

With the popular of stereo video and free-viewpoint video, binocular and multi-view video enhancement has attracted increasing attention. Current binocular video enhancement methods mainly focus on stereo super-resolution. In this paper, we tend to ...
How Video Super-Resolution and Frame Interpolation Mutually Benefit
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Video super-resolution (VSR) and video frame interpolation (VFI) are inter-dependent for enhancing videos of low resolution and low frame rate. However, most studies treat VSR and temporal VFI as independent tasks. In this work, we design a spatial-...
Frame and feature-context video super-resolution
AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

For video super-resolution, current state-of-the-art approaches either process multiple low-resolution (LR) frames to produce each output high-resolution (HR) frame separately in a sliding window fashion or recurrently exploit the previously estimated HR ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

February 2023

619 pages

ISBN:9781450398411

DOI:10.1145/3587716

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMLC 2023

ICMLC 2023: 2023 15th International Conference on Machine Learning and Computing

February 17 - 20, 2023

Zhuhai, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
27
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents