research-article

Multi-scale feature and correlation volumes for Video Interpolation

Authors:

Yueli HuAuthors Info & Claims

ICVIP '23: Proceedings of the 2023 7th International Conference on Video and Image Processing

Pages 69 - 75

https://doi.org/10.1145/3639390.3639400

Published: 05 February 2024 Publication History

Abstract

Video frame interpolation involves synthesizing intermediate frames between two consecutive frames to enhance the smoothness of a video. Nevertheless, generating high-quality interpolated frames in videos featuring substantial motion and complex scenes remains a formidable challenge. To produce superior quality frames, this paper introduces an interpolation method based on multi-scale features and correlation volumes. The multi-scale feature connects the deep features of high-resolution frames with the shallow features of lower-resolution frames, thereby increasing the number of available pixels and feature details for motion analysis. Correlation volumes are employed to construct correlation features for all pairs of pixels, which are utilized to refine the underlying optical flow field. We propose a unified network approach that eliminates the need for additional complex optical flow network integration, simplifying the training process. The experimental results demonstrate that this method outperforms the baseline approach in both objective and subjective evaluations across various datasets. Particularly, this method exhibits advantages on datasets characterized by complex backgrounds and large motions.

References

[1]

Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. In: International Journal of Computer Vision (IJCV), pp.1106-1125(2019)

[2]

Reda F, Kontkanen J, Tabellion E, FILM: Frame Interpolation for Large Motion. In: Proceedings of the European Conference on Computer Vision (ECCV)(2022)

[3]

Guo Lu, Xiaoyun Zhang, Li Chen, and Zhiyong Gao. Novel integration of frame rate up conversion and HEVC coding based on rate-distortion optimization. IEEE Trans. Image Process., 27(2):678–691, Feb. 2018.

[4]

John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. DeepStereo: Learning to predict new views from the world's imagery. In CVPR, pages 5515–5524, June 2016.

[5]

Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. Learning-based view synthesis for light field cameras. ACM Trans. Graph., 35(6):1–10, 2016.

Digital Library

[6]

Zhang H, Hu Y, Yan M, Thermal image super-resolution via multi-path residual attention network[J]. Signal, Image and Video Processing, 2023, 17(5): 2073-2081.

[7]

Zhang H, Hu Y, Yan M. Thermal Image Super-Resolution Based on Lightweight Dynamic Attention Network for Infrared Sensors[J]. Sensors, 2023, 23(21): 8717

[8]

William A. Starms and Jeffrey K. Uhlmann, "Orthogonal Vector Interpolationfor Aesthetic Image Transformations," Journal of lmage and Graphics, Vol. 5, No. 2. pp.47-51, December 2017.

[9]

Z. Liu, R. Yeh, X. Tang, Y. Liu, and A. Agarwala. Video frame synthesis using deep voxel flow. In ICCV, 2017.

[10]

S. Niklaus and F. Liu. Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2018.

[11]

D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR, 2018.

[12]

S. Niklaus, L. Mai, and F. Liu. Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[13]

Liu, Y., Liao, Y.T., Lin, Y.Y., Chuang, Y.Y.: Deep video frame interpolation using cyclic frame generation. In: AAAI (2019)

[14]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T. Freeman. Video enhancement with task-oriented flow. Int. J. Comput. Vis., 127(8):1106–1125, Feb. 2019.

Digital Library

[15]

Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super slomo: high quality estimation of multiple intermediate frames for video interpolation. In: 2018 IEEE/CVFConference on Computer Vision and Pattern Recognition pp. 9000–9008 (2018)

[16]

Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proc. IEEE ICCV.pp. 261–270 (Oct 2017)

[17]

Niklaus, S.,Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2270–2279 (2017)

[18]

Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In:AAAI Conference on Artifificial Intelligence (2020)

[19]

Park, J., Ko, K., Lee, C., Kim, C.S.: Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation.In: Proceedings of the European Conference on Computer Vision (ECCV) pp. 109–125. Springer (2020).

[20]

Bao, W., Lai, W.S., Zhang, X., Gao, Z., Yang, M.H.: Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) (2018).

[21]

Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. International journal of computer vision 92(1), pp.1–31 (2011)

[22]

Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)

[23]

Lee, H., Kim, T., Chung, T.y., Pak, D., Ban, Y., Lee, S.: Adacof: Adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5316–5325 (2020)

[24]

Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3703–3712 (2019)

[25]

Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 724–732 (2016)

[26]

Huang Z, Zhang T, Heng W, Real-time intermediate flow estimation for video frame interpolation[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 624-642.

Index Terms

Multi-scale feature and correlation volumes for Video Interpolation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

A Multi-scale Densely Connected and Feature Aggregation Network for Hyperspectral Image Classification
PRICAI 2023: Trends in Artificial Intelligence
Abstract
Convolutional neural networks have been widely used in the field of hyperspectral image (HSI) classification due to their excellent ability to model local regions, and have achieved good classification performance. However, HSI classification ...
Single image super-resolution via deep progressive multi-scale fusion networks
Abstract
Deep convolutional neural network-based single-image super-resolution (SR) models typically process either upsampled full-resolution or original low-resolution features, which suffer from context lack and spatially imprecision, respectively. To ...
Feedback Multi-scale Residual Dense Network for image super-resolution
Abstract
The image super-resolution algorithm based on deep learning has a good reconstruction effect, and the reconstruction can be further enhanced by using multi-scale features. There are different extraction methods for multi-scale features,...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICVIP '23: Proceedings of the 2023 7th International Conference on Video and Image Processing

December 2023

97 pages

ISBN:9798400709388

DOI:10.1145/3639390

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 February 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICVIP 2023

ICVIP 2023: 2023 the 7th International Conference on Video and Image Processing

December 14 - 17, 2023

Kyoto, Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
37
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)7

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten