skip to main content
10.1145/3639390.3639400acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvipConference Proceedingsconference-collections
research-article

Multi-scale feature and correlation volumes for Video Interpolation

Published: 05 February 2024 Publication History

Abstract

Video frame interpolation involves synthesizing intermediate frames between two consecutive frames to enhance the smoothness of a video. Nevertheless, generating high-quality interpolated frames in videos featuring substantial motion and complex scenes remains a formidable challenge. To produce superior quality frames, this paper introduces an interpolation method based on multi-scale features and correlation volumes. The multi-scale feature connects the deep features of high-resolution frames with the shallow features of lower-resolution frames, thereby increasing the number of available pixels and feature details for motion analysis. Correlation volumes are employed to construct correlation features for all pairs of pixels, which are utilized to refine the underlying optical flow field. We propose a unified network approach that eliminates the need for additional complex optical flow network integration, simplifying the training process. The experimental results demonstrate that this method outperforms the baseline approach in both objective and subjective evaluations across various datasets. Particularly, this method exhibits advantages on datasets characterized by complex backgrounds and large motions.

References

[1]
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. In: International Journal of Computer Vision (IJCV), pp.1106-1125(2019)
[2]
Reda F, Kontkanen J, Tabellion E, FILM: Frame Interpolation for Large Motion. In: Proceedings of the European Conference on Computer Vision (ECCV)(2022)
[3]
Guo Lu, Xiaoyun Zhang, Li Chen, and Zhiyong Gao. Novel integration of frame rate up conversion and HEVC coding based on rate-distortion optimization. IEEE Trans. Image Process., 27(2):678–691, Feb. 2018.
[4]
John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. DeepStereo: Learning to predict new views from the world's imagery. In CVPR, pages 5515–5524, June 2016.
[5]
Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. Learning-based view synthesis for light field cameras. ACM Trans. Graph., 35(6):1–10, 2016.
[6]
Zhang H, Hu Y, Yan M, Thermal image super-resolution via multi-path residual attention network[J]. Signal, Image and Video Processing, 2023, 17(5): 2073-2081.
[7]
Zhang H, Hu Y, Yan M. Thermal Image Super-Resolution Based on Lightweight Dynamic Attention Network for Infrared Sensors[J]. Sensors, 2023, 23(21): 8717
[8]
William A. Starms and Jeffrey K. Uhlmann, "Orthogonal Vector Interpolationfor Aesthetic Image Transformations," Journal of lmage and Graphics, Vol. 5, No. 2. pp.47-51, December 2017.
[9]
Z. Liu, R. Yeh, X. Tang, Y. Liu, and A. Agarwala. Video frame synthesis using deep voxel flow. In ICCV, 2017.
[10]
S. Niklaus and F. Liu. Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2018.
[11]
D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR, 2018.
[12]
S. Niklaus, L. Mai, and F. Liu. Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[13]
Liu, Y., Liao, Y.T., Lin, Y.Y., Chuang, Y.Y.: Deep video frame interpolation using cyclic frame generation. In: AAAI (2019)
[14]
Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T. Freeman. Video enhancement with task-oriented flow. Int. J. Comput. Vis., 127(8):1106–1125, Feb. 2019.
[15]
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super slomo: high quality estimation of multiple intermediate frames for video interpolation. In: 2018 IEEE/CVFConference on Computer Vision and Pattern Recognition pp. 9000–9008 (2018)
[16]
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proc. IEEE ICCV.pp. 261–270 (Oct 2017)
[17]
Niklaus, S.,Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2270–2279 (2017)
[18]
Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In:AAAI Conference on Artifificial Intelligence (2020)
[19]
Park, J., Ko, K., Lee, C., Kim, C.S.: Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation.In: Proceedings of the European Conference on Computer Vision (ECCV) pp. 109–125. Springer (2020).
[20]
Bao, W., Lai, W.S., Zhang, X., Gao, Z., Yang, M.H.: Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) (2018).
[21]
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. International journal of computer vision 92(1), pp.1–31 (2011)
[22]
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
[23]
Lee, H., Kim, T., Chung, T.y., Pak, D., Ban, Y., Lee, S.: Adacof: Adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5316–5325 (2020)
[24]
Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3703–3712 (2019)
[25]
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 724–732 (2016)
[26]
Huang Z, Zhang T, Heng W, Real-time intermediate flow estimation for video frame interpolation[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 624-642.

Index Terms

  1. Multi-scale feature and correlation volumes for Video Interpolation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICVIP '23: Proceedings of the 2023 7th International Conference on Video and Image Processing
    December 2023
    97 pages
    ISBN:9798400709388
    DOI:10.1145/3639390
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 February 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Correlation volumes
    2. Flow
    3. Multi-scale feature

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICVIP 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 37
      Total Downloads
    • Downloads (Last 12 months)37
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media