research-article

Learning to Handle Large Obstructions in Video Frame Interpolation

Authors:

Jochen LangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 5221 - 5229

https://doi.org/10.1145/3664647.3681006

Published: 28 October 2024 Publication History

Abstract

Video frame interpolation based on optical flow has made great progress in recent years. Most of the previous studies have focused on improving the quality of clean videos. However, many real-world videos contain large obstructions making the video discontinuous. To address this challenge, we propose our Obstruction Robustness Framework (ORF) that enhances the robustness of existing VFI networks in the face of large obstructions. The ORF contains two components: (1) A feature repair module that first captures ambiguous pixels in the synthetic frame by a region similarity map, then repairs them with a cross-overlap attention module. (2) A data augmentation strategy that enables the network to handle dynamic obstructions without extra data. To the best of our knowledge, this is the first work that explicitly addresses the error caused by large obstructions in video frame interpolation. By using previous state-of-the-art methods as backbones, our method does not only improve the results in original benchmarks but also significantly enhances the interpolation quality for videos with obstructions.

References

[1]

Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming- Hsuan Yang. 2019. Depth-aware video frame interpolation. In CVPR. 3703--3712.

[2]

Xianhang Cheng and Zhenzhong Chen. 2021. Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution. IEEE TPAMI 44, 10 (2021), 7029--7045.

Digital Library

[3]

Jinsoo Choi and In So Kweon. 2019. Deep Iterative Frame Interpolation for Full-frame Video Stabilization. ACM Transactions on Graphics (TOG) 38, 6 (2019), Article No. 235.

[4]

Myungsub Choi, Heewon Kim, Bohyung Han, Ning Xu, and Kyoung Mu Lee. 2020. Channel Attention Is All You Need for Video Frame Interpolation. In AAAI. 10663--10671.

[5]

Tianyu Ding, Luming Liang, Zhihui Zhu, and Ilya Zharkov. 2021. CDFI: Compression-Driven Network Design for Frame Interpolation. In CVPR. 8001-- 8011.

[6]

Philipp Dufter, Martin Schmitt, and Hinrich Schütze. 2022. Position information in transformers: An overview. Computational Linguistics 48, 3 (2022), 733--763.

[7]

Ping Hu, Simon Niklaus, Stan Sclaroff, and Kate Saenko. 2022. Many-to-many splatting for efficient video frame interpolation. In CVPR. 3553--3562.

[8]

Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, and Shuchang Zhou. 2022. Real-time intermediate flow estimation for video frame interpolation. In ECCV. Part XIV: 624--642.

[9]

Zhaoyang Jia, Yan Lu, and Houqiang Li. 2022. Neighbor Correspondence Matching for Flow-based Video Frame Synthesis. In Proceedings of the ACM Multimedia Conference.

Digital Library

[10]

Huaizu Jiang, Deqing Sun, Varan Jampani, Ming-Hsuan Yang, Erik Learned- Miller, and Jan Kautz. 2018. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. In CVPR. 9000--9008.

[11]

Nima Khademi Kalantari, Ting-Chun Wang, and Ravi Ramamoorthi. 2016. Learning-Based View Synthesis for Light Field Cameras. ACM TOG 35, 6 (2016), 1--10.

Digital Library

[12]

Tarun Kalluri, Deepak Pathak, Manmohan Chandraker, and Du Tran. 2023. Flavr: Flow-agnostic video representations for fast frame interpolation. In CVPR. 2071-- 2082.

[13]

Soo Ye Kim, Jihyong Oh, and Munchurl Kim. 2020. FISR: Deep Joint Frame Interpolation and Super-Resolution with a Multi-scale Temporal Loss. In AAAI. 11278--11286.

[14]

Taewoo Kim, Yujeong Chae, Hyun-Kurl Jang, and Kuk-Jin Yoon. 2023. Event- Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion Fields. In CVPR. 18032--18042.

[15]

Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, ChengjieWang, and Jie Yang. 2022. IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation. In CVPR. 1959--1968.

[16]

Hyeongmin Lee, Taeoh Kim, Tae-young Chung, Daehyun Pak, Yuseok Ban, and Sangyoun Lee. 2020. AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation. In CVPR. 5315--5324.

[17]

Sangjin Lee, Hyeongmin Lee, Chajin Shin, Hanbin Son, and Sangyoun Lee. 2023. Exploring discontinuity for video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9791--9800.

[18]

Ruoteng Li, XintaoWang, Yue Luo, and Ying Shan. 2022. HybridWarping Fusion for Video Frame Interpolation. International Journal of Computer Vision 130 (2022), 2980--2993.

Digital Library

[19]

Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Learning to See Through Obstructions. In CVPR. 14215--14224.

[20]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In ICCV. 10012--10022.

[21]

Gucan Long, Laurent Kneip, Jose M Alvarez, Hongdong Li, Xiaohu Zhang, and Qifeng Yu. 2016. Learning Image Matching by Simply Watching Video. In ECCV. Springer, Part VI 14: 434--450.

[22]

Guo Lu, Xiaoyun Zhang, Li Chen, and Zhiyong Gao. 2018. Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization. IEEE TIP 27, 2 (2018), 678--691.

[23]

Liying Lu, Ruizheng Wu, Huaijia Lin, Jiangbo Lu, and Jiaya Jia. 2022. Video Frame Interpolation with Transformer. In CVPR. 3522--3532.

[24]

Simone Meyer, Abdelaziz Djelouah, Brian McWilliams, Alexander Sorkine- Hornung, Markus Gross, and Christopher Schroers. 2018. PhaseNet for Video Frame Interpolation. In CVPR. 498--507.

[25]

Simone Meyer, Oliver Wang, Henning Zimmer, Max Grosse, and Alexander Sorkine-Hornung. 2015. Phase-based frame interpolation for video. In CVPR. 1410--1418.

[26]

Simon Niklaus and Feng Liu. 2018. Context-aware Synthesis for Video Frame Interpolation. In CVPR. 645--654.

[27]

Simon Niklaus and Feng Liu. 2020. Softmax splatting for video frame interpolation. In CVPR. 5437--5446.

[28]

Simon Niklaus, Long Mai, and Feng Liu. 2017. Video Frame Interpolation via Adaptive Convolution. In CVPR. 2270--2279.

[29]

Simon Niklaus, Long Mai, and Feng Liu. 2017. Video Frame Interpolation via Adaptive Separable Convolution. In ICCV. 261--270.

[30]

Junheum Park, Jintae Kim, and Chang-Su Kim. 2023. BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation. In CVPR. 1568--1577.

[31]

Junheum Park, Keunsoo Ko, Chul Lee, and Chang-Su Kim. 2020. BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation. In ECCV. Part XIV 16: 109--125.

[32]

Junheum Park, Chul Lee, and Chang-Su Kim. 2021. Asymmetric Bilateral Motion Estimation for Video Frame Interpolation. In ICCV. 14519--14528.

[33]

Tomer Peleg, Pablo Szekely, Doron Sabo, and Omry Sendik. 2019. IM-Net for High Resolution Video Frame Interpolation. In CVPR. 2393--2402.

[34]

Markus Plack, Karlis Martins Briedis, Abdelaziz Djelouah, Matthias B. Hullin, Markus Gross, and Christopher Schroers. 2023. Frame Interpolation Transformer and Uncertainty Guidance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9811--9821.

[35]

Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, and Brian Curless. 2022. FILM: Frame Interpolation for Large Motion. In ECCV. Part VII, 250--266.

[36]

Zheng Shi, Yuval Bahat, Seung-Hwan Baek, Qiang Fu, Hadi Amata, Xiao Li, Praneeth Chakravarthula, Wolfgang Heidrich, and Felix Heide. 2022. Seeing through Obstructions with Diffractive Cloaking. ACM TOG 41, 4 (2022). https: //doi.org/10.1145/3528223.3530185

Digital Library

[37]

Zhihao Shi, Xiangyu XU, Xiaohong Liu, Jun Chen, and Ming-Hsuan Yang. 2022. Video Frame Interpolation Transformer. In CVPR. 17461--17470.

[38]

Patrice Y Simard, Dave Steinkraus, and John C Platt. 2003. Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. In International Conference on Document Analysis and Recognition. 958--963.

[39]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE TIP 13, 4 (2004), 600--612.

[40]

Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, and Chenliang Xu. 2020. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution. In CVPR. 3367--3376.

[41]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video Enhancement with Task-Oriented Flow. IJCV 127, 8 (2019), 1106--1125.

Digital Library

[42]

Jun-Sang Yoo, Hongjae Lee, and Seung-Won Jung. 2023. Video Object Segmentation-aware Video Frame Interpolation. In ICCV. 12322--12333.

[43]

Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, and Limin Wang. 2023. Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In CVPR. 5682--5692.

[44]

Chengcheng Zhou, Zongqing Lu, Qiangyu Yan, Linge Li, and Jing-Hao Xue. 2021. How Video Super-Resolution and Frame Interpolation Mutually Benefit. In Proceedings of the ACM Multimedia Conference.

Digital Library

[45]

Tinghui Zhou, Shubham Tulsiani,Weilun Sun, Jitendra Malik, and Alexei A Efros. 2016. View synthesis by appearance flow. In ECCV. Part IV 14, 286--301.

[46]

Chengxuan Zhu, Renjie Wan, Yunkai Tang, and Boxin Shi. 2023. Occlusion-Free Scene Recovery via Neural Radiance Fields. In CVPR. 20722--20731

Index Terms

Learning to Handle Large Obstructions in Video Frame Interpolation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

Obstructions to a binary matroid being graphic

Bixby and Cunningham showed that a 3-connected binary matroid M is graphic if and only if every element belongs to at most two non-separating cocircuits. Likewise, Lemos showed that such a matroid M is graphic if and only if it has exactly r(M)+1 non-...
Obstructions for acyclic local tournament orientation completions
Abstract
The orientation completion problem for a fixed class of oriented graphs asks whether a given partially oriented graph can be completed to an oriented graph in the class. Orientation completion problems have been studied recently for ...
H -kernels and H -obstructions in H -colored digraphs

Let D be a digraph. V ( D ) and A ( D ) will denote the vertex and arc sets of D respectively. A kernel K of a digraph D is an independent set of vertices of D such that for every vertex w in V ( D ) - K there exists an arc from w to a vertex in K . Let ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
123
Total Downloads

Downloads (Last 12 months)123
Downloads (Last 6 weeks)58

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten