Local and nonlocal flow-guided video inpainting

Wang, Jing; Yang, Zongju; Huo, Zhanqiang; Chen, Wei

doi:10.1007/s11042-023-15457-z

Local and nonlocal flow-guided video inpainting

Published: 21 June 2023

Volume 83, pages 10321–10340, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jing Wang¹,
Zongju Yang¹,
Zhanqiang Huo ORCID: orcid.org/0000-0001-9243-5009¹ &
…
Wei Chen¹

339 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The purpose of video inpainting is to get a reasonable content from the video to fill in the missing region. Video is a continuous four-dimensional sequence in the temporal dimension. It’s difficult to ensure the temporal continuity of video by inpaint video frames respectively along the time dimension. Video inpainting has gone from the traditional inpainting algorithm to the advanced learning based inpainting method. It has been able to inpainting for a variety of scenes. However, there are still unresolved questions in video inpainting, and video inpainting is still a challenging task. Existing works focused on fixing the problem of object removal in the video, and neglected the importance of inpainting the occlusion scene in the middle region. For the occlusion problem in the middle region, we propose a local and nonlocal optical flow video inpainting framework. First, according to the forward and backward directions of the reference frame and the sampling window, we divide the video into local and nonlocal frames, extract the local and nonlocal optical flow and feed them to the residual network for rough inpainting. Next, our approach extracts and completes the edges of the predicted flow. Finally, the composed optical flow field guides the propagation of pixels to inpaint the video content. Experimental results on DAVIS and YouTube-VOS datasets show that our method has significantly improved in terms of the image quality and optical flow quality compared with the state of the art. Codes are available at {https://github.com/lengfengio/LNFVI.git.}

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 5

Error Compensation Framework for Flow-Guided Video Inpainting

Aggregating multi-scale flow-enhanced information in transformer for video inpainting

Article 29 December 2024

Flow-Guided Transformer for Video Inpainting

Availability of data and materials

We using available datasets to tested our method. The DAVIS dataset can be found in https://davischallenge.org/davis2017/code.html and the YouTube-VOS dataset can be found in https://youtube-vos.org/dataset

Code Availability

We upload the project in GitHub in the next few months.

References

Barnes C, Shechtman E, Finkelstein A, Goldman D B (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24
Article Google Scholar
Beauchemin S S, Barron J L (1995) The computation of optical flow. ACM Comput Surv (CSUR) 27(3):433–466
Article Google Scholar
Chen L, Takaki T, Ishii I (2012) Accuracy of gradient-based optical flow estimation in high-frame-rate video analysis. IEICE Trans Inform Syst 95 (4):1130–1141
Article Google Scholar
Cheng J, Yang Y, Tang X, Xiong N, Zhang Y, Lei F (2020) Generative adversarial networks: a literature review. KSII Trans Internet Inform Syst (TIIS) 14(12):4625–4647
Google Scholar
Cong R, Lei J, Fu H, Porikli F, Huang Q, Hou C (2019) Video saliency detection via sparsity-based reconstruction and propagation. IEEE Trans Image Process 28(10):4819–4831
Article MathSciNet Google Scholar
Ding L, Goshtasby A (2001) On the canny edge detector. Pattern Recogn 34(3):721–725
Article Google Scholar
Ding D, Ram S, Rodríguez J J (2018) Image inpainting using nonlocal texture matching and nonlinear filtering. IEEE Trans Image Process 28 (4):1705–1719
Article MathSciNet Google Scholar
Elharrouss O, Almaadeed N, Al-Maadeed S, Akbari Y (2020) Image inpainting: a review. Neural Process Lett 51(2):2007–2028
Article Google Scholar
Gao C, Saraf A, Huang J-B, Kopf J (2020) Flow-edge guided video completion. In: European conference on computer vision. Springer, pp 713–729
Gibbons F X (1990) Self-attention and behavior: a review and theoretical update. Adv Exper Soc Psychol 23:249–303
Article Google Scholar
Huang J-B, Kang S B, Ahuja N, Kopf J (2016) Temporally coherent completion of dynamic video. ACM Trans Graph (TOG) 35(6):1–11
Google Scholar
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (ToG) 36(4):1–14
Article Google Scholar
Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2462–2470
Jam J, Kendrick C, Walker K, Drouard V, Hsu J G-S, Yap M H (2021) A comprehensive review of past and present image inpainting methods. Comput Vis Image Understand 203:103147
Article Google Scholar
Kim D, Woo S, Lee J-Y, Kweon I S (2019) Deep video inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5792–5801
Köhler R, Schuler C, Schölkopf B, Harmeling S (2014) Mask-specific inpainting with deep neural networks. In: German conference on pattern recognition. Springer, pp 523–534
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25
Lee S, Oh S W, Won D, Kim S J (2019) Copy-and-paste networks for deep video inpainting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4413–4421
Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100
Lin J, Gan C, Han S Temporal shift module for efficient video understanding. arXiv:1811.08383 (2018) 1811
Li Z, Lu C-Z, Qin J, Guo C-L, Cheng M-M (2022) Towards an end-to-end framework for flow-guided video inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 17562–17571
Liu K, Du Q, Yang H, Ma B (2010) Optical flow and principal component analysis-based motion detection in outdoor videos. EURASIP J Adv Signal Process 2010(1):1–6
Article Google Scholar
Liu K, Li J, Hussain Bukhari S S (2022) Overview of image inpainting and forensic technology. Security and Communication Networks, 2022
Nazeri K, Ng E, Joseph T, Qureshi F Z, Ebrahimi M (2019) Edgeconnect: generative image inpainting with adversarial edge learning. arXiv:1901.00212
Newson A, Almansa A, Fradet M, Gousseau Y, Pérez P (2014) Video inpainting of complex scenes. SIAM J Imag Sci 7(4):1993–2019
Article MathSciNet Google Scholar
Oh S W, Lee S, Lee J-Y, Kim S J (2019) Onion-peel networks for deep video completion. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4403–4412
Ouyang H, Wang T, Chen Q (2021) Internal video inpainting by implicit long-range propagation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14579–14588
Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros A A (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732
Ren J S, Xu L, Yan Q, Sun W (2015) Shepard convolutional neural networks. Advances in Neural Information Processing Systems, 28
Shao M, Zhang W, Zuo W, Meng D (2020) Multi-scale generative adversarial inpainting network based on cross-layer attention transfer mechanism. Knowl-Based Syst 196:105778
Article Google Scholar
Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. arXiv:1803.02155
Shih T K, Tang N C, Hwang J-N (2009) Exemplar-based video inpainting without ghost shadow artifacts by maintaining temporal continuity. IEEE Trans Circ Syst Video Technol 19(3):347–360
Article Google Scholar
Sridevi G, Srinivas Kumar S (2019) Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circ Syst Signal Process 38 (8):3802–3817
Article Google Scholar
Theckedath D, Sedamkar RR (2020) Detecting affect states using vgg16, resnet50 and se-resnet50 networks. SN Comput Sci 1(2):1–7
Article Google Scholar
Wang C, Huang H, Han X, Wang J (2019) Video inpainting by jointly learning temporal structure and spatial details. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 5232–5239
Wang L, Chen W, Yang W, Bi F, Yu F R (2020) A state-of-the-art review on image synthesis with generative adversarial networks. IEEE Access 8:63514–63537
Article Google Scholar
Wang L, Guo Y, Liu L, Lin Z, Deng X, An W (2020) Deep video super-resolution using hr optical flow estimation. IEEE Trans Image Process 29:4323–4336
Article Google Scholar
Wexler Y, Shechtman E, Irani M (2007) Space-time completion of video. IEEE Trans Pattern Anal Mach Intell 29(3):463–476
Article Google Scholar
Xu N, Yang L, Fan Y, Yue D, Liang Y, Yang J, Huang T (2018) Youtube-vos: a large-scale video object segmentation benchmark. arXiv:1809.03327
Xu R, Li X, Zhou B, Loy C C (2019) Deep flow-guided video inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3723–3732
Yin Z, Shi J (2018) Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1983–1992
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang T S (2018) Generative image inpainting with contextual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition , pp 5505–5514
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang T S (2019) Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4471–4480
Zeng Y, Fu J, Chao H (2020) Learning joint spatial-temporal transformations for video inpainting. In: European conference on computer vision. Springer, pp 528–543
Zhang Y, Aydın TO (2021) Deep hdr estimation with generative detail reconstruction. In: Computer graphics forum, vol 40. Wiley Online Library, pp 179–190
Zhang M, Zhang G-F (2021) Fast image inpainting strategy based on the space-fractional modified cahn-hilliard equations. Comput Math Applic 102:1–14
Article MathSciNet Google Scholar
Zhang R, Isola P, Efros A A, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhang K, Fu J, Liu D (2022) Inertia-guided flow completion and style fusion for video inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5982–5991
Zhao W, Zhang J, Li L, Barnes N, Liu N, Han J (2021) Weakly supervised video salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pp 16826–16835

Download references

Funding

This work was supported by Fundamental Research Funds for the Universities of Henan Province(NSFRF220414), Excellent Young Teachers Program of Henan Polytechnic University (No.2019XQG-02).

Author information

Authors and Affiliations

School of Software, Henan Polytechnic University, Jiaozuo, 454003, China
Jing Wang, Zongju Yang, Zhanqiang Huo & Wei Chen

Authors

Jing Wang
View author publications
You can also search for this author inPubMed Google Scholar
Zongju Yang
View author publications
You can also search for this author inPubMed Google Scholar
Zhanqiang Huo
View author publications
You can also search for this author inPubMed Google Scholar
Wei Chen
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors took part in the discussion of the work described in this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhanqiang Huo.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for Publication

Not applicable

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, J., Yang, Z., Huo, Z. et al. Local and nonlocal flow-guided video inpainting. Multimed Tools Appl 83, 10321–10340 (2024). https://doi.org/10.1007/s11042-023-15457-z

Download citation

Received: 05 September 2022
Revised: 07 January 2023
Accepted: 18 April 2023
Published: 21 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15457-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Local and nonlocal flow-guided video inpainting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Error Compensation Framework for Flow-Guided Video Inpainting

Aggregating multi-scale flow-enhanced information in transformer for video inpainting

Flow-Guided Transformer for Video Inpainting

Availability of data and materials

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for Publication

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now