RealFlow: EM-Based Realistic Optical Flow Dataset Generation from Videos

Han, Yunhui; Luo, Kunming; Luo, Ao; Liu, Jiangyu; Fan, Haoqiang; Luo, Guiming; Liu, Shuaicheng

doi:10.1007/978-3-031-19800-7_17

Yunhui Han¹²,
Kunming Luo¹³,
Ao Luo¹³,
Jiangyu Liu¹³,
Haoqiang Fan¹³,
Guiming Luo¹² &
…
Shuaicheng Liu^13,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13679))

Included in the following conference series:

European Conference on Computer Vision

3830 Accesses

Abstract

Obtaining the ground truth labels from a video is challenging since the manual annotation of pixel-wise flow labels is prohibitively expensive and laborious. Besides, existing approaches try to adapt the trained model on synthetic datasets to authentic videos, which inevitably suffers from domain discrepancy and hinders the performance for real-world applications. To solve these problems, we propose RealFlow, an Expectation-Maximization based framework that can create large-scale optical flow datasets directly from any unlabeled realistic videos. Specifically, we first estimate optical flow between a pair of video frames, and then synthesize a new image from this pair based on the predicted flow. Thus the new image pairs and their corresponding flows can be regarded as a new training set. Besides, we design a Realistic Image Pair Rendering (RIPR) module that adopts softmax splatting and bi-directional hole filling techniques to alleviate the artifacts of the image synthesis. In the E-step, RIPR renders new images to create a large quantity of training data. In the M-step, we utilize the generated training data to train an optical flow network, which can be used to estimate optical flows in the next E-step. During the iterative learning steps, the capability of the flow network is gradually improved, so is the accuracy of the flow, as well as the quality of the synthesized dataset. Experimental results show that RealFlow outperforms previous dataset generation methods by a considerably large margin. Moreover, based on the generated dataset, our approach achieves state-of-the-art performance on two standard benchmarks compared with both supervised and unsupervised optical flow methods. Our code and dataset are available at https://github.com/megvii-research/RealFlow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The devil in the details: simple and effective optical flow synthetic data generation

Article 17 February 2024

Learning for Video Super-Resolution Through HR Optical Flow Estimation

Temporal Interpolation as an Unsupervised Pretraining Task for Optical Flow Estimation

References

Aleotti, F., Poggi, M., Mattoccia, S.: Learning optical flow from still images. In: Proceedings CVPR, pp. 15201–15211 (2021)
Google Scholar
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vision 92(1), 1–31 (2011)
Article Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_44
Chapter Google Scholar
Caelles, S., Pont-Tuset, J., Perazzi, F., Montes, A., Maninis, K.K., Van Gool, L.: The 2019 davis challenge on vos: unsupervised multi-object segmentation. arXiv:1905.00737 (2019)
Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings ICCV, pp. 2758–2766 (2015)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings CVPR, pp. 4340–4349 (2016)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Proceedings CVPR, pp. 3354–3361 (2012)
Google Scholar
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Real-time intermediate flow estimation for video frame interpolation. In: Proceedings ECCV (2022)
Google Scholar
Hui, T.W., Tang, X., Loy, C.C.: Liteflownet: a lightweight convolutional neural network for optical flow estimation. In: Proceedings CVPR, pp. 8981–8989 (2018)
Google Scholar
Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings CVPR, pp. 5754–5763 (2019)
Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings CVPR, pp. 2462–2470 (2017)
Google Scholar
Im, W., Kim, T.-K., Yoon, S.-E.: Unsupervised learning of optical flow with deep feature similarity. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12369, pp. 172–188. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_11
Chapter Google Scholar
Janai, J., Güney, F., Ranjan, A., Black, M., Geiger, A.: Unsupervised learning of multi-frame optical flow with occlusions. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 713–731. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_42
Chapter Google Scholar
Janai, J., Guney, F., Wulff, J., Black, M.J., Geiger, A.: Slow flow: exploiting high-speed cameras for accurate and diverse optical flow reference data. In: Proceedings CVPR, pp. 3597–3607 (2017)
Google Scholar
Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_1
Chapter Google Scholar
Jiang, S., Campbell, D., Lu, Y., Li, H., Hartley, R.: Learning to estimate hidden motions with global motion aggregation. In: Proceedings ICCV, pp. 9772–9781 (2021)
Google Scholar
Jonschkowski, R., Stone, A., Barron, J.T., Gordon, A., Konolige, K., Angelova, A.: What matters in unsupervised optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 557–572. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_33
Chapter Google Scholar
Lai, W.S., Huang, J.B., Yang, M.H.: Semi-supervised learning for optical flow with generative adversarial networks. In: Proceedings NeurIPS, pp. 353–363 (2017)
Google Scholar
Li, H., Luo, K., Liu, S.: Gyroflow: gyroscope-guided unsupervised optical flow learning. In: Proceedings ICCV, pp. 12869–12878 (2021)
Google Scholar
Li, J., Wang, N., Zhang, L., Du, B., Tao, D.: Recurrent feature reasoning for image inpainting. In: Proceedings CVPR, pp. 7760–7768 (2020)
Google Scholar
Liu, C., Freeman, W.T., Adelson, E.H., Weiss, Y.: Human-assisted motion annotation. In: Proceedings CVPR, pp. 1–8 (2008)
Google Scholar
Liu, L., et al.: Learning by analogy: reliable supervision from transformations for unsupervised optical flow estimation. In: Proceedings CVPR, pp. 6489–6498 (2020)
Google Scholar
Liu, P., King, I., Lyu, M., Xu, J.: Ddflow:learning optical flow with unlabeled data distillation. In: Proceedings AAAI, pp. 8770–8777 (2019)
Google Scholar
Liu, P., Lyu, M., King, I., Xu, J.: Selflow:self-supervised learning of optical flow. In: Proceedings CVPR, pp. 4571–4580 (2019)
Google Scholar
Liu, S., Luo, K., Luo, A., Wang, C., Meng, F., Zeng, B.: Asflow: unsupervised optical flow learning with adaptive pyramid sampling. IEEE Trans. Circuits Syst. Video Technol. 32(7), 4282–4295 (2021)
Article Google Scholar
Liu, S., Luo, K., Ye, N., Wang, C., Wang, J., Zeng, B.: Oiflow: Occlusion-inpainting optical flow estimation by unsupervised learning. IEEE Trans. on Image Processing 30, 6420–6433 (2021)
Article Google Scholar
Luo, A., Yang, F., Li, X., Liu, S.: Learning optical flow with kernel patch attention. In: Proceedings CVPR, pp. 8906–8915 (2022)
Google Scholar
Luo, A., Yang, F., Luo, K., Li, X., Fan, H., Liu, S.: Learning optical flow with adaptive graph reasoning. In: Proceedings AAAI, pp. 1890–1898 (2022)
Google Scholar
Luo, K., Wang, C., Liu, S., Fan, H., Wang, J., Sun, J.: Upflow: upsampling pyramid for unsupervised optical flow learning. In: Proceedings CVPR, pp. 1045–1054 (2021)
Google Scholar
Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vision 126(9), 942–960 (2018)
Article Google Scholar
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings CVPR, pp. 4040–4048 (2016)
Google Scholar
McLachlan, G.J., Krishnan, T.: The EM algorithm and extensions, vol. 382. John Wiley & Sons (2007)
Google Scholar
Meister, S., Hur, J., Roth, S.: Unflow: unsupervised learning of optical flow with a bidirectional census loss. In: Proceedings AAAI (2018)
Google Scholar
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings CVPR, pp. 3061–3070 (2015)
Google Scholar
Niklaus, S., Liu, F.: Softmax splatting for video frame interpolation. In: Proceedings CVPR, pp. 5437–5446 (2020)
Google Scholar
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings ICCV, pp. 12179–12188 (2021)
Google Scholar
Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell. 44(3), 1623–1637 (2022)
Google Scholar
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings CVPR, pp. 4161–4170 (2017)
Google Scholar
Ren, Z., et al.: Stflow: self-taught optical flow estimation using pseudo labels. IEEE Trans. Image Process. 29, 9113–9124 (2020)
Article MATH Google Scholar
Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Proceedings AAAI, pp. 1495–1501 (2017)
Google Scholar
Smeulders, A.W., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2013)
Google Scholar
Stone, A., Maurer, D., Ayvaci, A., Angelova, A., Jonschkowski, R.: Smurf: self-teaching multi-frame unsupervised raft with full-image warping. In: Proceedings CVPR, pp. 3887–3896 (2021)
Google Scholar
Sun, D., et al.: Tf-raft: a tensorflow implementation of raft. In: ECCV Robust Vision Challenge Workshop (2020)
Google Scholar
Sun, D., et al.: Autoflow: learning a better training set for optical flow. In: Proceedings CVPR, pp. 10093–10102 (2021)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings CVPR, pp. 8934–8943 (2018)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of cnns for optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1408–1423 (2020)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Chapter Google Scholar
Wang, Y., Yang, Y., Yang, Z., Zhao, L., Wang, P., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: Proceedings CVPR, pp. 4884–4893 (2018)
Google Scholar
Xu, X., Siyao, L., Sun et al., W.: Quadratic video interpolation. In: Proceedings NeurIPS 32 (2019)
Google Scholar
Yang, G., Song, X., Huang, C., Deng, Z., Shi, J., Zhou, B.: Drivingstereo: a large-scale dataset for stereo matching in autonomous driving scenarios. In: Proceedings CVPR, pp. 899–908 (2019)
Google Scholar
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings CVPR, pp. 2636–2645 (2020)
Google Scholar
Zhong, Y., Ji, P., Wang, J., Dai, Y., Li, H.: Unsupervised deep epipolar flow for stationary or dynamic scenes. In: Proceedings CVPR, pp. 12095–12104 (2019)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (NSFC) No.62173203, No.61872067 and No.61720106004.

Author information

Authors and Affiliations

School of Software, Tsinghua University, Beijing, 100084, China
Yunhui Han & Guiming Luo
Megvii Technology, Beijing, China
Kunming Luo, Ao Luo, Jiangyu Liu, Haoqiang Fan & Shuaicheng Liu
University of Electronic Science and Technology of China, Chengdu, China
Shuaicheng Liu

Authors

Yunhui Han
View author publications
You can also search for this author in PubMed Google Scholar
Kunming Luo
View author publications
You can also search for this author in PubMed Google Scholar
Ao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jiangyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haoqiang Fan
View author publications
You can also search for this author in PubMed Google Scholar
Guiming Luo
View author publications
You can also search for this author in PubMed Google Scholar
Shuaicheng Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuaicheng Liu .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 18785 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, Y. et al. (2022). RealFlow: EM-Based Realistic Optical Flow Dataset Generation from Videos. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13679. Springer, Cham. https://doi.org/10.1007/978-3-031-19800-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-19800-7_17
Published: 09 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19799-4
Online ISBN: 978-3-031-19800-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

RealFlow: EM-Based Realistic Optical Flow Dataset Generation from Videos