Skip to main content

FLOT: Scene Flow on Point Clouds Guided by Optimal Transport

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12373))

Included in the following conference series:

  • 4078 Accesses

Abstract

We propose and study a method called FLOT that estimates scene flow on point clouds. We start the design of FLOT by noticing that scene flow estimation on point clouds reduces to estimating a permutation matrix in a perfect world. Inspired by recent works on graph matching, we build a method to find these correspondences by borrowing tools from optimal transport. Then, we relax the transport constraints to take into account real-world imperfections. The transport cost between two points is given by the pairwise similarity between deep features extracted by a neural network trained under full supervision using synthetic datasets. Our main finding is that FLOT can perform as well as the best existing methods on synthetic and real-world datasets while requiring much less parameters and without using multiscale analysis. Our second finding is that, on the training datasets considered, most of the performance can be explained by the learned transport cost. This yields a simpler method, FLOT\(_0\), which is obtained using a particular choice of optimal transport parameters and performs nearly as well as FLOT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Code and pretrained model available at https://github.com/laoreja/HPLFlowNet.

  2. 2.

    Code and datasets available at https://github.com/xingyul/flownet3d.

  3. 3.

    Code and pretrained model available at https://github.com/DylanWusee/PointPWC.

References

  1. Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: a view centered variational approach. In: Conference on Computer Vision and Pattern Recognition, pp. 1506–1513. IEEE (2010)

    Google Scholar 

  2. Battrawy, R., Schuster, R., Wasenmller, O., Rao, Q., Stricker, D.: LiDAR-Flow: dense scene flow estimation from sparse lidar and stereo images. In: International Conference on Intelligent Robots and Systems, pp. 7762–7769. IEEE (2019)

    Google Scholar 

  3. Baur, S.A., Moosmann, F., Wirges, S., Rist, C.B.: Real-time 3D LiDAR flow for autonomous vehicles. In: Intelligent Vehicles Symposium, pp. 1288–1295. IEEE (2019)

    Google Scholar 

  4. Behl, A., Paschalidou, D., Donné, S., Geiger, A.: PointFlowNet: learning representations for rigid motion estimation from point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 7962–7971. IEEE (2019)

    Google Scholar 

  5. Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1256–1272 (2017)

    Article  Google Scholar 

  6. Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling algorithms for unbalanced transport problems. Math. Comput. 87, 2563–2609 (2018)

    Article  MathSciNet  Google Scholar 

  7. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 2292–2300. Curran Associates, Inc. (2013)

    Google Scholar 

  8. Dewan, A., Caselitz, T., Tipaldi, G.D., Burgard, W.: Rigid scene flow for 3D LiDAR scans. In: International Conference on Intelligent Robots and Systems (IROS), pp. 1765–1770. IEEE (2016)

    Google Scholar 

  9. Genevay, A., Peyré, G., Cuturi, M.: Learning generative models with sinkhorn divergences. In: Storkey, A., Perez-Cruz, F. (eds.) International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 84, pp. 1608–1617. PMLR (2018)

    Google Scholar 

  10. Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: International Conference on Machine Learning, pp. 399–406 (2010)

    Google Scholar 

  11. Gu, X., Wang, Y., Wu, C., Lee, Y.J., Wang, P.: HPLFlowNet: hierarchical permutohedral lattice FlowNet for scene flow estimation on large-scale point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 3249–3258. IEEE (2019)

    Google Scholar 

  12. Hadfield, S., Bowden, R.: Kinecting the dots: particle based scene flow from depth sensors. In: International Conference on Computer Vision, pp. 2290–2295. IEEE (2011)

    Google Scholar 

  13. Kingma, D.P., Adam, J.B.: Adam : a method for stochastic optimization. In: International Conference on Learning Representations. arXiv.org (2015)

  14. Liu, J., Sun, Y., Eldeniz, C., Gan, W., An, H., Kamilov, U.S.: RARE: image reconstruction using deep priors learned without ground truth. J. Sel. Top. Signal Process. 14(6), 1088–1099 (2020)

    Article  Google Scholar 

  15. Liu, X., Qi, C.R., Guibas, L.J.: FlowNet3D: learning scene flow in 3D point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 529–537. IEEE (2019)

    Google Scholar 

  16. Ma, W.C., Wang, S., Hu, R., Xiong, Y., Urtasun, R.: Deep rigid instance scene flow. In: Conference on Computer Vision and Pattern Recognition, pp. 3609–3617. IEEE (2019)

    Google Scholar 

  17. Mardani, M., et al.: Neural proximal gradient descent for compressive imaging. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 9573–9583. Curran Associates, Inc. (2018)

    Google Scholar 

  18. Maretic, H.P., Gheche, M.E., Chierchia, G., Frossard, P.: GOT: an optimal transport framework for graph comparison. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 13876–13887. Curran Associates, Inc. (2019)

    Google Scholar 

  19. Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Conference on Computer Vision and Pattern Recognition, pp. 4040–4048. IEEE (2016)

    Google Scholar 

  20. Meinhardt, T., Moller, M., Hazirbas, C., Cremers, D.: Learning proximal operators: using denoising networks for regularizing inverse imaging problems. In: International Conference on Computer Vision, pp. 1799–1808. IEEE (2017)

    Google Scholar 

  21. Mémoli, F.: Gromov-wasserstein distances and the metric approach to object matching. Found. Comput. Math. 11(4), 417–487 (2011)

    Article  MathSciNet  Google Scholar 

  22. Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (2015)

    Google Scholar 

  23. Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sens. 140, 60–76 (2018)

    Article  Google Scholar 

  24. Metzler, C., Mousavi, A., Baraniuk, R.: Learned D-AMP: principled neural network based compressive image recovery. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, pp. 1772–1783. Curran Associates, Inc. (2017)

    Google Scholar 

  25. Mittal, H., Okorn, B., Held, D.: Just go with the flow: self-supervised scene flow estimation. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)

    Google Scholar 

  26. Mousavi, A., Baraniuk, R.G.: Learning to invert: signal recovery via deep convolutional networks. In: International Conference on Acoustics, Speech and Signal Processing, pp. 2272–2276. IEEE (2017)

    Google Scholar 

  27. Nikolentzos, G., Meladianos, P., Vazirgiannis, M.: Matching node embeddings for graph similarity. In: AAAI Conference on Artificial Intelligence, pp. 2429–2435 (2017)

    Google Scholar 

  28. Peyré, G., Cuturi, M.: Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)

    Article  Google Scholar 

  29. Peyré, G., Cuturi, M., Solomon, J.: Gromov-Wasserstein averaging of kernel and distance matrices. In: Balcan, M.F., Weinberger, K.Q. (eds.) International Conference on Machine Learning, vol. 48, pp. 2664–2672. PMLR (2016)

    Google Scholar 

  30. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, pp. 5099–5108. Curran Associates, Inc. (2017)

    Google Scholar 

  31. Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)

    Google Scholar 

  32. Shao, L., Shah, P., Dwaracherla, V., Bohg, J.: Motion-based object segmentation based on dense RGB-D scene flow. Robot. Autom. Lett. 3(4), 3797–3804 (2018)

    Article  Google Scholar 

  33. Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Conference on Computer Vision and Pattern Recognition, pp. 8934–8943. IEEE (2018)

    Google Scholar 

  34. Titouan, V., Courty, N., Tavenard, R., Laetitia, C., Flamary, R.: Optimal transport for structured data with application on graphs. In: Chaudhuri, K., Salakhutdinov, R. (eds.) International Conference on Machine Learning, vol. 97, pp. 6275–6284. PMLR (2019)

    Google Scholar 

  35. Ushani, A.K., Wolcott, R.W., Walls, J.M., Eustice, R.M.: A learning approach for real-time temporal scene flow estimation from LIDAR data. In: International Conference on Robotics and Automation, pp. 5666–5673. IEEE (2017)

    Google Scholar 

  36. Ushani, A.K., Eustice, R.M.: Feature learning for scene flow estimation from LIDAR. In: Billard, A., Dragan, A., Peters, J., Morimoto, J. (eds.) Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 87, pp. 283–292. PMLR (2018)

    Google Scholar 

  37. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5998–6008. Curran Associates, Inc. (2017)

    Google Scholar 

  38. Vedula, S., Baker, S., Rander, P., Collins, R., Kanade, T.: Three-dimensional scene flow. In: International Conference on Computer Vision, vol. 2, pp. 722–729. IEEE (1999)

    Google Scholar 

  39. Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: International Conference on Computer Vision, pp. 1377–1384. IEEE (2013)

    Google Scholar 

  40. Wang, S., Suo, S., Ma, W.C., Pokrovsky, A., Urtasun, R.: Deep parametric continuous convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition, pp. 2589–2597. IEEE (2018)

    Google Scholar 

  41. Wang, S., Fidler, S., Urtasun, R.: Proximal deep structured models. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 865–873. Curran Associates, Inc. (2016)

    Google Scholar 

  42. Wang, X., Jabri, A., Efros, A.A.: Learning correspondence from the cycle-consistency of time. In: Conference on Computer Vision and Pattern Recognition, pp. 2566–2576. IEEE (2019)

    Google Scholar 

  43. Wang, Y., Solomon, J.M.: Deep closest point: learning representations for point cloud registration. In: International Conference on Computer Vision, pp. 3522–3531. IEEE (2019)

    Google Scholar 

  44. Wang, Y., Solomon, J.M.: PRNet: self-supervised learning for partial-to-partial registration. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 8814–8826. Curran Associates, Inc. (2019)

    Google Scholar 

  45. Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) European Conference on Computer Vision, pp. 739–751. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_56

    Chapter  Google Scholar 

  46. Wu, W., Wang, Z., Li, Z., Liu, W., Fuxin, L.: PointPWC-Net: a coarse-to-fine network for supervised and self-supervised scene flow estimation on 3D point clouds. arXiv:1911.12408v1 (2019)

  47. Zou, C., He, B., Zhu, M., Zhang, L., Zhang, J.: Learning motion field of LiDAR point cloud with convolutional networks. Pattern Recogn. Lett. 125, 514–520 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gilles Puy .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 277 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Puy, G., Boulch, A., Marlet, R. (2020). FLOT: Scene Flow on Point Clouds Guided by Optimal Transport. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58604-1_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58603-4

  • Online ISBN: 978-3-030-58604-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics