Abstract
We propose and study a method called FLOT that estimates scene flow on point clouds. We start the design of FLOT by noticing that scene flow estimation on point clouds reduces to estimating a permutation matrix in a perfect world. Inspired by recent works on graph matching, we build a method to find these correspondences by borrowing tools from optimal transport. Then, we relax the transport constraints to take into account real-world imperfections. The transport cost between two points is given by the pairwise similarity between deep features extracted by a neural network trained under full supervision using synthetic datasets. Our main finding is that FLOT can perform as well as the best existing methods on synthetic and real-world datasets while requiring much less parameters and without using multiscale analysis. Our second finding is that, on the training datasets considered, most of the performance can be explained by the learned transport cost. This yields a simpler method, FLOT\(_0\), which is obtained using a particular choice of optimal transport parameters and performs nearly as well as FLOT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Code and pretrained model available at https://github.com/laoreja/HPLFlowNet.
- 2.
Code and datasets available at https://github.com/xingyul/flownet3d.
- 3.
Code and pretrained model available at https://github.com/DylanWusee/PointPWC.
References
Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: a view centered variational approach. In: Conference on Computer Vision and Pattern Recognition, pp. 1506–1513. IEEE (2010)
Battrawy, R., Schuster, R., Wasenmller, O., Rao, Q., Stricker, D.: LiDAR-Flow: dense scene flow estimation from sparse lidar and stereo images. In: International Conference on Intelligent Robots and Systems, pp. 7762–7769. IEEE (2019)
Baur, S.A., Moosmann, F., Wirges, S., Rist, C.B.: Real-time 3D LiDAR flow for autonomous vehicles. In: Intelligent Vehicles Symposium, pp. 1288–1295. IEEE (2019)
Behl, A., Paschalidou, D., Donné, S., Geiger, A.: PointFlowNet: learning representations for rigid motion estimation from point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 7962–7971. IEEE (2019)
Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1256–1272 (2017)
Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling algorithms for unbalanced transport problems. Math. Comput. 87, 2563–2609 (2018)
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 2292–2300. Curran Associates, Inc. (2013)
Dewan, A., Caselitz, T., Tipaldi, G.D., Burgard, W.: Rigid scene flow for 3D LiDAR scans. In: International Conference on Intelligent Robots and Systems (IROS), pp. 1765–1770. IEEE (2016)
Genevay, A., Peyré, G., Cuturi, M.: Learning generative models with sinkhorn divergences. In: Storkey, A., Perez-Cruz, F. (eds.) International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 84, pp. 1608–1617. PMLR (2018)
Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: International Conference on Machine Learning, pp. 399–406 (2010)
Gu, X., Wang, Y., Wu, C., Lee, Y.J., Wang, P.: HPLFlowNet: hierarchical permutohedral lattice FlowNet for scene flow estimation on large-scale point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 3249–3258. IEEE (2019)
Hadfield, S., Bowden, R.: Kinecting the dots: particle based scene flow from depth sensors. In: International Conference on Computer Vision, pp. 2290–2295. IEEE (2011)
Kingma, D.P., Adam, J.B.: Adam : a method for stochastic optimization. In: International Conference on Learning Representations. arXiv.org (2015)
Liu, J., Sun, Y., Eldeniz, C., Gan, W., An, H., Kamilov, U.S.: RARE: image reconstruction using deep priors learned without ground truth. J. Sel. Top. Signal Process. 14(6), 1088–1099 (2020)
Liu, X., Qi, C.R., Guibas, L.J.: FlowNet3D: learning scene flow in 3D point clouds. In: Conference on Computer Vision and Pattern Recognition, pp. 529–537. IEEE (2019)
Ma, W.C., Wang, S., Hu, R., Xiong, Y., Urtasun, R.: Deep rigid instance scene flow. In: Conference on Computer Vision and Pattern Recognition, pp. 3609–3617. IEEE (2019)
Mardani, M., et al.: Neural proximal gradient descent for compressive imaging. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 9573–9583. Curran Associates, Inc. (2018)
Maretic, H.P., Gheche, M.E., Chierchia, G., Frossard, P.: GOT: an optimal transport framework for graph comparison. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 13876–13887. Curran Associates, Inc. (2019)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Conference on Computer Vision and Pattern Recognition, pp. 4040–4048. IEEE (2016)
Meinhardt, T., Moller, M., Hazirbas, C., Cremers, D.: Learning proximal operators: using denoising networks for regularizing inverse imaging problems. In: International Conference on Computer Vision, pp. 1799–1808. IEEE (2017)
Mémoli, F.: Gromov-wasserstein distances and the metric approach to object matching. Found. Comput. Math. 11(4), 417–487 (2011)
Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (2015)
Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sens. 140, 60–76 (2018)
Metzler, C., Mousavi, A., Baraniuk, R.: Learned D-AMP: principled neural network based compressive image recovery. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, pp. 1772–1783. Curran Associates, Inc. (2017)
Mittal, H., Okorn, B., Held, D.: Just go with the flow: self-supervised scene flow estimation. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)
Mousavi, A., Baraniuk, R.G.: Learning to invert: signal recovery via deep convolutional networks. In: International Conference on Acoustics, Speech and Signal Processing, pp. 2272–2276. IEEE (2017)
Nikolentzos, G., Meladianos, P., Vazirgiannis, M.: Matching node embeddings for graph similarity. In: AAAI Conference on Artificial Intelligence, pp. 2429–2435 (2017)
Peyré, G., Cuturi, M.: Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)
Peyré, G., Cuturi, M., Solomon, J.: Gromov-Wasserstein averaging of kernel and distance matrices. In: Balcan, M.F., Weinberger, K.Q. (eds.) International Conference on Machine Learning, vol. 48, pp. 2664–2672. PMLR (2016)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, pp. 5099–5108. Curran Associates, Inc. (2017)
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Conference on Computer Vision and Pattern Recognition. IEEE (2020)
Shao, L., Shah, P., Dwaracherla, V., Bohg, J.: Motion-based object segmentation based on dense RGB-D scene flow. Robot. Autom. Lett. 3(4), 3797–3804 (2018)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Conference on Computer Vision and Pattern Recognition, pp. 8934–8943. IEEE (2018)
Titouan, V., Courty, N., Tavenard, R., Laetitia, C., Flamary, R.: Optimal transport for structured data with application on graphs. In: Chaudhuri, K., Salakhutdinov, R. (eds.) International Conference on Machine Learning, vol. 97, pp. 6275–6284. PMLR (2019)
Ushani, A.K., Wolcott, R.W., Walls, J.M., Eustice, R.M.: A learning approach for real-time temporal scene flow estimation from LIDAR data. In: International Conference on Robotics and Automation, pp. 5666–5673. IEEE (2017)
Ushani, A.K., Eustice, R.M.: Feature learning for scene flow estimation from LIDAR. In: Billard, A., Dragan, A., Peters, J., Morimoto, J. (eds.) Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 87, pp. 283–292. PMLR (2018)
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 5998–6008. Curran Associates, Inc. (2017)
Vedula, S., Baker, S., Rander, P., Collins, R., Kanade, T.: Three-dimensional scene flow. In: International Conference on Computer Vision, vol. 2, pp. 722–729. IEEE (1999)
Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: International Conference on Computer Vision, pp. 1377–1384. IEEE (2013)
Wang, S., Suo, S., Ma, W.C., Pokrovsky, A., Urtasun, R.: Deep parametric continuous convolutional neural networks. In: Conference on Computer Vision and Pattern Recognition, pp. 2589–2597. IEEE (2018)
Wang, S., Fidler, S., Urtasun, R.: Proximal deep structured models. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 865–873. Curran Associates, Inc. (2016)
Wang, X., Jabri, A., Efros, A.A.: Learning correspondence from the cycle-consistency of time. In: Conference on Computer Vision and Pattern Recognition, pp. 2566–2576. IEEE (2019)
Wang, Y., Solomon, J.M.: Deep closest point: learning representations for point cloud registration. In: International Conference on Computer Vision, pp. 3522–3531. IEEE (2019)
Wang, Y., Solomon, J.M.: PRNet: self-supervised learning for partial-to-partial registration. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 8814–8826. Curran Associates, Inc. (2019)
Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) European Conference on Computer Vision, pp. 739–751. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_56
Wu, W., Wang, Z., Li, Z., Liu, W., Fuxin, L.: PointPWC-Net: a coarse-to-fine network for supervised and self-supervised scene flow estimation on 3D point clouds. arXiv:1911.12408v1 (2019)
Zou, C., He, B., Zhu, M., Zhang, L., Zhang, J.: Learning motion field of LiDAR point cloud with convolutional networks. Pattern Recogn. Lett. 125, 514–520 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Puy, G., Boulch, A., Marlet, R. (2020). FLOT: Scene Flow on Point Clouds Guided by Optimal Transport. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-58604-1_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58603-4
Online ISBN: 978-3-030-58604-1
eBook Packages: Computer ScienceComputer Science (R0)