Weakly-Supervised 3D Shape Completion in the Wild

Gu, Jiayuan; Ma, Wei-Chiu; Manivasagam, Sivabalan; Zeng, Wenyuan; Wang, Zihao; Xiong, Yuwen; Su, Hao; Urtasun, Raquel

doi:10.1007/978-3-030-58558-7_17

Jiayuan Gu^12,13,
Wei-Chiu Ma^12,14,
Sivabalan Manivasagam^12,15,
Wenyuan Zeng^12,15,
Zihao Wang¹²,
Yuwen Xiong^12,15,
Hao Su¹³ &
…
Raquel Urtasun^12,15

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12350))

Included in the following conference series:

European Conference on Computer Vision

4509 Accesses
33 Citations

Abstract

3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned. Different from previous methods, we address the problem of learning 3D complete shape from unaligned and real-world partial point clouds. To this end, we propose a weakly-supervised method to estimate both 3D canonical shape and 6-DoF pose for alignment, given multiple partial observations associated with the same instance. The network jointly optimizes canonical shapes and poses with multi-view geometry constraints during training, and can infer the complete shape given a single partial point cloud. Moreover, learned pose estimation can facilitate partial point cloud registration. Experiments on both synthetic and real data show that it is feasible and promising to learn 3D shape completion through large-scale data without shape and pose supervision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning 3D Shape Completion Under Weak Supervision

Article Open access 29 October 2018

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Learning Shape Priors for Single-View 3D Completion And Reconstruction

Notes

1.
We use the term “weakly-supervised” instead of “unsupervised learning of shape and pose” [14] to avoid confusion, which are in fact equivalent.
2.
The canonical frame in our method is not predefined, but emerges during training.
3.
The resulting coordinate system is similar to 3D mask coordinate introduced in [18].

References

Agarwal, S., Snavely, N., Seitz, S.M., Szeliski, R.: Bundle adjustment in the large. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 29–42. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15552-9_3
Chapter Google Scholar
Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., Gall, J.: SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Besl, P.J., McKay, N.D.: Method for registration of 3-d shapes. In: Sensor fusion IV: control paradigms and data structures, vol. 1611, pp. 586–606. International Society for Optics and Photonics (1992)
Google Scholar
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et al.: Shapenet: An information-rich 3d model repository (2015). arXiv preprint arXiv:1512.03012
Chen, X., Chen, B., Mitra, N.J.: Unpaired point cloud completion on real scans using adversarial training (2019). arXiv preprint arXiv:1904.00069
Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Chapter Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5828–5839 (2017)
Google Scholar
Dai, A., Ruizhongtai Qi, C., Nießner, M.: Shape completion using 3d-encoder-predictor CNNs and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5868–5877 (2017)
Google Scholar
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)
Google Scholar
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: The KITTI dataset. Int. J. Robot. Res. (IJRR) (2013)
Google Scholar
Giancola, S., Zarzar, J., Ghanem, B.: Leveraging shape completion for 3d siamese tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1359–1368 (2019)
Google Scholar
Han, Z., Wang, X., Liu, Y.S., Zwicker, M.: Multi-angle point cloud-VAE: Unsupervised feature learning for 3d point clouds from multiple angles by joint self-reconstruction and half-to-half prediction. arXiv (2019)
Google Scholar
Insafutdinov, E., Dosovitskiy, A.: Unsupervised learning of shape and pose with differentiable point clouds. In: Advances in Neural Information Processing Systems, pp. 2802–2812 (2018)
Google Scholar
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: Convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)
Google Scholar
Manivasagam, S., Wang, S., Wong, K., Zeng, W., Sazanovich, M., Tan, S., Yang, B., Ma, W.C., Urtasun, R.: Lidarsim: Realistic LiDAR simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11167–11176 (2020)
Google Scholar
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE (2011)
Google Scholar
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3d object detection from RGB-D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 918–927 (2018)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)
Google Scholar
Rusu, R.B., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3d registration. In: 2009 IEEE International Conference on Robotics and Automation, pp. 3212–3217. IEEE (2009)
Google Scholar
Stutz, D., Geiger, A.: Learning 3d shape completion from laser scan data with weak supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1955–1964 (2018)
Google Scholar
Tang, C., Tan, P.: BA-Net: Dense bundle adjustment network (2018). arXiv preprint arXiv:1806.04807
Tchapmi, L.P., Kosaraju, V., Rezatofighi, H., Reid, I., Savarese, S.: TopNet: Structural point cloud decoder. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 383–392 (2019)
Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment — A modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) IWVA 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44480-7_21
Chapter Google Scholar
Tulsiani, S., Efros, A.A., Malik, J.: Multi-view consistency as supervisory signal for learning shape and pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2897–2905 (2018)
Google Scholar
Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2634 (2017)
Google Scholar
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T.: Demon: Depth and motion network for learning monocular stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5038–5047 (2017)
Google Scholar
Varley, J., DeChant, C., Richardson, A., Ruales, J., Allen, P.: Shape completion enabled robotic grasping. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2442–2447. IEEE (2017)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graphics (TOG), 1, 1–13 (2019)
Google Scholar
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond PASCAL: A benchmark for 3d object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision, pp. 75–82. IEEE (2014)
Google Scholar
Yan, X., Yang, J., Yumer, E., Guo, Y., Lee, H.: Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision. In: Advances in Neural Information Processing Systems, pp. 1696–1704 (2016)
Google Scholar
Yuan, W., Khot, T., Held, D., Mertz, C., Hebert, M.: PCN: Point completion network. In: 2018 International Conference on 3D Vision (3DV), pp. 728–737. IEEE (2018)
Google Scholar
Zhou, Q.Y., Park, J., Koltun, V.: Open3D: A modern library for 3D data processing (2018). arXiv:1801.09847
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: CVPR (2017)
Google Scholar
Zhou, Y., Barnes, C., Lu, J., Yang, J., Li, H.: On the continuity of rotation representations in neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5745–5753 (2019)
Google Scholar
Zhu, R., Kiani Galoogahi, H., Wang, C., Lucey, S.: Rethinking reprojection: Closing the loop for pose-aware shape reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 57–65 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Uber Advanced Technologies Group, Pittsburgh, USA
Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, Zihao Wang, Yuwen Xiong & Raquel Urtasun
University of California, San Diego, USA
Jiayuan Gu & Hao Su
Massachusetts Institute of Technology, Cambridge, USA
Wei-Chiu Ma
University of Toronto, Toronto, Canada
Sivabalan Manivasagam, Wenyuan Zeng, Yuwen Xiong & Raquel Urtasun

Authors

Jiayuan Gu
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Chiu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Sivabalan Manivasagam
View author publications
You can also search for this author in PubMed Google Scholar
Wenyuan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Zihao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuwen Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Hao Su
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Urtasun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiayuan Gu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 14712 KB)

Supplementary material 2 (pdf 17186 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gu, J. et al. (2020). Weakly-Supervised 3D Shape Completion in the Wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12350. Springer, Cham. https://doi.org/10.1007/978-3-030-58558-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-58558-7_17
Published: 29 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58557-0
Online ISBN: 978-3-030-58558-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Weakly-Supervised 3D Shape Completion in the Wild

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning 3D Shape Completion Under Weak Supervision

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Learning Shape Priors for Single-View 3D Completion And Reconstruction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 2 (pdf 17186 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Weakly-Supervised 3D Shape Completion in the Wild

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning 3D Shape Completion Under Weak Supervision

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Learning Shape Priors for Single-View 3D Completion And Reconstruction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 2 (pdf 17186 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation