Semi-supervised Single-View 3D Reconstruction via Prototype Shape Priors

Xing, Zhen; Li, Hengduo; Wu, Zuxuan; Jiang, Yu-Gang

doi:10.1007/978-3-031-19769-7_31

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13661))

Included in the following conference series:

European Conference on Computer Vision

Abstract

The performance of existing single-view 3D reconstruction methods heavily relies on large-scale 3D annotations. However, such annotations are tedious and expensive to collect. Semi-supervised learning serves as an alternative way to mitigate the need for manual labels, but remains unexplored in 3D reconstruction. Inspired by the recent success of semi-supervised image classification tasks, we propose SSP3D, a semi-supervised framework for 3D reconstruction. In particular, we introduce an attention-guided prototype shape prior module for guiding realistic object reconstruction. We further introduce a discriminator-guided module to incentivize better shape generation, as well as a regularizer to tolerate noisy training samples. On the ShapeNet benchmark, the proposed approach outperforms previous supervised methods by clear margins under various labeling ratios, (i.e., 1%, 5% , 10% and 20%). Moreover, our approach also performs well when transferring to real-world Pix3D datasets under labeling ratios of 10%. We also demonstrate our method could transfer to novel categories with few novel supervised data. Experiments on the popular ShapeNet dataset show that our method outperforms the zero-shot baseline by over 12% and we also perform rigorous ablations and analysis to validate our approach. Code is available at https://github.com/ChenHsing/SSP3D.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Shape Priors for Single-View 3D Completion And Reconstruction

Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

Single View 3D Reconstruction with Category Information Learning

Notes

1.
Please refer to Appendix for more details.

References

Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. In: NeurIPS (2019)
Google Scholar
Brier, G.W., et al.: Verification of forecasts expressed in terms of probability. Monthly Weather Rev. 78, 1–3 (1950)
Article Google Scholar
Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Rob. 32, 1309–1332 (2016)
Article Google Scholar
Chang, A.X., et al.: Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)
Cheng, T.Y., Yang, H.R., Trigoni, N., Chen, H.T., Liu, T.L.: Pose adaptive dual mixup for few-shot single-view 3D reconstruction. In: AAAI (2022)
Google Scholar
Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Chapter Google Scholar
Ge, C., Liang, Y., Song, Y., Jiao, J., Wang, J., Luo, P.: Revitalizing CNN attention via transformers in self-supervised visual representation learning. In: NeurIPS (2021)
Google Scholar
Gkioxari, G., Malik, J., Johnson, J.: Mesh r-cnn. In: ICCV (2019)
Google Scholar
Grill, J.B., et al.: Bootstrap your own latent-a new approach to self-supervised learning. In: NeurIPS (2020)
Google Scholar
Guo, H., Mao, Y., Zhang, R.: Mixup as locally linear out-of-manifold regularization. In: AAAI (2019)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: Augmix: a simple data processing method to improve robustness and uncertainty. In: ICLR (2020)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2017)
Google Scholar
Laradji, I., Rodríguez, P., Vazquez, D., Nowrouzezahrai, D.: SSR: semi-supervised soft rasterizer for single-view 2D to 3D reconstruction. In: ICCVW (2021)
Google Scholar
Li, H., Wu, Z., Shrivastava, A., Davis, L.S.: Rethinking pseudo labels for semi-supervised object detection. In: AAAI (2022)
Google Scholar
Liu, Y.C., et al.: Unbiased teacher for semi-supervised object detection. In: ICLR (2021)
Google Scholar
Michalkiewicz, M., Parisot, S., Tsogkas, S., Baktashmotlagh, M., Eriksson, A., Belilovsky, E.: Few-shot single-view 3-D object reconstruction with compositional priors. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 614–630. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_37
Chapter Google Scholar
Qiao, S., Shen, W., Zhang, Z., Wang, B., Yuille, A.: Deep co-training for semi-supervised image recognition. In: ECCV (2018)
Google Scholar
Richter, S.R., Roth, S.: Matryoshka networks: predicting 3D geometry via nested shape layers. In: CVPR (2018)
Google Scholar
Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: NeurIPS (2016)
Google Scholar
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: CVPR (2016)
Google Scholar
Shi, Z., Meng, Z., Xing, Y., Ma, Y., Wattenhofer, R.: 3D-RETR: end-to-end single and multi-view 3D reconstruction with transformers. In: BMVC (2021)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. In: NeurIPS (2020)
Google Scholar
Sun, X., et al.: Pix3D: dataset and methods for single-image 3D shape modeling. In: CVPR (2018)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: NeurIPS (2017)
Google Scholar
Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: ICCV (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Wallace, B., Hariharan, B.: Few-shot generalization for single-image 3D reconstruction via priors. In: ICCV (2019)
Google Scholar
Wang, D., et al.: Multi-view 3D reconstruction with transformers. In: ICCV (2021)
Google Scholar
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2mesh: generating 3D mesh models from single RGB images. In: ECCV (2018)
Google Scholar
Weng, Z., Yang, X., Li, A., Wu, Z., Jiang, Y.G.: Semi-supervised vision transformers. In: ECCV (2022)
Google Scholar
Wu, J., Wang, Y., Xue, T., Sun, X., Freeman, B., Tenenbaum, J.: Marrnet: 3D shape reconstruction via 2.5D sketches. In: NeurIPS (2017)
Google Scholar
Wu, J., Zhang, C., Xue, T., Freeman, W.T., Tenenbaum, J.B.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: NeurIPS (2016)
Google Scholar
Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W.T., Tenenbaum, J.B.: Learning shape priors for single-view 3D completion and reconstruction. In: ECCV (2018)
Google Scholar
Xie, H., Yao, H., Sun, X., Zhou, S., Zhang, S.: Pix2vox: context-aware 3D reconstruction from single and multi-view images. In: ICCV (2019)
Google Scholar
Xie, H., Yao, H., Zhang, S., Zhou, S., Sun, W.: Pix2vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. IJCV 128, 2919–2935 (2020)
Article Google Scholar
Yalniz, I.Z., Jégou, H., Chen, K., Paluri, M., Mahajan, D.: Billion-scale semi-supervised learning for image classification. arXiv preprint arXiv:1905.00546 (2019)
Yang, G., Cui, Y., Belongie, S., Hariharan, B.: Learning single-view 3D reconstruction with limited pose supervision. In: ECCV (2018)
Google Scholar
Yang, S., Xu, M., Xie, H., Perry, S., Xia, J.: Single-view 3D object reconstruction from shape priors in memory. In: CVPR (2021)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: ICCV (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. In: ICLR (2018)
Google Scholar
Zhang, X., Zhang, Z., Zhang, C., Tenenbaum, J., Freeman, B., Wu, J.: Learning to reconstruct shapes from unseen classes. In: NeurIPS (2018)
Google Scholar

Download references

Acknowledgement

Y.-G. Jiang was sponsored in part by “Shuguang Program” supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission (No. 20SG01). Z. Wu was supported by NSFC under Grant No. 62102092.

Author information

Authors and Affiliations

Shanghai Key Laboratory of Intelligent Information Processing, School of CS, Fudan University, Shanghai, China
Zhen Xing, Zuxuan Wu & Yu-Gang Jiang
Shanghai Collaborative Innovation Center on Intelligent Visual Computing, Shanghai, China
Zhen Xing, Zuxuan Wu & Yu-Gang Jiang
University of Maryland, College Park, USA
Hengduo Li

Authors

Zhen Xing
View author publications
You can also search for this author in PubMed Google Scholar
Hengduo Li
View author publications
You can also search for this author in PubMed Google Scholar
Zuxuan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Gang Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zuxuan Wu .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 664 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xing, Z., Li, H., Wu, Z., Jiang, YG. (2022). Semi-supervised Single-View 3D Reconstruction via Prototype Shape Priors. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13661. Springer, Cham. https://doi.org/10.1007/978-3-031-19769-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-19769-7_31
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19768-0
Online ISBN: 978-3-031-19769-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semi-supervised Single-View 3D Reconstruction via Prototype Shape Priors