ARShape-Net: Single-View Image Oriented 3D Shape Reconstruction with an Adversarial Refiner

Xu, Hao; Bai, Jing

doi:10.1007/978-3-030-93046-2_54

ARShape-Net: Single-View Image Oriented 3D Shape Reconstruction with an Adversarial Refiner

Hao Xu¹⁴ &
Jing Bai^14,15

Conference paper
First Online: 01 January 2022

1985 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13069))

Abstract

In this paper, we propose a novel method of reconstructing 3D shapes from single-view images based on an adversarial refiner. Generative Adversarial mechanism is adopted between the coarse-volumes generation stage and the refinement stage. In the coarse-volumes generation stage, a Context Aware Channel Attention Module (CA-CAM) and a VoxFocal Loss are designed to reconstruct the coarse-volume shapes as complete as possible. Furthermore, an adversarial refiner is proposed to adaptively optimize the coarse-volume shapes and remove redundant voxels by adding the discriminator into refiner. On the challenging large synthesis-image dataset ShapeNet and the real-world dataset Pix3D, ARShape-Net demonstrated significant quantitative and qualitative improvements compared with the state-of-the-art methods.

This work was supported in part by the National Natural Science Foundation of China under Grant 61762003, Grant 62162001 and Grant 61972121, in part by CAS “Light of West China” Program, and in part by Ningxia Excellent Talent Program.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Chapter Google Scholar
Wu, J., Zhang, C., Xue, T., Freeman, W. T., Tenenbaum, J. B.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 82–90 (2016)
Google Scholar
Zhang, X., Zhang, Z., Zhang, C., Tenenbaum, J.B., Freeman, W.T., Wu, J.: Learning to reconstruct shapes from unseen classes. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2263–2274 (2018)
Google Scholar
Wu, J., Wang, Y., Xue, T., Sun, X., Freeman, W.T., Tenenbaum, J.B.: MarrNet: 3D shape reconstruction via 2.5 d sketches. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 540–550 (2017)
Google Scholar
Xie, H., Yao, H., Sun, X., Zhou, S., Zhang, S.: Pix2vox: context-aware 3D reconstruction from single and multi-view images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2690–2698 (2019)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)
Google Scholar
Richter, S.R., Roth, S.: Discriminative shape from shading in uncalibrated illumination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1128–1136 (2015)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W.T., Tenenbaum, J.B.: Learning shape priors for single-view 3D completion and reconstruction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 673–691. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_40
Chapter Google Scholar
Witkin, A.P.: Recovering surface shape and orientation from texture. Artif. Intell. 17(1–3), 17–45 (1981)
Article Google Scholar
Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 922–928 (2015)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 2626–2634 (2017)
Google Scholar
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Google Scholar
Dibra, E., Jain, H., Oztireli, C., Ziegler, R., Gross, M.: Human shape from silhouettes using generative HKS descriptors and cross-modal neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4826–4836 (2017)
Google Scholar
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)
Google Scholar
Sun, X., et al.: Pix3d: dataset and methods for single-image 3D shape modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2974–2983 (2018)
Google Scholar
Xie, H., Yao, H., Zhang, S., Zhou, S., Sun, W.: Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. Int. J. Comput. Vision 128(12), 2919–2935 (2020). https://doi.org/10.1007/s11263-020-01347-6
Article Google Scholar
Mandikal, P., Babu, R.V.: Dense 3D point cloud reconstruction using a deep pyramid network. In: Proceedings-2019 IEEE Winter Conference on Applications of Computer Vision, pp. 1052–1060 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

North Minzu University, Yinchuan, 750021, Ningxia, China
Hao Xu & Jing Bai
The Key Laboratory of Images and Graphics Intelligent Processing of State Ethnic Affairs Commission, Yinchuan, 750021, Ningxia, China
Jing Bai

Authors

Hao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Bai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Duke University, Durham, NC, USA
Yiran Chen
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
University of British Columbia, Vancouver, BC, Canada
Jane Wang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Ruiping Wang
Xidian University, Xi’an, China
Weisheng Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, H., Bai, J. (2021). ARShape-Net: Single-View Image Oriented 3D Shape Reconstruction with an Adversarial Refiner. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_54

Download citation

DOI: https://doi.org/10.1007/978-3-030-93046-2_54
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93045-5
Online ISBN: 978-3-030-93046-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics