Skip to main content

ARShape-Net: Single-View Image Oriented 3D Shape Reconstruction with an Adversarial Refiner

  • Conference paper
  • First Online:
  • 1985 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13069))

Abstract

In this paper, we propose a novel method of reconstructing 3D shapes from single-view images based on an adversarial refiner. Generative Adversarial mechanism is adopted between the coarse-volumes generation stage and the refinement stage. In the coarse-volumes generation stage, a Context Aware Channel Attention Module (CA-CAM) and a VoxFocal Loss are designed to reconstruct the coarse-volume shapes as complete as possible. Furthermore, an adversarial refiner is proposed to adaptively optimize the coarse-volume shapes and remove redundant voxels by adding the discriminator into refiner. On the challenging large synthesis-image dataset ShapeNet and the real-world dataset Pix3D, ARShape-Net demonstrated significant quantitative and qualitative improvements compared with the state-of-the-art methods.

This work was supported in part by the National Natural Science Foundation of China under Grant 61762003, Grant 62162001 and Grant 61972121, in part by CAS “Light of West China” Program, and in part by Ningxia Excellent Talent Program.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

    Google Scholar 

  2. Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38

    Chapter  Google Scholar 

  3. Wu, J., Zhang, C., Xue, T., Freeman, W. T., Tenenbaum, J. B.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 82–90 (2016)

    Google Scholar 

  4. Zhang, X., Zhang, Z., Zhang, C., Tenenbaum, J.B., Freeman, W.T., Wu, J.: Learning to reconstruct shapes from unseen classes. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2263–2274 (2018)

    Google Scholar 

  5. Wu, J., Wang, Y., Xue, T., Sun, X., Freeman, W.T., Tenenbaum, J.B.: MarrNet: 3D shape reconstruction via 2.5 d sketches. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 540–550 (2017)

    Google Scholar 

  6. Xie, H., Yao, H., Sun, X., Zhou, S., Zhang, S.: Pix2vox: context-aware 3D reconstruction from single and multi-view images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2690–2698 (2019)

    Google Scholar 

  7. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

    Google Scholar 

  8. Richter, S.R., Roth, S.: Discriminative shape from shading in uncalibrated illumination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1128–1136 (2015)

    Google Scholar 

  9. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  10. Wu, J., Zhang, C., Zhang, X., Zhang, Z., Freeman, W.T., Tenenbaum, J.B.: Learning shape priors for single-view 3D completion and reconstruction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 673–691. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_40

    Chapter  Google Scholar 

  11. Witkin, A.P.: Recovering surface shape and orientation from texture. Artif. Intell. 17(1–3), 17–45 (1981)

    Article  Google Scholar 

  12. Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 922–928 (2015)

    Google Scholar 

  13. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  14. Tulsiani, S., Zhou, T., Efros, A.A., Malik, J.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 2626–2634 (2017)

    Google Scholar 

  15. Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)

    Google Scholar 

  16. Dibra, E., Jain, H., Oztireli, C., Ziegler, R., Gross, M.: Human shape from silhouettes using generative HKS descriptors and cross-modal neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4826–4836 (2017)

    Google Scholar 

  17. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)

  18. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

    Google Scholar 

  19. Sun, X., et al.: Pix3d: dataset and methods for single-image 3D shape modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2974–2983 (2018)

    Google Scholar 

  20. Xie, H., Yao, H., Zhang, S., Zhou, S., Sun, W.: Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. Int. J. Comput. Vision 128(12), 2919–2935 (2020). https://doi.org/10.1007/s11263-020-01347-6

    Article  Google Scholar 

  21. Mandikal, P., Babu, R.V.: Dense 3D point cloud reconstruction using a deep pyramid network. In: Proceedings-2019 IEEE Winter Conference on Applications of Computer Vision, pp. 1052–1060 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, H., Bai, J. (2021). ARShape-Net: Single-View Image Oriented 3D Shape Reconstruction with an Adversarial Refiner. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93046-2_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93045-5

  • Online ISBN: 978-3-030-93046-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics