Skip to main content

Point Auto-Encoder and Its Application to 2D-3D Transformation

  • Conference paper
  • First Online:
Advances in Visual Computing (ISVC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11845))

Included in the following conference series:

Abstract

PointNet has been shown to be an efficient way to encode the global geometric features of a point cloud representation of 3D objects based on supervised learning. Although it is quite interesting to have a decoder integrated into PointNet for possible unsupervised end-to-end learning, similar to conventional auto-encoders, only a few methods have been proposed to date. The proposed methods are shown to be able to reconstruct the given input point cloud from the corresponding global features through decoders, however, further improvements not only in their performance of training in terms of accuracy but also in their power of generalization in decoding testing samples. This paper presents a Point Auto-Encoder, or Point AE, which is implemented based on the novel semi-convolutional and semi-fully-connected layers proposed that can handle the problem of mapping from single global feature vector to massive numbers of 3D points. The proposed Point AE is not only simpler in its architecture but also more powerful in terms of training performance and generalization capability than state-of-the-art methods. The effectiveness of Point AE is well verified based on the ShapeNet and ModelNet40 dataset. Furthermore, in order to demonstrate the extended capability of Point AE, we apply Point AE to the automatic transformation of images from 2D to 3D and 3D to 2D.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)

    Google Scholar 

  2. Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: point cloud auto-encoder via deep grid deformation. In: CVPR (2018)

    Google Scholar 

  3. Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 484–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_29

    Chapter  Google Scholar 

  4. Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)

    Google Scholar 

  5. Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38

    Chapter  Google Scholar 

  6. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  7. Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)

    Article  Google Scholar 

  8. Vincent, P., Larochelle, H., Bengio, Y., et al.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, ACM, pp. 1096–1103 (2008)

    Google Scholar 

  9. Bengio, Y., et al.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (2007)

    Google Scholar 

  10. Vincent, P., Larochelle, H., Lajoie, I., et al.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  11. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  12. Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20

    Chapter  Google Scholar 

  13. Wu, Z., Song, S., Khosla, A., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

    Google Scholar 

  14. Smith, E., Meger, D.: Improved adversarial systems for 3D object generation and reconstruction. arXiv preprint arXiv:1707.09557 (2017)

  15. Zamorski, M., Zięba, M., Nowak, R., et al.: Adversarial Autoencoders for Generating 3D Point Clouds. arXiv preprint arXiv:1811.07605 (2018)

  16. Shoef, M., Fogel, S., Cohen-Or, D.: PointWise: An Unsupervised Point-wise Feature Learning Network. arXiv preprint arXiv:1901.04544 (2019)

  17. Makhzani, A., Shlens, J., Jaitly, N., et al.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)

  18. Achlioptas. P., Diamanti, O., Mitliagkas, I., et al.: Learning representations and generative models for 3D point clouds. arXiv preprint arXiv:1707.02392 (2017)

  19. Zhu, R., Kiani Galoogahi, H., Wang, C., et al.: Rethinking reprojection: closing the loop for pose-aware shape reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 57–65 (2017)

    Google Scholar 

  20. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  21. Montemerlo, M., Thrun, S., Koller, D., et al.: FastSLAM: a factored solution to the simultaneous localization and mapping problem. In: AAAI/IAAI, pp. 593–598 (2002)

    Google Scholar 

  22. Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (SLAM): Part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006)

    Article  Google Scholar 

  23. Woodham, R.J.: Photometric method for determining surface orientation from multiple images. Opt. Eng. 19(1), 191139 (1980)

    Article  Google Scholar 

  24. Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3

    Chapter  Google Scholar 

  25. Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

    Article  Google Scholar 

  26. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  27. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  28. Shen, Y., Feng, C., Yang, Y., et al.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4548–4557 (2018)

    Google Scholar 

  29. Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on Geometry Processing, vol. 6, pp. 156–164 (2003)

    Google Scholar 

  30. Wu, J., Zhang, C., Xue, T., et al.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)

    Google Scholar 

  31. Achlioptas, P., Diamanti, O., Mitliagkas, I., et al.: Representation learning and adversarial generation of 3D point clouds. 2(3), 4 (2017). arXiv preprint arXiv:1707.02392

  32. Ul Islam, N., Lee, S.: Learning typical 3D representation from a single 2D correspondence using 2D-3D transformation network. In: Lee, S., Ismail, R., Choo, H. (eds.) IMCOM 2019. AISC, vol. 935, pp. 440–455. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19063-7_35

    Chapter  Google Scholar 

Download references

Acknowledgement

This research was supported, in part, by the “3D Recognition Project” of Korea Evaluation Institute of Industrial Technology (KEIT) (10060160) and, in part, by the Institute of Information and Communication Technology Planning & Evaluation (IITP) grant sponsored by the Korean Ministry of Science and Information Technology (MSIT): No. 2019-0-00421, AI Graduate School Program and, in part, by “Robocarechair: A Smart Transformable Robot for Multi-Functional Assistive Personal Care” Project, KEIT P0006886, of the Korea Evaluation Institute of Industrial Technology (KEIT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sukhan Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, W., Lee, S. (2019). Point Auto-Encoder and Its Application to 2D-3D Transformation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11845. Springer, Cham. https://doi.org/10.1007/978-3-030-33723-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33723-0_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33722-3

  • Online ISBN: 978-3-030-33723-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics