Point Auto-Encoder and Its Application to 2D-3D Transformation

Cheng, Wencan; Lee, Sukhan

doi:10.1007/978-3-030-33723-0_6

Wencan Cheng²⁰ &
Sukhan Lee²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11845))

Included in the following conference series:

International Symposium on Visual Computing

1658 Accesses
4 Citations

Abstract

PointNet has been shown to be an efficient way to encode the global geometric features of a point cloud representation of 3D objects based on supervised learning. Although it is quite interesting to have a decoder integrated into PointNet for possible unsupervised end-to-end learning, similar to conventional auto-encoders, only a few methods have been proposed to date. The proposed methods are shown to be able to reconstruct the given input point cloud from the corresponding global features through decoders, however, further improvements not only in their performance of training in terms of accuracy but also in their power of generalization in decoding testing samples. This paper presents a Point Auto-Encoder, or Point AE, which is implemented based on the novel semi-convolutional and semi-fully-connected layers proposed that can handle the problem of mapping from single global feature vector to massive numbers of 3D points. The proposed Point AE is not only simpler in its architecture but also more powerful in terms of training performance and generalization capability than state-of-the-art methods. The effectiveness of Point AE is well verified based on the ShapeNet and ModelNet40 dataset. Furthermore, in order to demonstrate the extended capability of Point AE, we apply Point AE to the automatic transformation of images from 2D to 3D and 3D to 2D.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)
Google Scholar
Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: point cloud auto-encoder via deep grid deformation. In: CVPR (2018)
Google Scholar
Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 484–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_29
Chapter Google Scholar
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Google Scholar
Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Chapter Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., et al.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, ACM, pp. 1096–1103 (2008)
Google Scholar
Bengio, Y., et al.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (2007)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., et al.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20
Chapter Google Scholar
Wu, Z., Song, S., Khosla, A., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Smith, E., Meger, D.: Improved adversarial systems for 3D object generation and reconstruction. arXiv preprint arXiv:1707.09557 (2017)
Zamorski, M., Zięba, M., Nowak, R., et al.: Adversarial Autoencoders for Generating 3D Point Clouds. arXiv preprint arXiv:1811.07605 (2018)
Shoef, M., Fogel, S., Cohen-Or, D.: PointWise: An Unsupervised Point-wise Feature Learning Network. arXiv preprint arXiv:1901.04544 (2019)
Makhzani, A., Shlens, J., Jaitly, N., et al.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Achlioptas. P., Diamanti, O., Mitliagkas, I., et al.: Learning representations and generative models for 3D point clouds. arXiv preprint arXiv:1707.02392 (2017)
Zhu, R., Kiani Galoogahi, H., Wang, C., et al.: Rethinking reprojection: closing the loop for pose-aware shape reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 57–65 (2017)
Google Scholar
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)
Article Google Scholar
Montemerlo, M., Thrun, S., Koller, D., et al.: FastSLAM: a factored solution to the simultaneous localization and mapping problem. In: AAAI/IAAI, pp. 593–598 (2002)
Google Scholar
Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (SLAM): Part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006)
Article Google Scholar
Woodham, R.J.: Photometric method for determining surface orientation from multiple images. Opt. Eng. 19(1), 191139 (1980)
Article Google Scholar
Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3
Chapter Google Scholar
Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Shen, Y., Feng, C., Yang, Y., et al.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4548–4557 (2018)
Google Scholar
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on Geometry Processing, vol. 6, pp. 156–164 (2003)
Google Scholar
Wu, J., Zhang, C., Xue, T., et al.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)
Google Scholar
Achlioptas, P., Diamanti, O., Mitliagkas, I., et al.: Representation learning and adversarial generation of 3D point clouds. 2(3), 4 (2017). arXiv preprint arXiv:1707.02392
Ul Islam, N., Lee, S.: Learning typical 3D representation from a single 2D correspondence using 2D-3D transformation network. In: Lee, S., Ismail, R., Choo, H. (eds.) IMCOM 2019. AISC, vol. 935, pp. 440–455. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19063-7_35
Chapter Google Scholar

Download references

Acknowledgement

This research was supported, in part, by the “3D Recognition Project” of Korea Evaluation Institute of Industrial Technology (KEIT) (10060160) and, in part, by the Institute of Information and Communication Technology Planning & Evaluation (IITP) grant sponsored by the Korean Ministry of Science and Information Technology (MSIT): No. 2019-0-00421, AI Graduate School Program and, in part, by “Robocarechair: A Smart Transformable Robot for Multi-Functional Assistive Personal Care” Project, KEIT P0006886, of the Korea Evaluation Institute of Industrial Technology (KEIT).

Author information

Authors and Affiliations

SungKyunKwan University, Suwon, 16419, South Korea
Wencan Cheng & Sukhan Lee

Authors

Wencan Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Sukhan Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sukhan Lee .

Editor information

Editors and Affiliations

University of Nevada, Reno, NV, USA
George Bebis
NASA Ames Research Center, Moffett Field, CA, USA
Richard Boyle
University of Nevada, Reno, NV, USA
Bahram Parvin
Desert Research Institute, Reno, NV, USA
Darko Koracin
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Daniela Ushizima
Latent AI, Palo Alto, CA, USA
Sek Chai
Texas A&M University, College Station, TX, USA
Shinjiro Sueda
Louisiana State University, Baton Rouge, LA, USA
Xin Lin
University of North Carolina at Charlotte, Charlotte, NC, USA
Aidong Lu
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Daniel Thalmann
Notre Dame University, Notre Dame, IN, USA
Chaoli Wang
Bosch Research North America, Palo Alto, CA, USA
Panpan Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, W., Lee, S. (2019). Point Auto-Encoder and Its Application to 2D-3D Transformation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11845. Springer, Cham. https://doi.org/10.1007/978-3-030-33723-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-33723-0_6
Published: 21 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33722-3
Online ISBN: 978-3-030-33723-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics