Abstract
PointNet has been shown to be an efficient way to encode the global geometric features of a point cloud representation of 3D objects based on supervised learning. Although it is quite interesting to have a decoder integrated into PointNet for possible unsupervised end-to-end learning, similar to conventional auto-encoders, only a few methods have been proposed to date. The proposed methods are shown to be able to reconstruct the given input point cloud from the corresponding global features through decoders, however, further improvements not only in their performance of training in terms of accuracy but also in their power of generalization in decoding testing samples. This paper presents a Point Auto-Encoder, or Point AE, which is implemented based on the novel semi-convolutional and semi-fully-connected layers proposed that can handle the problem of mapping from single global feature vector to massive numbers of 3D points. The proposed Point AE is not only simpler in its architecture but also more powerful in terms of training performance and generalization capability than state-of-the-art methods. The effectiveness of Point AE is well verified based on the ShapeNet and ModelNet40 dataset. Furthermore, in order to demonstrate the extended capability of Point AE, we apply Point AE to the automatic transformation of images from 2D to 3D and 3D to 2D.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)
Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: point cloud auto-encoder via deep grid deformation. In: CVPR (2018)
Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 484–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_29
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_38
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Vincent, P., Larochelle, H., Bengio, Y., et al.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, ACM, pp. 1096–1103 (2008)
Bengio, Y., et al.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (2007)
Vincent, P., Larochelle, H., Lajoie, I., et al.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20
Wu, Z., Song, S., Khosla, A., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Smith, E., Meger, D.: Improved adversarial systems for 3D object generation and reconstruction. arXiv preprint arXiv:1707.09557 (2017)
Zamorski, M., Zięba, M., Nowak, R., et al.: Adversarial Autoencoders for Generating 3D Point Clouds. arXiv preprint arXiv:1811.07605 (2018)
Shoef, M., Fogel, S., Cohen-Or, D.: PointWise: An Unsupervised Point-wise Feature Learning Network. arXiv preprint arXiv:1901.04544 (2019)
Makhzani, A., Shlens, J., Jaitly, N., et al.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Achlioptas. P., Diamanti, O., Mitliagkas, I., et al.: Learning representations and generative models for 3D point clouds. arXiv preprint arXiv:1707.02392 (2017)
Zhu, R., Kiani Galoogahi, H., Wang, C., et al.: Rethinking reprojection: closing the loop for pose-aware shape reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 57–65 (2017)
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)
Montemerlo, M., Thrun, S., Koller, D., et al.: FastSLAM: a factored solution to the simultaneous localization and mapping problem. In: AAAI/IAAI, pp. 593–598 (2002)
Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (SLAM): Part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006)
Woodham, R.J.: Photometric method for determining surface orientation from multiple images. Opt. Eng. 19(1), 191139 (1980)
Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3
Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Shen, Y., Feng, C., Yang, Y., et al.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4548–4557 (2018)
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Symposium on Geometry Processing, vol. 6, pp. 156–164 (2003)
Wu, J., Zhang, C., Xue, T., et al.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)
Achlioptas, P., Diamanti, O., Mitliagkas, I., et al.: Representation learning and adversarial generation of 3D point clouds. 2(3), 4 (2017). arXiv preprint arXiv:1707.02392
Ul Islam, N., Lee, S.: Learning typical 3D representation from a single 2D correspondence using 2D-3D transformation network. In: Lee, S., Ismail, R., Choo, H. (eds.) IMCOM 2019. AISC, vol. 935, pp. 440–455. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19063-7_35
Acknowledgement
This research was supported, in part, by the “3D Recognition Project” of Korea Evaluation Institute of Industrial Technology (KEIT) (10060160) and, in part, by the Institute of Information and Communication Technology Planning & Evaluation (IITP) grant sponsored by the Korean Ministry of Science and Information Technology (MSIT): No. 2019-0-00421, AI Graduate School Program and, in part, by “Robocarechair: A Smart Transformable Robot for Multi-Functional Assistive Personal Care” Project, KEIT P0006886, of the Korea Evaluation Institute of Industrial Technology (KEIT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cheng, W., Lee, S. (2019). Point Auto-Encoder and Its Application to 2D-3D Transformation. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2019. Lecture Notes in Computer Science(), vol 11845. Springer, Cham. https://doi.org/10.1007/978-3-030-33723-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-33723-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33722-3
Online ISBN: 978-3-030-33723-0
eBook Packages: Computer ScienceComputer Science (R0)