Abstract
Deep generative networks provide a way to generalize complex multi-dimensional data such as 3D point clouds. In this work, we present a novel method that operates on depth images and with the use of geometric images is able to learn the representation of discrete 3D points based on variational autoencoders (VAE). Traditional VAE solutions failed to capture sharply compressed 3D data; however, with the constrained variational framework with additional hyperparameters, we managed to learn the representation of 3D data successfully. To do this, we applied a Bayesian optimization on the hyperparameter space of the VAE. The results were validated on a large scale of public data while the code and demos are available on the authors’ website: https://github.com/molnarszilard/GIPC_rele.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 40–49. Proceedings of Machine Learning Research (2018)
Blaga, A., Militaru, C., Mezei, A.-D., Tamas, L.: Augmented reality integration into MES for connected workers. Robot. Comput.-Integr. Manuf. 68, 102057 (2021)
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., et al.: Generative adversarial networks: an overview. IEEE Sig. Process. Mag. 35(1), 53–65 (2018)
Frohlich, R., Tamas, L., Kato, Z.: Absolute pose estimation of central cameras using planar regions. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 377–391 (2021)
Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 105–122. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_7
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 2672–2680. Curran Associates Inc. (2014)
Gu, X., Gortler, S.J., Hoppe, H.: Geometry images. ACM Trans. Graph. 21(3), 355–361 (2002)
Higgins, I., Matthey, L., Pal, A., Burgess, C., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings (2017)
Keshtkaran, M.R., Pandarinath, C.: Enabling hyperparameter optimization in sequential autoencoders for spiking neural data. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, vol. 32, pp. 15911–15921. Neural Information Processing Systems Foundation, Inc. (NeurIPS) (2019)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Marnissi, Y., Zheng, Y., Chouzenoux, E., Pesquet, J.-C.: A variational Bayesian approach for image restoration - application to image deblurring with Poisson-Gaussian noise. IEEE Trans. Comput. Imaging 3(4), 722–737 (2017)
Masuda, M., Hachiuma, R., Fujii, R., Saito, H., Sekikawa, Y.: Toward unsupervised 3D point cloud anomaly detection using variational autoencoder. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3118–3122. IEEE (2021)
Molnár, S., Kelényi, B., Tamás, L.: ToFNest: efficient normal estimation for time-of-flight depth cameras. In: IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, 11–17 October 2021, pp. 1791–1798. IEEE, online (2021)
Rybkin, O., Daniilidis, K., Levine, S.: Simple and effective VAE training with calibrated decoders. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol. 139, pp. 9179–9189. Proceedings of Machine Learning Research (2021)
Siivola, E., Paleyes, A., González, J., Vehtari, A.: Good practices for Bayesian optimization of high dimensional structured spaces. Applied AI Lett. 2(2), e24 (2021)
Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14
Su, F.G., Lin, C.S., Wang, Y.: Learning interpretable representation for 3D point clouds. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, 10–15 January 2021, pp. 7470–7477. IEEE (2021)
Tamas, L., Cozma, A.: Embedded real-time people detection and tracking with time-of-flight camera. In: Real-Time Image Processing and Deep Learning 2021, vol. 11736, pp. 65–70. International Society for Optics and Photonics, SPIE, online (2021)
Thanou, D., Chou, P.A., Frossard, P.: Graph-based compression of dynamic 3D point cloud sequences. IEEE Trans. Image Process. 25(4), 1765–1778 (2016)
Yílmaz, M.A., Kelesş, O., Güven, H., Tekalp, A.M., Malik, J., Kíranyaz, S.: Self-organized variational autoencoders (self-VAE) for learned image compression. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3732–3736. IEEE (2021)
Zamorski, M., Zięba, M., Klukowski, P., Nowak, R., et al.: Adversarial autoencoders for compact representations of 3D point clouds. Comput. Vis. Image Underst. 193, 102921 (2020)
Zeng, S., Geng, G., Gao, H., Zhou, M.: A novel geometry image to accurately represent a surface by preserving mesh topology. Sci. Rep. 11(1), 1–9 (2021)
Acknowledgments
The authors are thankful for the support of Analog Devices GMBH Romania, for the equipment list and Nvidia for graphic cards offered as support to this work.This work was financially supported by the Romanian National Authority for Scientific Research, project number PN-III-P2-2.1-PED-2021-3120. The authors are also thankful to KMTA (Kárpát-medencei Tehetségkutató Alapítvány) and Domus Foundation for their support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Molnár, S., Tamás, L. (2023). Representation Learning for Point Clouds with Variational Autoencoders. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)