Representation Learning for Point Clouds with Variational Autoencoders

Molnár, Szilárd; Tamás, Levente

doi:10.1007/978-3-031-25075-0_49

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13806))

Included in the following conference series:

European Conference on Computer Vision

1399 Accesses
1 Citations

Abstract

Deep generative networks provide a way to generalize complex multi-dimensional data such as 3D point clouds. In this work, we present a novel method that operates on depth images and with the use of geometric images is able to learn the representation of discrete 3D points based on variational autoencoders (VAE). Traditional VAE solutions failed to capture sharply compressed 3D data; however, with the constrained variational framework with additional hyperparameters, we managed to learn the representation of 3D data successfully. To do this, we applied a Bayesian optimization on the hyperparameter space of the VAE. The results were validated on a large scale of public data while the code and demos are available on the authors’ website: https://github.com/molnarszilard/GIPC_rele.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 40–49. Proceedings of Machine Learning Research (2018)
Google Scholar
Blaga, A., Militaru, C., Mezei, A.-D., Tamas, L.: Augmented reality integration into MES for connected workers. Robot. Comput.-Integr. Manuf. 68, 102057 (2021)
Article Google Scholar
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., et al.: Generative adversarial networks: an overview. IEEE Sig. Process. Mag. 35(1), 53–65 (2018)
Article Google Scholar
Frohlich, R., Tamas, L., Kato, Z.: Absolute pose estimation of central cameras using planar regions. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 377–391 (2021)
Article Google Scholar
Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 105–122. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_7
Chapter Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 2672–2680. Curran Associates Inc. (2014)
Google Scholar
Gu, X., Gortler, S.J., Hoppe, H.: Geometry images. ACM Trans. Graph. 21(3), 355–361 (2002)
Article Google Scholar
Higgins, I., Matthey, L., Pal, A., Burgess, C., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings (2017)
Google Scholar
Keshtkaran, M.R., Pandarinath, C.: Enabling hyperparameter optimization in sequential autoencoders for spiking neural data. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, vol. 32, pp. 15911–15921. Neural Information Processing Systems Foundation, Inc. (NeurIPS) (2019)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Google Scholar
Marnissi, Y., Zheng, Y., Chouzenoux, E., Pesquet, J.-C.: A variational Bayesian approach for image restoration - application to image deblurring with Poisson-Gaussian noise. IEEE Trans. Comput. Imaging 3(4), 722–737 (2017)
Article MathSciNet Google Scholar
Masuda, M., Hachiuma, R., Fujii, R., Saito, H., Sekikawa, Y.: Toward unsupervised 3D point cloud anomaly detection using variational autoencoder. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3118–3122. IEEE (2021)
Google Scholar
Molnár, S., Kelényi, B., Tamás, L.: ToFNest: efficient normal estimation for time-of-flight depth cameras. In: IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, 11–17 October 2021, pp. 1791–1798. IEEE, online (2021)
Google Scholar
Rybkin, O., Daniilidis, K., Levine, S.: Simple and effective VAE training with calibrated decoders. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol. 139, pp. 9179–9189. Proceedings of Machine Learning Research (2021)
Google Scholar
Siivola, E., Paleyes, A., González, J., Vehtari, A.: Good practices for Bayesian optimization of high dimensional structured spaces. Applied AI Lett. 2(2), e24 (2021)
Article Google Scholar
Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14
Chapter Google Scholar
Su, F.G., Lin, C.S., Wang, Y.: Learning interpretable representation for 3D point clouds. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, 10–15 January 2021, pp. 7470–7477. IEEE (2021)
Google Scholar
Tamas, L., Cozma, A.: Embedded real-time people detection and tracking with time-of-flight camera. In: Real-Time Image Processing and Deep Learning 2021, vol. 11736, pp. 65–70. International Society for Optics and Photonics, SPIE, online (2021)
Google Scholar
Thanou, D., Chou, P.A., Frossard, P.: Graph-based compression of dynamic 3D point cloud sequences. IEEE Trans. Image Process. 25(4), 1765–1778 (2016)
Article MathSciNet MATH Google Scholar
Yílmaz, M.A., Kelesş, O., Güven, H., Tekalp, A.M., Malik, J., Kíranyaz, S.: Self-organized variational autoencoders (self-VAE) for learned image compression. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3732–3736. IEEE (2021)
Google Scholar
Zamorski, M., Zięba, M., Klukowski, P., Nowak, R., et al.: Adversarial autoencoders for compact representations of 3D point clouds. Comput. Vis. Image Underst. 193, 102921 (2020)
Article Google Scholar
Zeng, S., Geng, G., Gao, H., Zhou, M.: A novel geometry image to accurately represent a surface by preserving mesh topology. Sci. Rep. 11(1), 1–9 (2021)
Article Google Scholar

Download references

Acknowledgments

The authors are thankful for the support of Analog Devices GMBH Romania, for the equipment list and Nvidia for graphic cards offered as support to this work.This work was financially supported by the Romanian National Authority for Scientific Research, project number PN-III-P2-2.1-PED-2021-3120. The authors are also thankful to KMTA (Kárpát-medencei Tehetségkutató Alapítvány) and Domus Foundation for their support.

Author information

Authors and Affiliations

Technical University of Cluj-Napoca, Cluj-Napoca, Romania
Szilárd Molnár & Levente Tamás

Authors

Szilárd Molnár
View author publications
You can also search for this author in PubMed Google Scholar
Levente Tamás
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Levente Tamás .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Molnár, S., Tamás, L. (2023). Representation Learning for Point Clouds with Variational Autoencoders. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_49

Download citation

DOI: https://doi.org/10.1007/978-3-031-25075-0_49
Published: 19 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Representation Learning for Point Clouds with Variational Autoencoders