Abstract
Methods based on Deep Geometric Learning allow the development of solutions with a geometric approximation in different applications. In particular, the curved feature of hyperbolic space has the ability to describe hierarchical structures in a better manner. In this paper, we aim to define an unsupervised learning model for action recognition. The curved feature space is intended to be used to describe a hierarchical relationship between the clips that compose a complete video sequence. These, in turn, are related to each other by means of a triplet loss function and a VAE (Variational Auto-Encoder) neural architecture, which establishes a similarity relationship between clips to identify actions from a set of unlabelled data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ariza Colpas, P., et al.: Unsupervised human activity recognition using the clustering approach: a review. Sensors 20(9), 2702 (2020)
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)
Chaaraoui, A.A., Climent-Pérez, P., Flórez-Revuelta, F.: A review on vision techniques applied to human behaviour analysis for ambient-assisted living. Expert Syst. Appl. 39(12), 10873–10888 (2012)
Cook, D.J., Crandall, A.S., Thomas, B.L., Krishnan, N.C.: Casas: a smart home in a box. Computer 46(7), 62–69 (2012)
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Fernando, B., Bilen, H., Gavves, E., Gould, S.: Self-supervised video representation learning with odd-one-out networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3636–3645 (2017)
Friji, R., Drira, H., Chaieb, F., Kchok, H., Kurtek, S.: Geometric deep neural network using rigid and non-rigid transformations for human action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12611–12620 (2021)
Hoffer, E., Hubara, I., Ailon, N.: Deep unsupervised learning through spatial contrasting. arXiv preprint arXiv:1610.00243 (2016)
Hsu, J., Gu, J., Wu, G., Chiu, W., Yeung, S.: Capturing implicit hierarchical structure in 3D biomedical images with self-supervised hyperbolic representations. In: Advances in Neural Information Processing Systems 34, 5112–5123 (2021)
Hu, W.Y., Scott, J.S.: Behavioral obstacles in the annuity market. Financ. Anal. J. 63(6), 71–82 (2007)
Huang, W., Wu, Q.J.: Human action recognition based on self organizing map. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2130–2133. IEEE (2010)
Jing, L., Yang, X., Liu, J., Tian, Y.: Self-supervised spatiotemporal feature learning via video rotation prediction. arXiv preprint arXiv:1811.11387 (2018)
Larsson, G., Maire, M., Shakhnarovich, G.: Colorization as a proxy task for visual understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6874–6883 (2017)
Li, Y., Paluri, M., Rehg, J.M., Dollár, P.: Unsupervised learning of edges. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1619–1627 (2016)
Lou, A., Katsman, I., Jiang, Q., Belongie, S., Lim, S.N., De Sa, C.: Differentiating through the fréchet mean. In: International Conference on Machine Learning, pp. 6393–6403. PMLR (2020)
Mathieu, E., Le Lan, C., Maddison, C.J., Tomioka, R., Teh, Y.W.: Continuous hierarchical representations with poincaré variational auto-encoders. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Pu, Y., et al.: Variational autoencoder for deep learning of images, labels and captions. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Rawassizadeh, R., Dobbins, C., Akbari, M., Pazzani, M.: Indexing multivariate mobile data through spatio-temporal event detection and clustering. Sensors 19(3), 448 (2019)
Sarabu, A., Santra, A.K.: Human action recognition in videos using convolution long short-term memory network with spatio-temporal networks. Emerg. Sci. J. 5(1), 25–33 (2021)
Surís, D., Liu, R., Vondrick, C.: Learning the predictability of the future. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12607–12617 (2021)
Surís, D., Liu, R., Vondrick, C.: Learning the predictability of the future. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12607–12617 (2021)
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)
Ungar, A.A.: The möbius gyrovector space. In: Beyond the Einstein Addition Law and its Gyroscopic Thomas Precession, pp. 161–210. Springer (2001). https://doi.org/10.1007/0-306-47134-5_6
Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2015)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 7444–7452 (2018)
Yao, G., Lei, T., Zhong, J.: A review of convolutional-neural-network-based action recognition. Pattern Recogn. Lett. 118, 14–22 (2019)
Acknowledgment
We would like to thank “A way of making Europe” European Regional Development Fund (ERDF) and MCIN/AEI/10.13039/501100011033 for supporting this work under the MoDeaAS project (grant PID2019-104818RB-I00). This work has also been supported by two Spanish national grants for PhD studies, FPU17/00166, and UAFPU2019-13 respectively. Furthermore, we would like to thank Nvidia for their generous hardware donation that made these experiments possible.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Castro-Vargas, JA., Garcia-Garcia, A., Martinez-Gonzalez, P., Oprea, S., Garcia-Rodriguez, J. (2023). Unsupervised Hyperbolic Action Recognition. In: Tardioli, D., Matellán, V., Heredia, G., Silva, M.F., Marques, L. (eds) ROBOT2022: Fifth Iberian Robotics Conference. ROBOT 2022. Lecture Notes in Networks and Systems, vol 590. Springer, Cham. https://doi.org/10.1007/978-3-031-21062-4_39
Download citation
DOI: https://doi.org/10.1007/978-3-031-21062-4_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21061-7
Online ISBN: 978-3-031-21062-4
eBook Packages: EngineeringEngineering (R0)