Skip to main content
Log in

Video-based social behavior recognition based on kernel relevance analysis

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

This paper presents a kernel-based relevance analysis for video data to support social behavior recognition. Our approach, termed KRAV, is twofold: (i) A feature ranking based on centered kernel alignment (CKA) is carried out to match social semantic features with the output labels (individual and group behaviors). The employed method is an extension of the conventional CKA to mitigate the imbalance effect of unusual human behaviors. (ii) A classification stage to perform the behavior prediction. For concrete testing, the Israel Institute of Technology social behavior database is employed to assess the KRAV under a tenfold cross-validation scheme. Attained results show that the proposed approach for the individual recognition task obtains 0.5925 \(F_1\) measure using 50 relevant features. Likewise, for the group recognition task obtains 0.8094 \(F_1\) measure using 12 relevant features, which in both cases outperforms state-of-the-art results concerning the classification performance and number of employed features. Also, our video-based approach would assist further social behavior analysis from the set of features selected regarding the recognition of individual profiles and group behaviors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Performed by the INESC TEC, Group of Portugal and supervised by the lab of social-psychology of the University of Porto—http://sigarra.up.pt/fpceup.

References

  1. Adam, A., Rivlin, E., Shimshoni, I., Reinitz, D.: Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans. Pattern Anal. Mach. Intell. 30(3), 555–560 (2008)

    Article  Google Scholar 

  2. Ai, S., Lu, T., Xiong, Y.: Improved dense trajectories for action recognition based on random projection and fisher vectors. In: Proceedings of the MIPPR, vol. 10609, pp. 337–344. SPIE (2017)

  3. Álvarez-Meza, A., Cárdenas-Peña, D., Castellanos-Domínguez, G.: Unsupervised Kernel Function Building Using Maximization of Information Potential Variability, pp. 335–342. Springer, Berlin (2014)

    Google Scholar 

  4. Álvarez-Meza, A., Orozco-Gutierrez, A., Castellanos-Dominguez, G.: Kernel-based relevance analysis with enhanced interpretability for detection of brain activity patterns. Front. Neurosci. 11, 550 (2017)

    Article  Google Scholar 

  5. Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential Deep Learning for Human Action Recognition. LNCS, vol. 7065, pp. 29–39. Springer, Berlin (2011)

  6. Bloom, V., Argyriou, V., Makris, D.: Linear latent low dimensional space for online early action recognition and prediction. Pattern Recogn. 72, 532–547 (2017)

    Article  Google Scholar 

  7. Brockmeier, A., Choi, J., Kriminger, E., Francis, J., Principe, J.: Neural decoding with kernel-based metric learning. Neural Comput. 26(6), 1080–1107 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chu, C., Ni, Y., Tan, G., Saunders, C.J., Ashburner, J.: Kernel regression for fmri pattern prediction. NeuroImage 56(2), 662–673 (2011)

    Article  Google Scholar 

  9. Climent-Pérez, P., Chaaraoui, A., Padilla-López, J., Flórez-Revuelta, F.: Optimal Joint Selection for Skeletal Data from RGB-D Devices Using a Genetic Algorithm, pp. 163–174. Springer, Berlin (2013)

    Google Scholar 

  10. Dawn, D.D., Shaikh, S.H.: A comprehensive survey of human action recognition with spatio-temporal interest point (stip) detector. Vis. Comput. 32(3), 289–306 (2016)

    Article  Google Scholar 

  11. Daza-Santacoloma, G., Arias-Londoño, J., Godino-Llorente, J., Sáenz-Lechón, N., Osma-Ruiz, V., Castellanos-Domínguez, G.: Dynamic feature extraction: an application to voice pathology detection. Intell. Autom. Soft Comput. 15(4), 667–682 (2009)

    Google Scholar 

  12. Fan, W., Bouguila, N., Liu, X.: A nonparametric bayesian learning model using accelerated variational inference and feature selection. Pattern Anal. Appl. 22(1), 63–74 (2019)

    Article  MathSciNet  Google Scholar 

  13. Gretton, A., Bousquet, O., Smola, A., Scolkopf, B.: Measuring Statistical Dependence with Hilbert–Schmidt Norms, pp. 63–77. Springer, Berlin (2005)

    MATH  Google Scholar 

  14. Guo, K., Ishwar, P., Konrad, J.: Action recognition from video using feature covariance matrices. IEEE Trans. Image Process. 22(6), 2479–2494 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  15. Guo, Y., Tao, D., Liu, W., Cheng, J.: Multiview cauchy estimator feature embedding for depth and inertial sensor-based human action recognition. IEEE Trans. Syst. Man Cybern. Syst. 47(4), 617–627 (2017)

    Article  Google Scholar 

  16. Guyon, I., Elisseeff, A.: Special issue on variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  17. Harandi, M., Salzmann, M., Hartley, R.: Dimensionality reduction on spd manifolds: the emergence of geometry-aware methods. IEEE Trans. Pattern Anal. Mach. Intell. 40(1), 48–62 (2018)

    Article  Google Scholar 

  18. Iosifidis, A., Tefas, A., Pitas, I.: Distance-based human action recognition using optimized class representations. Neurocomputing 161, 47–55 (2015)

    Article  Google Scholar 

  19. Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)

    Article  Google Scholar 

  20. Karim, S., Zhang, Y., Laghari, A.A., Asif, M.R.: Image Processing Based Proposed Drone for Detecting and Controlling Street Crimes, pp. 1725–1730. IEEE, Piscataway (2018)

    Google Scholar 

  21. Lee, J.A., Renard, E., Bernard, G., Dupont, P., Verleysen, M.: Type 1 and 2 mixtures of Kullback–Leibler divergences as cost functions in dimensionality reduction based on similarity preservation. Neurocomputing 112, 92–108 (2013)

    Article  Google Scholar 

  22. Li, Y., Ye, J., Wang, T., Huang, S.: Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition. Vis. Comput. 31(10), 1383–1394 (2015)

    Article  Google Scholar 

  23. Mabrouk, A.B., Zagrouba, E.: Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst. Appl. 91, 480–491 (2018)

    Article  Google Scholar 

  24. Molina-Giraldo, S., Carvajal-González, J., Álvarez-Meza, A., Castellanos-Domínguez, G.: Video segmentation framework based on multi-kernel representations and feature relevance analysis for object classification. Adv. Intell. Syst. Comput. 318, 273–283 (2015)

    Google Scholar 

  25. Negin, F., Özdemir, F., Akgül, C., Yüksel, K., Erçil, A.: A Decision Forest Based Feature Selection Framework for Action Recognition from RGB-Depth Cameras, pp. 648–657. Springer, Berlin (2013)

    Google Scholar 

  26. Nie, F., Xu, D., Tsang, I.W.H., Zhang, C.: Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans. Image Process. 19(7), 1921–1932 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  27. Pei, L., Ye, M., Zhao, X., Dou, Y., Bao, J.: Action recognition by learning temporal slowness invariant features. Vis. Comput. 32(11), 1395–1404 (2016)

    Article  Google Scholar 

  28. Pereira, E.M., Ciobanu, L., Cardoso, J.S.: Cross-layer classification framework for automatic social behavioural analysis in surveillance scenario. Neural Comput. Appl. 28(9), 2425–2444 (2017)

    Article  Google Scholar 

  29. Ribeiro, P.C., Santos-Victor, J., Lisboa, P.: Human activity recognition from video: modeling, feature selection and classification architecture. In: Proceedings of HAREM, pp. 61–78. BMVC (2005)

  30. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of relieff and rrelieff. Mach. Learn. 53(1–2), 23–69 (2003)

    Article  MATH  Google Scholar 

  31. Soviany, S., Sandulescu, V., Puscoci, S.: The Hierarchical Classification Model Using Support Vector Machine with Multiple Kernels in Human Behavioral Pattern Recognition, pp. 683–686. IEEE, Piscataway (2017)

    Google Scholar 

  32. Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11(Feb), 451–490 (2010)

    MathSciNet  MATH  Google Scholar 

  33. Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)

    Article  Google Scholar 

  34. Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vision 119(3), 219–238 (2016)

    Article  MathSciNet  Google Scholar 

  35. Weng, J., Liu, M., Jiang, X., Yuan, J.: Deformable pose traversal convolution for 3d action and gesture recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 136–152 (2018)

  36. Weng, J., Weng, C., Yuan, J., Liu, Z.: Discriminative spatio-temporal pattern discovery for 3d action recognition. IEEE Trans. Circuits Syst. Video Technol. 29(4), 1077–1089 (2019)

    Article  Google Scholar 

  37. Xiao, Y., Xia, L.: Human action recognition using modified slow feature analysis and multiple kernel learning. Multimed. Tools Appl. 75(21), 13041–13056 (2016)

    Article  Google Scholar 

  38. Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems, pp. 521–528 (2003)

  39. Zhao, F., Huang, Y., Wang, L., Xiang, T., Tan, T.: Learning relevance restricted boltzmann machine for unstructured group activity and event understanding. Int. J. Comput. Vis. 119(3), 329–345 (2016)

    Article  MathSciNet  Google Scholar 

  40. Zhao, S., Liu, Y., Han, Y., Hong, R., Hu, Q., Tian, Q.: Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans. Circuits Syst. Video Technol. 28(8), 1839–1849 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

Under grants provided by the Colciencias’s project: “Caracterización morfológica de estructuras cerebrales por técnicas de imagenpara el tratamiento mediante implantación quirúrgica de neuroestimuladores en la enfermedad e Parkinson”, CV 110180763808—Colciencias. J. Fernández-Ramírez is partially founded by the project: “Sistema de clasificación de videos basado en técnicas de representación utilizando métodos núcleo e inferencia bayesiana”, CV E6-19-2, from the VIIE-UTP, and by the Colciencias program: Jóvenes investigadores e innovadores-Convocatoria 812 de 2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Fernández-Ramírez.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fernández-Ramírez, J., Álvarez-Meza, A., Pereira, E.M. et al. Video-based social behavior recognition based on kernel relevance analysis. Vis Comput 36, 1535–1547 (2020). https://doi.org/10.1007/s00371-019-01754-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-019-01754-y

Keywords

Navigation