Skip to main content

Advertisement

Log in

Unsupervised learning of finite full covariance multivariate generalized Gaussian mixture models for human activity recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We propose in this paper to recognize human activities through an unsupervised learning of finite multivariate generalized Gaussian mixture model. We address an important cue in finite mixture model which is the estimation of the mixture model’s parameters for a full covariance matrix. We have developed a novel learning algorithm based on Fixed-point covariance matrix estimator combined with the Expectation-Maximization algorithm. Furthermore, we have proposed an appropriate minimum message length (MML) criterion to deal with model selection problem. We evaluated our proposed method on synthetic datasets and a challenging application namely : Human activity recognition from images and videos. The obtained resutls show clearly the merits of our proposed framework which has better capabilities with full covariance matrix when modeling correlated data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Agusta Y, Dowe DL (2003) Unsupervised learning of correlated multivariate gaussian mixture models using mml. In: Australasian joint conference on artificial intelligence. Springer, pp 477–489

  2. Baxter RA, Oliver JJ (2000) Finding overlapping components with mml. Stat Comput 10(1):5–16

    Article  Google Scholar 

  3. Bosch A, Zisserman A, Muñoz X (2006) Scene classification via plsa. Computer Vision–ECCV 2006:517–530

    Google Scholar 

  4. Bouguila N, Ziou D (2007) High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Trans Pattern Anal Mach Intell 29(10):1716–1731

  5. Bruno B, Mastrogiovanni F, Sgorbissa A, Vernazza T, Zaccaria R (2012) Human motion modelling and recognition: a computational approach. In: 2012 IEEE international conference on automation science and engineering (CASE). IEEE, pp 156–161

  6. Calderara S, Cucchiara R, Prati A (2007) Detection of abnormal behaviors using a mixture of von mises distributions. In: IEEE conference on advanced video and signal based surveillance, 2007. AVSS 2007. IEEE, pp 141–146

  7. Channoufi I, Bourouis S, Bouguila N, Hamrouni K (2018) Image and video denoising by combining unsupervised bounded generalized gaussian mixture modeling and spatial information. Multimed Tools Appl 77:1–16

    Article  Google Scholar 

  8. Chong W, Blei D, Li FF (2009) Simultaneous image classification and annotation. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 1903–1910

  9. Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol 1. Prague, pp 1–2

  10. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol 39:1–38

    MathSciNet  MATH  Google Scholar 

  11. Dollár P., Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: 2nd joint IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance, 2005. IEEE, pp 65–72

  12. Elguebaly T, Bouguila N (2015) Semantic scene classification with generalized gaussian mixture models. In: International conference image analysis and recognition. Springer, pp 159–166

  13. Elguebaly T, Bouguila N (2015) Simultaneous high-dimensional clustering and feature selection using asymmetric gaussian mixture models. Image Vis Comput 34:27–41

    Article  Google Scholar 

  14. Fan W, Bouguila N (2014) Variational learning for dirichlet process mixtures of dirichlet distributions and applications. Multimed Tools Appl 70(3):1685–1702

    Article  Google Scholar 

  15. Iosifidis A, Tefas A, Pitas I (2014) Human action recognition based on bag of features and multi-view neural networks. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 1510–1514

  16. Kasarapu P, Allison L (2015) Minimum message length estimation of mixtures of multivariate gaussian and von mises-fisher distributions. Mach Learn 100(2-3):333–378

    Article  MathSciNet  MATH  Google Scholar 

  17. Kelker D (1970) Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā: The Indian Journal of Statistics, Series A: 419–430

  18. Kotz S (1975) Multivariate distributions at a cross-road. Statistical Distributions in Scientific Work 1:247–270

    Article  Google Scholar 

  19. Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2-3):107–123

    Article  Google Scholar 

  20. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Null. IEEE, pp 2169–2178

  21. Li LJ, Fei-Fei L (2007) What, where and who? Classifying events by scene and object recognition. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007. IEEE, pp 1–8

  22. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: IJCAI, vol 2015, pp 1617–1623

  23. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: AAAI, vol 30, pp 1266–1272

  24. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  25. Najar F, Bourouis S, Bouguila N, Belguith S (2017) A comparison between different gaussian-based mixture models. In: 14th IEEE international conference on computer systems and applications. IEEE, Tunisia

  26. Najar F, Bourouis S, Bouguila N, Belghith S (2018) A fixed-point estimation algorithm for learning the multivariate ggmm: application to human action recognition. Accepted, to be appear in the 31st IEEE Canadian conference on electrical and computer engineering (CCECE 2018)

  27. Negin F, Bremond F (2016) Human action recognition in videos: a survey. Tech. rep., INRIA Technical Report

  28. Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318

    Article  Google Scholar 

  29. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  MATH  Google Scholar 

  30. Pascal F, Bombrun L, Tourneret JY, Berthoumieu Y (2013) Parameter estimation for multivariate generalized gaussian distributions. IEEE Trans Signal Process 61(23):5960–5971

    Article  MathSciNet  MATH  Google Scholar 

  31. Peters C, Hermann T, Wachsmuth S, Hoey J (2014) Automatic task assistance for people with cognitive disabilities in brushing teeth-a user study with the tebra system. ACM Transactions on Accessible Computing (TACCESS) 5(4):10

    Google Scholar 

  32. Sailaja V, Srinivasa Rao K, Reddy K (2010) Text independent speaker identification with finite multivariate generalized gaussian mixture model and hierarchical clustering algorithm. Int J Comput Appl 11(11):0975–8887

    Google Scholar 

  33. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 3. IEEE, pp 32– 36

  34. Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th ACM international conference on multimedia. ACM, pp 357– 360

  35. Tanisik G, Zalluhoglu C, Ikizler-Cinbis N (2016) Facial descriptors for human interaction recognition in still images. Pattern Recogn Lett 73:44–51

    Article  Google Scholar 

  36. Varanasi MK, Aazhang B (1989) Parametric generalized gaussian density estimation. J Acoust Soc Am 86(4):1404–1415

    Article  Google Scholar 

  37. Vrigkas M, Nikou C, Kakadiaris IA (2015) A review of human activity recognition methods. Frontiers in Robotics and AI 2:28

    Article  Google Scholar 

  38. Wallace CS (2005) Statistical and inductive inference by minimum message length. Springer, Berlin

    MATH  Google Scholar 

  39. Yang Y, Saleemi I, Shah M (2013) Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions. IEEE Trans Pattern Anal Mach Intell 35(7):1635–1648

    Article  Google Scholar 

  40. Yao B, Fei-Fei L (2012) Action recognition with exemplar based 2.5 d graph matching. In: European conference on computer vision. Springer, Berlin, pp 173–186

  41. Yao B, Jiang X, Khosla A, Lin AL, Guibas L, Fei-Fei L (2011) Human action recognition by learning bases of action attributes and parts. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 1331–1338

  42. Zheng Y, Zhang YJ, Li X, Liu BD (2012) Action recognition in still images using a combination of human pose and context information. In: 2012 19th IEEE international conference on image processing (ICIP). IEEE, pp 785–788

  43. Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatma Najar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

$$\begin{array}{@{}rcl@{}} \frac{\partial L(X|{\Theta})}{\partial \mu_{jk_{1}}} &=& \frac{-1}{2m^{\beta_{j}}} \sum\limits_{i = 1}^{N} \beta_{j} \left( \sum\limits_{k = 1}^{d} (Y_{ik}-\mu_{jk})({\Sigma}_{j}^{-1}(k,k_{1})+{\Sigma}_{j}^{-1}(k_{1},k))\right) \\ && \times ((Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1} (Y_{i}-\boldsymbol{\mu}_{j}))^{\beta_{j}-1} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \end{array} $$
(25)
$$\begin{array}{@{}rcl@{}} \frac{\partial^{2} L(X|{\Theta})}{\partial \mu_{jk_{1}}^{2}} &=&\frac{-1}{2m^{\beta_{j}}} \sum\limits_{i = 1}^{N} \beta_{j} \left[(-2 {\Sigma}_{j}^{-1}(k_{1},k_{1})) \left( (Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j})\right)^{\beta_{j}-1} \right.\\ &&+ (\beta_{j}-1)\left( \sum\limits_{k = 1}^{d} (Y_{ik}-\mu_{jk})({\Sigma}_{j}^{-1}(k,k_{1})+{\Sigma}_{j}^{-1}(k_{1},k))\right)^{2} \\ && \left.\times ((Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1} (Y_{i}-\boldsymbol{\mu}_{j}))^{\beta_{j}-2}\right] \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \end{array} $$
(26)
$$\begin{array}{@{}rcl@{}} \frac{\partial^{2} L(X|{\Theta})}{\partial \mu_{jk_{1}} \partial \mu_{jk_{2}}} &=&\frac{-1}{2m^{\beta_{j}}} \sum\limits_{i = 1}^{N} \beta_{j} \left[(\beta_{j}-1) \left( \sum\limits_{k = 1}^{d} (Y_{ik}-\mu_{jk})({\Sigma}_{j}^{-1}(k,k_{1})+{\Sigma}_{j}^{-1}(k_{1},k))\right)\right.\\ && \times \left( \sum\limits_{k = 1}^{d} (Y_{ik}-\mu_{jk})({\Sigma}_{j}^{-1}(k,k_{2})+{\Sigma}_{j}^{-1}(k_{2},k))\right)\\ &&((Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1} (Y_{i}-\boldsymbol{\mu}_{j}))^{\beta_{j}-2}\\ && \left.- ({\Sigma}_{j}^{-1}(k_{1},k_{2})+{\Sigma}_{j}^{-1}(k_{2},k_{1})) \left( (Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j})\right)^{\beta_{j}-1} \right] \end{array} $$
$$\begin{array}{@{}rcl@{}} \frac{\partial L(X|{\Theta})}{\partial \beta_{j}} &=& \left[\frac{1}{\beta_{j}} + \frac{d \psi(d/2\beta_{j})}{2 {\beta_{j}^{2}}} + \frac{d \log(2)}{2 {\beta_{j}^{2}}}\right] \\ && + \sum\limits_{i = 1}^{N} \frac{1}{2m^{\beta_{j}}}\left( (Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j})\right)^{\beta_{j}}\\ & & \left[\log(m)-\log((Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j}))\right] \qquad \qquad \qquad \qquad \qquad \qquad \end{array} $$
(27)
$$\begin{array}{@{}rcl@{}} \frac{\partial^{2} L(X|{\Theta})}{\partial {\beta_{j}^{2}}} &=& \left[\frac{-1}{{\beta_{j}^{2}}} - \frac{d \psi(d/2\beta_{j})}{{\beta_{j}^{3}}} - \left( \frac{d}{2{\beta_{j}^{2}}}\right)^{2} \psi^{\prime}(d/2\beta_{j})- \frac{d \log(2)}{{\beta_{j}^{3}}}\right] \\ && - \sum\limits_{i = 1}^{N} \frac{1}{2m^{\beta_{j}}}\left( (Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j})\right)^{\beta_{j}}\\ & & \left[\log(m)-\log((Y_{i}-\boldsymbol{\mu}_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\boldsymbol{\mu}_{j}))\right]^{2} \qquad \qquad \qquad \qquad \qquad \qquad \end{array} $$
(28)
$$\begin{array}{@{}rcl@{}} d_{{\Sigma}_{j}} L(X|{\Theta})& = & -\frac{1}{2} tr({\Sigma}_{j}^{-1} d{\Sigma}_{j}) + \frac{\beta_{j}}{2 m^{\beta_{j}}}\sum\limits_{i = 1}^{N} \left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1} d{\Sigma}_{j} {\Sigma}_{j}^{-1} (Y_{i}-\mu_{j})\right) \\ & & \left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\mu_{j})\right)^{\beta_{j}-1} \qquad \qquad \end{array} $$
(29)
$$\begin{array}{@{}rcl@{}} {d}_{{\Sigma}_{j}}^{2} L(X|{\Theta}) & = & \frac{1}{2} tr({\Sigma}_{j}^{-1} d{\Sigma}_{j}^{-1} {\Sigma}_{j}^{-1} d{\Sigma}_{j}^{-1})\\ && - \frac{\beta_{j}(\beta_{j}-1)}{2 m^{\beta_{j}}} \sum\limits_{i = 1}^{N} \left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1} d{\Sigma}_{j} {\Sigma}_{j}^{-1} (Y_{i}-\mu_{j})\right)^{2} \\ & & \left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i}-\mu_{j})\right)^{\beta_{j}-2} - \frac{\beta_{j}}{2 m^{\beta_{j}}} \sum\limits_{i = 1}^{N}\left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1}(Y_{i} - \mu_{j})\right)^{\beta_{j}-1} \\ & & \left( (Y_{i}-\mu_{j})^{T}{\Sigma}_{j}^{-1} d{\Sigma}_{j} {\Sigma}_{j}^{-1} d{\Sigma}_{j} {\Sigma}_{j}^{-1}(Y_{i}-\mu_{j})\right) \end{array} $$
(30)

We need to express the Fisher information matrix into the differential forms dΣr, s , with Σr, s is the (r, s)-th non-redundant (i.e. rs) element of Σ. Introducing, for all r and s , the matrix E(r, s) : by

$$E_{(r,s)} = \left\{ \begin{array}{l} \bar{E}_{(r,r)} \qquad \qquad \qquad \qquad \quad r=s,\\ \bar{E}_{(r,s)}+\bar{E}_{(s,r)} \hspace{0.1\textheight} r\ne s, \end{array} \right. $$

where \(\bar {E}(r,s)\) denotes the d × d matrix with the (r, s)-th entry 1 and 0 elsewhere,

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Najar, F., Bourouis, S., Bouguila, N. et al. Unsupervised learning of finite full covariance multivariate generalized Gaussian mixture models for human activity recognition. Multimed Tools Appl 78, 18669–18691 (2019). https://doi.org/10.1007/s11042-018-7116-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-7116-9

Keywords

Navigation