Abstract
Complex action recognition possesses significant academic research value, potential commercial value and broad market application prospect. For improving its performance, a local-weighted nonnegative matrix factorization with rank regularization constraint (LWNMF_RC) is firstly presented, which removes complex background and then obtains motion salient regions. Secondly, a dual-manifold regularized nonnegative matrix factorization with sparsity constraint (DMNMF_SC) is proposed, which not only considers the short-term and middle-term temporal dependencies implied in data manifold, but also mines the geometric structure hidden in feature manifold. In addition, the introduction of sparsity constraint makes features possess better discriminativeness. Thirdly, a deep DMNMF_SC method is constructed, which acquires more hierarchical and discriminative features. Finally, a long-term temporal memory model with probability transfer learning (PTL-LTM) is proposed, which accurately memorizes the long-term temporal dependency among multiple simple action segments and, meanwhile, makes full use of the probability features of rich labeled simple actions and then applies the knowledge learned from simple actions for complex action recognition. Consequently, the performance is effectively improved.
Similar content being viewed by others
References
Chen Y, Yi Z (2019) Locality-constrained least squares regression for subspace clustering. Knowl-Based Syst 163:51–56
Lu Y, Lai Z, Xu Y, Li X, Zhang D, Yuan C (2017) Nonnegative discriminant matrix factorization. IEEE Trans Circuits Syst Video Technol 27(7):1392–1405
Lu C, Feng J, Lin Z, Mei T, Yan S (2019) Subspace clustering by block diagonal representation. IEEE Trans Pattern Anal Mach Intell 41(2):487–501
Lu C, Feng J, Chen Y, Liu W, Lin Z, Yan S (2019) Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2019.2891760
Alawadi S, Fernández-Delgado M, Mera D, Barro S (2018) Polynomial kernel discriminant analysis for 2D visualization of classification problems. Neural Comput Appl. https://doi.org/10.1007/s00521-017-3290-3
Xu KK, Li HX, Liu Z (2018) ISOMAP-based spatiotemporal modeling for lithium-ion battery thermal process. IEEE Trans Ind Inf 14(2):569–577
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401(6755):788–791
Yuan X, Han L, Qian S, Xu G, Yan H (2019) Singular value decomposition based recommendation using imputed data. Knowl-Based Syst 163:485–494
Yi Y, Wang J, Zhou W, Zheng C, Kong J, Qiao S (2019) Non-negative matrix factorization with locality constrained adaptive graph. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2019.2892971
Zhu W, Yan Y, Peng Y (2018) Topological structure regularized nonnegative matrix factorization for image clustering. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3572-4
Yang S, Zhang L, He X, Yi Z (2019) Learning manifold structures with subspace segmentations. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2019.2895497
Zhang H, Wang S, Xu X, Chow TW, Wu QJ (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15
Gao H, Nie F, Huang H (2017) Local centroids structured non-negative matrix factorization. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 1905–1911
Huang S, Zhao P, Ren Y, Li T, Xu Z (2019) Self-paced and soft-weighted nonnegative matrix factorization for data representation. Knowl-Based Syst 164:29–37
Liu F, Xu X, Qiu S, Tao D (2016) Simple to complex transfer learning for action recognition. IEEE Trans Image Process 25(2):949–960
Zhang J, Hu H (2019) Domain learning joint with semantic adaptation for human action recognition. Pattern Recognit 90:196–209
Li J, Wong Y, Zhao Q, Kankanhalli MS (2017) Attention transfer from web images for video recognition. In: Proceedings of the 25th ACM international conference on multimedia, pp 1–9
Luo Z, Zou Y, Hoffman J, Fei-Fei L (2017) Label efficient learning of transferable representations acrosss domains and tasks. In: Proceedings of advances in neural information processing systems (NIPS), pp 165–177
Duan L, Xu D, Tsang IWH, Luo J (2012) Visual event recognition in videos by learning from web data. IEEE Trans Pattern Anal Mach Intell 34(9):1667–1680
Rahmani H, Mian A, Shah M (2018) Learning a deep model for human action recognition from novel viewpoints. IEEE Trans Pattern Anal Mach Intell 40(3):667–681
Wu F, Hu Y, Gao J, Sun Y, Yin B (2016) Ordered subspace clustering with block-diagonal priors. IEEE Trans Cybern 46(12):3209–3219
Wang J, Tian F, Liu CH, Yu H, Wang X, Tang X (2017) Robust nonnegative matrix factorization with ordered structure constraints. In: Proceedings of the international joint conference on neural networks (IJCNN), pp 478–485
Xiang Y, Zhang G, Gu S, Cai J (2018) Online multi-layer dictionary pair learning for visual classification. Expert Syst Appl 105:174–182
Su B, Zhou J, Ding X, Wang H, Wu Y (2016) Hierarchical dynamic parsing and encoding for action recognition. In: Proceedings of European conference on computer vision (ECCV), pp 202–217
Trigeorgis G, Zafeiriou S, Schuller BW (2017) A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Intell 39(3):417–429
Kulis B (2012) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364
Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905
Reddy KK, Shah M (2013) Recognizing 50 human action categories of web videos. Mach Vis Appl 24(5):971–981
Niebles JC, Chen CW, Fei-Fei L (2010) Modeling temporal structure of decomposable motion segments for activity classification. In: Proceedings of European conference on computer vision (ECCV), pp 392–405
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: Proceedings of the international conference on pattern recognition (ICPR), pp 32–36
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666
Allab K, Labiod L, Nadif M (2017) A semi-NMF-PCA unified framework for data clustering. IEEE Trans Knowl Data Eng 29(1):2–16
Arias-Castro E, Lerman G, Zhang T (2017) Spectral clustering based on local PCA. J Mach Learn Res 18(9):1–57
Liu G, Lin Z, Yu Y (2010) Robust subspace segmentation by low-rank representation. In: Proceedings of the international conference on machine learning (ICML), pp 663–670
Hu W, Choi KS, Wang P, Jiang Y, Wang S (2015) Convex nonnegative matrix factorization with manifold regularization. Neural Netw 63:94–103
Xia G, Sun H, Feng L, Zhang G, Liu Y (2018) Human motion segmentation via robust kernel sparse subspace clustering. IEEE Trans Image Process 27(1):135–150
Everts I, Van Gemert JC, Gevers T (2014) Evaluation of color spatio-temporal interest points for human action recognition. IEEE Trans Image Process 23(4):1569–1580
Ciptadi A, Goodwin MS, Rehg JM (2014) Movement pattern histogram for action recognition and retrieval. In: Proceedings of European conference on computer vision (ECCV), pp 695–710
Narayan S, Ramakrishnan KR (2014) A cause and effect analysis of motion trajectories for modeling actions. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2633–2640
Liu J, Huang Y, Peng X, Wang L (2015) Multi-view descriptor mining via codeword net for action recognition. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 793–797
Chen QQ, Zhang YJ (2016) Cluster trees of improved trajectories for action recognition. Neurocomputing 173:364–372
Wang H, Oneata D, Verbeek J, Schmid C (2016) A robust and efficient video representation for action recognition. Int J Comput Vis 119(3):219–238
Peng X, Wang L, Wang X, Qiao Y (2016) Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput Vis Image Underst 150:109–125
Wang L, Qiao Y, Tang X (2016) MoFAP: a multi-level representation for action recognition. Int J Comput Vis 119(3):254–271
Liu AA, Su YT, Nie WZ, Kankanhalli M (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114
Wang H, Chang X, Shi L, Yang Y, Shen YD (2018) Uncertainty sampling for action recognition via maximizing expected average precision. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence (IJCAI), pp 964–970
Ni B, Moulin P, Yang X, Yan S (2015) Motion part regularization: Improving action recognition via trajectory selection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3698–3706
Liu C, Wu X, Jia Y (2016) A hierarchical video description for complex activity understanding. Int J Comput Vis 118(2):240–255
Yi Y, Zheng Z, Lin M (2017) Realistic action recognition with salient foreground trajectories. Expert Syst Appl 75:44–55
Xu K, Jiang X, Sun T (2017) Two-stream dictionary learning architecture for action recognition. IEEE Trans Circuits Syst Video Technol 27(3):567–576
Li WX, Vasconcelos N (2017) Complex activity recognition via attribute dynamics. Int J Comput Vis 122(2):334–370
Tian Y, Kong Y, Ruan Q, An G, Fu Y (2018) Hierarchical and spatio-temporal sparse representation for human action recognition. IEEE Trans Image Process 27(4):1748–1762
Tang K, Fei-Fei L, Koller D (2012) Learning latent temporal structure for complex event detection. In: Proceedings of the IEEE international conference on computer vision (CVPR), pp 1250–1257
Li W, Yu Q, Divakaran A, Vasconcelos N (2013) Dynamic pooling for complex event recognition. In: Proceedings of the IEEE international conference on computer vision (CVPR), pp 2728–2735
Zheng J, Jiang Z, Chellappa R (2016) Submodular attribute selection for visual recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2242–2255
Acknowledgements
This work was supported partially by Science and Technology Overall Innovation Project of Shaanxi Province (Grant 2013KTZB03-03-03), Shaanxi Province Key Project of Research and Development Plan (S2018-YF-ZDGY-0187) and International Cooperation Project of Shaanxi Province (S2018-YF-GHMS-0061).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors of the manuscript declared that there are no potential conflicts of interest.
Human and animal rights
All the authors of the manuscript declared that there is no research involving human participants and/or animal.
Informed consent
All the authors of the manuscript declared that there is no material that required informed consent.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Proof of Theorem 1
To prove Theorem 1, it is required to show the non-increasing property of the objective function in Eq. (3) under the update rules in Eqs. (15) and (16). Firstly, the objective function is proved to have non-increasing property under the update rule in Eq. (15). Then, it is demonstrated to have non-increasing property under the update rule in Eq. (16). The proof procedure will utilize the following auxiliary function, which is the same as that employed in the expectation maximization (EM) algorithm.
Definition 1
If the conditions G(x, x(t)) ≥ F(x) and G(x, x) = F(x) are satisfied, then G(x, x(t)) is an auxiliary function of F(x).
Lemma 1
If G(x, x(t)) is an auxiliary function of F(x), then F(x) is non-increasing under the following update rule:
Proof
\( F\left( {x^{{\left( {t + 1} \right)}} } \right) \le G\left( {x^{{\left( {t + 1} \right)}} ,x^{\left( t \right)} } \right) \le G\left( {x^{\left( t \right)} ,x^{\left( t \right)} } \right) = F\left( {x^{\left( t \right)} } \right). \)
Now, it will be shown in the following that the update rule for W in Eq. (3) is exactly the update rule in Eq. (15) with a proper auxiliary function.
Considering any element wjk in W, \( F_{{w_{jk} }} \) is used to represent the part of Eq. (3), which is only related to wjk. It is easy to check that:
Since the update rule is essentially element-wise, it is sufficient to prove that each \( F_{{w_{jk} }} \) is non-increasing under the update rule in Eq. (15).
Lemma 2
Function (54) is an auxiliary function of \( F_{{w_{jk} }} \), which is the part of OLWNMF_RC and only related to wjk.
Proof
Since \(G(w,w) = F_{w_{jk}}(w)\) is obvious, it only requires to prove that \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge F_{{w_{jk} }} \left( w \right) \). To do this, a comparison of Taylor series expansion of \( F_{{w_{jk} }} \left( w \right) \) is made with Eq. (54):
and it can be found that: \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge F_{{w_{jk} }} \left( w \right) \) is equivalent to
Meanwhile, the following equations hold:
Therefore, Eq. (56) holds, from which \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge F_{{w_{jk} }} \left( w \right) \) holds.
Now, the objective function of Theorem 1 can be demonstrated to be non-increasing under the update rule in Eq. (15).
Proof
Substitute \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \) in Eq. (54) into Eq. (51), and the following update rule is obtained:
Since Eq. (54) is an auxiliary function, \( F_{w_{jk}}\) is non-increasing under this update rule.
Subsequently, the objective function is validated to be non-increasing under the update rule in Eq. (16).
Considering any element vjk in V, \( F_{{v_{jk} }} \) is used to denote the part of Eq. (3), which is only related to vjk. It is easy to check that:
where I is an identity matrix.
Lemma 3
Function (62) is an auxiliary function for \( F_{{v_{jk} }} \), which is the part of OLWNMF_RC and only related to vjk.
Proof
Since \( G\left( {v,v} \right) = F_{{v_{jk} }} \left( v \right) \) is obvious, it only requires to show that \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge F_{{v_{jk} }} \left( v \right) \). To do this, a comparison of Taylor series expansion of \( F_{{v_{jk} }} \left( v \right) \) is made with Eq. (62):
and it can be found that: \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge F_{{v_{jk} }} \left( v \right) \) is equivalent to
Meanwhile, the following equations hold:
Thus, Eq. (64) holds and \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge F_{{v_{jk} }} \left( v \right) \).
Now, it can also be demonstrated that the objective function of Theorem 1 is non-increasing under the update rule in Eq. (16).
Proof
Substitute \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \) in Eq. (62) into Eq. (51), and the following update rule can be obtained:
Since Eq. (62) is an auxiliary function, and \( F_{{v_{jk} }} \) is non-increasing under this update rule. Therefore, Theorem 1 holds.
Appendix 2: Proof of Theorem 2
To prove Theorem 2, it is required to show the non-increasing property of the objective function in Eq. (19) under the update rules in Eqs. (31) and (40). Firstly, the objective function is proved to have non-increasing property under the update rule in Eq. (31). Then, it is also demonstrated to have non-increasing property under the update rule in Eq. (40). The proof will utilize the following auxiliary function, which is the same as that used in the EM algorithm.
According to Definition 1 and Lemma 1 in Appendix 1, it can also be demonstrated that the objective function of Theorem 2 has non-increasing property under the update rule in Eq. (31).
Considering any element wjk in W, \( \tilde{J}_{{w_{jk} }} \) is utilized to denote the part of Eq. (19), which is only related to wjk. It is easy to check that:
Lemma 4
Function (71) is an auxiliary function for \( \tilde{J}_{{w_{jk} }} \), which is the part of ODWNMF_SC, and only related to wjk.
Proof
Since \( G\left( {w,w} \right) = \tilde{J}_{{w_{jk} }} \left( w \right) \) is obvious, it only requires to show that \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{w_{jk} }} \left( w \right) \). To do this, a comparison of Taylor series expansion of \( \tilde{J}_{{w_{jk} }} \left( w \right) \) is made with Eq. (71):
And it can be found that: \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{w_{jk} }} \left( w \right) \) is equivalent to
Meanwhile, the following inequalities hold:
Thus, Eq. (73) holds and \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{w_{jk} }} \left( w \right) \).
Now, it can be demonstrated that the objective function of Theorem 2 is non-increasing under the update rule of Eq. (31).
Proof
Replace \( G\left( {w,w_{jk}^{\left( t \right)} } \right) \) in Eq. (51) by Eq. (71), and the following update rule is obtained:
Since Eq. (71) is an auxiliary function, \( \tilde{J}_{{w_{jk} }} \) is non-increasing under this update rule.
Subsequently, the objective function is validated to be non-increasing under the update rule in Eq. (40).
Considering any element vjk in V, \( \tilde{J}_{{v_{jk} }} \) is used to denote the part of Eq. (19), which is only related to vjk. It is easy to check that:
Lemma 5
Function (79) is an auxiliary function for \( \tilde{J}_{{v_{jk} }} \), which is the part of ODWNMF_SC and only related to vjk.
Proof
Since \( G\left( {v,v} \right) = \tilde{J}_{{v_{jk} }} \left( v \right) \) is obvious, it only requires to show that \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{v_{jk} }} \left( v \right) \). To do this, a comparison of Taylor series expansion of \( \tilde{J}_{{v_{jk} }} \left( v \right) \) is made with Eq. (79):
and it can be found that: \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{v_{jk} }} \left( v \right) \) is equivalent to
Meanwhile, the following inequalities hold:
Thus, Eq. (81) holds and \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \ge \tilde{J}_{{v_{jk} }} \left( v \right) \).
Now, it can also be demonstrated that the objective function of Theorem 2 is non-increasing under the update rule in Eq. (40).
Proof
Replace \( G\left( {v,v_{jk}^{\left( t \right)} } \right) \) in Eq. (51) by Eq. (79) and the following update rule can be obtained:
Since Eq. (79) is an auxiliary function, \( \tilde{J}_{{v_{jk} }} \) is non-increasing under the update rule in Eq. (85). Thus, Theorem 2 holds.
Rights and permissions
About this article
Cite this article
Tong, M., Bai, H., Yue, X. et al. PTL-LTM model for complex action recognition using local-weighted NMF and deep dual-manifold regularized NMF with sparsity constraint. Neural Comput & Applic 32, 13759–13781 (2020). https://doi.org/10.1007/s00521-020-04783-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04783-0