Abstract
Sparse Representation-based Classifier (SRC) and Dictionary Learning (DL), have significantly impacted greatly on the classification performance of image recognition in recent times. In video semantic analysis, the locality structure of video semantic data containing more discriminative information is very essential for classification. However, this has not been fully considered by the current sparse representation-based approaches. Furthermore, similar coding outcomes are not being realized from video features with the same video category. To handle these issues, we propose a novel DL method, called Group Sparsity Locality-Sensitive Dictionary Learning (GSLSDL) for video semantic analysis. In the proposed GSLSDL, a discriminant loss function for the video category based on group sparse coding of sparse coefficients, is introduced into the structure of the Locality-Sensitive Dictionary Learning (LSDL) method. After solving the optimized dictionary, the sparse coefficients for the testing video feature samples are obtained. The classification result for video semantic is then realized by minimizing the error between the original and reconstructed samples. The experiment results show that, the proposed GSLSDL significantly improves the performance of video semantic detection compared with the competing methods, and robust in various diverse environments of video.






Similar content being viewed by others
References
Aharon M, Elad M, Bruckstein A (2006) K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on 54:4311–4322
Benuwa BB, Ghansah B, Wornyo DK, Adabunu SA (2016) A Comprehensive Review of Particle Swarm Optimization. In: International Journal of Engineering Research in Africa, pp. 141–161
Benuwa BB, Zhan YZ, Ghansah B, Wornyo DK, Banaseka Kataka F (2016) A Review of Deep Machine Learning. In: International Journal of Engineering Research in Africa, pp. 124–136
Cai S, Zuo W, Zhang L, Feng X, Wang P (2014) Support vector guided dictionary learning. In: European Conference on Computer Vision, pp. 624–639
Chang H, Yang M, Yang J (2016) Learning a structure adaptive dictionary for sparse representation based classification. Neurocomputing 190:124–131
Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization. Proc Natl Acad Sci 100:2197–2202
Engan K, Aase SO, Husoy JH (1999) Method of optimal directions for frame design. In: Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on, pp. 2443–2446
Feng Z, Yang M, Zhang L, Liu Y, Zhang D (2013) Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recogn 46:2134–2143
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video Captioning With Attention-Based LSTM and Semantic Consistency. IEEE Transactions on Multimedia 19:2045–2055
Gou J, Xu Y, Zhang D, Mao Q, Du L, Zhan Y (2018) Two-phase linear reconstruction measure-based classification for face recognition. Inf Sci 433–434:17–36
Guo Y, Zhang J, Gao L (2018) Exploiting long-term temporal dynamics for video captioning. World Wide Web-internet & Web Information Systems 1–15
Haralick RM, Shanmugam K (1973) Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, pp. 610–621
Harandi M, Salzmann M (2015) Riemannian coding and dictionary learning: Kernels to the rescue," In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3926–3935
Jiang Z, Lin Z, Davis LS (2011) Learning a discriminative dictionary for sparse coding via label consistent K-SVD. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 1697–1704
Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 35:2651–2664
Lee Y-S, Wang C-Y, Mathulaprangsan S, Zhao J-H, Wang J-C (2016) Locality-preserving K-SVD Based Joint Dictionary and Classifier Learning for Object Recognition. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 481–485
Lei J, Zheng K, Zhang H, Cao X, Ling N, Hou Y (2017) Sketch based image retrieval via image-aided cross domain learning. In: Image Processing (ICIP), 2017 IEEE International Conference on, pp. 3685–3689
Li L, Li S, Fu Y (2013) Discriminative dictionary learning with low-rank regularization for face recognition. In: Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, pp. 1–6
Liu W, Yu Z, Yang M, Lu L, Zou Y (2015) Joint kernel dictionary and classifier learning for sparse coding via locality preserving K-SVD. In. IEEE International Conference on Multimedia and Expo, pp. 1–6
Ma H, Gou J, Wang X, Ke J, Zeng S (2017) Sparse Coefficient-Based ${k}$ -Nearest Neighbor Classification. IEEE Access 5:16618–16634
Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image analysis. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8
Mairal J, Ponce J, Sapiro G, Zisserman A, Bach FR (2009) Supervised dictionary learning. In: Advances in neural information processing systems, pp. 1033–1040
Mukundan R (2005) Radial Tchebichef invariants for pattern recognition. In: TENCON 2005 2005 IEEE Region 10, pp. 1–6
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987
Pham D-S, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition," in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8
Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25:4999–5011
Song J, He T, Fan H, Gao L (2017) Deep Discrete Hashing with Self-supervised Pairwise Labels. 223–238
Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder. IEEE Trans Image Process PP:1–1
Sun Y, Liu Q, Tang J, Tao D (2014) Learning discriminative dictionary for group sparse representation. IEEE Trans Image Process 23:3816–3828
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 3360–3367
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained Linear Coding for image classification. In: Computer Vision and Pattern Recognition, 3360–3367
Wang P, Lan J, Zang Y, Song Z (2016) Discriminative structured dictionary learning for image classification. Transactions of Tianjin University 22:158–163
Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3D convNet Fusion for Action Recognition in Videos with Arbitrary Size and Length. IEEE Transactions on Multimedia PP:1–1
Wang X, Gao L, Song J, Zhen X, Sebe N, Shen HT (2018) Deep Appearance and Motion Learning for Egocentric Activity Recognition. Neurocomputing
Wei C-P, Chao Y-W, Yeh Y-R, Wang Y-CF (2013) Locality-sensitive dictionary learning for sparse representation based classification. Pattern Recogn 46:1277–1287
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 31:210–227
Xu Y, Sun Y, Quan Y, Zheng B (2015) Discriminative structured dictionary learning with hierarchical group sparsity. Comput Vis Image Underst 136:59–68
Xu D, Alameda-Pineda X, Song J, Ricci E, Sebe N (2016) Academic Coupled Dictionary Learning for Sketch-based Image Retrieval. In: ACM on Multimedia Conference, pp. 1326–1335
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1794–1801
Yang M., L. Zhang, X. Feng, and D. Zhang (2011) Fisher discrimination dictionary learning for sparse representation. In: Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 543–550
Yang M, Zhang L, Feng X, Zhang D (2014) Sparse representation based fisher discrimination dictionary learning for image classification. Int J Comput Vis 109:209–232
Yongzhao Z, Manrong W, Jia K (2012) Video keyframe extraction using ordered samples clustering based on artificial immune. Journal of Jiangsu University (Natural Science Edition) 2:017
Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Advances in neural information processing systems, 2223–2231
Yuan XT, Liu X, Yan S (2012) Visual Classification With Multitask Joint Sparse Representation. IEEE Trans Image Process 21:4349–4360
Zha Z, Liu X, Huang X, Hong X, Shi H, Xu Y et al (2016) Analyzing the group sparsity based on the rank minimization methods. arXiv preprint arXiv:1611.08983
Zhan Y, Sun J, Niu D, Mao Q, Fan J (2015) A semi-supervised incremental learning method based on adaptive probabilistic hypergraph for video semantic detection. Multimedia Tools & Applications 74:5513–5531
Zhan Y, Liu J, Gou J, Wang M (2016) A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN. J Vis Commun Image Represent 41:65–73
Zhang Q, Li B (2010) Discriminative K-SVD for dictionary learning in face recognition. In: Computer Vision and Pattern Recognition, pp. 2691–2698
Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530
Zhang Z, Li F, Chow TWS, Zhang L, Yan S (2016) Sparse Codes Auto-Extractor for Classification: A Joint Embedding and Dictionary Learning Framework for Representation. IEEE Trans Signal Process 64:3790–3805
Zhang Z, Jiang W, Qin J, Zhang L, Li F, Zhang M et al (2017) Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier. IEEE Trans Neural Netw Learn Syst PP:1–17
Zhang T, Jia W, He X, Yang J (2017) Discriminative Dictionary Learning with Motion Weber Local Descriptor for Violence Detection. IEEE Transactions on Circuits & Systems for Video Technology 27:696–709
Zhang Z, Jiang W, Li F, Zhao M, Li B, Zhang L (2017) Structured Latent Label Consistent Dictionary Learning for Salient Machine Faults Representation based Robust Classification. IEEE Transactions on Industrial Informatics PP:1–1
Zhao S, Yao H, Jiang X, Sun X (2015) Predicting discrete probability distribution of image emotions. In: IEEE International Conference on Image Processing, pp. 2459–2463
Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression. IEEE Transactions on Multimedia 19:632–645
Zheng H, Tao D (2015) Discriminative dictionary learning via Fisher discrimination K-SVD algorithm: Elsevier Science Publishers B. V
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67:301–320
Acknowledgments
This work was buoyed in part by National Natural Science Foundation of China (Grant Nos.~61170126, Grant Nos.~61502208), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 14KJB520007), China Postdoctoral Science Foundation (Grant No. 2015 M570411), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522) and Research Foundation for Talented Scholars of JiangSu University (Grant No. 14JDG037).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that, there are no conflicts of interest whatsoever.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Benuwa, BB., Zhan, Y., Liu, J. et al. Group sparse based locality – sensitive dictionary learning for video semantic analysis. Multimed Tools Appl 78, 6721–6744 (2019). https://doi.org/10.1007/s11042-018-6417-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6417-3