Skip to main content
Log in

Group sparse based locality – sensitive dictionary learning for video semantic analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Sparse Representation-based Classifier (SRC) and Dictionary Learning (DL), have significantly impacted greatly on the classification performance of image recognition in recent times. In video semantic analysis, the locality structure of video semantic data containing more discriminative information is very essential for classification. However, this has not been fully considered by the current sparse representation-based approaches. Furthermore, similar coding outcomes are not being realized from video features with the same video category. To handle these issues, we propose a novel DL method, called Group Sparsity Locality-Sensitive Dictionary Learning (GSLSDL) for video semantic analysis. In the proposed GSLSDL, a discriminant loss function for the video category based on group sparse coding of sparse coefficients, is introduced into the structure of the Locality-Sensitive Dictionary Learning (LSDL) method. After solving the optimized dictionary, the sparse coefficients for the testing video feature samples are obtained. The classification result for video semantic is then realized by minimizing the error between the original and reconstructed samples. The experiment results show that, the proposed GSLSDL significantly improves the performance of video semantic detection compared with the competing methods, and robust in various diverse environments of video.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Aharon M, Elad M, Bruckstein A (2006) K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on 54:4311–4322

  2. Benuwa BB, Ghansah B, Wornyo DK, Adabunu SA (2016) A Comprehensive Review of Particle Swarm Optimization. In: International Journal of Engineering Research in Africa, pp. 141–161

  3. Benuwa BB, Zhan YZ, Ghansah B, Wornyo DK, Banaseka Kataka F (2016) A Review of Deep Machine Learning. In: International Journal of Engineering Research in Africa, pp. 124–136

  4. Cai S, Zuo W, Zhang L, Feng X, Wang P (2014) Support vector guided dictionary learning. In: European Conference on Computer Vision, pp. 624–639

  5. Chang H, Yang M, Yang J (2016) Learning a structure adaptive dictionary for sparse representation based classification. Neurocomputing 190:124–131

    Article  Google Scholar 

  6. Donoho DL, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization. Proc Natl Acad Sci 100:2197–2202

    Article  MathSciNet  Google Scholar 

  7. Engan K, Aase SO, Husoy JH (1999) Method of optimal directions for frame design. In: Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on, pp. 2443–2446

  8. Feng Z, Yang M, Zhang L, Liu Y, Zhang D (2013) Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recogn 46:2134–2143

    Article  Google Scholar 

  9. Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video Captioning With Attention-Based LSTM and Semantic Consistency. IEEE Transactions on Multimedia 19:2045–2055

    Article  Google Scholar 

  10. Gou J, Xu Y, Zhang D, Mao Q, Du L, Zhan Y (2018) Two-phase linear reconstruction measure-based classification for face recognition. Inf Sci 433–434:17–36

    Article  MathSciNet  Google Scholar 

  11. Guo Y, Zhang J, Gao L (2018) Exploiting long-term temporal dynamics for video captioning. World Wide Web-internet & Web Information Systems 1–15

  12. Haralick RM, Shanmugam K (1973) Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, pp. 610–621

  13. Harandi M, Salzmann M (2015) Riemannian coding and dictionary learning: Kernels to the rescue," In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3926–3935

  14. Jiang Z, Lin Z, Davis LS (2011) Learning a discriminative dictionary for sparse coding via label consistent K-SVD. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 1697–1704

  15. Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 35:2651–2664

    Article  Google Scholar 

  16. Lee Y-S, Wang C-Y, Mathulaprangsan S, Zhao J-H, Wang J-C (2016) Locality-preserving K-SVD Based Joint Dictionary and Classifier Learning for Object Recognition. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 481–485

  17. Lei J, Zheng K, Zhang H, Cao X, Ling N, Hou Y (2017) Sketch based image retrieval via image-aided cross domain learning. In: Image Processing (ICIP), 2017 IEEE International Conference on, pp. 3685–3689

  18. Li L, Li S, Fu Y (2013) Discriminative dictionary learning with low-rank regularization for face recognition. In: Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, pp. 1–6

  19. Liu W, Yu Z, Yang M, Lu L, Zou Y (2015) Joint kernel dictionary and classifier learning for sparse coding via locality preserving K-SVD. In. IEEE International Conference on Multimedia and Expo, pp. 1–6

  20. Ma H, Gou J, Wang X, Ke J, Zeng S (2017) Sparse Coefficient-Based ${k}$ -Nearest Neighbor Classification. IEEE Access 5:16618–16634

    Article  Google Scholar 

  21. Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image analysis. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8

  22. Mairal J, Ponce J, Sapiro G, Zisserman A, Bach FR (2009) Supervised dictionary learning. In: Advances in neural information processing systems, pp. 1033–1040

  23. Mukundan R (2005) Radial Tchebichef invariants for pattern recognition. In: TENCON 2005 2005 IEEE Region 10, pp. 1–6

  24. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987

    Article  Google Scholar 

  25. Pham D-S, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition," in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8

  26. Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25:4999–5011

    Article  MathSciNet  Google Scholar 

  27. Song J, He T, Fan H, Gao L (2017) Deep Discrete Hashing with Self-supervised Pairwise Labels. 223–238

  28. Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder. IEEE Trans Image Process PP:1–1

    MathSciNet  MATH  Google Scholar 

  29. Sun Y, Liu Q, Tang J, Tao D (2014) Learning discriminative dictionary for group sparse representation. IEEE Trans Image Process 23:3816–3828

    Article  MathSciNet  Google Scholar 

  30. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 3360–3367

  31. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained Linear Coding for image classification. In: Computer Vision and Pattern Recognition, 3360–3367

  32. Wang P, Lan J, Zang Y, Song Z (2016) Discriminative structured dictionary learning for image classification. Transactions of Tianjin University 22:158–163

    Article  Google Scholar 

  33. Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3D convNet Fusion for Action Recognition in Videos with Arbitrary Size and Length. IEEE Transactions on Multimedia PP:1–1

    Google Scholar 

  34. Wang X, Gao L, Song J, Zhen X, Sebe N, Shen HT (2018) Deep Appearance and Motion Learning for Egocentric Activity Recognition. Neurocomputing

  35. Wei C-P, Chao Y-W, Yeh Y-R, Wang Y-CF (2013) Locality-sensitive dictionary learning for sparse representation based classification. Pattern Recogn 46:1277–1287

    Article  Google Scholar 

  36. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 31:210–227

    Article  Google Scholar 

  37. Xu Y, Sun Y, Quan Y, Zheng B (2015) Discriminative structured dictionary learning with hierarchical group sparsity. Comput Vis Image Underst 136:59–68

    Article  Google Scholar 

  38. Xu D, Alameda-Pineda X, Song J, Ricci E, Sebe N (2016) Academic Coupled Dictionary Learning for Sketch-based Image Retrieval. In: ACM on Multimedia Conference, pp. 1326–1335

  39. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1794–1801

  40. Yang M., L. Zhang, X. Feng, and D. Zhang (2011) Fisher discrimination dictionary learning for sparse representation. In: Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 543–550

  41. Yang M, Zhang L, Feng X, Zhang D (2014) Sparse representation based fisher discrimination dictionary learning for image classification. Int J Comput Vis 109:209–232

    Article  MathSciNet  Google Scholar 

  42. Yongzhao Z, Manrong W, Jia K (2012) Video keyframe extraction using ordered samples clustering based on artificial immune. Journal of Jiangsu University (Natural Science Edition) 2:017

    Google Scholar 

  43. Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Advances in neural information processing systems, 2223–2231

  44. Yuan XT, Liu X, Yan S (2012) Visual Classification With Multitask Joint Sparse Representation. IEEE Trans Image Process 21:4349–4360

    Article  MathSciNet  Google Scholar 

  45. Zha Z, Liu X, Huang X, Hong X, Shi H, Xu Y et al (2016) Analyzing the group sparsity based on the rank minimization methods. arXiv preprint arXiv:1611.08983

  46. Zhan Y, Sun J, Niu D, Mao Q, Fan J (2015) A semi-supervised incremental learning method based on adaptive probabilistic hypergraph for video semantic detection. Multimedia Tools & Applications 74:5513–5531

    Article  Google Scholar 

  47. Zhan Y, Liu J, Gou J, Wang M (2016) A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN. J Vis Commun Image Represent 41:65–73

    Article  Google Scholar 

  48. Zhang Q, Li B (2010) Discriminative K-SVD for dictionary learning in face recognition. In: Computer Vision and Pattern Recognition, pp. 2691–2698

  49. Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530

    Article  Google Scholar 

  50. Zhang Z, Li F, Chow TWS, Zhang L, Yan S (2016) Sparse Codes Auto-Extractor for Classification: A Joint Embedding and Dictionary Learning Framework for Representation. IEEE Trans Signal Process 64:3790–3805

    Article  MathSciNet  Google Scholar 

  51. Zhang Z, Jiang W, Qin J, Zhang L, Li F, Zhang M et al (2017) Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier. IEEE Trans Neural Netw Learn Syst PP:1–17

    Article  Google Scholar 

  52. Zhang T, Jia W, He X, Yang J (2017) Discriminative Dictionary Learning with Motion Weber Local Descriptor for Violence Detection. IEEE Transactions on Circuits & Systems for Video Technology 27:696–709

    Article  Google Scholar 

  53. Zhang Z, Jiang W, Li F, Zhao M, Li B, Zhang L (2017) Structured Latent Label Consistent Dictionary Learning for Salient Machine Faults Representation based Robust Classification. IEEE Transactions on Industrial Informatics PP:1–1

    Google Scholar 

  54. Zhao S, Yao H, Jiang X, Sun X (2015) Predicting discrete probability distribution of image emotions. In: IEEE International Conference on Image Processing, pp. 2459–2463

  55. Zhao S, Yao H, Gao Y, Ji R, Ding G (2017) Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression. IEEE Transactions on Multimedia 19:632–645

    Article  Google Scholar 

  56. Zheng H, Tao D (2015) Discriminative dictionary learning via Fisher discrimination K-SVD algorithm: Elsevier Science Publishers B. V

  57. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67:301–320

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was buoyed in part by National Natural Science Foundation of China (Grant Nos.~61170126, Grant Nos.~61502208), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 14KJB520007), China Postdoctoral Science Foundation (Grant No. 2015 M570411), Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522) and Research Foundation for Talented Scholars of JiangSu University (Grant No. 14JDG037).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ben-Bright Benuwa.

Ethics declarations

Conflict of Interest

The authors declare that, there are no conflicts of interest whatsoever.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Benuwa, BB., Zhan, Y., Liu, J. et al. Group sparse based locality – sensitive dictionary learning for video semantic analysis. Multimed Tools Appl 78, 6721–6744 (2019). https://doi.org/10.1007/s11042-018-6417-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6417-3

Keywords

Navigation