Subspace-level dictionary fusion for robust multimedia classification

Zhou, Jianhang; Zeng, Shaoning; Zhang, Bob

doi:10.1007/s11042-021-10661-1

Subspace-level dictionary fusion for robust multimedia classification

Published: 21 March 2021

Volume 80, pages 21885–21898, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

240 Accesses
Explore all metrics

Abstract

Nowadays, dictionary learning has become an important tool in many classification tasks, especially for images. The tailor-made atoms in a dictionary are trained for the reconstruction of the test sample. In the classification, atoms are associated with different classes from several subspaces such that the test sample is labeled according to the distances of each subspace. However, it is hard to fix the number of atoms to obtain the optimal result for each scenario since the optimal subspaces required are different. To improve the classification performance as well as the robustness, we proposed subspace-level dictionary fusion (SLDF) to construct a dictionary-based classifier. A full-size dictionary and a locality-constrained dictionary are constructed in parallel. Then, the reconstruction coefficients of the two dictionaries are obtained, which leads to a pair of distances between the test sample and the subspaces. Finally, a decision is made according to the pair-wise fusion of the distances. The experimental results on multimedia datasets from distinct categories such as image, text, and audio show that the proposed method outperforms other state-of-the-art dictionary-based classification methods with accuracies of 99.74% (image), 83.96% (Text), and 87.07% (Audio).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Collaborative coding and dictionary learning for nearest subspace classification

Article 04 May 2021

Relaxed support vector based dictionary learning for image classification

Article 30 June 2023

Discriminant Manifold Learning via Sparse Coding for Image Analysis

References

Aharon M, Elad M, Bruckstein A (2006) K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing 54(11):4311–4322
Article Google Scholar
Aharon M, Elad M, Bruckstein AM, Katz Y (2005) K-svd : An algorithm for designing of overcomplete dictionaries for sparse representation
Akhtar N, Shafait F, Mian A (2017) Efficient classification with sparsity augmented collaborative representation. Pattern Recogn 65:136–145
Article Google Scholar
Atawneh S, Almomani A, Al Bazar H, Sumari P, Gupta B (2017) Secure and imperceptible digital image steganographic algorithm based on diamond encoding in dwt domain. Multimedia tools and applications 76(18):18451–18472
Article Google Scholar
Benavente AMMR (1998) The ar face database. Tech. Rep. 24, The Ohio State University
Boyd S, Parikh N, Chu E (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Now Publishers Inc
Cai S, Zhang L, Zuo W, Feng X (2016) A probabilistic collaborative representation based approach for pattern classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2950–2959
Choi K, Fazekas G, Sandler M, Cho K (2017) Convolutional recurrent neural networks for music classification. In: 2017 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 2392–2396. IEEE
Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297
MATH Google Scholar
Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Multiple Classifier Systems 34(8):1–17
Google Scholar
Deterding DH (1989) Speaker normalisation for automatic speech recognition
do Campo SB (2006) Fei face database. https://fei.edu.br/cet/facedatabase.html
Dorgham O, Al-Rahamneh B, Almomani A, Khatatneh KF et al (2018) Enhancing the security of exchanging and storing dicom medical images on the cloud. International Journal of Cloud Applications and Computing (IJCAC) 8 (1):154–172
Article Google Scholar
Forina M (1991) Wine data set. https://archive.ics.uci.edu/ml/datasets/Wine
Gangeh MJ, Farahat AK, Ghodsi A, Kamel MS (2015) Supervised dictionary learning and sparse representation-a review. ArXiv:1502.05928
Goléa NE-H, Melkemi KE (2019) Roi-based fragile watermarking for medical image tamper detection. International Journal of High Performance Computing and Networking 13(2):199–210
Article Google Scholar
Gupta BB (2020) An efficient kp design framework of attribute-based searchable encryption for user level revocation in cloud. Concurrency and Computation: Practice and Experience 32(18):e5291
Google Scholar
Gupta B, Agrawal DP, Yamaguchi S (2016) Handbook of research on modern cryptographic solutions for computer and cyber security. IGI global, Pennsylvania
Book Google Scholar
Jeong D, Kim B-G, Dong S-Y (2020) Deep joint spatiotemporal network (djstn) for efficient facial expression recognition. Sensors 20(7):1936
Article Google Scholar
Kim J-H, Kim B-G, Roy PP, Jeong D-M (2019) Efficient facial expression recognition algorithm based on hierarchical deep neural network structure. IEEE Access 7:41273–41285
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS
Kumar A (2019) Design of secure image fusion technique using cloud for privacy-preserving and copyright protection. International Journal of Cloud Applications and Computing (IJCAC) 9(3):22–36
Article Google Scholar
Kumar S, Gahalawat M, Roy PP, Dogra DP, Kim B-G (2020) Exploring impact of age and gender on sentiment analysis using machine learning. Electronics 9(2):374
Article Google Scholar
Lan Z-Z, Bao L, Yu S-I, Liu W, Hauptmann AG (2013) Multimedia classification and event detection using double fusion. Multimedia Tools and Applications 71:333–347
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition
Li Z, Lai Z, Xu Y, Yang J, Zhang D (2015) A locality-constrained and label embedding dictionary learning algorithm for image classification. IEEE transactions on neural networks and learning systems 28(2):278–293
Article MathSciNet Google Scholar
Mairal J, Bach FR, Ponce J, Sapiro G, Zisserman A (2008) Supervised dictionary learning. In: NIPS
Merz CLBCJ (1998) Uci repository of machine learning databases. Tech. Rep. 24, University of California
Milborrow S, Morkel J, Nicolls F (2010) The MUCT landmarked face database. Pattern Recognition Association of South Africa. http://www.milbo.org/muct
Mu Y, Zhou Z (2019) Visual vocabulary tree-based partial-duplicate image retrieval for coverless image steganography. International Journal of High Performance Computing and Networking 14(3):333–341
Article Google Scholar
Olivetti (1994) Orl face database. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. Proceedings of 27th Asilomar Conference on Signals, Systems and Computers 1:40–44
Article Google Scholar
Pham D-S, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition. 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8
Pouyanfar S, Chen S-C, Shyu M-L (2017) An efficient deep residual-inception network for multimedia classification. 2017 IEEE International Conference on Multimedia and Expo (ICME), pp 373–378
Salamon J, Bello JP (2017) Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters 24:279–283
Article Google Scholar
Shamsolmoali P, Jain DK, Zareapoor M, Yang J, Alam MA (2018) High-dimensional multimedia classification using deep cnn and extended residual units. Multimedia Tools and Applications, pp 1–16
Su Y, Shan S, Chen X, Gao W (2009) Hierarchical ensemble of global and local classifiers for face recognition. IEEE Transactions on image processing 18 (8):1885–1896
Article MathSciNet Google Scholar
Wang H, Li Z, Li Y, Gupta BB, Choi C (2020) Visual saliency guided complex image retrieval. Pattern Recogn Lett 130:64–72
Article Google Scholar
Wang L, Li L, Li J, Li J, Gupta BB, Liu X (2018) Compressive sensing of medical images with confidentially homomorphic aggregations. IEEE Internet Things J 6(2):1402–1409
Article Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2008) Robust face recognition via sparse representation. IEEE transactions on pattern analysis and machine intelligence 31(2):210–227
Article Google Scholar
Xu J, An W, Zhang L, Zhang D (2019) Sparse, collaborative, or nonnegative representation: which helps pattern classification?. Pattern Recogn 88:679–688
Article Google Scholar
Xu Y, Li Z, Yang J, Zhang D (2017) A survey of dictionary learning algorithms for face recognition. IEEE Access 5:8502–8514
Article Google Scholar
Xu Y, Zhang D, Yang J, Yang J-Y (2011) A two-phase test sample sparse representation method for use with face recognition. IEEE Transactions on Circuits and Systems for Video Technology 21(9):1255–1262
Article MathSciNet Google Scholar
Zeng S, Yang X, Gou J (2017) Multiplication fusion of sparse and collaborative representation for robust face recognition. Multimedia Tools and Applications 76(20):20889–20907
Article Google Scholar
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition?. In: 2011 International conference on computer vision, pp 471–478. IEEE
Zhang L, Zhang L, Zhang D, Zhu H (2011) Ensemble of local and global information for finger–knuckle-print recognition. Pattern recognition 44 (9):1990–1998
Article Google Scholar
Zhang Q, Li B (2010) Discriminative k-svd for dictionary learning in face recognition. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2691–2698
Zhou J, Zeng S, Zhang B (2019) Two-stage image classification supervised by a single teacher single student model. In: 30th British machine vision conference
Zhou J, Zhang B (2019) Collaborative representation using non-negative samples for image classification. Sensors 19(11):2609
Article Google Scholar

Download references

Acknowledgements

This work was supported by the University of Macau (File no. MYRG2018-00053-FST).

Author information

Authors and Affiliations

PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau, China
Jianhang Zhou, Shaoning Zeng & Bob Zhang

Authors

Jianhang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Shaoning Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Bob Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bob Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, J., Zeng, S. & Zhang, B. Subspace-level dictionary fusion for robust multimedia classification. Multimed Tools Appl 80, 21885–21898 (2021). https://doi.org/10.1007/s11042-021-10661-1

Download citation

Received: 03 June 2020
Revised: 09 November 2020
Accepted: 04 February 2021
Published: 21 March 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11042-021-10661-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Subspace-level dictionary fusion for robust multimedia classification

Abstract

Access this article

Similar content being viewed by others

Collaborative coding and dictionary learning for nearest subspace classification

Relaxed support vector based dictionary learning for image classification

Discriminant Manifold Learning via Sparse Coding for Image Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Subspace-level dictionary fusion for robust multimedia classification

Abstract

Access this article

Similar content being viewed by others

Collaborative coding and dictionary learning for nearest subspace classification

Relaxed support vector based dictionary learning for image classification

Discriminant Manifold Learning via Sparse Coding for Image Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation