Learn structured analysis discriminative dictionary for multi-label classification

Liu, Bo; Che, Zhiyong; Song, Kejian; Xiao, Yanshan

doi:10.1007/s10489-021-02601-1

Learn structured analysis discriminative dictionary for multi-label classification

Published: 01 July 2021

Volume 52, pages 3175–3192, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Bo Liu ORCID: orcid.org/0000-0002-4250-6339¹,
Zhiyong Che¹,
Kejian Song¹ &
…
Yanshan Xiao²

483 Accesses
2 Citations
Explore all metrics

Abstract

Multi-label learning is a machine learning classification problem, in which an example belongs to more than one classes at the same time. Recently, multi-label learning has aroused a great deal of attention, and has achieved great success in the fields of text and image classification. In this paper, we propose a new method for multi-label learning, which is named as analysis discriminative dictionary learning for multi-label classification (ADML). We first incorporate analytical discrimination dictionary learning and sparse representation into multi-label classifier to obtain a unified model. The incoherence promoting term and reconstruction error for each label are minimized to obtain the dictionary. We then incorporate an analysis inconsistency promotion term into the model, which minimizes the reconstruction error of the dictionary with the corresponding label of the data. Further, we calculate a linear classifier by taking the label relationships into account. It is worth noting that we implicitly consider the label relationships in the analysis dictionary and linear classifier. Finally, we conduct experiments on 15 datasets to test the performance of the proposed ADML method and baselines. The results show that the proposed ADML method can deliver higher performance than previous multi-label methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on semi-supervised learning

Article Open access 15 November 2019

Learning from positive and unlabeled data: a survey

Article 02 April 2020

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Notes

References

Jin C, Jin S. -W. (2019) Multi-label automatic image annotation approach based on multiple improvement strategies. Image Processing Iet 13(4):623–633
Article Google Scholar
Wang X, Feng S, Lang C (2019) Semi supervised dual low-rank feature mapping for multi-label image annotation. Multimed Tools Appl 78(10):113149–13168
Article Google Scholar
Lee J, Yu I, Park J, Kim D-W (2019) Memetic feature selection for multilabel text categorization using label frequency difference. Inform Sci 485:263–280
Article Google Scholar
Al-Salemi B, Ayob M, Noah SAM (2018) Feature ranking for enhancing boosting-based multi-label text categorization. Expert Syst Appl 113:531–543
Article Google Scholar
Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51(2):966–979
Article Google Scholar
Lee J, Seo W, Park J-H, Kim D-W (2019) Compact feature subset-based multi-label music categorization for mobile devices. Multimed Tools Appl 78(4):4869–4883
Article Google Scholar
Ma Q, Yuan C, Zhou W, Han J, Hu S (2020) Beyond statistical relations: Integrating knowledge relations into style correlations for multi-label music style classification. In: WSDM ’20: The thirteenth ACM international conference on web search and data mining, Houston, TX, USA, February 3-7, 2020, pp 411–419
Kostiuk B, Costa YMG, de Souza Britto A Jr, Hu X, Silla CN (2019) Multi-label emotion classification in music videos using ensembles of audio and video features. In: 31st IEEE International conference on tools with artificial intelligence, ICTAI 2019, Portland, OR, USA, November 4-6, 2019, pp 517–523
Lv J, Wu T, Peng C-L, Liu Y-P, Xu N, Geng X (2020) Compact learning for multi-label classification. Pattern Recognit 113:107833
Article Google Scholar
Zhang M, Zhou Z (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Article MATH Google Scholar
Zhang M, Zhou Z (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Article MATH Google Scholar
Cheng W, Hullermeier E (2009) Combining instance-based learning and logistic regression for multilabel classification. Eur Conf Mach Learn 76(2):211–225
Article MATH Google Scholar
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Article MathSciNet Google Scholar
Teisseyre P (2021) Classifier chains for positive unlabelled multi-label learning. Knowl-Based Syst 213:106709
Article Google Scholar
Weng W, Wang D, Chin-Ling Chen JW, Wu S (2020) Label specific features-based classifier chains for multi-label classification. IEEE Access 8:51265–51275
Article Google Scholar
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771
Article Google Scholar
Wu G, Tian Y, Zhang C (2018) A unified framework implementing linear binary relevance for multi-label learning. Neurocomputing 289:86–100
Article Google Scholar
Moral-Garcia S, Mantas CJ, Castellano JG, Abellan J (2018) Using credal-c4.5 with binary relevance for multi-label classification. J Intell Fuzzy Syst 35(6):6501–6512
Article Google Scholar
Kong X, Ng MK, Zhou Z (2013) Transductive multilabel learning via label set propagation. IEEE Trans Knowl Data Eng 25(3):704–719
Article Google Scholar
Shan J, Hou C, Tao H, Zhuge W, Yi D (2019) Co-learning binary classifiers for lp-based multi-label classification. Cogn Syst Res 55:146–152
Article Google Scholar
Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: Machine learning: ECML 2007, 18th European conference on machine learning, Warsaw, Poland, September 17-21, 2007, Proceedings, pp 406–417
Wu Y, Lin H (2017) Progressive random k-labelsets for cost-sensitive multi-label classification. Mach Learn 106(5): 671–694
Article MathSciNet MATH Google Scholar
Zhou T, Yang S, Wang L, Yao J, Gui G (2018) Improved cross-label suppression dictionary learning for face recognition. IEEE Access 6:48716–48725
Article Google Scholar
Wang Y, Liu S, Peng Y, Cao H (2018) Discriminative dictionary learning based on sample diversity for face recognition. In: 19th Pacific rim conference on multimedia 2018, vol 2, pp 538– 546
Foroughi H, Shakeri M, Ray N, Zhang H (2017) Face recognition using multi-modal low-rank dictionary learning. In: International conference on image processing, pp 1082–1086
Meng Y, Chang H, Luo W (2017) Discriminative analysis-synthesis dictionary learning for image classification - sciencedirect. Neurocomputing 219:404–411
Article Google Scholar
Rong Y, Xiong S, Gao Y (2017) Low-rank double dictionary learning from corrupted data for robust image classification. Pattern Recogn 72:419–432
Article Google Scholar
Yang M, Chang H, Luo W, Yang J (2017) Fisher discrimination dictionary pair learning for image classification. Neurocomputing 269:13–20
Article Google Scholar
Aharon M, Elad M, Bruckstein AM (2006) K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322
Article MATH Google Scholar
Yang M, Liu W, Luo W, Shen L (2016) Analysis-synthesis dictionary learning for universality-particularity representation based classification. Assoc Adv Artif Intell :2251–2257
Jing X, Wu F, Li Z, Ruimin ZD (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725
Article MathSciNet MATH Google Scholar
Ji Z, Cui B, Li H, Jiang Y.-G., Xiang Tao, Hospedales TM, Fu Y (2020) Deep ranking for image zero-shot multi-label classification. IEEE Trans Image Process 29:6549–6560
Article MathSciNet Google Scholar
Ma J, Zhang H, Chow TWS (2021) Multilabel classification with label-specific features and classifiers: A coarse- and fine-tuned framework. IEEE Trans Cybern 51(2):1028–1042
Article Google Scholar
Pereira RB, Plastino A, Zadrozny B, Merschmann LHC (2021) A lazy feature selection method for multi-label classification. Intell Data Anal 25(1):21–34
Article Google Scholar
Dong H, Sun J, Sun X, Ding R (2020) A many-objective feature selection for multi-label classification. Knowl-Based Syst 208:106456
Article Google Scholar
Almeida TB, Borges HB (2017) An adaptation of the ml-knn algorithm to predict the number of classes in hierarchical multi-label classification. In: Modeling decisions for artificial intelligence-14th international conference, MDAI 2017, Kitakyushu, Japan, October 18-20, 2017, Proceedings., pp 77–88
Cheng Z, Zeng Z (2020) Joint label-specific features and label correlation for multi-label learning with missing label. Appl Intell 50(11):4029–4049
Article Google Scholar
Agrawal P, Whitaker RT, Elhabian SY (2020) An optimal, generative model for estimating multi-label probabilistic maps. IEEE Trans Med Imaging 39(7):2316–2326
Article Google Scholar
Wu G, Zheng R, Tian Y, Liu D (2020) Joint ranking svm and binary relevance with robust low-rank learning for multi-label classification. Neural Netw 122:24–39
Article MATH Google Scholar
Abdi A, Rahmati M, Ebadzadeh MM (2021) Entropy based dictionary learning for image classification, vol 110, p 107634
Yang B, Guan X.-P., Zhu J, Gu C, Wu K, Xu J (2021) Svms multi-class loss feedback based discriminative dictionary learning for image classification, vol 112, p 107690
Peng Y, Liu S, Wang X, Wu X (2020) Joint locality-constraint and fisher discrimination based dictionary learning for image classification. Neurocomputing 398:505–519
Article Google Scholar
Yang X, Jiang X, Tian C, Wang P, Zhou F, Fujita H (2020) Inverse projection group sparse representation for tumor classification: A low rank variation dictionary approach, vol 196, p 105768
Luo X, Xu Y, Yang J (2019) Multi-resolution dictionary learning for face recognition. Pattern Recogn 93:283–292
Article Google Scholar
Lin G, Yang M, Yang J, Shen L, Xie W (2018) Robust, discriminative and comprehensive dictionary learning for face recognition. Pattern Recogn 81:341–356
Article MATH Google Scholar
Ou W, Luan X, Gou J, Zhou Q, Xiao W, Xiong X, Zeng W (2018) Robust discriminative nonnegative dictionary learning for occluded face recognition. Pattern Recogn Lett 107:41–49
Article Google Scholar
Du H, Zhang Y, Ma L, Zhang F (2021) Structured discriminant analysis dictionary learning for pattern classification, vol 216, p 106794
Wang W, Yang C, Li Q (2019) Discriminative analysis dictionary and classifier learning for pattern classification. In: 2019 IEEE International conference on image processing (ICIP), pp 385–389
Song J, Xie X, Shi G, Dong W (2018) Exploiting class-wise coding coefficients: learning a discriminative dictionary for pattern classification. Neurocomputing 321:114–125
Article Google Scholar
Wang Q, Guo Y, Guo J, Kong X (2018) Synthesis k-svd based analysis dictionary learning for pattern classification. Multimed Tools Appl 77(13):17023–17041
Article Google Scholar
Dong J, Sun C, Yang W (2015) A supervised dictionary learning and discriminative weighting model for action recognition. Neurocomputing 158:246–256
Article Google Scholar
Pham DS, Venkatesh S (2008) Joint learning and dictionary construction for pattern recognition. In: Computer vision and pattern recognition, pp 1–8
Yang J, Yu K, Huang TS (2010) Supervised translation-invariant sparse coding. In: Computer vision and pattern recognition, pp 3517–3524
Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804
Article Google Scholar
Oramas S, Nieto O, Barbieri F, Serra X (2017) Multi-label music genre classification from audio, text, and images using deep features. In: Proceedings of the 18th International society for music information retrieval conference, ISMIR 2017, pp 23–30
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas I (2011) Multi-label classification of music by emotion. Eurasip J Audio Speech & Music Process 2011:4
Article Google Scholar
Gorski J, Pfeuffer F, Klamroth K (2007) Biconvex sets and optimization with biconvex functions: a survey and extensions. Math Methods Oper Res 66(3):373–407
Article MathSciNet MATH Google Scholar
Maimon O, Rokach L, Mining Data (2010) Data mining and knowledge discovery handbook, 2nd edn., Springer, Berlin. ISBN 978-0-387-09822-7
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Zhang Z, Jiang W, Qin J, Zhang L, Li F, Zhang M, Yan S (2018) Jointly learning structured analysis discriminative dictionary and analysis multiclass classifier. IEEE Trans Neural Netw Learn Syst 29(8):3798–3814
Article MathSciNet Google Scholar
Gu S, Zhang L, Zuo W, Feng X (2014) Projective dictionary pair learning for pattern classification. Neural Inf Process Syst (NeurIPS): 793–801

Download references

Acknowledgment

The authors would like to thank the anonymous referees for their significant comments and suggestions. This work was supported in part by the Natural Science Foundation of China under Grant 62076074, 61876044 and 61672169, in part by Guangdong Basic and Appiled Basic Research Foundation Grant 2020A151010670 and 2020A151011501, in part by the Science and Technology Planning Project of Guangzhou under Grant 202002030141.

Author information

Authors and Affiliations

School of Automation, Guangdong University of Technology, Guangzhou, 510006, China
Bo Liu, Zhiyong Che & Kejian Song
School of Computers, Guangdong University of Technology, Guangzhou, 510006, China
Yanshan Xiao

Authors

Bo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Che
View author publications
You can also search for this author in PubMed Google Scholar
Kejian Song
View author publications
You can also search for this author in PubMed Google Scholar
Yanshan Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Proof Proof of Theorem 1

The (11) is solved by the Lagrange dual function, the expression shows as follows:

$$ \begin{aligned} g(\mu)=&inf\{\| X_{l}-D_{l}S_{l} \|_{F}^{2} +\alpha \| D_{l}\overline{S_{l}}\|_{F}^{2} \\ &+ \sum\limits_{i=1}^{k} \mu_{l,i} (\| d_{v}\|_{2}^{2} -1)\}, \end{aligned} $$

(29)

where μ_l,i is the Lagrange multiplier. A diagonal matrix E_l ∈ R^k×k is constructed, where (E_l)_ii = μ_l,i denotes the diagonal entry, then we rewrite the formula as follows:

$$ \begin{aligned} L(D_{l},\mu)=&\| X_{l}-D_{l}S_{l} \|_{F}^{2} +\alpha \| D_{l}\overline{S_{l}}\|_{F}^{2} \\ &+tr(D^{T}DE)-tr(E_{l}). \end{aligned} $$

(30)

Let the derivative $\frac {\partial L(D_{L},\mu )}{\partial D_{L}} $ be zero, the closed-form solution for D_l is obtained as follows:

$$ \begin{aligned} D^{*}=X_{l}{S_{l}^{T}}(S_{l}{S_{l}^{T}} +\gamma\overline{S_{l}}\overline{S_{l}}^{T} +E_{l})^{-1}. \end{aligned} $$

(31)

E_l is discarded based on the work in [60], which can reduce the computational complexity and decrase the computation cost. Notely, $S_{l}{S_{l}^{T}}+\gamma \overline {S_{l}}\overline {S_{l}}^{T}$ can not be ensured to be invertible, so $S_{l}{S_{l}^{T}}+\gamma \overline {S_{l}}\overline {S_{l}}^{T}$ may produce the singular issue. Therefore, similarly as [61], a regularization term 𝜃I (𝜃 = 10e^− 4 is a small number) is embedded into $S_{l}{S_{l}^{T}}+\gamma \overline {S_{l}}\overline {S_{l}}^{T}$, which can avoid the singular problem and achieve a stable performance. □

Proof Proof of Theorem 2

The Lagrange function of this constrained problem (17) is,

$$ \begin{aligned} \mathcal{L}(P)=& \tau \|P_{l}X_{l}-S_{l}\|_{F}^{2}+\tau\|P_{l}\overline{X_{l}}\|_{F}^{2} \\ &-{\sum}_{i=1}^{N} p_{l}\xi_{l}-{\sum}_{i=1}^{N} q_{l}\{[M_{l}\cdot P_{l}X_{l}+\delta_{l}]{Y_{l}^{T}}-1+\xi_{l}\}, \end{aligned} $$

(32)

where q_l > 0, p_l > 0 are Lagrange multiplier. Let the derivative $\frac {\partial {\mathscr{L}}(P)}{\partial P}$ be zero, we have the expression of the closed-form solution for P as follows:

$$ \begin{aligned} P^{*}=[S_{l}{X_{l}^{T}}+\frac{1}{2\tau}{\sum}_{i=1}^{N} q_{l}(M_{l}X_{l}){Y_{l}^{T}}](X_{l}{X_{l}^{T}}+\overline{X_{l}}\overline{X_{l}}^{T}+\theta I)^{-1}, \end{aligned} $$

(33)

where 𝜃I is a regularization term, and 𝜃 = 10e^− 4. In fact, the samples’ number may smaller than the dimension of feature space; therefore, it is necessary to add regularization term 𝜃I into the formula to avoid the problem of singularity similar as [61]. For example, the inverse of $X_{l}{X_{l}^{T}}$ may be singular. □

Proof Proof of Theorem 3

Variables M_l, δ_l and ξ_l are optimized by the Lagrangian function, and then the dual form of the optimization problem in (20) is obtained. Therefore, α_l > 0 and η_l > 0 are introduced as the Lagrange multipliers. By introducing the Lagrangian function into the objective function in (20), we can rewrite the objective function in (20) as follows: □

$$ \begin{aligned} \mathcal{L}(M,\xi,\eta)=& \frac{1}{2}\|M_{l}\|_{2}^{2} +C_{l}\sum\limits_{i=1}^{N}\xi_{l}-\sum\limits_{i=1}^{N}\alpha_{l}\xi_{l} \\ &-\sum\limits_{i=1}^{N} \eta_{l}\{[M_{l}\cdot P_{l}X_{l} + \delta_{l}]{Y_{l}^{T}} -1+\xi_{l}\},\\ s.t.&\ \forall \ \ \eta_{l}>0,\ \alpha_{l}>0. \end{aligned}$$

(34)

A saddle point in the Lagrangian is the minimum value of the variables M_l and ξ_l, however, it is the maximum value for the dual form. In order to obtain the minimum value of the variables, we require as follows:

$$ \frac{\partial(\mathcal{L})}{\partial \xi_{l}}=C_{l}-\alpha_{l}-\eta_{l}=0,$$

(35)

$$ \frac{\partial(\mathcal{L})}{\partial \delta_{l}}=\sum\limits_{i=1}^{N} \eta_{l}{Y_{l}^{T}}=0.$$

(36)

Similarly, for M_l we require,

$$ \frac{\partial(\mathcal{L})}{\partial M_{l}}= M_{l}-\sum\limits_{i=1}^{N} \eta_{l}P_{l}X_{l}\cdot {Y_{l}^{T}}=0,$$

(37)

The results are presented as follows:

$$ M_{l}=\sum\limits_{i=1}^{N} \eta_{l}P_{l}X_{l}\cdot {Y_{l}^{T}}.$$

(38)

By incorporating (35), (36) and (38) into (34), we have the following optimization function:

$$ \begin{aligned} \underset{\eta_{l}}{\min}\frac{1}{2}\sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N}\eta_{l}\eta_{j}{Y_{l}^{T}}{Y_{j}^{T}}(X_{l}\cdot X_{j})+\sum\limits_{i=1}^{N} \eta_{l},\\ s.t.\sum\limits_{i=1}^{N} \eta_{l}{Y_{l}^{T}}=0,\ \ \ 0<\eta_{l}<C_{l}. \end{aligned} $$

(39)

The dual complementarity condition satisfying the KKT condition is:

$$ \begin{aligned} \eta_{l}^{*}({Y_{l}^{T}}(M_{l}P_{l}X_{l}+\delta_{l})-1+\xi_{l}^{*})=0. \end{aligned} $$

(40)

According to the dual complementarity condition of this KKT condition, we can get:

$$ \begin{aligned} &\eta_{l}^{*}=0\Rightarrow {Y_{l}^{T}}(M_{l}P_{l}X_{l}+\delta_{l})\geq 1,\\ &0<\eta_{l}^{*}<C_{l}\Rightarrow {Y_{l}^{T}}(M_{l}P_{l}X_{l}+\delta_{l})=1,\\ &\eta_{l}^{*}=C_{l}\Rightarrow {Y_{l}^{T}}(M_{l}P_{l}X_{l}+\delta_{l})\leq1. \end{aligned} $$

(41)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, B., Che, Z., Song, K. et al. Learn structured analysis discriminative dictionary for multi-label classification. Appl Intell 52, 3175–3192 (2022). https://doi.org/10.1007/s10489-021-02601-1

Download citation

Accepted: 07 June 2021
Published: 01 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10489-021-02601-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learn structured analysis discriminative dictionary for multi-label classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

Learning from imbalanced data: open challenges and future directions

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix:

Proof Proof of Theorem 1

Proof Proof of Theorem 2

Proof Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learn structured analysis discriminative dictionary for multi-label classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

Learning from imbalanced data: open challenges and future directions

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix:

Appendix:

Proof Proof of Theorem 1

Proof Proof of Theorem 2

Proof Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation