Abstract
Multi-label learning (MLL) refers to a learning task where each instance is associated with a set of labels. However, in most real-world applications, the labeling process is very expensive and time consuming. Partially multi-label learning (PML) refers to MLL where only a part of the labels are correctly annotated and the rest are false positive labels. The main purpose of PML is to learn and predict unseen multi-label data with less annotation cost. To address the ambiguities in the label set, existing popular PML research attempts to extract the label confidence for each candidate label. These methods mainly perform disambiguation by considering the correlation among labels or/and features. However, in PML because of noisy labels, the true correlation among labels is corrupted. These methods can be easily misled by noisy false-positive labels. In this paper, we propose Partial Multi-Label learning method via Constraint Clustering (PML-CC) to address PML based on the underlying structure of data. PML-CC gradually extracts high-confidence labels and then uses them to extract the rest labels. To find the high-confidence labels, it solves PML as a clustering task while considering extracted information from previous steps as constraints. In each step, PML-CC updates the extracted labels and uses them to extract the other labels. Experimental results show that our method successfully tackles PML tasks and outperforms the state-of-the-art methods on artificial and real-world datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)
Chen, A.I.A.: Fast distributed first-order methods. Ph.D. thesis, Massachusetts Institute of Technology (2012)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Gong, X., Yuan, D., Bao, W.: Partial multi-label learning via large margin nearest neighbour embeddings (2022)
Kolen, J.F., Hutcheson, T.: Reducing the time complexity of the fuzzy c-means algorithm. IEEE Trans. Fuzzy Syst. 10(2), 263–267 (2002)
Li, Z., Lyu, G., Feng, S.: Partial multi-label learning via multi-subspace representation. In: IJCAI, pp. 2612–2618 (2020)
Lyu, G., Feng, S., Li, Y.: Partial multi-label learning via probabilistic graph matching mechanism. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 105–113 (2020)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Siahroudi, S.K., Kudenko, D.: An effective single-model learning for multi-label data. Expert Syst. Appl. 232, 120887 (2023)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sun, L., Feng, S., Wang, T., Lang, C., Jin, Y.: Partial multi-label learning by low-rank and sparse decomposition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5016–5023 (2019)
Wang, H., Liu, W., Zhao, Y., Zhang, C., Hu, T., Chen, G.: Discriminative and correlative partial multi-label learning. In: IJCAI, pp. 3691–3697 (2019)
Wang, R., Kwong, S., Wang, X., Jia, Y.: Active k-labelsets ensemble for multi-label classification. Pattern Recogn. 109, 107583 (2021)
Xie, M.K., Huang, S.J.: Partial multi-label learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Xie, M.K., Huang, S.J.: Partial multi-label learning with noisy label identification. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3676–3687 (2021)
Xie, M.K., Sun, F., Huang, S.J.: Partial multi-label learning with meta disambiguation. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1904–1912 (2021)
Xu, N., Liu, Y.P., Geng, X.: Label enhancement for label distribution learning. IEEE Trans. Knowl. Data Eng. 33(4), 1632–1643 (2019)
Xu, N., Liu, Y.P., Geng, X.: Partial multi-label learning with label distribution. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6510–6517 (2020)
Yan, Y., Guo, Y.: Adversarial partial multi-label learning with label disambiguation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10568–10576 (2021)
Yan, Y., Li, S., Feng, L.: Partial multi-label learning with mutual teaching. Knowl.-Based Syst. 212, 106624 (2021)
Yu, G., et al.: Feature-induced partial multi-label learning. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 1398–1403. IEEE (2018)
Zhang, M., Fang, J.: Partial multi-label learning via credible label elicitation. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3587–3599 (2021)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zhao, P., Zhao, S., Zhao, X., Liu, H., Ji, X.: Partial multi-label learning based on sparse asymmetric label correlations. Knowl.-Based Syst. 245, 108601 (2022)
Acknowledgment
This work has been partially supported by the Volkswagen foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 Proof of Formula
In this section, the detail of the optimization of Eq. (1) is given. The goal of this optimization is to find the optimal value for cluster centers (\([Z]_{k\times m\times L}\)) and the fuzzy membership (\([U]_{k\times m \times L}\)). Where k is the number of classes for each label. Since each label is a binary class we set \(k=2\). m is the size of the feature and L is the number of labels. For making the optimization procedure easier for the reader, the procedure is described for a single label. Thus we consider cluster centers (\([Z]_{2\times m}\)) and the fuzzy membership (\([U]_{2\times m}\)) for only one label. For the rest of the labels, we repeat the procedure. The Eq. (1) does not have a close form solution. To solve this problem an alternating optimization approach is used. Equation (1) is a constraint non-linear optimization form. By using Lagrange multipliers the following function is obtained.
Lemma 1. The optimal value for \(U_{ij}\) when Z are fixed is equal to :
Proof. To find the optimal value for each \(U_{ij}\) we take derivative of Eq. (11) respect to each \(U_{ij}\) and set it to zero as follows:
By setting \(\psi _{ij}\) and \(\varPsi _{ij} \) as follows:
By solving Eq. (13), \(U_{ij}\) will be obtained as follows:
Since \(\sum _{j=1}^2=1\) the Lagrange multiplier can obtained as follows:
By substituting Eq. (16) in Eq. (15) the closed form solution for uij (Eq. (12)) will be obtained and completes the proof of lemma.
Lemma 2. If the U (fuzzy memberships) are fixed, the optimal value for Z (cluster centers) are equal to equation (17).
Proof. Again the alternative approach is used. First, The U are fixed then the optimal values for Z is obtained by taking derivative of Eq. (11) respect to each cluster center and set it to zero.
Lemma 3. U and Z are local optimum of J(U, Z) if \(Z_{ij}\) and \(U_{ij}\) are calculated using Eq. (17) and (12) and \(A_1,A_2,A_3 >0\)
Proof. Let J(U) be J(U, Z) when Z are fixed, J(Z) be J(U, Z) when U are fixed and \(A_1,A_2,A_3 >0\). Then, the Hessian H(J(Z)) and H(J(U)) matrices are calculated as follows:
Equation (19) and (20) shows H(J(Z)) and H(J(U)) are diagonal matrices. Since \(A_1>0\) and \( 0< U_{ij} \le 1\), the Hessian matrices are positive definite. Thus Eq. (12) and (17) are sufficient conditions to minimize J(U) and J(Z).
1.2 Additional Excremental Result
Tables 4,5,6 show the performance of our proposed method in the term of ranking loss and coverage respectively.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Siahroudi, S.K., Kudenko, D. (2024). Partial Multi-label Learning via Constraint Clustering. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1965. Springer, Singapore. https://doi.org/10.1007/978-981-99-8145-8_35
Download citation
DOI: https://doi.org/10.1007/978-981-99-8145-8_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8144-1
Online ISBN: 978-981-99-8145-8
eBook Packages: Computer ScienceComputer Science (R0)