Skip to main content
Log in

A multi-class partial hinge loss for partial label learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

As an important branch of weakly supervised learning, partial label learning (PLL) tackles the problem where each training instance is associated with a set of candidate labels, among only one is correct. Most existing PLL algorithms elaborately designed loss functions and update strategies to learn potential ground-truth labels among candidate labels with deep neural networks. However, these algorithms are susceptible to the cumulative error caused by noisy label propagation when updating label confidences, this will make the deep models tend to overfit the noisy labels, thereby achieving poor generation performance. To remedy this issue, we propose a general framework multi-class partial hinge loss (MPHL) for PLL, which can disambiguate the candidate labels by optimizing the margin between the maximum modeling output from partial labels and that from non-partial ones. More importantly, the partial hinge loss can adaptively optimize the separation hyperplane to reduce the influence of cumulative error. Meanwhile, we introduce graph laplacian regularization to full mine the relationship between candidate labels of similar instances to constrain the separation hyperplane to improve the robustness of disambiguation. Extensive experimental results demonstrate that the multi-class partial hinge loss significantly outperforms the state-of-the-art counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Z Zeng, S Xiao, K Jia, T-H Chan, S Gao, D Xu, Y Ma (2013) Learning by associating ambiguously labeled images. In: Proc IEEE Conf Comput Vis Pattern Recognit pp 708–715

  2. W Wang, M-L Zhang (2022) Partial label learning with discrimination augmentation. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining pp 1920–1928

  3. Liu L, Dietterich T (2012) A conditional multinomial mixture model for superset label learning. Advances in Neural Information Processing Systems 25:548–556

    Google Scholar 

  4. Luo J, Orabona F (2010) Learning from candidate labeling sets. AdVances In Neural Information Processing Systems 23:1504–1512

    Google Scholar 

  5. Cour T, Sapp B, Taskar B (2011) Learning from partial labels. J Mach Learn Res 12:1501–1536

    MathSciNet  MATH  Google Scholar 

  6. Zhang M-L, Yu F (2015) Solving the partial label learning problem: An instance-based approach. IJCAI. Buenos Aires, Argentina, pp 4048–4054

    Google Scholar 

  7. Berg TL, Berg AC, Edwards J, Forsyth DA (2005) Who’s in the picture. Advances in Neural Information Processing Systems. British Columbia, Canada, pp 137–144

    Google Scholar 

  8. N Nguyen, R Caruana (2008) Classification with partial labels. In: Proceedings of the 14th ACM SIGKDD International Conference On Knowledge Discovery And Data Mining, Las Vegas Nevada, USA pp 551–559

  9. Noda K, Yamaguchi Y, Nakadai K, Okuno HG (2015) T Ogata (2015) Audio-visual speech recognition using deep learning. Appl Intell 42:722–737

    Article  Google Scholar 

  10. A Binbusayyis, T Vaiyapuri (2021) Unsupervised deep learning approach for network intrusion detection combining convolutional autoencoder and one-class svm. In: Appl Intell 51(10): 7094–7108

  11. J Lv, M Xu, L Feng, G Niu, X Geng, M Sugiyama (2020) Progressive identification of true labels for partial-label learning. In: International Conference on Machine Learning, PMLR, pp 6500–6510

  12. H Wen, J Cui, H Hang, J Liu, Y Wang, Z Lin (2021) Leveraged weighted loss for partial label learning. In: International Conference on Machine Learning, PMLR pp 11091–11100

  13. J Fan, Y Yu, Z Wang, J Gu (2021) Partial label learning based on disambiguation correction net with graph representation. In: IEEE Transactions on Circuits and Systems for Video Technology

  14. F Yu, M-L Zhang (2016) Maximum margin partial label learning. In: Asian Conference on Machine Learning, PMLR pp 96–111

  15. Liu LP, Dietterich TG (2012) A conditional multinomial mixture model for superset label learning. Advances in Neural Information Processing Systems 1:548–556

    Google Scholar 

  16. Chen Y-C, Patel VM, Pillai JK, Chellappa R, Phillips PJ (2013) Dictionary learning from ambiguously labeled data. Proc IEEE Conf Comput Vis Pattern Recognit. Portland Oregon, USA, pp 353–360

    Google Scholar 

  17. J Fan, Y Yu, Z Wang (2022) Addressing label ambiguity imbalance in candidate labels: Measures and disambiguation algorithm, Information Sciences

  18. EH üllermeier, J Beringer (2006) Learning from ambiguously labeled examples. In: Intelligent Data Analysis 10(5): 419–439

  19. D-B Wang, L Li, M-L Zhang (2019) Adaptive graph guided disambiguation for partial label learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA pp. 83–91

  20. W-X Bao, J-Y Hang, M-L Zhang (2021) Partial label dimensionality reduction via confidence-based dependence maximization. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining pp. 46–54

  21. Wu X, Zhang M-L (2018) Towards enabling binary decomposition for partial label learning. IJCAI. Buenos Aires, Argentina, pp 2868–2874

    Google Scholar 

  22. B Chen, B Wu, A Zareian, H Zhang, S-F Chang (2020) General partial label learning via dual bipartite graph autoencoder. In: AAAI, New York, USA

  23. M-L Zhang, F Yu, C-Z Tang (2017) Disambiguation-free partial label learning. In: IEEE Trans Knowl Data Eng 29(10): 2155–2167

  24. J Fan, Z Wang (2022) Partial label learning via gans with multi-class svms and information maximization. In: IEEE Transactions on Circuits and Systems for Video Technology

  25. H Wang, R Xiao, Y Li, L Feng, G Niu, G Chen, J Zhao (2022) Pico: Contrastive label disambiguation for partial label learning. arXiv preprint arXiv:2201.08984

  26. Yan Y, Guo Y (2020) Partial label learning with batch label correction. Proceedings of the AAAI Conference on Artificial Intelligence 34:6575–6582

    Article  Google Scholar 

  27. A Tewari, PL Bartlett (2007) On the consistency of multiclass classification methods. In: J Mach Learn Res 8 (5)

  28. Xu Y, Wang Q, Pang X, Tian Y (2018) Maximum margin of twin spheres machine with pinball loss for imbalanced data classification. Appl Intell 48:23–34

    Article  Google Scholar 

  29. A Asuncion, D Newman (2007) Uci machine learning repository

  30. G Panis, A Lanitis (2014) An overview of research activities in facial age estimation using the fg-net aging database. In: European Conference on Computer Vision

  31. F Briggs, XZ Fern, R Raich (2012) Rank-loss support instance machines for miml instance annotation. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 534–542

  32. M Guillaumin, J Verbeek, C Schmid (2010) Multiple instance metric learning from automatically labeled bags of faces. In: European Conference on Computer Vision, Springer pp. 634–647

  33. Zeng Z, Xiao S, Jia K, Chan T-H, Gao S, Xu D, Ma Y (2013) Learning by associating ambiguously labeled images. Proc IEEE Conf Comput Vis Pattern Recognit. Portland Oregon, USA, pp 708–715

  34. Liu L, Dietterich T (2014) Learnability of the superset label learning problem. International Conference on Machine Learning. Lanzhou, China, pp 1629–1637

  35. Y LeCun, L Bottou, Y Bengio, P Haffner (1998) Gradient-based learning applied to document recognition. In: Proc IEEE 86(11): 2278–2324

  36. T Clanuwat, M Bober-Irizar, A Kitamoto, A Lamb, K Yamamoto, D Ha (2018) Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718

  37. H Xiao, K Rasul, R Vollgraf (2017) Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

  38. M-L Zhang, B-B Zhou, X-Y Liu (2016) Partial label learning via feature-aware disambiguation. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, USA pp. 1335–1344

  39. Y Yan, S Li (2021) A generative model for partial label learning. In: 2021 IEEE International Conference on Multimedia and Expo (ICME), IEEE pp. 1–6

  40. A Paszke, S Gross, F Massa, A Lerer, J Bradbury, G Chanan, T Killeen, Z Lin, N Gimelshein, L Antiga, et al. (2019) Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32

  41. DP Kingma, J Ba (2014) Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980

  42. J Demšar (2006) Statistical comparisons of classifiers over multiple data sets. In: J Mach Learn Res 71–30

  43. K He, X Zhang, S Ren, J Sun (2016) Deep residual learning for image recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit pp. 770–778

Download references

Acknowledgements

This research was supported by the Ministry of Science and Technology (under Project No. 2018YFB1702703).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinfu Fan or Zhongjie Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Fully documented templates are available in the elsarticle package on CTAN.

Appendix A

Appendix A

Proof of Theorem 1:

For instance \(\textbf{x}_{i}\), the potential ground-truth label \(\hat{y_{i}} \in S_{i}\). Then, we have:

$$\begin{aligned} \underset{y_{j} \in S_{i}}{max} g_{y_{j}}(\textbf{x}_{i}) \ge g_{\hat{y_{i}}}(\textbf{x}_{i}) \end{aligned}$$
(15)

Meanwhile, let \(\hat{U}_{i}=\mathcal {Y} \backslash \left\{ y_{i}\right\} \) and \(\hat{S}_{i}\) is complementary set of \(S_{i}\). Then, we have \(\hat{S}_{i} \in \hat{U}_{i} \). Therefore, we can further obtain:

$$\begin{aligned} \underset{y_{k} \in \hat{S}_{i}}{max}g_{y_{k}}(\textbf{x}_{i})) \le \underset{y_{j} \in \hat{U}_{i}}{max}g_{y_{j}}(\textbf{x}_{i})) \end{aligned}$$
(16)

Hence, we obtain:

$$\begin{aligned} \begin{aligned}{} & {} \max \left( 0,m-\underset{y_{j} \in S_{i}}{max} g_{y_{j}}(\textbf{x}_{i})+\underset{y_{k} \in \hat{S}_{i}}{max}g_{y_{k}}(\textbf{x}_{i}))\right) \\{} & {} \quad \le \max \left( 0,m- g_{\hat{y_{i}}}(\textbf{x}_{i})+\underset{y_{l} \in \hat{U}_{i}}{max}g_{y_{l}}(\textbf{x}_{i}))\right) \end{aligned} \end{aligned}$$
(17)

Finally, for any \(\textbf{X}\), we can get:

$$\begin{aligned} \begin{aligned} \mathbb {E}_{(\textbf{X},S)}\left[ \mathcal {L}_{DHL}(\varvec{g}(\textbf{X}), S\right] \le \mathbb {E}_{(\textbf{X},\hat{\textbf{Y}})}\left[ \mathcal {L}(\varvec{g}(\textbf{X}), \hat{\textbf{Y}})\right] \end{aligned} \end{aligned}$$
(18)

The proof is completed.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, J., Jiang, Z., Xian, Y. et al. A multi-class partial hinge loss for partial label learning. Appl Intell 53, 28333–28348 (2023). https://doi.org/10.1007/s10489-023-04954-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04954-1

Keywords

Navigation