Skip to main content

Hyperspherical Learning in Multi-Label Classification

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13685))

Included in the following conference series:

Abstract

Learning from online data with noisy web labels is gaining more attention due to the increasing cost of fully annotated datasets in large-scale multi-label classification tasks. Partial (positive) annotated data, as a particular case of data with noisy labels, are economically accessible. And they serve as benchmarks to evaluate the learning capacity of state-of-the-art methods in real scenarios, though they contain a large number of samples with false negative labels. Existing (partial) multi-label methods are usually studied in the Euclidean space, where the relationship between the label embeddings and image features is not symmetrical and thus can be challenging to learn. To alleviate this problem, we propose reformulating the task into a hyperspherical space, where an angular margin can be incorporated into a hyperspherical multi-label loss function. This margin allows us to effectively balance the impact of false negative and true positive labels. We further design a mechanism to tune the angular margin and scale adaptively. We investigate the effectiveness of our method under three multi-label scenarios (single positive labels, partial positive labels and full labels) on four datasets (VOC12, COCO, CUB-200 and NUS-WIDE). In the single and partial positive labels scenarios, our method achieves state-of-the-art performance. The robustness of our method is verified by comparing the performances at different proportions of partial positive labels in the datasets. Our method also obtains more than 1% improvement over the BCE loss even on the fully annotated scenario. Analysis shows that the learned label embeddings potentially correspond to actual label correlation, since in hyperspherical space label embeddings and image features are symmetrical and interchangeable. This further indicates the geometric interpretability of our method. Code is available at https://github.com/TencentYoutuResearch/MultiLabel-HML.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akbarnejad, A.H., Baghshah, M.S.: An efficient semi-supervised multi-label classifier capable of handling missing labels. IEEE Trans. Knowl. Data Eng. 31(2), 229–242 (2018)

    Article  Google Scholar 

  2. Bustos, A., Pertusa, A., Salinas, J.M., de la Iglesia-Vayá, M.: PadChest: a large chest x-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020)

    Article  Google Scholar 

  3. Chaudhari, S., Mithal, V., Polatkan, G., Ramanath, R.: An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 12(5), 1–32 (2021)

    Article  Google Scholar 

  4. Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5177–5186 (2019)

    Google Scholar 

  5. Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from national university of Singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp. 1–9 (2009)

    Google Scholar 

  6. Cid-Sueiro, J.: Proper losses for learning from partial labels. In: Advances in Neural Information Processing Systems, pp. 1565–1573. Citeseer (2012)

    Google Scholar 

  7. Cole, E., Mac Aodha, O., Lorieul, T., Perona, P., Morris, D., Jojic, N.: Multi-label learning from single positive labels. In: CVPR, pp. 933–942 (2021)

    Google Scholar 

  8. Cour, T., Sapp, B., Taskar, B.: Learning from partial labels. J. Mach. Learn. Res. 12, 1501–1536 (2011)

    MathSciNet  MATH  Google Scholar 

  9. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)

    Google Scholar 

  10. Dong, H.C., Li, Y.F., Zhou, Z.H.: Learning from semi-supervised weak-label data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  11. Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2020)

    Google Scholar 

  12. Durand, T., Mehrasa, N., Mori, G.: Learning a deep convnet for multi-label classification with partial labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 647–657 (2019)

    Google Scholar 

  13. Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 213–220 (2008)

    Google Scholar 

  14. Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

  15. Fan, X., Jiang, W., Luo, H., Fei, M.: SphereReID: deep hypersphere manifold embedding for person re-identification. J. Vis. Commun. Image Represent. 60, 51–58 (2019)

    Article  Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

    Google Scholar 

  17. He, X., Zemel, R.: Learning hybrid models for image annotation with partially labeled data. Adv. Neural. Inf. Process. Syst. 21, 625–632 (2008)

    Google Scholar 

  18. Huang, S.J., Zhou, Z.H.: Multi-label learning by exploiting label correlations locally. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26 (2012)

    Google Scholar 

  19. Huynh, D., Elhamifar, E.: Interactive multi-label CNN learning with partial labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9423–9432 (2020)

    Google Scholar 

  20. Jin, R., Ghahramani, Z.: Learning with multiple labels. In: NIPS, vol. 2, pp. 897–904. Citeseer (2002)

    Google Scholar 

  21. Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1719–1726. IEEE (2006)

    Google Scholar 

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (Poster) (2015), http://arxiv.org/abs/1412.6980

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  24. Lanchantin, J., Wang, T., Ordonez, V., Qi, Y.: General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16478–16488 (2021)

    Google Scholar 

  25. Li, Q., Peng, X., Qiao, Y., Peng, Q.: Learning label correlations for multi-label image recognition with graph networks. Pattern Recogn. Lett. 138, 378–384 (2020)

    Article  Google Scholar 

  26. Li, W., et al.: WebVision challenge: visual learning and understanding with web data. ArXiv preprint arXiv:1705.05640 (2017)

  27. Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: IJCAI, vol. 3, pp. 587–592. Citeseer (2003)

    Google Scholar 

  28. Lin, T., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  29. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 212–220 (2017)

    Google Scholar 

  30. Liu, W., et al.: Deep hyperspherical learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 3953–3963 (2017)

    Google Scholar 

  31. Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022 (October 2021)

    Google Scholar 

  32. Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)

    Google Scholar 

  33. Luo, H., et al.: A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multimedia 22(10), 2597–2609 (2019)

    Article  Google Scholar 

  34. Meng, Q., Zhang, W.: Multi-label image classification with attention mechanism and graph convolutional networks. In: Proceedings of the ACM Multimedia Asia, pp. 1–6. ACM (2019)

    Google Scholar 

  35. Nguyen, N., Caruana, R.: Classification with partial labels. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–559 (2008)

    Google Scholar 

  36. Pham, H., Dai, Z., Xie, Q., Le, Q.V.: Meta pseudo labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11557–11568 (2021)

    Google Scholar 

  37. Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)

    Google Scholar 

  38. Ridnik, T., et al.: Asymmetric loss for multi-label classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 82–91 (2021)

    Google Scholar 

  39. Sun, Y.Y., Zhang, Y., Zhou, Z.H.: Multi-label learning with weak label. In: Twenty-fourth AAAI conference on artificial intelligence (2010)

    Google Scholar 

  40. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)

    Google Scholar 

  41. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Journal (2011)

    Google Scholar 

  42. Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Signal Process. Lett. 25(7), 926–930 (2018)

    Article  Google Scholar 

  43. Wang, F., Xiang, X., Cheng, J., Yuille, A.L.: Normface: L2 hypersphere embedding for face verification. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1041–1049 (2017)

    Google Scholar 

  44. Wang, H., et al.: CosFace: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5265–5274 (2018)

    Google Scholar 

  45. Wang, L., Liu, Y., Qin, C., Sun, G., Fu, Y.: Dual relation semi-supervised multi-label learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6227–6234 (2020)

    Google Scholar 

  46. Wang, Y., He, D., Li, F., Long, X., Zhou, Z., Ma, J., Wen, S.: Multi-label classification with label graph superimposing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12265–12272 (2020)

    Google Scholar 

  47. Wen*, Y., Liu*, W., Weller, A., Raj, B., Singh, R.: Sphereface2: binary classification is all you need for deep face recognition. In: 10th International Conference on Learning Representations (ICLR) (2022). https://openreview.net/forum?id=l3SDgUh7qZO, *equal contribution

  48. Wojke, N., Bewley, A.: Deep cosine metric learning for person re-identification. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp. 748–756. IEEE (2018)

    Google Scholar 

  49. Xie, M.K., Huang, S.J.: Partial multi-label learning with noisy label identification. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)

    Google Scholar 

  50. Ye, J., He, J., Peng, X., Wu, W., Qiao, Y.: Attention-driven dynamic graph convolutional network for multi-label image recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 649–665. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_39

    Chapter  Google Scholar 

  51. Yu, F., Rawat, A.S., Menon, A., Kumar, S.: Federated learning with only positive labels. In: International Conference on Machine Learning, pp. 10946–10956. PMLR (2020)

    Google Scholar 

  52. Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations. Expert Syst. Appl. 41(6), 2989–3004 (2014)

    Article  Google Scholar 

  53. Zhao, F., Huang, Y., Wang, L., Tan, T.: Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1556–1564 (2015)

    Google Scholar 

  54. Zhao, J., Yan, K., Zhao, Y., Guo, X., Huang, F., Li, J.: Transformer-based dual relation graph for multi-label image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 163–172 (2021)

    Google Scholar 

  55. Zhao, J., Zhao, Y., Li, J.: M3tr: Multi-modal multi-label recognition with transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 469–477 (2021)

    Google Scholar 

  56. Zhu, J., Liao, S., Lei, Z., Yi, D., Li, S.: Pedestrian attribute classification in surveillance: Database and evaluation. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 331–338 (2013)

    Google Scholar 

  57. Zhu, Y., Kwok, J.T., Zhou, Z.H.: Multi-label learning with global and local label correlation. IEEE Trans. Knowl. Data Eng. 30(6), 1081–1094 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Ke .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 989 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ke, B., Zhu, Y., Li, M., Shu, X., Qiao, R., Ren, B. (2022). Hyperspherical Learning in Multi-Label Classification. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19806-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19805-2

  • Online ISBN: 978-3-031-19806-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics