Skip to main content
Log in

Robust AUC maximization for classification with pairwise confidence comparisons

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Supervised learning often requires a large number of labeled examples, which has become a critical bottleneck in the case that manual annotating the class labels is costly. To mitigate this issue, a new framework called pairwise comparison (Pcomp) classification is proposed to allow training examples only weakly annotated with pairwise comparison, i.e., which one of two examples is more likely to be positive. The previous study solves Pcomp problems by minimizing the classification error, which may lead to less robust model due to its sensitivity to class distribution. In this paper, we propose a robust learning framework for Pcomp data along with a pairwise surrogate loss called Pcomp-AUC. It provides an unbiased estimator to equivalently maximize AUC without accessing the precise class labels. Theoretically, we prove the consistency with respect to AUC and further provide the estimation error bound for the proposed method. Empirical studies on multiple datasets validate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhou Z H. A brief introduction to weakly supervised learning. National Science Review, 2018, 5(1): 44–53

    Article  MathSciNet  Google Scholar 

  2. Zhu X, Goldberg A B. Introduction to Semi-Supervised Learning. Cham: Springer, 2009, 1–130

    Book  Google Scholar 

  3. Niu G, Jitkrittum W, Dai B Hachiya H, Sugiyama M. Squared-loss mutual information regularization: a novel information-theoretic approach to semi-supervised learning. In: Proceedings of the 30th International Conference on International Conference on Machine Learning. 2013, III-10–III-18

  4. Natarajan N, Dhillon I S, Ravikumar P, Tewari A. Learning with noisy labels. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 1196–1204

  5. Liu T, Tao D. Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine intelligence, 2016, 38(3): 447–461

    Article  MathSciNet  Google Scholar 

  6. Du Plessis M C, Niu G, Sugiyama M. Analysis of learning from positive and unlabeled data. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 703–711

  7. Du Plessis M, Niu G, Sugiyama M. Convex formulation for learning from positive and unlabeled data. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 1386–1394

  8. Kiryo R, Niu G, du Plessis M C, Sugiyama M. Positive-unlabeled learning with non-negative risk estimator. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 1674–1684

  9. Cour T, Sapp B, Taskar B. Learning from partial labels. The Journal of Machine Learning Research, 2011, 12: 1501–1536

    MathSciNet  Google Scholar 

  10. Xie M K, Huang S J. Partial multi-label learning. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 4302–4309

  11. Feng L, Lv J, Han B, Xu M, Niu G, Geng X, An B, Sugiyama M. Provably consistent partial-label learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 10948–10960

  12. Ishida T, Niu G, Hu W, Sugiyama M. Learning from complementary labels. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5644–5654

  13. Yu X Y, Liu T L, Gong M M, Tao D C. Learning with biased complementary labels. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 69–85

  14. Bao H, Niu G, Sugiyama M. Classification from pairwise similarity and unlabeled data. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 452–461

  15. Shimada T, Bao H, Sato I, Sugiyama M. Classification from pairwise similarities/dissimilarities and unlabeled data via empirical risk minimization. Neural Computation, 2021, 33(5): 1234–1268

    Article  MathSciNet  Google Scholar 

  16. Feng L, Shu S, Cao Y, Tao L, Wei H, Xiang T, An B, Niu G. Multiple-instance learning from similar and dissimilar bags. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021, 374–382

  17. Zhang D, Han J, Cheng G, Yang M H. Weakly supervised object localization and detection: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 5866–5885

    Google Scholar 

  18. Zhang D, Zeng W, Yao J, Han J. Weakly supervised object detection using proposal- and semantic-level relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 3349–3363

    Article  Google Scholar 

  19. Feng L, Shu S, Lu N, Han B, Xu M, Niu G, An B, Sugiyama M. Pointwise binary classification with pairwise confidence comparisons. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 3252–3262

  20. Zhou Z H, Chen K J, Jiang Y. Exploiting unlabeled data in content-based image retrieval. In: Proceedings of the 15th European Conference on Machine Learning. 2004, 525–536

  21. Cortes C, Mohri M. AUC optimization vs. error rate minimization. In: Proceedings of the 16th International Conference on Neural Information Processing Systems. 2003, 313–320

  22. Elkan C. The foundations of cost-sensitive learning. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence. 2001, 973–978

  23. Freund Y, Iyer R, Schapire R E, Singer Y. An efficient boosting algorithm for combining preferences. The Journal of Machine Learning Research, 2003, 4: 933–969

    MathSciNet  Google Scholar 

  24. Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, 27(8): 861–874

    Article  MathSciNet  Google Scholar 

  25. Zhou K, Gao S, Cheng J, Gu Z, Fu H, Tu Z, Yang J, Zhao Y, Liu J. Sparse-Gan: sparsity-constrained generative adversarial network for anomaly detection in retinal OCT image. In: Proceedings of the 17th IEEE International Symposium on Biomedical Imaging. 2020, 1227–1231

  26. Liu W, Luo W, Lian D, Gao S. Future frame prediction for anomaly detection–a new baseline. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 6536–6545

  27. Liu C, Zhong Q, Ao X, Sun L, Lin W, Feng J, He Q, Tang J. Fraud transactions detection via behavior tree with local intention calibration. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, 3035–3043

  28. Chen Y, Chen B, He X, Gao C, Li Y, Lou J G, Wang Y. λOpt: learn to regularize recommender models in finer levels. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019, 978–986

  29. Dai L, Yin Y, Qin C, Xu T, He X, Chen E, Xiong H. Enterprise cooperation and competition analysis with a sign-oriented preference network. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, 774–782

  30. Yang Z Y, Xu Q Q, Bao S, Bao S L, Cao X C, Huang Q M. Learning with Multiclass AUC: theory and algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(11): 7747–7763

    Article  Google Scholar 

  31. Calders T, Jaroszewicz S. Efficient AUC optimization for classification. In: Proceedings of the 11th European Conference on Principles of Data Mining and Knowledge Discovery. 2007, 42–53

  32. Herschtal A, Raskutti B. Optimising area under the ROC curve using gradient descent. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 49

  33. Joachims T. Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006, 217–226

  34. Zhao P, Hoi S C H, Jin R, Yang T. Online AUC maximization. In: Proceedings of the 28th International Conference on International Conference on Machine Learning. 2011, 233–240

  35. Gao W, Jin R, Zhu S, Zhou Z H. One-pass AUC optimization. In: Proceedings of the 30th International Conference on Machine Learning. 2013, III-906–III-914

  36. Ying Y, Wen L, Lyu S. Stochastic online AUC maximization. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 451–459

  37. Dang Z, Li X, Gu B, Deng C, Huang H. Large-scale nonlinear AUC maximization via triply stochastic gradients. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1385–1398

    Article  Google Scholar 

  38. Agarwal S, Graepel T, Herbrich R, Har-Peled S, Roth D. Generalization bounds for the area under the ROC curve. The Journal of Machine Learning Research, 2005, 6: 393–425

    MathSciNet  Google Scholar 

  39. Usunier N, Amini M R, Gallinari P. A data-dependent generalisation error bound for the AUC. In: Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning. 2005

  40. Agarwal S. Surrogate regret bounds for bipartite ranking via strongly proper losses. The Journal of Machine Learning Research, 2014, 15(1): 1653–1674

    MathSciNet  Google Scholar 

  41. Gao W, Zhou Z H. On the consistency of AUC pairwise optimization. In: Proceedings of the 24th International Conference on Artificial Intelligence. 2015, 939–945

  42. Elkan C, Noto K. Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 213–220

  43. Niu G, du Plessis, Sakai T, Ma Y, Sugiyama M. Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 1207–1215

  44. Ren K, Yang H, Zhao Y, Chen W, Xue M, Miao H, Huang S, Liu J. A robust AUC maximization framework with simultaneous outlier detection and feature selection for positive-unlabeled classification. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(10): 3072–3083

    Article  MathSciNet  Google Scholar 

  45. Lu N, Niu G, Menon A K, Sugiyama M. On the minimal supervision for training any binary classifier from only unlabeled data. In: Proceedings of the 7th International Conference on Learning Representations. 2019

  46. Brefeld U, Scheffer T. AUC maximizing support vector learning. In: Proceedings of ICML 2005 workshop on ROC Analysis in Machine Learning. 2005

  47. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324

    Article  Google Scholar 

  48. Xiao H, Rasul K, Vollgraf R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. 2017, arXiv preprint arXiv: 1708.07747

  49. Clanuwat T, Bober-Irizar M, Kitamoto A, Lamb A, Yamamoto K, Ha D. Deep learning for classical Japanese literature. 2018, arXiv preprint arXiv: 1812.01718

  50. Hull J J. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994, 16(5): 550–554

    Article  Google Scholar 

  51. Dua D, Graff C. UCI Machine Learning Repository. Irvine: University of California, School of Information and Computer Science. See archive.ics.uci.edu/ml/citation_policy.html website, 2019

  52. Kingma D P, Ba J. Adam: a method for stochastic optimization. 2014, arXiv preprint arXiv: 1412.6980

  53. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. 2015

  54. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E Z, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. PyTorch: an imperative style, high-performance deep learning library. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019, 8024–8035

  55. Zhang T. Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 2004, 32(1): 56–85

    Article  MathSciNet  Google Scholar 

  56. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. 2nd ed. MIT Press, 2018

Download references

Acknowledgements

This research was supported by the Natural Science Foundation of Jiangsu Province, China (BK20222012, BK20211517), the National Key R&D Program of China (2020AAA0107000), and National Natural Science Foundation of China (Grant No. 62222605)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingkun Xie.

Additional information

Haochen Shi received BS, MS degrees in signal and information processing from Nanjing University of Aeronautics and Astronautics, China in 2016 and 2018, respectively, and received a MS degree in electronics from Queen’s University Belfast, UK in 2019. She is now a PhD student in Nanjing University of Aeronautics and Astronautics, China. Her main research interrests include machine learning, image processing and pattern recognition.

Mingkun Xie received the BS degree in 2018. He is currently a PhD student in the MIIT Key Laboratory of Pattern Analysis and Machine Intelligence of Nanjing University of Aeronautics and Astronautics, China. He has served as a PC member of NeurIPS, ICML, ICLR, also a reviewer of TNNLS, MLJ. His research interests are mainly in machine learning. Particularly, he is interested in multilabel learning and weakly-supervised learning.

Shengjun Huang received the BS and PhD in computer science from Nanjing University, China in 2008 and 2014, respectively. He is now a professor in the College of Computer Science and Technology of Nanjing University of Aeronautics and Astronautics, China. His main research interests include machine learning and data mining. He has been selected to the Young Elite Scientists Sponsorship Program by CAST in 2016, and won the China Computer Federation Outstanding Doctoral Dissertation Award in 2015, the KDD Best Poster Award at the in 2012, and the Microsoft Fellowship Award in 2011. He is a Junior Associate Editor of Frontiers of Computer Science.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, H., Xie, M. & Huang, S. Robust AUC maximization for classification with pairwise confidence comparisons. Front. Comput. Sci. 18, 184317 (2024). https://doi.org/10.1007/s11704-023-2709-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-2709-5