Active label distribution learning via kernel maximum mean discrepancy

Dong, Xinyue; Luo, Tingjin; Fan, Ruidong; Zhuge, Wenzhang; Hou, Chenping

doi:10.1007/s11704-022-1624-5

Active label distribution learning via kernel maximum mean discrepancy

Research Article
Published: 12 December 2022

Volume 17, article number 174327, (2023)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Xinyue Dong¹,
Tingjin Luo¹,
Ruidong Fan¹,
Wenzhang Zhuge¹ &
…
Chenping Hou¹

42 Accesses
6 Citations
53 Altmetric
7 Mentions
Explore all metrics

Abstract

Label distribution learning (LDL) is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances. Compared with traditional supervised learning scenarios, the annotation with label distribution is more expensive. Direct use of existing active learning (AL) approaches, which aim to reduce the annotation cost in traditional learning, may lead to the degradation of their performance. To deal with the problem of high annotation cost in LDL, we propose the active label distribution learning via kernel maximum mean discrepancy (ALDL-kMMD) method to tackle this crucial but rarely studied problem. ALDL-kMMD captures the structural information of both data and label, extracts the most representative instances from the unlabeled ones by incorporating the nonlinear model and marginal probability distribution matching. Besides, it is also able to markedly decrease the amount of queried unlabeled instances. Meanwhile, an effective solution is proposed for the original optimization problem of ALDL-kMMD by constructing auxiliary variables. The effectiveness of our method is validated with experiments on the real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ALDL: a novel method for label distribution learning

Article 08 February 2019

Online label distribution learning based on kernel extreme learning machine

Multi-label active learning by model guided distribution matching

Article 18 April 2016

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Geng X. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7): 1734–1748
Article Google Scholar
Zhang M L, Zhou Z H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837
Article Google Scholar
Geng X, Yin C, Zhou Z H. Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401–2412
Article Google Scholar
Gao B B, Zhou H Y, Wu J, Geng X. Age estimation using expectation of label distribution learning. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018, 712–718
Kong S G, Mbouna R O. Head pose estimation from a 2D face image using 3D face morphing with depth parameters. IEEE Transactions on Image Processing, 2015, 24(6): 1801–1808
Article MathSciNet Google Scholar
Zhou Y, Xue H, Geng X. Emotion distribution recognition from facial expressions. In: Proceedings of the 23rd ACM International Conference on Multimedia. 2015, 1247–1250
Zhou D, Zhang X, Zhou Y, Zhao Q, Geng X. Emotion distribution learning from texts. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 638–647
Dong X, Gu S, Zhuge W, Luo T, Hou C. Active label distribution learning. Neurocomputing, 2021, 436: 12–21
Article Google Scholar
Burges C J C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121–167
Article Google Scholar
McCallum A, Nigam K. A comparison of event models for naive Bayes text classification. In: Proceedings of AAAI-98 Workshop on Learning for Text Categorization. 1998, 41–48
Tong S, Koller D. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research, 2002, 2: 45–66
Google Scholar
Freund Y, Seung H S, Shamir E, Tishby N. Selective sampling using the query by committee algorithm. Machine Learning, 1997, 28(2–3): 133–168
Article Google Scholar
Guo Y, Schuurmans D. Discriminative batch mode active learning. In: Proceedings of the 20th International Conference on Neural Information Processing Systems. 2007, 593–600
Ren T, Jia X, Li W, Zhao S. Label distribution learning with label correlations via low-rank approximation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 3325–3331
Berger A L, Pietra V J D, Pietra S A D. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22(1): 39–71
Google Scholar
Pietra S D, Pietra V D, Lafferty J. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 380–393
Article Google Scholar
Nocedal J, Wright S J. Numerical Optimization. New York: Springer, 2006
Google Scholar
Xu N, Liu Y P, Geng X. Label enhancement for label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(4): 1632–1643
Article Google Scholar
Wang J, Geng X, Xue H. Re-weighting large margin label distribution learning for classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, DOI: https://doi.org/10.1109/TPAMI.2021.3082623
Seung H S, Opper M, Sompolinsky H. Query by committee. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory. 1992, 287–294
Lewis D D, Catlett J. Heterogeneous uncertainty sampling for supervised learning. In: Cohen W W, ed. Machine Learning Proceedings. New Brunswick: Elsevier, 1994
Google Scholar
Balcan M F, Broder A, Zhang T. Margin based active learning. In: Proceedings of the 20th International Conference on Computational Learning Theory. 2007, 35–50
Lindley D V. On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 1956, 27(4): 986–1005
Article MathSciNet Google Scholar
Yu K, Bi J, Tresp V. Active learning via transductive experimental design. In: Proceedings of the 23rd International Conference on Machine Learning. 2006, 1081–1088
Nguyen H T, Smeulders A. Active learning using pre-clustering. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 9
Nie F, Xu D, Li X. Initialization independent clustering with actively self-training method. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(1): 17–27
Article Google Scholar
Cai D, He X. Manifold adaptive experimental design for text categorization. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(4): 707–719
Article Google Scholar
Huang S J, Jin R, Zhou Z H. Active learning by querying informative and representative examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(10): 1936–1949
Article Google Scholar
Huang S J, Zhou Z H. Active query driven by uncertainty and diversity for incremental multi-label learning. In: Proceedings of the 13th IEEE International Conference on Data Mining. 2013, 1079–1084
Huang S J, Zhao J W, Liu Z Y. Cost-effective training of deep CNNs with active model adaptation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1580–1588
Huang S J, Chen J L, Mu X, Zhou Z H. Cost-effective active learning from diverse labelers. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 1879–1885
Yang Y, Zhou D W, Zhan D C, Xiong H, Jiang Y, Yang J. Cost-effective incremental deep model: matching model capacity with the least sampling. IEEE Transactions on Knowledge and Data Engineering, 2021, DOI: https://doi.org/10.1109/TKDE.2021.3132622
Tang Y P, Huang S J. Dual active learning for both model and data selection. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 3052–3058
Borgwardt K M, Gretton A, Rasch M J, Kriegel H P, Schölkopf B, Smola A J. Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics, 2006, 22(14): e49–e57
Article Google Scholar
Fortet R, Mourier E. Convergence de la répartition empirique vers la répartition théorique. Annales Scientifiques de l’École Normale Supérieure, 1953, 70(3): 267–285
Article MathSciNet Google Scholar
Eisen M B, Spellman P T, Brown P O, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 1998, 95(25): 14863–14868
Article Google Scholar
Kapoor A, Grauman K, Urtasun R, Darrell T. Gaussian processes for object categorization. International Journal of Computer Vision, 2010, 88(2): 169–188
Article Google Scholar
Guo F B, Lin Y. Identify protein-coding genes in the genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS. Journal of Biomolecular Structure and Dynamics, 2009, 26(4): 413–420
Article Google Scholar
Ghafoori Z, Bezdek J C, Leckie C, Karunasekera S. Unsupervised and active learning using maximin-based anomaly detection. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2019, 90–106

Download references

Acknowledgements

This work was partially supported by the National Natural Science Fundation of China (Grant Nos. 61922087, 61906201 and 62006238), and the Science and Technology Innovation Program of Hunan Province (2021RC3070).

Author information

Authors and Affiliations

College of Science, National University of Defense Technology, Changsha, 410073, China
Xinyue Dong, Tingjin Luo, Ruidong Fan, Wenzhang Zhuge & Chenping Hou

Authors

Xinyue Dong
View author publications
Search author on:PubMed Google Scholar
Tingjin Luo
View author publications
Search author on:PubMed Google Scholar
Ruidong Fan
View author publications
Search author on:PubMed Google Scholar
Wenzhang Zhuge
View author publications
Search author on:PubMed Google Scholar
Chenping Hou
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Tingjin Luo or Chenping Hou.

Additional information

Xinyue Dong is studying for her master’s degree in National University of Defense Technology, China. Her main research interests include data analysis and machine learning.

Tingjin Luo received his PhD degrees from National University of Defense Technology, China. He is currently an Associate Professor with the College of Science of the same university. He has authored more than 30 papers in journals and conferences, such as the IEEE TKDE, IEEE TCYB, ACM KDD. He has been a Program Committee member of several conferences including IJCAI and AAAI. His current research interests include machine learning, optimization, and data mining.

Ruidong Fan has received his master’s degree in National University of Defense Technology, China. He is working toward the PhD degree in the College of Science, National University of Denfense Technology, China. His main research interests include statistical data analysis and machine learning.

Wenzhang Zhuge received the BS degree from Shandong University, China and the MS degree from National University of Denfense Technology, China. He has received his PhD degree in the College of Science, National University of Denfense Technology, China. His research interests include machine learning, system science, and data mining.

Chenping Hou received the BS and PhD degrees in applied mathematics from National University of Defense Technology, China in 2004 and 2009, respectively. He is currently a Professor with the College of Science, National University of Defense Technology, China. He has authored more than 50 papers in journals and conferences, such as the IEEE TPAMI, IEEE TKDE, IEEE TNNLS/TNN, IEEE TCYB, IEEE TIP, PR, IJCAI, AAAI. He has been a Program Committee member of several conferences including IJCAI, AAAI. His current research interests include pattern recognition, machine learning, data mining, and computer vision.

Electronic supplementary material