Abstract
Active learning involves selecting a few critical unlabeled samples for manual and credible labeling to improve the performance of the current classifier. The critical step of active learning is the sample selection strategy. Uncertainty sampling is a well-known sample selection strategy, which involves selecting the samples for which the current classifier is uncertain. For the generalized linear model, these samples are usually distributed around the current classification hyperplane. However, uncertain samples include samples near the current classification hyperplane, and samples far from the current classification hyperplane and the labeled samples. Traditional uncertainty sampling fails to describe the latter, and traditional methods are easily affected by outliers. In this paper, belief functions are used to describe the uncertainty that exists in various samples. Furthermore, we propose a sample selection strategy based on belief functions. Experimental results based on benchmark datasets show that the proposed approach outperforms several classical methods. Through this approach, higher classification accuracy can be achieved using the same number of new labeled samples.
Similar content being viewed by others
References
Zhou Z W, Shin J, Zhang L, et al. Fine-tuning convolutional neural networks for biomedical image analysis: actively and incrementally. In: Proceedings of Computer Vision and Pattern Recognition, 2017. 4761–4772
Hoi S C H, Rong J, Zhu J K, et al. Semi-supervised SVM batch mode active learning for image retrieval. In: Proceedings of Computer Vision and Pattern Recognition, 2008. 1–7
Hoi S C H, Rong J, Lyu M R. Batch mode active learning with applications to text categorization and image retrieval. IEEE Trans Knowl Data Eng, 2009, 21: 1233–1248
Raghavan H, Madani O, Jones R. Active learning with feedback on features and instances. J Mach Learn Res, 2006, 7: 1655–1686
Lewis D D, Catlett J. Heterogenous uncertainty sampling for supervised learning. In: Proceedings of the 11th International Conferenceon on Machine Learning, 1994. 148–156
Settles B. Active Learning Literature Survey. Technical Report, Department of Computer Science, University of Wisconsin-Madison. 2010
Sharma M, Bilgic M. Evidence-based uncertainty sampling for active learning. Data Min Knowl Disc, 2017, 31: 164–202
Li X, Guo Y H. Active learning with multi-label SVM classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, 2013. 1479–1485
Zhang T, Oles F. The value of unlabeled data for classification problems. In: Proceedings of the 17th International Conference on Machine Learning, 2000. 1191–1198
Cai W B, Zhang Y X, Zhang Y, et al. Active learning for classification with maximum model change. ACM Trans Inf Syst, 2017, 36: 1–28
Zhu J, Wang H, Yao T, et al. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In: Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, 2008. 18–22
Kee S, del Castillo E, Runger G. Query-by-committee improvement with diversity and density in batch active learning. Inf Sci, 2018, 454-455: 401–418
Huang S J, Jin R, Zhou Z H. Active learning by querying informative and representative examples. IEEE Trans Pattern Anal Mach Intell, 2014, 36: 1936–1949
Zhu J J, Bento J. Generative adversarial active learning. 2017. ArXiv: 1702.07956
Sinha S, Ebrahimi S, Darrell T. Variational adversarial active learning. 2019. ArXiv: 1904.00370
Yang Y, Loog M. A variance maximization criterion for active learning. Pattern Recogn, 2018, 78: 358–370
Cai W B, Zhang Y, Zhou S Y, et al. Active learning for support vector machines with maximum model change. Machine Learn Knowl Discov Databases, 2014, 9: 211–226
Zhu J, Wang H, Tsou B K, et al. Active learning with sampling by uncertainty and density for data annotations. IEEE Trans Audio Speech Lang Process, 2010, 18: 1323–1331
Masson M H, Denœux T. ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recogn, 2008, 41: 1384–1397
Han D Q, Liu W B, Dezert J, et al. A novel approach to pre-extracting support vectors based on the theory of belief functions. Knowledge-Based Syst, 2016, 110: 210–223
Smets P, Kennes R. The transferable belief model. Artif Intell, 1994, 66: 191–234
Fan R E, Chang K W, Hsieh C J, et al. LIBLINEAR: a library for large linear classification. J Mach Learn Res, 2008, 9: 1871–1874
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 61671370), Postdoctoral Science Foundation of China (Grant No. 2016M592790), and Postdoctoral Science Research Foundation of Shaanxi Province (Grant No. 2016BSHEDZZ46).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, S., Han, D. & Yang, Y. Active learning based on belief functions. Sci. China Inf. Sci. 63, 210205 (2020). https://doi.org/10.1007/s11432-020-3082-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3082-9