Abstract
Active learning has increasingly become an important paradigm for classification of networked data, where instances are connected with a set of links to form a network. In this paper, we propose a novel batch mode active learning method for networked data (BMALNeT). Our novel active learning method selects the best subset of instances from the unlabeled set based on the correlation matrix that we construct from the dedicated informativeness evaluation of each unlabeled instance. To evaluate the informativeness of each unlabeled instance accurately, we simultaneously exploit content information and the network structure to capture the uncertainty and representativeness of each instance and the disparity between any two instances. Compared with state-of-the-art methods, our experimental results on three real-world datasets demonstrate the effectiveness of our proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baldridge, J., Osborne, M.: Active learning and the total cost of annotation. In: EMNLP 2004, A meeting of SIGDAT, pp. 9–16 (2004)
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. J. Artif. Intell. Res. (JAIR) 4, 129–145 (1996)
Macskassy, S.A.: Using graph-based metrics with empirical risk minimization to speed up active learning on networked data. In: KDDM 2009, pp. 597–606. ACM (2009)
Shi, L., Zhao, Y., Tang, J.: Batch mode active learning for networked data. ACM Transactions on Intelligent Systems and Technology (TIST) 3(2), 33 (2012)
Yang, Z., Tang, J., Xu, B., Xing, C.: Active learning for networked data based on non-progressive diffusion model. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 363–372. ACM (2014)
Joshi, A.J., Porikli, F., Papanikolopoulos, N.: Multi-class active learning for image classification. In: CVPR 2009, pp. 2372–2379. IEEE (2009)
Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 74. ACM (2004)
Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: KDDM 2004, pp. 593–598. ACM (2004)
Hu, X., Tang, J., Gao, H., Liu, H.: Actnet: Active learning for networked texts in microblogging. In: SDM, pp. 306–314. SIAM (2013)
Cesa-Bianchi, N., Gentile, C., Vitale, F., Zappella, G.: Active learning on trees and graphs. arXiv preprint arXiv:1301.5112 (2013)
Fang, M., Yin, J., Zhang, C., Zhu, X., Fang, M., Yin, J., Zhu, X., Zhang, C.: Active class discovery and learning for networked data. In: SDM, pp. 315–323. SIAM (2013)
Bilgic, M., Mihalkova, L., Getoor, L.: Active learning for networked data. In: ICML 2010, pp. 79–86 (2010)
Zhuang, H., Tang, J., Tang, W., Lou, T., Chin, A., Wang, X.: Actively learning to infer social ties. Data Mining and Knowledge Discovery 25(2), 270–297 (2012)
Newman, M.: Networks: an introduction. Oxford University Press (2010)
Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry, 35–41 (1977)
Brandes, U.: On variants of shortest-path betweenness centrality and their generic computation. Social Networks 30(2), 136–145 (2008)
Fu, Y., Zhu, X., Elmagarmid, A.K.: Active learning with optimal instance subset selection. IEEE Transactions on Cybernetics 43(2), 464–475 (2013)
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM) 42(6), 1115–1145 (1995)
Fujisawa, K., Kojima, M., Nakata, K.: Sdpa (semidefinite programming algorithm) user manual-version 4.10. Department of Mathematical and Computing Science, Tokyo Institute of Technology, Research Report, Tokyo (1998)
Sen, P., Namata, G.M., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, H. et al. (2015). Batch Mode Active Learning for Networked Data with Optimal Subset Selection. In: Dong, X., Yu, X., Li, J., Sun, Y. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9098. Springer, Cham. https://doi.org/10.1007/978-3-319-21042-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-21042-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21041-4
Online ISBN: 978-3-319-21042-1
eBook Packages: Computer ScienceComputer Science (R0)