Skip to main content

Batch Mode Active Learning for Networked Data with Optimal Subset Selection

  • Conference paper
  • First Online:
  • 2679 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9098))

Abstract

Active learning has increasingly become an important paradigm for classification of networked data, where instances are connected with a set of links to form a network. In this paper, we propose a novel batch mode active learning method for networked data (BMALNeT). Our novel active learning method selects the best subset of instances from the unlabeled set based on the correlation matrix that we construct from the dedicated informativeness evaluation of each unlabeled instance. To evaluate the informativeness of each unlabeled instance accurately, we simultaneously exploit content information and the network structure to capture the uncertainty and representativeness of each instance and the disparity between any two instances. Compared with state-of-the-art methods, our experimental results on three real-world datasets demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baldridge, J., Osborne, M.: Active learning and the total cost of annotation. In: EMNLP 2004, A meeting of SIGDAT, pp. 9–16 (2004)

    Google Scholar 

  2. Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. J. Artif. Intell. Res. (JAIR) 4, 129–145 (1996)

    MATH  Google Scholar 

  3. Macskassy, S.A.: Using graph-based metrics with empirical risk minimization to speed up active learning on networked data. In: KDDM 2009, pp. 597–606. ACM (2009)

    Google Scholar 

  4. Shi, L., Zhao, Y., Tang, J.: Batch mode active learning for networked data. ACM Transactions on Intelligent Systems and Technology (TIST) 3(2), 33 (2012)

    Google Scholar 

  5. Yang, Z., Tang, J., Xu, B., Xing, C.: Active learning for networked data based on non-progressive diffusion model. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 363–372. ACM (2014)

    Google Scholar 

  6. Joshi, A.J., Porikli, F., Papanikolopoulos, N.: Multi-class active learning for image classification. In: CVPR 2009, pp. 2372–2379. IEEE (2009)

    Google Scholar 

  7. Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 74. ACM (2004)

    Google Scholar 

  8. Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: KDDM 2004, pp. 593–598. ACM (2004)

    Google Scholar 

  9. Hu, X., Tang, J., Gao, H., Liu, H.: Actnet: Active learning for networked texts in microblogging. In: SDM, pp. 306–314. SIAM (2013)

    Google Scholar 

  10. Cesa-Bianchi, N., Gentile, C., Vitale, F., Zappella, G.: Active learning on trees and graphs. arXiv preprint arXiv:1301.5112 (2013)

  11. Fang, M., Yin, J., Zhang, C., Zhu, X., Fang, M., Yin, J., Zhu, X., Zhang, C.: Active class discovery and learning for networked data. In: SDM, pp. 315–323. SIAM (2013)

    Google Scholar 

  12. Bilgic, M., Mihalkova, L., Getoor, L.: Active learning for networked data. In: ICML 2010, pp. 79–86 (2010)

    Google Scholar 

  13. Zhuang, H., Tang, J., Tang, W., Lou, T., Chin, A., Wang, X.: Actively learning to infer social ties. Data Mining and Knowledge Discovery 25(2), 270–297 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  14. Newman, M.: Networks: an introduction. Oxford University Press (2010)

    Google Scholar 

  15. Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry, 35–41 (1977)

    Google Scholar 

  16. Brandes, U.: On variants of shortest-path betweenness centrality and their generic computation. Social Networks 30(2), 136–145 (2008)

    Article  MathSciNet  Google Scholar 

  17. Fu, Y., Zhu, X., Elmagarmid, A.K.: Active learning with optimal instance subset selection. IEEE Transactions on Cybernetics 43(2), 464–475 (2013)

    Article  Google Scholar 

  18. Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM) 42(6), 1115–1145 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  19. Fujisawa, K., Kojima, M., Nakata, K.: Sdpa (semidefinite programming algorithm) user manual-version 4.10. Department of Mathematical and Computing Science, Tokyo Institute of Technology, Research Report, Tokyo (1998)

    Google Scholar 

  20. Sen, P., Namata, G.M., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)

    Google Scholar 

  21. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pengpeng Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, H. et al. (2015). Batch Mode Active Learning for Networked Data with Optimal Subset Selection. In: Dong, X., Yu, X., Li, J., Sun, Y. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9098. Springer, Cham. https://doi.org/10.1007/978-3-319-21042-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21042-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21041-4

  • Online ISBN: 978-3-319-21042-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics