An Active Learning Method Based on Most Possible Misclassification Sampling Using Committee

Long, Jun; Yin, Jianping; Zhu, En

doi:10.1007/978-3-540-73729-2_10

Jun Long¹,
Jianping Yin¹ &
En Zhu¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4617))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

1543 Accesses
3 Citations

Abstract

By selecting and asking the user to label only the most informative instances, active learners can significantly reduce the number of labeled training instances to learn a classification function. We focus here on how to select the most informative instances for labeling. In this paper we make three contributions. First, in contrast to the leading sampling strategy of halving the volume of version space, we present the sampling strategy of reducing the volume of version space by more than half with the assumption of target function being chosen from nonuniform distribution over version space. Second, via Halving model, we propose the idea of sampling the instances that would be most possibly misclassified. Third, we present a sampling method named CBMPMS (Committee Based Most Possible Misclassification Sampling) which samples the instances that have the largest probability to be misclassified by the current classifier. Comparing the proposed CBMPMS method with the existing active learning methods, when the classifiers achieve the same accuracy, the former method will sample fewer times than the latter ones. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proc. of SIGIR-94. 17th ACM International Conference on Research and Development in Information Retrieval, pp. 3–12. Springer, Heidelberg (1994)
Google Scholar
Hieu, T., Arnold, S.: Active Learning Using Pre-clustering. In: Proc. 21th International Conf. on Machine Learning, Banff, Canada (2004)
Google Scholar
Schein, A.I.: Active learning for logistic regression: [Ph D dissertation] (2004)
Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45–66 (2001)
Article Google Scholar
Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: Proc. 17th International Conf on Machine Learning, pp. 839–846. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Campbell, C., Cristianini, N., Smola, A.: Query learning with large margin classifiers. In: Proc. 17th International Conf. on Machine Learning, pp. 111–118. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. Journal of Artificial Intelligence research 4, 129–145 (1996)
MATH Google Scholar
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proc. 18th International Conf. on Machine Learning, pp. 441–448. Morgan Kaufmann, San Francisco, CA (2001)
Google Scholar
Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. Computational Learning Theory, 287–294 (1992)
Google Scholar
Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28, 133–168 (1997)
Article MATH Google Scholar
Cohn, D.A., Ladner, R.: Improving generalization with active learning. Machine Learning 15, 201–221 (1994)
Google Scholar
Abe, N., Mamitsuka, H.: Query learning using boosting and bagging. In: Proc. 15th International Conf on Machine Learning, pp. 1–10. Morgan Kaufmann, Madison, CA (1998)
Google Scholar
Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proc. 21th International Conf. on Machine Learning, pp. 584–591. Morgan Kaufmann, Banff, CA (2004)
Google Scholar
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intell. 12, 993–1001 (1990)
Article Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

School of Computer Science, National University of Defense Technology, Changsha 410073, P.R. China
Jun Long, Jianping Yin & En Zhu

Authors

Jun Long
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Yin
View author publications
You can also search for this author in PubMed Google Scholar
En Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vicenç Torra Yasuo Narukawa Yuji Yoshida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Long, J., Yin, J., Zhu, E. (2007). An Active Learning Method Based on Most Possible Misclassification Sampling Using Committee. In: Torra, V., Narukawa, Y., Yoshida, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2007. Lecture Notes in Computer Science(), vol 4617. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73729-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-73729-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73728-5
Online ISBN: 978-3-540-73729-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics