Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances

Long, Jun; Yin, Jianping; Zhu, En; Zhao, Wentao

doi:10.1007/978-3-540-68125-0_98

Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances

Jun Long¹,
Jianping Yin¹,
En Zhu¹ &
…
Wentao Zhao¹

Conference paper

1734 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

Active learners can significantly reduce the number of labeled training instances to learn a classification function by actively selecting only the most informative instances for labeling. Most existing methods try to select the instances which could halve the version space size after each sampling. In contrast to them, we try to reduce the volume of the version space more than half. Therefore, a sampling criterion of misclassification is presented. Furthermore, in each iteration of active learning, a strong classifier was introduced to estimate the target function for evaluation of the misclassification degree of an instance. We use a modified popular ensemble learning method DECORATE as the strong classifier which was enhanced by the unlabeled instances with high certainty by the current base classifier. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: 17th ACM International Conference on Research and Development in Information Retrieval, pp. 3–12. Springer, Heidelberg (1994)
Google Scholar
Muslea, I., Minton, S., Knoblock, C.A.: Active learning with multiple views. Journal of Artificial Intelligence Research 27, 203–233 (2006)
MathSciNet Google Scholar
Campbell, C., Cristianini, N., Smola, A.: Query learning with large margin classifiers. In: Proc. 17th International Conf. on Machine Learning, Madison, pp. 111–118. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proc. 18th International Conf. on Machine Learning, pp. 441–448. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Workshop on Computational Learning Theory, San Mateo, pp. 287–294. Morgan Kaufmann, San Francisco (1992)
Chapter Google Scholar
Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28, 133–168 (1997)
Article MATH Google Scholar
Abe, N., Mamitsuka, H.: Query learning using boosting and bagging. In: Proc. 15th International Conf on Machine Learning, Madison, pp. 1–10. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proc. 21th International Conf. on Machine Learning, Banff, CA, pp. 584–591. Morgan Kaufmann, San Francisco (2004)
Google Scholar
Dietterich, T.G.: Machine-learning research: Four current directions. The AI Magazine 18(4), 97–136 (1998)
Google Scholar
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intell. 12, 993–1001 (1990)
Article Google Scholar
Melville, P., Mooney, R.J.: Constructing diverse classifier ensembles using artificial training examples. In: IJCAI, pp. 505–512 (2003)
Google Scholar
Newman, D., Hettich, S., Blake, C.L., Merz, C.J.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/mlrepository.html

Download references

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, Hunan, 410073, China
Jun Long, Jianping Yin, En Zhu & Wentao Zhao

Authors

Jun Long
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Yin
View author publications
You can also search for this author in PubMed Google Scholar
En Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Long, J., Yin, J., Zhu, E., Zhao, W. (2008). Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_98

Download citation

DOI: https://doi.org/10.1007/978-3-540-68125-0_98
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics