Skip to main content

Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

Active learners can significantly reduce the number of labeled training instances to learn a classification function by actively selecting only the most informative instances for labeling. Most existing methods try to select the instances which could halve the version space size after each sampling. In contrast to them, we try to reduce the volume of the version space more than half. Therefore, a sampling criterion of misclassification is presented. Furthermore, in each iteration of active learning, a strong classifier was introduced to estimate the target function for evaluation of the misclassification degree of an instance. We use a modified popular ensemble learning method DECORATE as the strong classifier which was enhanced by the unlabeled instances with high certainty by the current base classifier. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: 17th ACM International Conference on Research and Development in Information Retrieval, pp. 3–12. Springer, Heidelberg (1994)

    Google Scholar 

  2. Muslea, I., Minton, S., Knoblock, C.A.: Active learning with multiple views. Journal of Artificial Intelligence Research 27, 203–233 (2006)

    MathSciNet  Google Scholar 

  3. Campbell, C., Cristianini, N., Smola, A.: Query learning with large margin classifiers. In: Proc. 17th International Conf. on Machine Learning, Madison, pp. 111–118. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  4. Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proc. 18th International Conf. on Machine Learning, pp. 441–448. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  5. Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Workshop on Computational Learning Theory, San Mateo, pp. 287–294. Morgan Kaufmann, San Francisco (1992)

    Chapter  Google Scholar 

  6. Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28, 133–168 (1997)

    Article  MATH  Google Scholar 

  7. Abe, N., Mamitsuka, H.: Query learning using boosting and bagging. In: Proc. 15th International Conf on Machine Learning, Madison, pp. 1–10. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  8. Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proc. 21th International Conf. on Machine Learning, Banff, CA, pp. 584–591. Morgan Kaufmann, San Francisco (2004)

    Google Scholar 

  9. Dietterich, T.G.: Machine-learning research: Four current directions. The AI Magazine 18(4), 97–136 (1998)

    Google Scholar 

  10. Hansen, L., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intell. 12, 993–1001 (1990)

    Article  Google Scholar 

  11. Melville, P., Mooney, R.J.: Constructing diverse classifier ensembles using artificial training examples. In: IJCAI, pp. 505–512 (2003)

    Google Scholar 

  12. Newman, D., Hettich, S., Blake, C.L., Merz, C.J.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/mlrepository.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Long, J., Yin, J., Zhu, E., Zhao, W. (2008). Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_98

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_98

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics