Abstract
There has recently been a large effort in using unlabeled data in conjunction with labeled data in machine learning. Semi-supervised learning and active learning are two well-known techniques that exploit the unlabeled data in the learning process. In this work, the active learning is used to query a label for an unlabeled data on top of a semi-supervised classifier. This work focuses on the query selection criterion. The proposed criterion selects the example for which the label change results in the largest pertubation of other examples’ label. Experimental results show the effectiveness of the proposed query selection criterion in comparison to existing techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Krishnapuram, B., Williams, D., Xue, Y., Hartemink, A.J., Carin, L., Figueiredo, M.A.T.: On semi-supervised classification. In: NIPS (2004)
Zhu, X., Ghahramani, Z.: Semi-supervised learning: From gaussian fields to gaussian processes. Technical report, School of CS, CMU (2003)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, pp. 912–919 (2003)
Minton, S., Knoblock, C.A.: Active + semi-supervised learning = robust multi-view learning. In: Proceedings of ICML 2002, 19th International Conference on Machine Learning, pp. 435–442 (2002)
Nigam, K., Mccallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Machine Learning, 103–134 (1999)
Baluja, S.: Probabilistic modeling for face orientation discrimination: learning from labeled and unlabeled data. In: Proceedings of the 1998 conference on Advances in neural information processing systems II, pp. 854–860. MIT Press, Cambridge (1999)
Lawrence, N.D., Jordan, M.I.: Semi-supervised learning via gaussian processes. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 753–760. MIT Press, Cambridge (2005)
Chu, W., Sindhwani, V., Ghahramani, Z., Keerthi, S.S.: Relational learning with gaussian processes. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 289–296. MIT Press, Cambridge (2007)
Szummer, M., Jaakkola, T.: Information regularization with partially labeled data. In: Advances in Neural Information Processing Systems 15. MIT Press, Cambridge (2003)
Tommi, A.C., Jaakkola, T.: On information regularization. In: Proceedings of the 19th UAI (2003)
Blum, A., Lafferty, J., Rwebangira, M.R., Reddy, R.: Semi-supervised learning using randomized mincuts. In: ICML 2004: Proceedings of the twenty-first international conference on Machine learning, p. 13. ACM, New York (2004)
Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 19–26. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16, pp. 321–328. MIT Press, Cambridge (2003)
Belkin, M., Matveeva, I., Niyogi, P.: Regularization and semi-supervised learning on large graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 624–638. Springer, Heidelberg (2004)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399–2434 (2006)
Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison (2005), http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR 1994: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 3–12. Springer-Verlag New York, Inc, New York (1994)
Mccallum, A.K.: Employing em in pool-based active learning for text classification. In: Proceedings of the 15th International Conference on Machine Learning, pp. 350–358. Morgan Kaufmann, San Francisco (1998)
Settles, B.: Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison (2009)
Cochran, W.G.: Sampling Techniques. John Wiley and Sons, Chichester (1977)
Roy, N., Mccallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proc. 18th International Conf. on Machine Learning, pp. 441–448. Morgan Kaufmann, San Francisco (2001)
Zhu, X., Lafferty, J., Ghahramani, Z.: Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, pp. 58–65 (2003)
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muandet, K., Marukatat, S., Nattee, C. (2009). Query Selection via Weighted Entropy in Graph-Based Semi-supervised Classification. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-05224-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05223-1
Online ISBN: 978-3-642-05224-8
eBook Packages: Computer ScienceComputer Science (R0)