A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers

Takamura, Hiroya; Okumura, Manabu

doi:10.1007/978-3-540-30211-7_48

Hiroya Takamura²² &
Manabu Okumura²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Included in the following conference series:

International Conference on Natural Language Processing

1585 Accesses
1 Citations

Abstract

We propose to use both labeled and unlabeled data with the Expectation-Maximization (EM) algorithm in order to estimate the generative model and use this model to construct a Fisher kernel. The Naive Bayes generative probability is used to model a document. Through the experiments of text categorization, we empirically show that, (a) the Fisher kernel with labeled and unlabeled data outperforms Naive Bayes classifiers with EM and other methods for a sufficient amount of labeled data, (b) the value of additional unlabeled data diminishes when the labeled data size is large enough for estimating a reliable model, (c) the use of categories as latent variables is effective, and (d) larger unlabeled training datasets yield better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Herbrich, R., Graepel, T.: A PAC-bayesian margin bound for linear classifiers: Why SVMs work. Advances in Neural Information Processing Systems 12, 224–230 (2000)
Google Scholar
Hofmann, T., Puzicha, J.: Statistical models for co-occurrence data. Technical Report AIM-1625, Artifical Intelligence Laboratory, Massachusetts Institute of Technology (1998)
Google Scholar
Hofmann, T.: Learning the similarity of documents: An information geometric approach to document retrieval and categorization. Advances in Neural Information Processing Systems 12, 914–920 (2000)
Google Scholar
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. Advances in Neural Information Processing Systems 11, 487–493 (1998)
Google Scholar
Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the European Conference on Machine Learning, pp. 137–142 (1998)
Google Scholar
Joachims, T.: Transductive inference for text classification using support vector machines. In: Proceedings of 16th International Conference on Machine Learning (ICML 1999), pp. 200–209 (1999)
Google Scholar
Kass, R.E., Vos, P.W.: Geometrical foundations of asymptotic inference. Wiley, New York (1997)
MATH Google Scholar
Kressel, U.: Pairwise classication and support vector machines. In: Schölkopf, C.J.C., Burgesa, A.J. (eds.) Advances in Kernel Methods *Support Vector Learning, pp. 255–268. The MIT Press, Cambridge (1999)
Google Scholar
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), pp. 192–199 (2001)
Google Scholar
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: Proceedings of AAAI 1998 Workshop on Learning for Text Categorization, pp. 41–48 (1998)
Google Scholar
Nigam, K., Mccallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2/3), 103–134 (2000)
Article MATH Google Scholar
Smola, A.J., Bartlett, P.J., Schölkopf, B., Schuurmans, D.: Advances in Large Margin Classifiers. MIT Press, Cambridge (2000)
MATH Google Scholar
Tsuda, K., Kawanabe, M.: The leave-one-out kernel. In: Proceedings of International Conference on Artificial Neural Networks, pp. 727–732 (2002)
Google Scholar
Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.-R.: A new discriminative kernel from probabilistic models. Neural Computation 14(10), 2397–2414 (2002)
Article MATH Google Scholar
Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Networks 11(2), 271–282 (1998)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley, New York (1998)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Precision and Intelligence Laboratory, Tokyo Institute of Technology, 4259 Nagatsuta Midori-ku, Yokohama, 226-8503, Japan
Hiroya Takamura & Manabu Okumura

Authors

Hiroya Takamura
View author publications
You can also search for this author in PubMed Google Scholar
Manabu Okumura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Behavior Design Corporation, IV Science-Based Industrial Park Hsinchu, 2F, No.5, Industry E. Rd, Taiwan
Keh-Yih Su
University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, JST CREST, Honcho 4-1-8, Kawaguchi-shi,, 332-0012, Saitama,
Jun’ichi Tsujii
Pohang University of Science and Technology (POSTECH), AITrc, Republic of Korea
Jong-Hyeok Lee
Language Information Sciences Research Centre, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Oi Yee Kwong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takamura, H., Okumura, M. (2005). A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_48

Download citation

DOI: https://doi.org/10.1007/978-3-540-30211-7_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24475-2
Online ISBN: 978-3-540-30211-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics