Co-training from an Incremental EM Perspective

Aminian, Minoo

doi:10.1007/978-3-540-28651-6_114

Co-training from an Incremental EM Perspective

Minoo Aminian¹⁹

Conference paper

1305 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

Abstract

We study classification when the majority of data is unlabeled, and only a small fraction is labeled: the so-called semi-supervised learning situation. Blum and Mitchell’s co-training is a popular semi-supervised algorithm [1] to use when we have multiple independent views of the entities to classify. An example of a multi-view situation is classifying web pages: one view may describe the pages by the words that occur on them, another view describes the pages by the words in the hyperlinks that point to them. In co-training two learners each form a model from the labeled data and then incrementally label small subsets of the unlabeled data for each other. The learners then re-estimate their model from the labeled data and the psuedo-labels provided by the learners. Though some analysis of the algorithm’s performance exists [1] the computation performed is still not well understood. We propose that each view in co-training is effectively performing incremental EM as postulated by Neal and Hinton [3], combined with a Bayesian classifier. This analysis suggests improvements over the core co-training algorithm. We introduce variations, which result in faster convergence to the maximum possible accuracy of classification than the core co-training algorithm, and therefore increase the learning efficiency. We empirically verify our claim for a number of data sets in the context of belief network learning.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co- training. In: COLT (1998)
Google Scholar
Davidson, I., Aminian, M.: Using the Central Limit Theorem for Belief Network Learning. In: Proceedings of the 8th Int. Symposium on A.I. and Math (2004)
Google Scholar
Neal, M., Hinton, G.: A New View of the EM Algorithm that Justifies Incremental and Other Variants. In: Biometrika (1993)
Google Scholar
Muslea, I., Minton, S., Knoblock, C.: Active + Semi-Supervised Learning = Robust Multi-View Learning. In: Proceedings of the 19th International Conference on Machine Learning, pp. 435–442 (2002)
Google Scholar
Nigam, K., Ghani, R.: Understanding the Behavior of Co-training. In: Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD (2000)
Google Scholar
Seeger, M.: Learning with Labeled and Unlabeled Data. Technical Report (2000)
Google Scholar
Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm, TR-97-021 (April 1998)
Google Scholar
Byrne, W., Gunawardana, A.: Comments on “Efficient Training Algorithm for HMM’s Using Incremental Estimation”. IEEE Transactions on Speech and Audio Processing 8(6) (November 2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Dept, SUNY Albany, Albany, NY, USA, 12222
Minoo Aminian

Authors

Minoo Aminian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Computing, and Mathematics, University of Exeter, EX4 4QF, Exeter, UK
Zheng Rong Yang
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
School of Engineering, Computer Science and Mathematics, University of Exeter, EX4 4QF, UK
Richard M. Everson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aminian, M. (2004). Co-training from an Incremental EM Perspective. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_114

Download citation

DOI: https://doi.org/10.1007/978-3-540-28651-6_114
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics