Towards making co-training suffer less from insufficient views

Guo, Xiangyu; Wang, Wei

doi:10.1007/s11704-018-7138-5

Towards making co-training suffer less from insufficient views

Research Article
Published: 30 August 2018

Volume 13, pages 99–105, (2019)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Xiangyu Guo¹ &
Wei Wang¹

65 Accesses
6 Citations
Explore all metrics

Abstract

Co-training is a famous semi-supervised learning algorithm which can exploit unlabeled data to improve learning performance. Generally it works under a two-view setting (the input examples have two disjoint feature sets in nature), with the assumption that each view is sufficient to predict the label. However, in real-world applications due to feature corruption or feature noise, both views may be insufficient and co-training will suffer from these insufficient views. In this paper, we propose a novel algorithm named Weighted Co-training to deal with this problem. It identifies the newly labeled examples that are probably harmful for the other view, and decreases their weights in the training set to avoid the risk. The experimental results show that Weighted Co-training performs better than the state-of-art co-training algorithms on several benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Co-Training for Semi-Supervised Image Recognition

Co-embedding: a semi-supervised multi-view representation learning approach

Article 30 October 2021

A Distance-Weighted Selection of Unlabelled Instances for Self-training and Co-training Semi-supervised Methods

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Miller D J, Uyar H S. A mixture of experts classifier with learning based on both labelled and unlabelled data. Advances in Neural Information Processing Systems, 1997, 571–577
Google Scholar
Nigam K, McCallum A, Thrun S, Mitchell T. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39(2/3): 103–134
Article MATH Google Scholar
Bennett K P, Demiriz A. Semi-supervised support vector machines. Advances in Neural Information Processing Systems, 1998, 368–374
Google Scholar
Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning. 1999, 200–209
Google Scholar
Blum A, Chawla S. Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the 18th International Conference on Machine Learning. 2001, 19–26
Google Scholar
Zhu X, Ghahramani Z, Lafferty J. Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning. 2003, 912–919
Google Scholar
Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with local and global consistency. Advances in Neural Information Processing Systems, 2003, 321–328
Google Scholar
Blum A, Mitchell T. Combining labeled and unlabeled data with cotraining. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. 1998, 92–100
Google Scholar
Zhou Z H, Li M. Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529–1541
Article Google Scholar
Zhou Z H, Li M. Semi-supervised learning by disagreement. Knowledge and Information System, 2010, 24(3): 415–439
Article MathSciNet Google Scholar
Nigam K, Ghani R. Analyzing the effectiveness and applicability of co-training. In: Proceedings of the 10th International Conference on Information and Knowledge Management. 2000, 86–93
Google Scholar
Goldman S A, Zhou Y. Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning. 2000, 327–334
Google Scholar
Kiritchenko S, Matwin S. Email classification with co-training. In: Proceedings of the 2001 Conference of the Centre for Advanced Studies on Collaborative Research. 2001, 301–312
Google Scholar
Maeireizo B, Litman D, Hwa R. Co-training for predicting emotions with spoken dialogue data. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 2004, 28
Chapter Google Scholar
Wan X. Co-training for cross-lingual sentiment classification. In: Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 235–243
Google Scholar
Liu R, Cheng J, Lu H. A robust boosting tracker with minimum error bound in a co-training framework. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 1459–1466
Google Scholar
Abney S P. Bootstrapping. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002, 360–367
Google Scholar
Balcan M F, Blum A, Yang K. Co-training and expansion: towards bridging theory and practice. Advances in Neural Information Processing Systems, 2004, 89–96
Google Scholar
Wang W, Zhou Z H. A new analysis of co-training. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 1135–1142
Google Scholar
Wang W, Zhou Z H. Analyzing co-training style algorithms. In: Proceedings of the 18th European Conference on Machine Learning. 2007, 454–465
Google Scholar
Wang W, Zhou Z H. Co-training with insufficient views. In: Proceedings of the 5th Asian Conference on Machine Learning. 2013, 467–482
Google Scholar
Xu J, He H, Man H. DCPE co-training for classification. Neurocomputing, 2012, 86: 75–85
Article Google Scholar
Kushmerick N. Learning to remove internet advertisements. In: Proceedings of the 3rd Annual Conference on Autonomous Agents. 1999, 175–181
Chapter Google Scholar
Giles C L, Bollacker K D, Lawrence S. Citeseer: an automatic citation indexing system. In: Proceedings of the 3rd ACM International Conference on Digital Libraries. 1998, 89–98
Chapter Google Scholar
Bisson G, Grimal C. Co-clustering of multi-view datasets: a parallelizable approach. In: Proceedings of the 12th IEEE International Conference on Data Mining. 2012, 828–833
Google Scholar
Lichman M. UCI machine learning repository. 2013
Google Scholar

Download references

Acknowledgements

This work was supported by the NSFC (61673202, 61305067), the Fundamental Research Funds for the Central Universities, and the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Xiangyu Guo & Wei Wang

Authors

Xiangyu Guo
View author publications
Search author on:PubMed Google Scholar
Wei Wang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Wei Wang.

Additional information

Xiangyu Guo received his BS in Electronic Engineering from Xidian University, China in 2014. He received the National Scholarship in 2011. Currently he is a master student at the Department of Computer Science and Technology, Nanjing University, China. His research interests include machine learning and data mining.

Wei Wang is an associate professor at Department of Computer Science and Technology, Nanjing University, China. He received his PhD degree from Department of Computer Science and Technology, Nanjing University, China in 2012. His research interest mainly includes computational learning theory, especially in semi-supervised learning and active learning.

Electronic supplementary material

Supplementary material, approximately 370 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X., Wang, W. Towards making co-training suffer less from insufficient views. Front. Comput. Sci. 13, 99–105 (2019). https://doi.org/10.1007/s11704-018-7138-5

Download citation

Received: 20 April 2017
Accepted: 07 August 2017
Published: 30 August 2018
Issue Date: February 2019
DOI: https://doi.org/10.1007/s11704-018-7138-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards making co-training suffer less from insufficient views

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Co-Training for Semi-Supervised Image Recognition

Co-embedding: a semi-supervised multi-view representation learning approach

A Distance-Weighted Selection of Unlabelled Instances for Self-training and Co-training Semi-supervised Methods

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 370 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now