New Feature Splitting Criteria for Co-training Using Genetic Algorithm Optimization

Salaheldin, Ahmed; El Gayar, Neamat

doi:10.1007/978-3-642-12127-2_3

Ahmed Salaheldin¹⁹ &
Neamat El Gayar^19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5997))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

1381 Accesses
7 Citations

Abstract

Often in real world applications only a small number of labeled data is available while unlabeled data is abundant. Therefore, it is important to make use of unlabeled data. Co-training is a popular semi-supervised learning technique that uses a small set of labeled data and enough unlabeled data to create more accurate classification models. A key feature for successful co-training is to split the features among more than one view. In this paper we propose new splitting criteria based on the confidence of the views, the diversity of the views, and compare them to random and natural splits. We also examine a previously proposed artificial split that maximizes the independence between the views, and propose a mixed criterion for splitting features based on both the confidence and the independence of the views. Genetic algorithms are used to choose the splits which optimize the independence of the views given the class, the confidence of the views in their predictions, and the diversity of the views. We demonstrate that our proposed splitting criteria improve the performance of co-training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A co-training method based on entropy and multi-criteria

Article 10 November 2020

Co-clustering based classification of multi-view data

Article 15 January 2022

Towards making co-training suffer less from insufficient views

Article 30 August 2018

References

Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison (2005)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational Learning Theory (COLT 1998), pp. 92–100 (1998)
Google Scholar
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the Ninth International Conference on Information and Knowledge (CIKM 2000), pp. 86–93 (2000)
Google Scholar
Feger, F., Koprinska, I.: Co-training using rbf nets and different feature splits. In: Proceedings of 2006 International Joint Conference on Neural Network, pp. 1878–1885 (2006)
Google Scholar
Wang, W., Zhou, Z.-H.: Analyzing co-training style algorithms. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 454–465. Springer, Heidelberg (2007)
Chapter Google Scholar
Opitz, D.: Feature selection for ensembles. In: Proceedings of the 16th International Conference on Artificial Intelligence, pp. 379–384 (1999)
Google Scholar
Goldberg, D.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Reading (1989)
MATH Google Scholar
Kamp, R.G., Savenije, H.H.G.: Optimising training data for anns with genetic algorithms. In: Hydrol. Earth Syst. Sci., pp. 603–608 (2006)
Google Scholar
Asuncion, A., Newman, D.J.: Uci machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Rebecca Fay, F.S., Kaufmann, U., Palm, G.: Learning object recognition in a neurobotic system. In: Groß, H.-M., Debes, K., Böhme, H.-J. (eds.) 3rd Workshop on SelfOrganization of AdaptiVE Behavior, SOAVE 2004, pp. 198–209 (2004)
Google Scholar
Kiritchenko, S., Matwin, S.: Email classification with co-training. In: Proceedings of CASCON 2001, Toronto, Canada, pp. 192–201 (2001)
Google Scholar
Terabe, M., Hashimoto, K.: Evaluation criteria of feature splits for co-training. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2008 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Informatics Science, Nile University, Giza, Egypt
Ahmed Salaheldin & Neamat El Gayar
Faculty of Computers and Information, Cairo University, 12613, Giza, Egypt
Neamat El Gayar

Authors

Ahmed Salaheldin
View author publications
You can also search for this author in PubMed Google Scholar
Neamat El Gayar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Informatics Science, Nile University, 12677, Giza, Egypt
Neamat El Gayar
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH, Guildford, Surrey, UK
Josef Kittler
Department of Electrical and Electronic Engineering, University of Cagliari, Piazza d’Armi, 09123, Cagliari, Italy
Fabio Roli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salaheldin, A., El Gayar, N. (2010). New Feature Splitting Criteria for Co-training Using Genetic Algorithm Optimization. In: El Gayar, N., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2010. Lecture Notes in Computer Science, vol 5997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12127-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-12127-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12126-5
Online ISBN: 978-3-642-12127-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics