Maps Ensemble for Semi-Supervised Learning of Large High Dimensional Datasets

Prudhomme, Elie; Lallich, Stéphane

doi:10.1007/978-3-540-68123-6_11

Elie Prudhomme¹ &
Stéphane Lallich¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4994))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1041 Accesses
2 Citations

Abstract

In many practical cases, only few labels are available on the data. Algorithms must then take advantage of the unlabeled data to ensure an efficient learning. This type of learning is called semi-supervised learning (SSL). In this article, we propose a methodology adapted to both the representation and the prediction of large datasets in that situation. For that purpose, groups of non-correlated attributes are created in order to overcome problems related to high dimensional spaces. An ensemble is then set up to learn each group with a self-organizing map (SOM). Beside the prediction, these maps also aim at providing a relevant representation of the data which could be used in semi-supervised learning. Finally, the prediction is achieved by a vote of the different maps. Experimentations are performed both in supervised and semi-supervised learning. They show the relevance of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Becker, S.: JPMAX: Learning to recognize moving objects as a model-fitting problem. In: Advances in Neural Information Processing Systems, vol. 7, pp. 933–940. MIT Press, Cambridge (1995)
Google Scholar
Bellmann, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1975)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT: Proceedings of the Workshop on Computational Learning Theory, pp. 92–100. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)
Article Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Google Scholar
Demartines, P.: Analyse de données par réseaux de neurones auto-organisés. Ph.d. dissertation, Institut National Polytechnique de Grenoble, France (1994)
Google Scholar
Duin, R., Tax, D.: Experiments with classifier combining rules. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 16–29. Springer, Heidelberg (2000)
Chapter Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. In: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann, San Francisco (1990)
Google Scholar
Geman, S., Bienenstock, E., Doursat, R.: Neural networks and the bias/variance dilemma. Neural Computing 4(1), 1–58 (1992)
Article Google Scholar
Jacobs, R., Jordan, M., Barto, A.: Task decomposition through competition in a modular connectionist architecture: The what and where vision tasks. Cognitive Science 15, 219–250 (1991)
Article Google Scholar
Kaiser, H.: The varimax criterion for analytic rotation in factor analysis. Psychometrika 23, 187–200 (1958)
Article MATH Google Scholar
Kohonen, T.: Self-Organizing Maps, vol. 30. Springer, Heidelberg (2001)
MATH Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. Advances in NIPS 7, 231–238 (1995)
Google Scholar
Leskes, B.: The Value of Agreement, a New Boosting Algorithm. Springer, Heidelberg (2005)
Google Scholar
McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Prudhomme, E., Lallich, S.: Quality measure based on Kohonen maps for supervised learning of large high dimensional data. In: Proc. of ASMDA 2005, pp. 246–255 (2005)
Google Scholar
Rakotomalala, R.: Tanagra: un logiciel gratuit pour l’enseignement et la recherche. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 697–702. Springer, Heidelberg (2005)
Google Scholar
Ruta, D., Gabrys, B.: Classifier selection for majority voting. Information Fusion 6, 63–81 (2005)
Article Google Scholar
SAS, SAS/STAT user’s guide, vol. 2. SAS Institute Inc. (1989)
Google Scholar
Tumer, K., Ghosh, J.: Theoretical foundations of linear and order statistics combiners for neural pattern classifiers. Technical report, Computer and Vision Research Center, University of Texas, Austin (1995)
Google Scholar
Valentini, G., Masulli, F.: Ensembles of learning machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–20. Springer, Heidelberg (2002)
Google Scholar
Verleysen, M., François, D., Simon, G., Wertz, V.: On the effects of dimensionality on data analysis with neural networks. In: International Work-Conference on ANNN: Computational Methods in Neural Modeling, vol. II, pp. 105–112. Springer, Heidelberg (2003)
Google Scholar
Ward, J.H.: Hierarchical grouping to optimize an objective function. Journal of American Statistical Association 58(301), 236–244 (1963)
Article Google Scholar
Zanda, M., Brown, G., Fumera, G., Roli, F.: Ensemble learning in linearly combined classifiers via negative correlation. In: International Workshop on Multiple Classifier Systems (2007)
Google Scholar
Zhou, Y., Goldman, S.: Democratic co-learning. In: ICTAI, pp. 594–202 (2004)
Google Scholar
Zhu, X.: Semi-supervised learning literature survey. Technical report (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire ERIC, Université Lumière Lyon 2, 5 avenue Pierre Mendès, France, 69676, Bron
Elie Prudhomme & Stéphane Lallich

Authors

Elie Prudhomme
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Lallich
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Aijun An Stan Matwin Zbigniew W. Raś Dominik Ślęzak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prudhomme, E., Lallich, S. (2008). Maps Ensemble for Semi-Supervised Learning of Large High Dimensional Datasets. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds) Foundations of Intelligent Systems. ISMIS 2008. Lecture Notes in Computer Science(), vol 4994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68123-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-68123-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68122-9
Online ISBN: 978-3-540-68123-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics