Abstract
This study presents a novel hybrid intelligent system using both unsupervised and supervised learning that can be easily adapted to be used in an individual or collaborative system. The system divides the classification problem into two stages: firstly it divides the input data space into different parts, according to the input space distribution of the data set. Then, it generates several simple classifiers that are used to correctly classify samples that are contained in one of the previously determined parts. This way, the efficiency of each classifier increases, as they can specialize in classifying only related samples from certain regions of the input data space. This specialization of the single classifiers enables them to learn more specific patterns or characteristics of the data space, avoiding the risk of obtaining a general algorithm that over-fits to the data. The hybrid system presented has been tested with artificial and real data sets. A comparative study of the results obtained by the novel model with those obtained from other common classification methods is also included in the present work.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Mitchell, T., Machine Learning, McGraw Hill, 1997.
Witten, I. H. and Frank, E., Data mining: practical machine learning tools and techniques, Morgan Kaufmann, 2005.
Geman S., Bienenstock E., Doursat R.: “Neural networks and the bias/variance dilemma”. Neural Computation 4, 1–58 (1992)
Dietterich, T. G., “Ensemble methods in machine learning,” in MCS ’00: Proc of the First International Workshop on Multiple Classifier Systems, pp. 1–15, Springer-Verlag, (London, UK), 2000.
Sharkey, A. and Sharkey, N., “Diversity, selection and ensembles of artificial neural nets,” in Third International Conference on Neural Networks and their Applications, IUSPIM, pp. 205–212, 1997.
Bakker B., Heskes T.: “Clustering ensembles of neural network models”. Neural Networks 16(2), 261–269 (2003)
Kuncheva, L. I., Combining Pattern Classifiers: Methods and Algorithms, Wiley-Interscience, 2004.
Polikar R.: “Ensemble based systems in decision making”. IEEE Circuits and Systems Magazine 6(3), 21–45 (2006)
Breiman L.: “Bagging predictors”. Machine Learning 24(2), 123–140 (1996)
Freund, Y. and Schapire, R. E., “Experiments with a new boosting algorithm,” in International Conference on Machine Learning, pp. 148–156, 1996.
Kuncheva L.I., Skurichina M., Duin R.P.W.: “An experimental study on diversity for bagging and boosting with linear classifiers”. Information Fusion 3(4), 245–258 (2002)
Sharkey A., Sharkey N.: “Combining diverse neural nets”. Knowledge Engineering Review 12(3), 1–17 (1997)
Ruta D., Gabrys B.: “A theoretical analysis of the limits of majority voting errors for multiple classifier systems”. Pattern Analysis and Applications 5(4), 333–350 (2002)
Jacobs R., Jordan M.I.N.S.J., Hinton G.E.: “Adaptive mixtures of local experts”. Neural Computation 3, 79–87 (1991)
Kuncheva, L.I., “Clustering-and-selection model for classifier combination,” in KES, pp. 185–188, 2000
Liu, R. and Yuan, B., “Multiple classifiers combination by clustering and selection,” Information Fusion, pp. 163–168, 2001.
Jackowski K., Wozniak M.: “Algorithm of designing compound recognition system on the basis of combining classifiers with simultaneous splitting feature space into competence areas”. Pattern Analysis and Applications 12(4), 415–425 (2009)
Dara, R., Kremer, S. C. and Stacey, D. A., “Clustering unlabelled data with SOMs improves classification of labeled real-world data,” in Proc. IEEE World Congress on Computational Intelligence, pp. 2237–2242, May 2002.
Baruque, B., and Corchado, E., “A weighted voting summarization of SOM ensembles,” Data Mining and Knowledge Discovery, 21, pp. 398–426, 2010. 10.1007/s10618-009-0160-3.
Corchado, E. and Baruque, B., “WeVoS-ViSOM: An ensemble summarization algorithm for enhanced data visualization,” Neurocomputing in press, 2011.
Kuncheva L., BezdekJ.C. Duin R.P.W.: “Decision templates for multiple classifier fusion: an experimental comparison”. Pattern Recognition 34(2), 299–314 (2001)
Stepenosky, N., Green, D., Kounios, J., Clark, C. and Polikar, R., “Majority vote and decision template based ensemble classifiers trained on event related potentials for early diagnosis of alzheimer’s disease,” in Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proc. 2006 IEEE International Conference on, 5, p. V, May 2006.
Kohonen T.: Self-Organizing Maps, 30. Springer, Berlin, Germany (1995)
Kohonen, T., “The self-organizing map,” Neurocomputing, 21, 1–3, pp. 1–6, 1998.
Kohonen T., Lehtio P., Rovamo J., Hyvarinen J., Bry K., Vainio L.: “A principle of neural associative memory”. Neuroscience 2(6), 1065–1076 (1977)
Lampinen J., Oja E.: “Clustering properties of hierarchical self-organizing maps”. Journal of Mathematical Imaging and Vision 2, 261–272 (1992)
Ultsch, A., “Self-organizing neural networks for visualization and classification,” in Proc. Conf. Soc. for Information and Classification, 1992.
Ultsch, A., “U*-matrix: A tool to visualize clusters in high dimensional data,” Tech. rep., Department of Computer Science, University of Marburg, 2003.
Beyer, K., Goldstein, J., Ramakrishnan, R. and Shaft, U., “When is nearest neighbor meaningful?,” in Computer Science Database Theory – ICDT’99, LNCS 1540, pp. 217–235, Springer, 1999.
Topchy, A., Minaei-Bidgoli, B., Jain, A. K. and Punch, W. F., “Adaptive clustering ensembles,” Pattern Recognition, 2004, ICPR 2004, Proc. of the 17th International Conference on, pp. 272–275, 1, 1, 2004.
Asuncion, A. and Newman, D. J., UCI machine learning repository, 2007.
Kuncheva, L., Whitaker, C., Shipp, C. and Duin, R., “Limits on the majority vote accuracy in classifier fusion,” Pattern Analysis and Applications, 6, pp. 22–31, 2003.
Goebel, K. and Yan. W., “Choosing classifier for decision fusion,” in Proc. of the 7th International Conference on Information Fusion (Stockholm, Sweden), pp. 562–568, 2004.
Erp, M. V., Vuurpijl, L. and Schomaker, L., “An overview and comparison of voting methods for pattern recognition,” in Proc. of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR.8) (Niagara-on-the-Lake, Canada), pp. 195–200, 2002.
Apache Software Foundation, Spamassasin public corpus, 2006.
Yin, H., “ViSOM - a novel method for multivariate data projection and structure visualization,” Neural Networks, IEEE Transactions on, 13, 1, pp. 237–243, 2002.
Corchado, E. and Fyfe, C., “The scale invariant map and maximum likelihood hebbian learning,” International Conference on Knowledge-Based & Intelligent Information & Engineering System, pp. 245–249, 82, 2002.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Baruque, B., Porras, S. & Corchado, E. Hybrid Classification Ensemble Using Topology-preserving Clustering. New Gener. Comput. 29, 329–344 (2011). https://doi.org/10.1007/s00354-011-0306-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-011-0306-x