Abstract
Feature subset selection for unsupervised learning, is a very important topic in artificial intelligence because it is the base for saving computational resources. In this implementation we use a typical testor’s methodology in order to incorporate an importance index for each variable. This paper presents the general framework and the way two hybridized meta-heuristics work in this NP-complete problem. The evolutionary mechanisms are based on the Univariate Marginal Distribution Algorithm (UMDA) and the Genetic Algorithm (GA). GA and UMDA – Estimation of Distribution Algorithm (EDA) use a very useful rapid operator implemented for finding typical testors on a very large dataset and also, both algorithms, have a local search mechanism for improving time and fitness. Experiments show that EDA is faster than GA because it has a better exploitation performance; nevertheless, GA’ solutions are more consistent.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alba, C.E., Santana, R., Ochoa, R.A., Lazo, C.M.: Finding Typical Testors By Using an Evolutionary Strategy. In: Proceedings of V Iberoamerican Workshop on Pattern Recognition Lisbon, Portugal, pp. 267–278 (2000)
Amershi, S., Conati, C.: Unsupervised and supervised machine learning in user modeling for intelligent learning environments. In: IUI 2007: Proceedings of the 12th international conference on Intelligent user interfaces, pp. 72–81 (2007)
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. In: Artificial Intelligence 1997, pp. 245–271 (1997)
Cheguis, I.A., Yablonskii, S.V.: About testors for electrical outlines. Uspieji Matematicheskij Nauk 4(66), 182–184 (1955) (in Russian)
Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)
Dmitriev, A.N., Zhuravlev, Y. I., Krendeleiev, F.P.: On the mathematical principles of patterns and phenomena classification. Diskretnyi Analiz 7, 3–15 (1966)
Dy, J.G., Brodley, C.E.: Feature Selection for Unsupervised Learning. The Journal of Machine Learning Research 5, 845–889 (2004)
Glover, F.W., Kochenberger, G.A., Fred, W. (eds.): Handbook of Metaheuristics, pp. 55–82. Springer, Heidelberg (2005)
Goldberg, D.E.: Genetic Algorithms in Search. In: Optimization, and Machine Learning, Addison-Wesley, New York (1989)
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. In: Ann Arbor, The University of Michigan Press (1975); 2nd edn. The MIT Press, Cambridge (1992)
Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature Subset Selection by Bayesian networks based optimization. Artificial Intelligence 123(1-2), 157–184 (1999)
Kohavi, R.y., Jhon, G.H.: Wrappers for Feature Subset Selection. In: Artificial Intelligence 1997, pp. 273–324 (1997)
Laguna, M., Martí, R.: Scatter Search Methodology and Implementations in C. Kluwer Academic Press, Dordrecht (2003)
Larrañaga, P., Lozano, J.A.: Estimation of distribution algorithms: a new tool for evolutionary computation. Springer, Heidelberg (2002)
Lazo-Cortés, M., Ruiz-Shulcloper, J.: Determining the feature relevance for non classically described objects and a new algorithm to compute typical fuzzy testors. Pattern Recognition Letters 16, 1259–1265 (1995)
Lazo-Cortes, M., Ruiz-Shulcloper, J., Alba Cabrera, E.: An Overview of the Evolution of the Concept of Testor. Pattern Recognition 34, 753–762 (2001)
Lozano, J.A., Larrañaga, P., Inza, I., Bengoetxea, I.: Towards a new evolutionary computation: advances in the estimation of distribution algorithms, pp. 55–82. Springer, Heidelberg (2006)
Mierswa, I., Wurst, M.: Information Preserving Multi-Objective Feature Selection for Unsupervised Learning. In: Proc. of the Genetic and Evolutionary Computation Conference (GECCO 2006), pp. 1545–1552. ACM Press, New York (2006)
Tom, M.M.: Machine Learning. Mc.Graw Hill, New York (1997)
Moreno, A., Armengol, E., Béjar, J., Belanche, L., Cortés, U., Gavalda, R., Gimeno, J., López, B., Martín, M., Sánchez, M.: Aprendizaje Automático. Ediciones de la Universidad Politécnica de Catalunya. SL (1994)
Pelikan, M., Sastry, K., Cantu-Paz, E.: Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications. Springer, Secaucus (2006)
Peng L, Jiaxian Z, Lanjuan L, Yanhong L, Xuefeng Z. Data mining application in prosecution committee for unsupervised learning. In: International Conference on Services Systems and Services Management, Proceedings of ICSSSM 2005, 13-15 June 05, Vol. 2, pp 1061 - 1064 (2005)
Saeys, Y., Degroeve, S., Van de Peer, Y.: Digging into Acceptor Splice Site Prediction: An Iterative Feature Selection Approach. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 386–397. Springer, Heidelberg (2004)
Sanchez, D.G., Lazo, C.M.: CT-EXT: An Algorithm for Computing Typical Testor Set. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 506–514. Springer, Heidelberg (2008)
Sánchez, D.G., Lazo, C.M.: Modificaciones al Algoritmo BT para Mejorar sus Tiempos de Ejecucion, cuba. Revista Ciencias Matemáticas XX(2), 129–136 (2002)
Santiesteban, A.Y., Pons, P.A.: LEX: Un Nuevo Algoritmo para el Cálculo de los Testores Típicos. Revista Ciencias Matemáticas 21(1), 85–95 (2003)
Shulcloper, J.R., Alba, C., Lazo, C.: Introducción al reconocimiento de Patrones: Enfoque Lógico Combinatorio, México. Serie Verde, CINVESTAV-IPN 51, 188 (1995)
Shulcloper, J.R., Bravo, M.A., Lazo, C.: Algoritmos BT y TB para el cálculo de todos los test típicos, cuba. Revista Ciencias Matemáticas VI(2), 11–18 (1985)
Singhi, S.K., Liu, H.: Feature Subset Selection Bias for Classification Learning. In: Proceedings of 23rd International Conference on Machine Learning, Pittsburgh, pp. 849–856 (2006)
Torres, M.D., Torres, A., de León, P., Ochoa, A.: Búsqueda Dispersa y Testor Típico. In: XI Simposio de Informática y 6° Mostra de Software Académico SIMS 2006, Uruguaiana, Brasil, Hífen, Noviembre del 2006, vol. 30(58) (2006)
Torres, M.D., Ponce, E.E., Torres, A., Luna, F.J.: Selección de Subconjuntos de Características en Aprendizaje no Supervisado utilizando una Estrategia Genética en Combinación con Testores y Testores Típicos. In: Tercer Congreso Internacional de Computación Evolutiva (COMCEV 2007), pp. 57–63 (2007)
Torres, M.D., Ponce, E.E., Torres, A., Torres, A., Díaz, E.: Selección de Características Basada en el Peso Informacional de las Variables en Aprendizaje no Supervisado mediante Algoritmos Genéticos. In: Avances en computación Evolutiva. Centro de Investigaciones en Matemáticas, pp. 100–106 (2008)
Kim, Y., Street, W.N., Menczer, F.: Feature selection in unsupervised learning via evolutionary search. In: KDD 2000, pp. 365–369 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Torres, D., Ponce-de-León, E., Torres, A., Ochoa, A., Díaz, E. (2009). Hybridization of Evolutionary Mechanisms for Feature Subset Selection in Unsupervised Learning. In: Aguirre, A.H., Borja, R.M., Garciá, C.A.R. (eds) MICAI 2009: Advances in Artificial Intelligence. MICAI 2009. Lecture Notes in Computer Science(), vol 5845. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05258-3_54
Download citation
DOI: https://doi.org/10.1007/978-3-642-05258-3_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05257-6
Online ISBN: 978-3-642-05258-3
eBook Packages: Computer ScienceComputer Science (R0)