Abstract
The main objective of this chapter is to compare a strategy of rule induction based on feature selection, exemplified by the LEM1 algorithm, with another strategy, not using feature selection, exemplified by the LEM2 algorithm. The LEM2 algorithm uses all possible attribute-value pairs as the search space. It is shown that LEM2 significantly outperforms LEM1, a strategy based on feature selection in terms of an error rate (5 % significance level, two-tailed test). At the same time, the LEM2 algorithm induces smaller rule sets with the smaller total number of conditions as well. The time complexity for both algorithms is the same.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bazan, J.G., Szczuka, M.S., Wojna, A., Wojnarski, M.: On the evolution of rough set exploration system. In: Proceedings of the Rough Sets and Current Trends in Computing Conference, pp. 592–601 (2004)
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)
Booker, L.B., Goldberg, D.E., F, H.J.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning: Paradigms and Methods, pp. 235–282. MIT, Boston (1990)
Chan, C.C., Grzymala-Busse, J.W.: On the attribute redundancy and the learning programs ID3, PRISM, and LEM2. Technical Report, Department of Computer Science, University of Kansas (1991)
Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. J. Approx. Reason. 15(4), 319–331 (1996)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
Everitt, B.: Cluster Analysis. Heinemann Educational Books, London (1980)
Fang, J., Grzymala-Busse, J.: Leukemia prediction from gene expression data—a rough set approach. In: Proceedings of the Eighth International Conference on Artificial Intelligence and Soft Computing, pp. 899–908 (2006)
Fang, J., Grzymala-Busse, J.: Mining of microRNA expression data—a rough set approach. In: Proceedings of the First International Conference on Rough Sets and Knowledge Technology, pp. 758–765 (2006)
Fang, J., Grzymala-Busse, J.: Predicting penetration across the blood-brain barrier—a rough set approach. In: Proceedings of the IEEE International Conference on Granular Computing, pp. 231–236 (2007)
Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht, Boston (1992)
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundam. Inform. 31, 27–39 (1997)
Grzymala-Busse, J.W.: Mining numerical data—a rough set approach. In: Proceedings of the RSEISP’2007, the International Conference of Rough Sets and Emerging Intelligent Systems Paradigms, pp. 12–21 (2007)
Grzymala-Busse, J.W.: Rule induction. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 249–265. Springer, Berlin (2010)
Grzymala-Busse, J.W.: An empirical comparison of rule induction using feature selection with the LEM2 algorithm. In: Greco, S., Bouchon-Meunier, B.B., Coletti, G., Fedrizzi, M.M., Matarazzo, B., Yager, R.R. (eds.) Communications in Computer and Information Science, vol. 297, pp. 270–279. Springer (2012)
Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 33–51. Springer, Berlin (2010)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction. Foundations and Applications. Springer, Berlin (2006)
Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction: Processes of Inference, Learning, and Discovery. MIT, Boston (1986)
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19, 153–158 (1997)
Kohavi, R., John, G.: Wrappers for feature selection. Artif. Intell. 97, 273–324 (1997)
Lei, Y., Huan, L.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20-th International Conference on Machine Learning, p. 8 (2003)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17, 491–502 (2005)
Liu, H., Motoda, H.: Computational Methods of Feature Selection. Chapman and Hall/CRC, Boca Raton (2007)
Michalski, R.S., Mozetic, I., Hong, J., Lavrac, N.: The multi-purpose incremental learning system AQ15 and its testing application on three medical domains. In: Proceedings of the National Conference on Artificial Intelligence, pp. 1041–1045. Morgan Kaufmann, San Mateo (1986)
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht (1991)
Pawlak, Z., Grzymala-Busse, J.W., Slowinski, R., Ziarko, W.: Rough sets. Commun. ACM 38, 89–95 (1995)
Peng, H., Fuhui, L., Chris, D.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)
Stefanowski, J.: Algorithms of Decision Rule Induction in Data Mining. Poznan University of Technology Press, Poznan (2001)
Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognit. Lett. 24, 833–849 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grzymała-Busse, J.W. (2015). A Comparison of Rule Induction Using Feature Selection and the LEM2 Algorithm. In: Stańczyk, U., Jain, L. (eds) Feature Selection for Data and Pattern Recognition. Studies in Computational Intelligence, vol 584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45620-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-662-45620-0_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45619-4
Online ISBN: 978-3-662-45620-0
eBook Packages: EngineeringEngineering (R0)