Abstract
Practical machine learning algorithms are known to degrade in performance when faced with many features that are not necessary for rule discovery. To cope with this problem, many methods for selecting a subset of features with similar-enough behaviors to merit focused analysis have been proposed. In such methods, the filter approach that selects a feature subset using a preprocessing step, and the wrapper approach that selects an optimal feature subset from the space of possible subsets of features using the induction algorithm itself as a part of the evaluation function, are two typical ones. Although the filter approach is a faster one, it has some blindness and the performance of induction is not considered. On the other hand, the optimal feature subsets can be obtained by using the wrapper approach, but it is not easy to use because the complexity of time and space. In this paper, we propose an algorithm of using the rough set methodology with greedy heuristics for feature selection. In our approach, selecting features is similar as the filter approach, but the performance of induction is considered in the evaluation criterion for feature selection. That is, we select the features that damage the performance of induction as little as possible.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pawlak, Z.: Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)
Skowron, A., Rauszer, C.: The Discernibility Matrics and Functions in Information Systems. In: Slowinski, R. (ed.) Intelligent Decision Support, pp. 331–362. Kluwer, Dordrecht (1992)
Skowron, A., Polkowski, L.: A Synthesis of Decision Systems from Data Tables. In: Lin, T.Y., Cercone, N. (eds.) Rough Sets and Data Mining, pp. 259–299. Kluwer, Dordrecht (1997)
Boussouf, M.: A Hybrid Approach to Feature Selection. In: Zytkow, J., Quafafou, M. (eds.) PKDD 1998. LNCS (LNAI), vol. 1510, pp. 231–238. Springer, Heidelberg (1998)
Yao, Y.Y., Wong, S.K.M., Butz, C.J.: On Information-Theoretic Measures of Attribute Importance. In: Zhong, N., Zhou, L. (eds.) PAKDD 1999. LNCS (LNAI), vol. 1574, pp. 231–238. Springer, Heidelberg (1999)
Liu, H., Motoda, H.: Feature Selection. Kluwer, Dordrecht (1998)
Zupan, B., Bohanec, M., Demsar, J., Bratko, I.: Feature Transformation by Function Decomposition. IEEE Intelligent Systems 13(2), 38–43 (1998)
Aho, A.V., Hopcroft, J.E., Ullman, J.D.: Data Structures and Algorithms. Addison-Wesley, Reading (1983)
Zhong, N., Dong, J.Z., Ohsuga, S.: Data Mining based on the Generalization Distribution Table and Rough Sets. In: Wu, X., et al. (eds.) PAKDD 1998. LNCS (LNAI), vol. 1394, pp. 360–373. Springer, Heidelberg (1998)
Dong, J.Z., Zhong, N., Ohsuga, S.: Probabilistic Rough Induction: The GDT-RS Methodology and Algorithms. In: Ras, Z.W., Skowron, A. (eds.) ISMIS 1999. LNCS (LNAI), vol. 1609, pp. 621–629. Springer, Heidelberg (1999)
Kohavi, R.: Useful Feature Subsets and Rough Set Reducts. In: Proc. Third Inter. Workshop on Rough Set and Soft Computing, pp. 310–317 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dong, J., Zhong, N., Ohsuga, S. (1999). Using Rough Sets with Heuristics for Feature Selection. In: Zhong, N., Skowron, A., Ohsuga, S. (eds) New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. RSFDGrC 1999. Lecture Notes in Computer Science(), vol 1711. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48061-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-48061-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66645-5
Online ISBN: 978-3-540-48061-7
eBook Packages: Springer Book Archive