Abstract
This study is concerned with whether it is possible to detect what information contained in the training data and background knowledge is relevant for solving the learning problem, and whether irrelevant information can be eliminated in preprocessing before starting the learning process. A case study of data preprocessing for a hybrid genetic algorithm shows that the elimination of irrelevant features can substantially improve the efficiency of learning. In addition, cost-sensitive feature elimination can be effective for reducing costs of induced hypotheses.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Caruana and D. Freitag, Greedy attribute selection, In Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann (1994) 28–36.
D. Gamberger, A minimization approach to propositional inductive learning, In Proceedings of the 8th European Conference on Machine Learning (ECML-95), Springer (1995) 151–160.
D. Gamberger and N. Lavrač, Towards a theory of relevance in inductive concept learning. Technical report IJS-DP-7310, J. Stefan Institute, Ljubljana (1995).
G.H. John, R. Kohavi and K. Pfleger, Irrelevant features and the subset selection problem, In Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann (1994) 190–198.
N. Lavrač and S. Džeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994).
N. Lavrač, D. Gamberger and S. Džeroski. An approach to dimensionality reduction in learning from deductive databases. In Proceedings of the 5th International Workshop on Inductive Logic Programming, 337–354, (1995).
N. Lavrač, D. Gamberger, and P. Turney. Reduction of the number of features in the East-West Challenge, Technical Report IJS-DP-7347, J. Stefan Institute, Ljubljana (1996).
N. Lavrač, D. Gamberger, and P. Turney. Feature reduction in the 24 trains East-West Challenge, Technical Report IJS-DP-7372, J. Stefan Institute, Ljubljana (1996).
R.S. Michalski and J.B. Larson. Inductive inference of VL decision rules. Paper presented at Workshop in Pattern-Directed Inference Systems, Hawaii, 1977. SIGART Newsletter, ACM 63 (1977) 38–44.
R.S. Michalski, A theory and methodology of inductive learning, In: R. Michalski, J. Carbonell and T. Mitchell (eds.) Machine Learning: An Artificial Intelligence Approach, Tioga (1983) 83–134.
D. Michie, S. Muggleton, D. Page, and A. Srinivasan. To the international computing community: A new East-West challenge. Oxford University Computing Laboratory, Oxford (1994).
D. Skalak. Prototype and feature selection by sampling and random mutation hill climbing algorithms, In Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann (1994) 293–301.
P. Turney. Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. Journal of Artificial Intelligence Research 2 (1995) 369–409.
P. Turney. Low size-complexity inductive logic programming: The East-West Challenge as a problem in cost-sensitive classification. In Advances in Inductive Logic Programming, IOS Press (1996) 308–321.
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann (1993).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lavrač, N., Gamberger, D., Turney, P. (1996). Cost-sensitive feature reduction applied to a hybrid genetic algorithm. In: Arikawa, S., Sharma, A.K. (eds) Algorithmic Learning Theory. ALT 1996. Lecture Notes in Computer Science, vol 1160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61863-5_40
Download citation
DOI: https://doi.org/10.1007/3-540-61863-5_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61863-8
Online ISBN: 978-3-540-70719-6
eBook Packages: Springer Book Archive