Skip to main content
Log in

Incremental Feature Selection

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection is a problem of finding relevant features. When the number of features of a dataset is large and its number of patterns is huge, an effective method of feature selection can help in dimensionality reduction. An incremental probabilistic algorithm is designed and implemented as an alternative to the exhaustive and heuristic approaches. Theoretical analysis is given to support the idea of the probabilistic algorithm in finding an optimal or near-optimal subset of features. Experimental results suggest that (1) the probabilistic algorithm is effective in obtaining optimal/suboptimal feature subsets; (2) its incremental version expedites feature selection further when the number of patterns is large and can scale up without sacrificing the quality of selected features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. Koller and M. Sahami, “Toward optimal feature selection,” in Machine Learning: Proceedings of the Thirteenth International Conference, edited by L. Saitta, Morgan Kaufmann Publishers, 1996, pp. 284- 292.

  2. D.W. Aha, “Tolerating noisy, irrelevant and novel attributes in instance-bassed learning algorithm,” International Journal of Man-Machine Studies, vol. 36, no.1, 1992.

  3. P.M. Narendra and K. Fukunaga, “A branch and bound algorithm for feature subset selection,” IEEE Trans. on Computer, vol. C-26, no.9, pp. 917- 922, September 1977.

    Google Scholar 

  4. W. Siedlecki and J. Sklansky, “On automatic feature selection,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 2, pp. 197- 220, 1988.

    Google Scholar 

  5. M. Dash and H. Liu, “Feature selection methods for classifications,” Intelligent Data Analysis: An International Journal, vol. 1, no.3, 1997.

  6. A. Jain and D. Zongker, “Feature selection: Evaluation, application, and small sample performance,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, no.2, pp. 153- 158, 1997.

    Google Scholar 

  7. G.H. John, R. Kohavi, and K. Pfleger, “Irrelevant feature and the subset selection problem,” in Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann Publisher, 1994, pp. 121- 129.

  8. P. Langley, “Selection of relevant features in machine learning,” in Proceedings of the AAAI Fall Symposium on Relevance, AAAI Press, 1994.

  9. N. Wyse, R. Dubes, and A.K. Jain, “A critical evaluation of intrinsic dimensionality algorithms,” in Pattern Recognition in Practice, edited by E.S. Gelsema and L.N. Kanal, Morgan Kaufmann Publishers, Inc., pp. 415- 425, 1980.

  10. P.A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach, Prentice Hall International, 1982.

  11. K. Kira and L.A. Rendell, “The feature selection problem: Traditional methods and a newalgorithm,” in Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press/The MIT Press, 1992, pp. 129- 134.

  12. H. Almuallim and T.G. Dietterich, “Learning Boolean concepts in the presence of many irrelevant features,” Artificial Intelligence, vol. 69, nos.1- 2, pp. 279- 305, November 1994.

    Google Scholar 

  13. M. Modrzejewski, “Feature selection using rough sets theory,” in Proceedings of the European Conference on Machine Learning, edited by P.B. Brazdil, 1993, pp. 213- 226.

  14. H. Liu and W.X. Wen, “Concept learning through feature selection,” in Proceedings of First Australian and New Zealand Conference on Intelligent Information Systems, 1993, pp. 293- 297.

  15. J.R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no.1, pp. 81- 106, 1986.

    Google Scholar 

  16. G. Pagallo and D. Haussler, “Boolean feature discovery in empirical learning,” Machine Learning, vol. 5, pp. 71- 99, 1990.

    Google Scholar 

  17. J.R. Quinlan, C 4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.

  18. H. Liu and R. Setiono, “A probabilistic approach to feature selection—a filter solution,” in Proceedings of International Conference on Machine Learning (ICML-96), edited by L. Saitta, Morgan Kaufmann Publishers, 1996, pp. 319- 327.

  19. G. Brassard and P. Bratley, Fundamentals of Algorithms. Prentice Hall: NJ, 1996.

  20. S. Watanabe, Pattern Recognition: Human and Mechanical, Wiley Interscience, 1985.

  21. P. Mehra, L.A. Rendell, and B.W. Wah, “Principled constructive induction,” in Proceedings of IJCAI, August 1989, pp. 651- 656.

  22. W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Cambridge University Press: Cambridge, 1992.

    Google Scholar 

  23. T.W. Rauber, “Inductive pattern classification methods— features—sensors,” Ph.D. Thesis, Dept. of Electrical Engineering, Universidade Nova de Lisboa, Lisboa, 1994.

    Google Scholar 

  24. C.J. Merz and P.M. Murphy, “UCI repository of machine learning databases,” http://www.ics.uci.edu/~mlearn/ MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science, 1996.

    Google Scholar 

  25. S.B. Thrun et al., “The Monk's problems: A performance comparison of different learning algorithms,” Technical Report CMU-CS-91-197, Carnegie Mellon University, 1991.

  26. Sholom M. Weiss and Casimir A. Kulikowski, Computer Systems That Learn, Morgan Kaufmann Publishers: San Mateo, CA, 1991.

    Google Scholar 

  27. M. Boddy and T.L. Dean, “Deliberation scheduling for problem solving in time-constrained environments,” Artificial Intelligence, vol. 67, no.2, pp. 245- 285, 1994.

    Google Scholar 

  28. H. Liu and R. Setiono, “Feature selection and classification— a probabilistic wrapper approach,” in Proceedings of the Ninth International Conference on Industrial and Engineering Applications of AI and ES, 1996, pp. 419- 424.

  29. R. Kohavi, “Wrappers for performance enhancement and oblivious decision graphs,” Ph.D. Thesis, Department of Computer Science, Standford University, Stanford, CA, 1995.

    Google Scholar 

  30. H. Liu and R. Setiono, “Chi2: Feature selection and discretization of numeric attributes,” in Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, 1995, pp. 388- 391.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Setiono, R. Incremental Feature Selection. Applied Intelligence 9, 217–230 (1998). https://doi.org/10.1023/A:1008363719778

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008363719778

Navigation