Skip to main content

Core-Generating Discretization for Rough Set Feature Selection

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 6499))

Abstract

Rough set feature selection (RSFS) can be used to improve classifier performance. RSFS removes redundant attributes whilst keeping important ones that preserve the classification power of the original dataset. The feature subsets selected by RSFS are called reducts. The intersection of all reducts is called core. However, RSFS handles discrete attributes only. To process datasets consisting of real attributes, they are discretized before applying RSFS. Discretization controls core of the discrete dataset. Moreover, core may critically affect the classification performance of reducts. This paper defines core-generating discretization, a type of discretization method; analyzes the properties of core-generating discretization; models core-generating discretization using constraint satisfaction; defines core-generating approximate minimum entropy (C-GAME) discretization; models C-GAME using constraint satisfaction and evaluates the performance of C-GAME as a pre-processor of RSFS using ten datasets from the UCI Machine Learning Repository.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apt, K., Wallace, M.: Constraint Logic Programming using ECLiPSe. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  2. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wroblewski, J.: Rough set algorithms in classification problems. In: Polkowski, L., et al. (eds.) Rough Set Methods and Applications: New Developments in Kownledge Discovery in Information Systems, pp. 49–88. Physica-Verlag, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Chmielewski, M.R., Grzymala-Busse, J.W.: Global Discretization of Continuous Attributes as Preprocessing for Machine Learning. International Journal of Approximate Reasoning 15(4), 319–331 (1996)

    Article  MATH  Google Scholar 

  4. Chai, D., Kuehlmann, A.: A Fast Pseudo-Boolean Constraint Solver. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24(3), 305–317 (2005)

    Article  Google Scholar 

  5. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proceedings of the Twelfth International Conference on Machine Learning, San Francisco, CA, pp. 194–202 (1995)

    Google Scholar 

  6. Fayyad, M.U., Irani, B.K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)

    Google Scholar 

  7. Fayyad, M.U.: On the Handling of Continuous-Valued Attributes in Decision Tree Generation. Machine Learning 8(1), 87–102 (1992)

    MathSciNet  MATH  Google Scholar 

  8. Gama, J., Torgo, L., Soares, C.: Dynamic Discretization of Continuous Attributes. In: Proceedings of the Sixth Ibero-American Conference on AI, pp. 160–169 (1998)

    Google Scholar 

  9. Han, J., Hu, X., Lin, Y.T.: Feature Subset Selection Based on Relative Dependency between Attributes. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 176–185. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Hettich, S., Blake, L.C., Merz, J.C.: UCI Repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  11. Hentenryck, V.: Constraint Satisfaction in Logic Programming. MIT Press, Cambridge (1989)

    Google Scholar 

  12. Jiao, N., Miao, D.: An efficient gene selection algorithm based on tolerance rough set theory. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS, vol. 5908, pp. 176–183. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Jensen, R., Shen, Q.: Tolerance-based and Fuzzy-rough Feature Selection. In: Proceedings of the 16th IEEE International Conference on Fuzzy Systems, pp. 877–882 (2007)

    Google Scholar 

  14. Jensen, R., Shen, Q.: Are More Features Better? A response to Attributes Reduction Using Fuzzy Rough Sets. IEEE Transactions on Fuzzy Systems 17(6), 1456–1458 (2009)

    Article  Google Scholar 

  15. Jensen, R., Shen, Q.: New Approaches to Fuzzy-Rough Feature Selection. IEEE Transactions on Fuzzy Systems 17(4), 824–838 (2009)

    Article  Google Scholar 

  16. Johnson, S.D.: Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences 9, 256–278 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  17. Jensen, R., Shen, Q.: Semantics-preserving dimensionality reduction: Rough and fuzzy-rough-based approaches. IEEE Transactions On Knowledge and Data Engineering 16(12), 1457–1471 (2004)

    Article  Google Scholar 

  18. Kohavi, R., Sahami, M.: Error-based and Entropy-based Discretization of Continous Features. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 114–119 (1996)

    Google Scholar 

  19. Liu, H., Hussain, F., Tan, L.C., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  20. Marriot, K., Stuckey, J.P.: Programming with Constraints: an Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  21. Marcus, S.: Tolerance rough sets, Cech topologies, learning processes. Bull. Polish Academy of Sciences, Technical Sciences 42(3), 471–487 (1994)

    MATH  Google Scholar 

  22. Nguyen, H.S., Skrowron, A.: Quantization of real values attributes, Rough set and boolean reasoning approach. In: Proceedings of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, North Carolina, pp. 34–37 (1995)

    Google Scholar 

  23. Nguyen, S.H., Nguyen, H.S.: Some efficient algorithms for rough set methods. In: Proceedings of the Conference of Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU 1996, Granada, Spain, pp. 1451–1456 (1996)

    Google Scholar 

  24. Nguyen, H.S.: Discretization Problem for Rough Sets Methods. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 545–552. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  25. Nguyen, S.H.: Discretization of Real Value Attributes: Boolean Reasoning Approach. Ph.D. Thesis, Warsaw University, Warsaw, Poland (1997)

    Google Scholar 

  26. Parthalain, M.N., Jensen, R., Shen, Q., Zwiggelaar, R.: Fuzzy-rough approaches for mammographic risk analysis. Intelligent Data Analysis 14(2), 225–244 (2010)

    Google Scholar 

  27. Polkowski, L.: Rough Sets. Mathematical Foundations. Physica–Verlag, Heidelberg (2002)

    Book  MATH  Google Scholar 

  28. Peters, J.F.: Tolerance near sets and image correspondence. Int. J. Bio-Inspired Computation 1(4), 239–245 (2009)

    Article  Google Scholar 

  29. Peters, J.F.: Corrigenda, Addenda: Tolerance near sets and image correspondence. Int. J. Bio-Inspired Computation 2(5) (in press, 2010)

    Google Scholar 

  30. Quinlan, R.J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  31. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Slowinski, R., et al. (eds.) Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1992)

    Chapter  Google Scholar 

  32. Shang, C., Shen, Q.: Rough Feature Selection for Neural Network Based Image Classification. International Journal of Image and Graphics 2(4), 541–555 (2002)

    Article  Google Scholar 

  33. Shen, Q., Chouchoulas, A.: A rough-fuzzy approach for generating classification rules. Pattern Recognition 35(11), 341–354 (2002)

    Article  MATH  Google Scholar 

  34. Swiniarski, W.R., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognition Letters 24(6), 833–849 (2003)

    Article  MATH  Google Scholar 

  35. Skowron, A., Stephaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27, 245–253 (1996)

    MathSciNet  MATH  Google Scholar 

  36. Stepaniuk, J., Kretowski, M.: Decision systems based on tolerance rough sets. In: Proc. 4th Int. Workshop on Intelligent Information Systems, Augustow, Poland, pp. 62–73 (1995)

    Google Scholar 

  37. Tsang, E.P.K.: Foundations of Constraint Satisfaction. Academic Press Limited, London (1993)

    Google Scholar 

  38. Vinterbo, S., Ohrn, A.: Minimal approximate hitting sets and rule templates. International Journal of Approximate Reasoning 25(2), 123–143 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  39. Zhong, N., Dong, J., Ohsuga, S.: Using Rough Sets with Heuristics for Feature Selection. Journal of Intelligent Information Systems 16(4), 199–214 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tian, D., Zeng, Xj., Keane, J. (2011). Core-Generating Discretization for Rough Set Feature Selection. In: Peters, J.F., Skowron, A., Chan, CC., Grzymala-Busse, J.W., Ziarko, W.P. (eds) Transactions on Rough Sets XIII. Lecture Notes in Computer Science, vol 6499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18302-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18302-7_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18301-0

  • Online ISBN: 978-3-642-18302-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics