Skip to main content

Mining Numerical Data—A Rough Set Approach

  • Conference paper
Rough Sets and Intelligent Systems Paradigms (RSEISP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4585))

  • 1275 Accesses

Abstract

We present an approach to mining numerical data based on rough set theory using calculus of attribute-value blocks. An algorithm implementing these ideas, called MLEM2, induces high quality rules in terms of both simplicity (number of rules and total number of conditions) and accuracy. Additionally, MLEM2 induces rules not only from complete data sets but also from data with missing attribute values, with or without numerical attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Bajcar, S., Grzymala-Busse, J.W., Hippe, Z.S.: A comparison of six discretization algorithms used for prediction of melanoma. In: IIS’2002. Proc. of the Eleventh International Symposium on Intelligent Information Systems, Sopot, Poland, pp. 3–12. Physica-Verlag, Heidelberg (2003)

    Google Scholar 

  • Bay, S.D.: Multivariate discretization of continous variables for set mining. In: Proc. of the 6-th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Boston, MA, pp. 315–319 (2000)

    Google Scholar 

  • Biba, M., Esposito, F., Ferilli, S., Mauro, N.D., Basile, T.M.A.: Unsupervised discretization using kernel density estimation. In: Proc. of the 20-th Int. Conf. on AI, Hyderabad, India, pp. 696–701 (2007)

    Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Monterey CA (1984)

    MATH  Google Scholar 

  • Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) Machine Learning - EWSL-91. LNCS (LNAI), vol. 482, pp. 164–178. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  • Chan, C.C., Batur, C., Srinivasan, A.: Determination of quantization intervals in rule based model for dynamic systems. In: Proc. of the IEEE Conference on Systems, Man, and Cybernetics, Charlottesville, VA, pp. 1719–1723. IEEE Computer Society Press, Los Alamitos (1991)

    Chapter  Google Scholar 

  • Chan, C.C., Grzymala-Busse, J.W.: On the attribute redundancy and the learning programs ID3, PRISM, and LEM2. Department of Computer Science, University of Kansas, TR-91-14, December 1991, p. 20 (1991)

    Google Scholar 

  • Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. Journal of Approximate Reasoning 15, 319–331 (1996)

    Article  MATH  Google Scholar 

  • Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc of the 12-th Int. Conf. on Machine Learning, Tahoe City, CA, July 9-12, 1995, pp. 194–202 (1995)

    Google Scholar 

  • Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th Int. Joint Conference on AI, Chambery, France, pp. 1022–1027 (1993)

    Google Scholar 

  • Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht, Boston, London (1992)

    Google Scholar 

  • Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)

    MATH  Google Scholar 

  • Grzymala-Busse, J.W.: Discretization of numerical attributes. In: Klösgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 218–225. Oxford University Press, New York (2002)

    Google Scholar 

  • Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: IPMU 2002. Proc. of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Annecy, France, pp. 243–250 (2002)

    Google Scholar 

  • Grzymala-Busse, J.W.: A comparison of three strategies to rule induction from data with numerical attributes. In: RSKD 2003. Proc. of the Int. Workshop on Rough Sets in Knowledge Discovery, pp. 132–140 in conjunction with the European Joint Conferences on Theory and Practice of Software, Warsaw (2003)

    Google Scholar 

  • Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Workshop Notes, Foundations and New Directions of Data Mining, in conjunction with the 3-rd International Conference on Data Mining, Melbourne, FL, pp. 56–63 (2003)

    Google Scholar 

  • Grzymala-Busse, J.W.: Data with missing attribute values: Generalization of idiscernibility relation and rule induction. In: Transactions on Rough Sets. Lecture Notes in Computer Science Journal Subline, vol. 1, pp. 78–95. Springer, Heidelberg (2004)

    Google Scholar 

  • Grzymala-Busse, J.W.: Incomplete data and generalization of indiscernibility relation, definability, and approximations. In: Ślęzak, D., Wang, G., Szczuka, M., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 244–253. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  • Grzymala-Busse, J.W., Stefanowski, J.: Discretization of numerical attributes by direct use of the rule induction algorithm LEM2 with interval extension. In: IIS’97. Proc. of the Sixth Symposium on Intelligent Information Systems, Zakopane, Poland, pp. 149–158 (1997)

    Google Scholar 

  • Grzymala-Busse, J.W., Stefanowski, J.: Three discretization methods for rule induction. Int. Journal of Intelligent Systems 16, 29–38 (2001)

    Article  MATH  Google Scholar 

  • Kerber, R.: ChiMerge: Discretization of numeric attributes. In: Proc. of the 10th National Conf. on AI, San Jose, CA, pp. 123–128 (1992)

    Google Scholar 

  • Kohavi, R., Sahami, M.: Error-based and entropy-based discretization of continous features. In: Proc of the 2-nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, pp. 114–119 (1996)

    Google Scholar 

  • Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: An enabling technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  • Nguyen, H.S., Nguyen, S.H.: Discretization methods for data mining. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery, pp. 451–482. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

  • Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  • Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht, Boston, London (1991)

    MATH  Google Scholar 

  • Pensa, R.G, Leschi, C., Besson, J., Boulicaut, J.F.: Assessment of discretization techniques for relevant pattern discovery from gene expression data. In: Proc. of the 4-th ACM SIGKDD Workshop on Data Mining in Bioinformatics, pp. 24–30. ACM Press, New York (2004)

    Google Scholar 

  • Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA (1993)

    Google Scholar 

  • Stefanowski, J.: Handling continuous attributes in discovery of strong decision rules. In: Proc. of the 1-st Int. Conference on Rough Sets and Current Trends in Computing, Warsaw, pp. 394–401. Springer, Berlin (1998)

    Chapter  Google Scholar 

  • Stefanowski, J.: Algorithms of Decision Rule Induction in Data Mining. Poznan University of Technology Press, Poznan, Poland (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Marzena Kryszkiewicz James F. Peters Henryk Rybinski Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grzymala-Busse, J.W. (2007). Mining Numerical Data—A Rough Set Approach. In: Kryszkiewicz, M., Peters, J.F., Rybinski, H., Skowron, A. (eds) Rough Sets and Intelligent Systems Paradigms. RSEISP 2007. Lecture Notes in Computer Science(), vol 4585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73451-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73451-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73450-5

  • Online ISBN: 978-3-540-73451-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics