Skip to main content

An Empirical Comparison of Discretization Methods for Neural Classifier

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8284))

Abstract

Discretization leads the improvement in classification accuracy and generalizes the problem well for further knowledge extraction. As a result researchers have developed various discretization methods for preprocessing the data. This paper provides a survey of existing discretization methods that preprocess the data for datamining. Also the paper evaluates the effectiveness of various discretization methods in terms of better discretization scheme and better accuracy of classification by comparing the performance of some traditional and recent discetization algorithms on six different real datasets namely Iris, Ionosphere, Waveform-5000, Wisconsin breast cancer, Hepatitis Domain and Pima Indian Diabetes. The feedforward neural network with conjugate gradient training algorithm is used to compute the accuracy of classification from the data discretized by those algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2001)

    Google Scholar 

  2. Cios, K.J., Kurgan, L.A.: CLIP 4: Hybrid inductive machine learning algorithm that generates inequality rules. Information Science 163, 37–83 (2004)

    Article  Google Scholar 

  3. Clark, P., Niblett, T.: The CN2 algorithm. Machine Learning 3, 261–283 (1989)

    Google Scholar 

  4. Kurgan, L.A., Cios, K.J.: CAIM Discretization Algorithm. IEEE Trans. on Knowledge and Data Engineering 16, 145–152 (2004)

    Article  Google Scholar 

  5. Tsai, C.J., Lee, C.I., Yang, W.P.: A Discretization algorithm based on Class-Attribute Contingency Coefficient. Information Sciences 178, 714–731 (2008)

    Article  Google Scholar 

  6. Butterworth, R., Simovici, D.A., Santos, G.S., Machado, L.O.: A Greedy Algorithm for supervised discretization. Biomedical Informatics 37, 285–292 (2004)

    Article  Google Scholar 

  7. Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of Thirteenth Int. Conf. on Artificial Intelligence, pp. 1022–1027 (1993)

    Google Scholar 

  8. Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  9. Cohen, S., Rokach, L., Maimon, O.: Decision-tree instance-space decomposition with grouped gain-ratio. Information Sciences 177, 3592–3612 (2007)

    Article  Google Scholar 

  10. Yager, R.R.: An extension of the naive Bayesian Classifier. Information Sciences 176, 577–588 (2006)

    Article  MathSciNet  Google Scholar 

  11. Kaikhah, K., Doddmeti, S.: Discovering trends in large datasets using neural network. Applied Intelligence 29, 51–60 (2006)

    Article  Google Scholar 

  12. Ozekes, S., Osman, O.: Classification and prediction in data mining with neural networks. Electrical and Electronics Engineering 3, 707–712 (2003)

    Google Scholar 

  13. Dam, H., Abbass, H.A., Lokan, C., Yao, X.: Neural based learning classifier systems. IEEE Trans. on Knowledge and Data Engineering 20, 26–39 (2008)

    Article  Google Scholar 

  14. Moller, F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6, 525–533 (1990)

    Article  Google Scholar 

  15. Kurgan, L.A., Cios, K.J.: Fast Class-Attribute Interdependence Maximization (FCAIM) Discretization Algorithm. In: Proc. of Int. Conf. on Machine Learning and Applications, pp. 30–36 (2003)

    Google Scholar 

  16. Kerber, R.: ChiMerge: Discretization of numeric attributes. In: Proc. of Ninth Int. Conf. on Artificial Intelligence, pp. 123–128 (1992)

    Google Scholar 

  17. Wu, Q., Bell, D.A., McGinnity, M., Prasad, G., Qi, G., Huang, X.: Improvement of Decision Accuracy Using Discretization of Continuous Attributes. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 674–683. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  19. Ferrandiz, S., Boullé, M.: Multivariate discretization by recursive supervised bipartition of graph. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 253–264. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. on Knowledge and Data Engineering 9, 642–645 (1997)

    Article  Google Scholar 

  21. Tay, F., Shen, L.: A modified chi2 algorithm for discretization. IEEE Trans. on Knowledge and Data Engineering 14, 666–670 (2002)

    Article  Google Scholar 

  22. Su, C.T., Hsu, J.H.: An extended chi2 algorithm for discretization of real value attributes. IEEE Trans. on Knowledge and Data Engineering 17, 437–441 (2005)

    Article  Google Scholar 

  23. Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–177. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  24. Grzymała-Busse, J.W.: Three strategies to rule induction from data with numerical attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds.) Transactions on Rough Sets II. LNCS, vol. 3135, pp. 54–62. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  25. Kass, G.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29, 119–127 (1980)

    Article  Google Scholar 

  26. MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  27. Bradley, Fayyad, Reina: Scaling EM (Expectation-Maximization) Clustering to Large Databases. Technical Report MSR-TR-98-35, Microsoft Research (1998)

    Google Scholar 

  28. Yang, Y., Webb, G.I.: Proportional k-Interval Discretization for Naive-Bayes Classifiers. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 564–575. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  29. Ching, J.Y., Wong, A.K.C., Chan, K.C.C.: Class-dependent discretization for inductive learning from continuous and mixed mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 641–651 (1995)

    Article  Google Scholar 

  30. Cios, K.J., Pedrycz, W., Swiniarski, R.: Data Mining Methods for Knowledge Discovery. Kluwer (1998), http://www.wkap.nl/~book.htm/0-7923-8252-8

  31. Huang, W.: Discretization of Continuous Attributes for Inductive Machine Learn- ing, masters thesis, Dept. Computer Science, Univ. of Toledo, Ohio (1996)

    Google Scholar 

  32. Kurgan, L.A., Cios, K.J.: Fast Class-Attribute Interdependence Maximization (FCAIM) Discretization Algorithm. In: Proc. of Int. Conf. on Machine Learning and Applications, pp. 30–36 (2003)

    Google Scholar 

  33. Cerquides, J., Lopez de Mantaras, R.: Maximum a posteriori tree augmented naive bayes classifiers. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 73–88. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  34. Bacardit, J., Garrell, J.M.: Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1818–1831. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  35. Divina, F., Marchiori, E.: Handling continuous attributes in an evolutionary induc- tive learner. IEEE Trans. on Evolutionary Computation 9, 31–43 (2005)

    Article  Google Scholar 

  36. Gethsiyal Augasta, M., Kathirvalavakumar, T.: A new Discretization algorithm based on Range coefficient of Dispersion and Skewness for neural networks classifier. Applied Soft Computing 12, 619–625 (2012)

    Article  Google Scholar 

  37. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: A Software Tool to Assess Evolutionary Algorithms to Data Mining Problems. Soft Computing 13(3), 307–318 (2009)

    Article  Google Scholar 

  38. http://weka.wikispaces.com/Datasets

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Augasta, M.G., Kathirvalavakumar, T. (2013). An Empirical Comparison of Discretization Methods for Neural Classifier. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03844-5_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03843-8

  • Online ISBN: 978-3-319-03844-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics