An Empirical Comparison of Discretization Methods for Neural Classifier

Augasta, M. Gethsiyal; Kathirvalavakumar, Thangairulappan

doi:10.1007/978-3-319-03844-5_5

An Empirical Comparison of Discretization Methods for Neural Classifier

M. Gethsiyal Augasta²¹ &
Thangairulappan Kathirvalavakumar²²

Conference paper

2672 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8284))

Abstract

Discretization leads the improvement in classification accuracy and generalizes the problem well for further knowledge extraction. As a result researchers have developed various discretization methods for preprocessing the data. This paper provides a survey of existing discretization methods that preprocess the data for datamining. Also the paper evaluates the effectiveness of various discretization methods in terms of better discretization scheme and better accuracy of classification by comparing the performance of some traditional and recent discetization algorithms on six different real datasets namely Iris, Ionosphere, Waveform-5000, Wisconsin breast cancer, Hepatitis Domain and Pima Indian Diabetes. The feedforward neural network with conjugate gradient training algorithm is used to compute the accuracy of classification from the data discretized by those algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2001)
Google Scholar
Cios, K.J., Kurgan, L.A.: CLIP 4: Hybrid inductive machine learning algorithm that generates inequality rules. Information Science 163, 37–83 (2004)
Article Google Scholar
Clark, P., Niblett, T.: The CN2 algorithm. Machine Learning 3, 261–283 (1989)
Google Scholar
Kurgan, L.A., Cios, K.J.: CAIM Discretization Algorithm. IEEE Trans. on Knowledge and Data Engineering 16, 145–152 (2004)
Article Google Scholar
Tsai, C.J., Lee, C.I., Yang, W.P.: A Discretization algorithm based on Class-Attribute Contingency Coefficient. Information Sciences 178, 714–731 (2008)
Article Google Scholar
Butterworth, R., Simovici, D.A., Santos, G.S., Machado, L.O.: A Greedy Algorithm for supervised discretization. Biomedical Informatics 37, 285–292 (2004)
Article Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of Thirteenth Int. Conf. on Artificial Intelligence, pp. 1022–1027 (1993)
Google Scholar
Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)
Article MathSciNet Google Scholar
Cohen, S., Rokach, L., Maimon, O.: Decision-tree instance-space decomposition with grouped gain-ratio. Information Sciences 177, 3592–3612 (2007)
Article Google Scholar
Yager, R.R.: An extension of the naive Bayesian Classifier. Information Sciences 176, 577–588 (2006)
Article MathSciNet Google Scholar
Kaikhah, K., Doddmeti, S.: Discovering trends in large datasets using neural network. Applied Intelligence 29, 51–60 (2006)
Article Google Scholar
Ozekes, S., Osman, O.: Classification and prediction in data mining with neural networks. Electrical and Electronics Engineering 3, 707–712 (2003)
Google Scholar
Dam, H., Abbass, H.A., Lokan, C., Yao, X.: Neural based learning classifier systems. IEEE Trans. on Knowledge and Data Engineering 20, 26–39 (2008)
Article Google Scholar
Moller, F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6, 525–533 (1990)
Article Google Scholar
Kurgan, L.A., Cios, K.J.: Fast Class-Attribute Interdependence Maximization (FCAIM) Discretization Algorithm. In: Proc. of Int. Conf. on Machine Learning and Applications, pp. 30–36 (2003)
Google Scholar
Kerber, R.: ChiMerge: Discretization of numeric attributes. In: Proc. of Ninth Int. Conf. on Artificial Intelligence, pp. 123–128 (1992)
Google Scholar
Wu, Q., Bell, D.A., McGinnity, M., Prasad, G., Qi, G., Huang, X.: Improvement of Decision Accuracy Using Discretization of Continuous Attributes. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 674–683. Springer, Heidelberg (2006)
Chapter Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Ferrandiz, S., Boullé, M.: Multivariate discretization by recursive supervised bipartition of graph. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 253–264. Springer, Heidelberg (2005)
Chapter Google Scholar
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. on Knowledge and Data Engineering 9, 642–645 (1997)
Article Google Scholar
Tay, F., Shen, L.: A modified chi2 algorithm for discretization. IEEE Trans. on Knowledge and Data Engineering 14, 666–670 (2002)
Article Google Scholar
Su, C.T., Hsu, J.H.: An extended chi2 algorithm for discretization of real value attributes. IEEE Trans. on Knowledge and Data Engineering 17, 437–441 (2005)
Article Google Scholar
Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–177. Springer, Heidelberg (1991)
Chapter Google Scholar
Grzymała-Busse, J.W.: Three strategies to rule induction from data with numerical attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds.) Transactions on Rough Sets II. LNCS, vol. 3135, pp. 54–62. Springer, Heidelberg (2004)
Chapter Google Scholar
Kass, G.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29, 119–127 (1980)
Article Google Scholar
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
Bradley, Fayyad, Reina: Scaling EM (Expectation-Maximization) Clustering to Large Databases. Technical Report MSR-TR-98-35, Microsoft Research (1998)
Google Scholar
Yang, Y., Webb, G.I.: Proportional k-Interval Discretization for Naive-Bayes Classifiers. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 564–575. Springer, Heidelberg (2001)
Chapter Google Scholar
Ching, J.Y., Wong, A.K.C., Chan, K.C.C.: Class-dependent discretization for inductive learning from continuous and mixed mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 641–651 (1995)
Article Google Scholar
Cios, K.J., Pedrycz, W., Swiniarski, R.: Data Mining Methods for Knowledge Discovery. Kluwer (1998), http://www.wkap.nl/~book.htm/0-7923-8252-8
Huang, W.: Discretization of Continuous Attributes for Inductive Machine Learn- ing, masters thesis, Dept. Computer Science, Univ. of Toledo, Ohio (1996)
Google Scholar
Kurgan, L.A., Cios, K.J.: Fast Class-Attribute Interdependence Maximization (FCAIM) Discretization Algorithm. In: Proc. of Int. Conf. on Machine Learning and Applications, pp. 30–36 (2003)
Google Scholar
Cerquides, J., Lopez de Mantaras, R.: Maximum a posteriori tree augmented naive bayes classifiers. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 73–88. Springer, Heidelberg (2004)
Chapter Google Scholar
Bacardit, J., Garrell, J.M.: Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1818–1831. Springer, Heidelberg (2003)
Chapter Google Scholar
Divina, F., Marchiori, E.: Handling continuous attributes in an evolutionary induc- tive learner. IEEE Trans. on Evolutionary Computation 9, 31–43 (2005)
Article Google Scholar
Gethsiyal Augasta, M., Kathirvalavakumar, T.: A new Discretization algorithm based on Range coefficient of Dispersion and Skewness for neural networks classifier. Applied Soft Computing 12, 619–625 (2012)
Article Google Scholar
Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: A Software Tool to Assess Evolutionary Algorithms to Data Mining Problems. Soft Computing 13(3), 307–318 (2009)
Article Google Scholar
http://weka.wikispaces.com/Datasets

Download references

Author information

Authors and Affiliations

Department of Computer Applications, Sarah Tucker College, Tirunelveli, 627007, TN, India
M. Gethsiyal Augasta
Department of Computer Science, V.H.N.S.N College, VirudhuNagar, 626001, TN, India
Thangairulappan Kathirvalavakumar

Authors

M. Gethsiyal Augasta
View author publications
You can also search for this author in PubMed Google Scholar
Thangairulappan Kathirvalavakumar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Business Information Systems, FSIC, National University of Ireland, University College Cork, O’Rahilly Buildings, Cork, Ireland
Rajendra Prasath
Research Centre in Computer Science, V.H.N.Senthikumara Nadar College (Autonomous), 626 001, Virudhunagar, Tamil Nadu, India
T. Kathirvalavakumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Augasta, M.G., Kathirvalavakumar, T. (2013). An Empirical Comparison of Discretization Methods for Neural Classifier. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-03844-5_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03843-8
Online ISBN: 978-3-319-03844-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics