Abstract
Discretization is an important preprocess in data mining tasks. Considering the density distribution of attributes, this paper proposes a novel discretization approach. The time complexity is O(m*n* logn) as EW and PKID, so it can scale to large datasets. We use the datasets from the UCI repository to perform the experiments and compare the effects with some current discretization methods; the experimental results demonstrate that our method is effective and practicable.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Transactions on Know ledge and Data Engineering 9, 642–645 (1997)
Tay, E.H., Shen, L.: A modified Chi2 algorithm for discretization. IEEE Transactions on Knowledge and Data Engineering 14, 666–670 (2002)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning. In: Proc. of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1029 (1993)
Clarke, E.J., Braton, B.A.: Entropy and MDL discretization of continuous variables for Bayesian belief networks. International Journal of Intelligence Systems 15, 61–92 (2000)
Höppner, F.: Objective Function-based Discretization, pp. 438–445. Springer, Heidelberg (2006)
Dougherty, J.R., Kohavi, S.M.: Supervised and Unsupervised Discretization of Continuous Features. Machine Learning. In: Proc of 12th International Conference, pp. 194–202. Morgan Kaufmann, San Francisco (1995)
Yang, Y., Webb, G.I.: A Comparative Study of Discretization Methods for Naive-Bayes Classifiers. In: Pacific Rim Knowledge Acquisition Workshop (PKAW 2002), Tokyo, pp. 159–173 (2002)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, S., Yu, W. (2009). A Local Density Approach for Unsupervised Feature Discretization. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)