Abstract
Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5, 537–550 (1994)
Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)
Chiu, D., Wong, A., Cheung, B.: Information discovery through hierarchical maximum entropy discretization and synthesis. In: Proceedings of the Knowledge Discovery in Databases, pp. 125–140 (1991)
Cover, T., Thomas, J.: Elements of Information Theory. Wiley, Hoboken (1991)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: International Conference on Machine Learning (ICML), pp. 194–202 (1995)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of International Joint Conference on Artifficial Intelligence (IJCAI), pp. 1022–1027 (1993)
Ferreira, A., Figueiredo, M.: An unsupervised approach to feature discretization and selection. Pattern Recog. 45, 3048–3060 (2012)
Frank, A., Asuncion, A.: UCI machine learning repository, available at http://archive.ics.uci.edu/ml (2010)
Garcia, S., Luengo, J., Saez, J., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)
Hellman, M.: Probability of error, equivocation, and the Chernoff bound. IEEE Trans. Inf. Theory 16(4), 368–372 (1970)
Jin, R., Breitbart, Y., Muoh, C.: Data discretization unification. Knowl. Inf. Syst. 19(1), 1–29 (2009)
Kononenko, I.: On biases in estimating multi-valued attributes. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp. 1034–1040 (1995)
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)
Kurgan, L., Cios, K.: CAIM discretization algorithm. IEEE Trans. Knowl. Data Eng. 16(2), 145–153 (2004)
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28, 84–94 (1980)
Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6(4), 393–423 (2002)
Principe, J.: Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives, 1st edn. Springer, Heidelberg (2010)
Santhi, N., Vardy, A.: On an improvement over Rényi’s equivocation bound. In: 44-th Annual Allerton Conference on Communication, Control, and Computing (2006)
Tsai, C.-J., Lee, C.-I., Yang, W.-P.: A discretization algorithm based on class-attribute contingency coefficient. Inf. Sci. 178, 714–731 (2008)
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, Morgan Kauffmann, Burlington (2005)
Yang, Y., Webb, G.: Proportional k-interval discretization for naïve-Bayes classifiers. In: 12th European Conference on Machine Learning, (ECML), pp. 564–575 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ferreira, A.J., Figueiredo, M.A.T. (2015). Feature Discretization with Relevance and Mutual Information Criteria. In: Fred, A., De Marsico, M. (eds) Pattern Recognition Applications and Methods. Advances in Intelligent Systems and Computing, vol 318. Springer, Cham. https://doi.org/10.1007/978-3-319-12610-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-12610-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12609-8
Online ISBN: 978-3-319-12610-4
eBook Packages: EngineeringEngineering (R0)