An Incremental Bit Allocation Strategy for Supervised Feature Discretization

Ferreira, Artur; Figueiredo, Mário

doi:10.1007/978-3-642-38628-2_62

Artur Ferreira^19,21 &
Mário Figueiredo^20,21

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7887))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1831 Accesses
1 Citations

Abstract

Feature discretization (FD) is a necessary pre-processing step for many machine learning tasks. Its use often yields compact and robust data representations, leading to more accurate classifiers and lower training times. In this paper, we propose an incremental supervised FD technique based on recursive bit allocation. The proposed algorithm starts with a pool of bits and, at each stage, if there are still bits left in the pool, allocates the next bit to the most promising feature, i.e., the one which, after discretization, has the highest mutual information with the class label. Since it may happen that one (or more) feature(s) receives no bits at all, this FD procedure has a built-in feature selection effect. The experimental evaluation on public domain benchmark datasets shows that the proposed method obtains similar or better results, both in terms of classification accuracy and number of discretization intervals, as compared to other state-of-the-art supervised FD techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Int. Conf. M. L. (ICML), pp. 194–202 (1995)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, Morgan Kauffmann (2005)
Google Scholar
Cover, T., Thomas, J.: Elements of Information Theory. J. Wiley & Sons (1991)
Google Scholar
Principe, J.: Information Theoretic Learning. Springer (2010)
Google Scholar
Tsai, C.-J., Lee, C.-I., Yang, W.-P.: A discretization algorithm based on class-attribute contingency coefficient. Inf. Sci. 178, 714–731 (2008)
Article Google Scholar
Jin, R., Breitbart, Y., Muoh, C.: Data discretization unification. Know. Inf. Systems 19(1), 1–29 (2009)
Article Google Scholar
Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6(4), 393–423 (2002)
Article MathSciNet Google Scholar
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: A recent survey. GESTS Int. Trans. on Computer Science and Engineering 32(1) (2006)
Google Scholar
Fayyad, U., Irani, K.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Int. Joint Conf. on Art. Intell. (IJCAI), pp. 1022–1027 (1993)
Google Scholar
Kononenko, I.: On biases in estimating multi-valued attributes. In: Proc. Int. Joint Conf. on Art. Intell. (IJCAI), pp. 1034–1040 (1995)
Google Scholar
Kurgan, L., Cios, K.: CAIM discretization algorithm. IEEE Trans. on Know. and Data Engineering 16(2), 145–153 (2004)
Article Google Scholar
Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: A unifying framework for information theoretic feature selection. J. Machine Learning Research 13, 27–66 (2012)
Google Scholar
Fox, B.: Discrete optimization via marginal analysis. Man. Sci. 13(3), 210–216 (1966)
Article MATH Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Instituto Superior de Engenharia de Lisboa, Portugal
Artur Ferreira
Instituto Superior Técnico, Portugal
Mário Figueiredo
Instituto de Telecomunicações, Lisboa, Portugal
Artur Ferreira & Mário Figueiredo

Authors

Artur Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Mário Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Systems and Robotics, Instituto Superior Técnico, Portugal
João M. Sanches
University of Alicante, Spain
Luisa Micó
INESC and University of Porto, Porto, Portugal
Jaime S. Cardoso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferreira, A., Figueiredo, M. (2013). An Incremental Bit Allocation Strategy for Supervised Feature Discretization. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_62

Download citation

DOI: https://doi.org/10.1007/978-3-642-38628-2_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics