Feature ranking and best feature subset using mutual information

Cang, Shuang; Partridge, Derek

doi:10.1007/s00521-004-0400-9

Feature ranking and best feature subset using mutual information

Original Article
Published: 01 April 2004

Volume 13, pages 175–184, (2004)
Cite this article

Neural Computing & Applications Aims and scope Submit manuscript

Shuang Cang¹ &
Derek Partridge²

443 Accesses
16 Citations
Explore all metrics

Abstract

A new algorithm for ranking the input features and obtaining the best feature subset is developed and illustrated in this paper. The asymptotic formula for mutual information and the expectation maximisation (EM) algorithm are used to developing the feature selection algorithm in this paper. We not only consider the dependence between the features and the class, but also measure the dependence among the features. Even for noisy data, this algorithm still works well. An empirical study is carried out in order to compare the proposed algorithm with the current existing algorithms. The proposed algorithm is illustrated by application to a variety of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Introduction to Machine Learning

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

References

Wang WJ, Jones P, Partridge D (1999) Assessing the impact of input features in a feedforward network. Neural Computing and Applications 9:101–112
MATH Google Scholar
van de Laar P, Heskes TM, Gielen CCAM (1999) Partial retraining: a new approach to input relevance determination. Int J Neur Sys 9:75–85
Article Google Scholar
Battiti R (1994) Using mutual information for selecting features in supervised neutral net learning. IEEE Trans Neur Netwks 5:537–550
Article Google Scholar
Kwak N, Choi C-H (2002) Input feature selection for classification problems. IEEE Trans Neur Netwks 13:143–159
Article Google Scholar
Tchaban T, Taylor MJ, Griffin J (1998) Establishing impacts of the inputs in a feedforward network. Neural Computing and Applications 7:309–317
MATH Google Scholar
Young TY, Coraluppi G (1970) Stochastic estimation of a mixture of normal density functions using an information criterion. IEEE Trans Info Theor 16:258–263
MATH Google Scholar
Carreira-Perpinan MA (2000) Mode-finding for mixtures of Gaussian distributions. IEEE Trans Patt Anal Mach Intell 22(11):1318–1323
Article Google Scholar
Cang S, Partridge D (2001) Determining the number of components in mixture models using Williams’ statistical test. In: Proceedings of the 8th International Conference on Neural Information Processing, Shanghai, China, November 2001
Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J Roy Stat Soc B59:731–792
Article Google Scholar
Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, Oxford, UK
Haykin S (1999) Neural networks: a comprehensive foundation. Prentice-Hall, Englewood Cliffs, NJ
Google Scholar
Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic Press, San Diego, CA

Download references

Acknowledgements

We wish to thank Julia Sonander and Harri Howells of National Air-Traffic Services for the STCA data, and the Engineering and Physical Science Research Council of the UK for supporting this work (grant no. GR/M75143).

Author information

Authors and Affiliations

Department of Computer Science, University of Wales, Aberystwyth, SY23 3DB, UK
Shuang Cang
Department of Computer Science, University of Exeter, Exeter, EX4 4QF, UK
Derek Partridge

Authors

Shuang Cang
View author publications
You can also search for this author in PubMed Google Scholar
Derek Partridge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuang Cang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cang, S., Partridge, D. Feature ranking and best feature subset using mutual information. Neural Comput & Applic 13, 175–184 (2004). https://doi.org/10.1007/s00521-004-0400-9

Download citation

Received: 22 May 2002
Accepted: 10 February 2004
Published: 01 April 2004
Issue Date: September 2004
DOI: https://doi.org/10.1007/s00521-004-0400-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature ranking and best feature subset using mutual information

Abstract

Access this article

Similar content being viewed by others

Introduction to Machine Learning

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Feature selection techniques for machine learning: a survey of more than two decades of research

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature ranking and best feature subset using mutual information

Abstract

Access this article

Similar content being viewed by others

Introduction to Machine Learning

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Feature selection techniques for machine learning: a survey of more than two decades of research

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation