Abstract
In pattern classification, input pattern features usually contribute differently, in accordance to their relevances for a specific classification task. In a previous paper, we have introduced the Energy Supervised Relevance Neural Gas classifier, a kernel method which uses the maximization of Onicescu’s informational energy for computing the relevances of input features. Relevances were used to improve classification accuracy. In our present work, we focus on the feature ranking capability of this approach. We compare our algorithm to standard feature ranking methods.
Similar content being viewed by others
Abbreviations
- ESRNG:
-
Energy Supervised Relevance Neural Gas
References
Andonie R, Caţaron A (2004) An informational energy LVQ approach for feature ranking. In: Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2004), Bruges, Belgium, pp 471–476
Andonie R, Caţaron A (2005) Feature ranking using supervised neural gas and informational energy. In: Proceedings of International Joint Conference on Neural Networks (IJCNN2005), Montreal, Canada, July 31–August 4, pp 1269–1273
Andonie R, Petrescu F (1986) Interacting systems and informational energy. Found Control Eng 11: 53–59
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. Available: http://mlearn.ics.uci.edu/MLRepository.html, 2007
Battiti R. (1994) Using mutual information for selecting features in supervised neural network training. IEEE Trans Neural Netw 5: 537–550
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York
Breiman L (1996) Bias, variance and arcing classifiers. Tec. Report 460, Statistics department, University of California, April 1996
Bojer T, Hammer B, Schunk D, von Toschanowitz KT (2001) Relevance determination in learning vector quantization. In: Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2001), Bruges, Belgium, pp 271–276
Bottou L, Cun YL (2004) Large scale online learning. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, MA
Cang S, Partridge D (2004) Feature ranking and best subset using mutual information. Neural Comput Appl 13: 175–184
Caţaron A, Andonie R (2004) Energy generalized LVQ with relevance factors. In: Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN 2004, Budapest, Hungary
Caţaron A, Andonie R (2005) Informational energy kernel for LVQ. In: Dutch W et al (eds) ICANN 2005, Lecture Notes in Computer Science, vol 3697. Springer-Verlag, Berlin, pp 601-606
Chow TWS, Huang D (2005) Estimating optimal feature subsets using efficient estimation of high- dimensional mutual information. IEEE Trans Neural Netw 16: 213–224
Duch W, Wieczorek T, Biesiada J, Blachnik M (2004) Comparison of feature ranking methods based on information entropy. In: Proceedings of the IEEE International Joint Conference on Neural Networks IJCNN 2004, Budapest, Hungary, pp 1415–1419
Duda RO, Hart PE, Stork DG (2001) Pattern classification. John Wiley and Sons Inc., New York
Estévez PA, Tesmer M, Perez CA, Zurada JM. Normalized mutual information feature selection. IEEE Trans Neural Netw 20:189–201
Fu X, Wang L (2003) Data reduction dimensionality with application to simplifying RBF network structure and improving classification performance. IEEE Trans Syst Man Cybern B Cybern 3: 399–409
Furlanello C, Serafini M, Merler S, Jurman G (2003) An accelerated procedure for recursive feature ranking on microarray data. Neural Netw 16: 641–648
García-Laencina PJ, Sancho-Gómez J-L, Figueiras-Vidal AR, Verleysen M (2008) K-nearest neighbours based on mutual information for incomplete data classification. In: Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2008), Bruges, Belgium, pp 37–42
Gómez-Verdejo V, Verleysen M, Fleury J (2009) Information-theoretic feature selection for functional data classification. Neurocomputing 72: 3580–3589
Guiaşu S (1977) Information theory with applications. McGraw Hill, New York
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Hammer B, Villmann T (2002) Generalized relevance learning vector quantization. Neural Netw 15: 1059–1068
Hammer B, Strickert M, Villmann T (2005) Supervised neural gas with general similarity measure. Neural Process Lett 21(1): 21–44
Kira K, Rendell LA (1992) A practical approach to feature selection. In: Proceedings of the 9th international conference on machine learning. Morgan Kaufmann, pp 249–256
Kohonen T (2000) Self-organizing maps. Springer-Verlag, Berlin
Lehn-Schiøler T, Hegde A, Erdogmus D, Principe JC (2005) Vector quantization using information theoretic concepts. Nat Comput Int J 4: 39–51
Liu, H, Motoda, H (eds) (2008) Computational methods of feature selection. Chapman & Hall/CRC, Boca Raton
Martinetz TM, Berkovich SG, Schulten SG (1993) Neural-gas network for vector quantization and its application to time-series prediction. IEEE Trans Neural Netw 4: 558–569
Onicescu O (1966) Theorie de l’information. Energie informationelle. C R Acad Sci Paris Ser A-B 263: 841–842
Principe JC, Xu D, Fisher JW III (2000) Information-theoretic learning. In: Haykin S (eds) Unsupervised adaptive filtering. Wiley, New York
Purcaru I (1987) Considerations on the definition of a good measure of unilateral dependence. Stud Cerc Mat 39: 56–59
Schneider P, Biehl M, Hammer B (2007) Relevance matrices in LVQ. In: Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2007), Bruges, Belgium, pp 37–42
Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Mach Intell 29: 1035–1051
Tesmer M, Estevez PA (2004) AMIFS: adaptive feature selection by using mutual information. In: Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2004), Budapest, Hungary, 25–29 July, 2004
Torkkola K (2002) On feature extraction by mutual information maximization. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Orlando, FL
Torkkola K (2003) Feature extraction by non parametric mutual information maximization. J Mach Learn Res 3: 1415–1438
van der Lubbe JCA, Boxma Y, Bockee DE (1984) A generalized class of certainty and information measures. Inf Sci 32: 187–215
Verleysen M, Rossi F, Franois D (2009) Advances in feature selection with mutual information, In: Biel M et al (eds) Similarity-based clustering. Lecture Notes in Artificial Intelligence, vol 5400. Springer, Berlin, pp 52–69
Wei HL, Billings SA (2007) Feature subset selection and ranking for data dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29: 162–166
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Caţaron, A., Andonie, R. Energy Supervised Relevance Neural Gas for Feature Ranking. Neural Process Lett 32, 59–73 (2010). https://doi.org/10.1007/s11063-010-9143-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-010-9143-z