Maximally Informative Feature and Sensor Selection in Pattern Recognition Using Local and Global Independent Component Analysis

Lan, Tian; Erdogmus, Deniz

doi:10.1007/s11265-006-0026-5

Maximally Informative Feature and Sensor Selection in Pattern Recognition Using Local and Global Independent Component Analysis

Published: 27 March 2007

Volume 48, pages 39–52, (2007)
Cite this article

The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology Aims and scope Submit manuscript

Tian Lan¹ &
Deniz Erdogmus¹

115 Accesses
5 Citations
Explore all metrics

Abstract

In pattern recognition, a suitable criterion for feature selection is the mutual information (MI) between feature vectors and class labels. Estimating MI in high dimensional feature spaces is problematic in terms of computation load and accuracy. We propose an independent component analysis based MI estimation (ICA-MI) methodology for feature selection. This simplifies the high dimensional MI estimation problem into multiple one-dimensional MI estimation problems. Nonlinear ICA transformation is achieved using piecewise local linear approximation on partitions in the feature space, which allows the exploitation of the additivity property of entropy and the simplicity of linear ICA algorithms. Number of partitions controls the tradeoff between more accurate approximation of the nonlinear data topology and small-sample statistical variations in estimation. We test the ICA-MI feature selection framework on synthetic, UCI repository, and EEG activity classification problems. Experiments demonstrate, as expected, that the selection of the number of partitions for local linear ICA is highly problem dependent and must be carried out properly through cross validation. When this is done properly, the proposed ICA-MI feature selection framework yields feature ranking results that are comparable to the optimal probability of error based feature ranking and selection strategy at a much lower computational load.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Data clustering: application and trends

Article 27 November 2022

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

References

E. Oja, Subspace Methods of Pattern Recognition, Wiley, New York, 1983.
Google Scholar
P. A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach, Prentice Hall, London, 1982.
MATH Google Scholar
K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed., Academic, New York, 1990.
MATH Google Scholar
R. Everson and S. Roberts, “Independent Component Analysis: A Flexible Nonlinearity and Decorrelating Manifold Approach,” Neural Comput., vol. 11, no. 8, 2003, pp. 1957–1983.
Article Google Scholar
A. Hyvärinen, E. Oja, P. Hoyer, and J. Hurri, “Image Feature Extraction by Sparse Coding and Independent Component Analysis,” Proc. of ICPR’98, 1998, pp. 1268–1273.
K. Torkkola, “Feature Extraction by Non-parametric Mutual Information Maximization,” J. Mach. Learn. Res., vol. 3, 2003, pp. 1415–1438.
Article MATH Google Scholar
T. Lan, D. Erdogmus, A. Adami, and M. Pavel, “Feature Selection by Independent Component Analysis and Mutual Information Maximization in EEG Signal Classification,” Proc. of IJCNN’05, Montreal, 2005, pp. 3011–3016.
W. Duch, T. Wieczorek, J. Biesiada, and M. Blachnik, “Comparison of Feature Ranking Methods Based on Information Entropy,” Proc. of International Joint Conference on Neural Networks (IJCNN), IEEE Press, Budapest, 2004, pp. 1415–1420.
R. M. Fano, Transmission of Information: A Statistical Theory of Communications, Wiley, New York, 1961.
Google Scholar
M. E. Hellman and J. Raviv, “Probability of Error, Equivocation and the Chernoff Bound,” IEEE Trans. Inf. Theory, vol. 16, 1970, pp. 368–372.
Article MATH Google Scholar
R. Battiti, “Using Mutual Information for Selecting Features in Supervised Neural Net Training,” IEEE Trans. Neural Netw., vol. 5, no. 4, 1994, pp. 537–550.
Article Google Scholar
K. Kira and L. Rendell, “The Feature Selection Problem: Traditional Methods and a New Algorithm,” in Proc. of the Tenth National Conference on Artificial Intelligence (AAAI-92), Menlo Park, 1992, pp. 129–134.
G. H. John, R. Kohavi, and K. Pfleger, “Irrelevant Features and the Subset Selection Problem,” in Proc. of the 11th International Conference on Machine Learning, San Mateo, 1994, pp. 121–129.
I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Mach. Learn. Res. (Special Issue on Variable and Feature Selection), 2003.
J. Karhunen, S. Malaroiu, and M. Ilmoniemi, “Local linear independent component analysis based on clustering,” Int. J. Neural Syst., vol. 10, no. 6, 2000, pp. 439–451.
Google Scholar
M. Szummer and T. Jaakkola, “Information Regularization with Partially Labeled Data,” Advances in NIPS 15, 2002.
S. Haykin (Ed.), Unsupervised Adaptive Filtering, Vol. I: Blind Source Separation, 2000, Wiley, York.
A. Hyvarinen and P. Pajunen, “Nonlinear Independent Component Analysis: Existence and Uniqueness Results,” Neural Netw., vol. 12, no. 3, 1999, pp. 429–439.
Article Google Scholar
S. Haykin, Neural Networks—A Comprehensive Foundation, 2nd ed., Prentice-Hall, Englewood Cliffs, NJ, 1998.
Google Scholar
O.Vasicek, “A Test for Normality Based on Sample Entropy,” J. R. Stat. Soc., Ser. B, vol. 38, no. 1, 1976, pp. 54–59.
MATH Google Scholar
K. E. Hild II, D. Erdogmus, and J. C. Principe, “Blind Source Separation Using Renyi’s Mutual Information,” IEEE Signal Process. Lett., vol. 8, no. 6, 2001, pp. 174–176.
Article Google Scholar
A. Hyvärinen and E. Oja, “A Fast Fixed Point Algorithm for Independent Component Analysis,” Neural Comput., vol. 9, no. 7, 1997, pp. 1483–1492.
Article Google Scholar
L. Parra and P. Sajda, “Blind Source Separation via Generalized Eigenvalue Decomposition,” J. Mach. Learn. Res., vol. 4, 2003, pp. 1261–1269.
Article Google Scholar
http://www.ics.uci.edu/~mlearn/MLRepository.html, UCI Machine Learning Repository.
http://ida.first.fraunhofer.de/projects/bci/competition_iii/#data_set_v, BCI Competition III.
C. C. Chang and C. C. Lin, LIBSVM: A Library for Support Vector Machines, 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
S. Mathan, N. Mazaeva, S. Whitlow, A. Adami, D. Erdogmus, T. Lan, and M. Pavel, “Sensor-based Cognitive State Assessment in a Mobile Environment,” Proc. of AUGCOG’05 (jointly with HCII’05), 2005, Las Vegas, Nevada.
B. Widrow and M. E. Hoff, “Adaptive Switching Circuits,” in IRE WESCON Convention Record, 1960, pp. 96–104.
P. Welch, “The Use of Fast Fourier Transform for the Estimation of Power Spectra: A Method Based on Time Averaging Over Short Modified Periodograms,” IEEE Trans. Audio Electroacoust., vol. 15, no. 2, 1967, pp. 70–73.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
Tian Lan & Deniz Erdogmus

Authors

Tian Lan
View author publications
You can also search for this author in PubMed Google Scholar
Deniz Erdogmus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tian Lan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lan, T., Erdogmus, D. Maximally Informative Feature and Sensor Selection in Pattern Recognition Using Local and Global Independent Component Analysis. J VLSI Sign Process Syst Sign Im 48, 39–52 (2007). https://doi.org/10.1007/s11265-006-0026-5

Download citation

Received: 24 April 2006
Revised: 08 November 2006
Accepted: 22 December 2006
Published: 27 March 2007
Issue Date: August 2007
DOI: https://doi.org/10.1007/s11265-006-0026-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximally Informative Feature and Sensor Selection in Pattern Recognition Using Local and Global Independent Component Analysis

Abstract

Access this article

Similar content being viewed by others

Learning from imbalanced data: open challenges and future directions

Data clustering: application and trends

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Maximally Informative Feature and Sensor Selection in Pattern Recognition Using Local and Global Independent Component Analysis

Abstract

Access this article

Similar content being viewed by others

Learning from imbalanced data: open challenges and future directions

Data clustering: application and trends

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation