Abstract
It is still a changing problem of choosing the most relevant ones from multiple features for their specific machine learning tasks. However, feature selection provides an effective solution to it, which aims to choose the most relevant and least redundant features for data analysis. In this paper, we present a feature selection algorithm termed as semi-supervised minimum redundancy maximum relevance. The relevance is measured by a semi-supervised filter score named constraint compensated Laplacian score, which takes advantage of the local geometrical structures of unlabeled data and constraint information from labeled data. The redundancy is measured by a semi-supervised Gaussian mixture model-based Bhattacharyya distance. The optimal feature subset is selected by maximizing feature relevance and minimizing feature redundancy simultaneously. We apply our algorithm in audio classification task and compare it with other known feature selection methods. Experimental results further prove that our algorithm can lead to promising improvements.
Similar content being viewed by others
References
Bartsch MA, Wakefield GH (2005) Audio thumbnailing of popular music using chroma-based representations. IEEE Trans Multimedia 7(1):96–104
Benabdeslem K, Hindawi M (2014) Efficient semi-supervised feature selection: constraint, relevance and redundancy. IEEE Trans Knowl Data Eng 26(5):1131–1143
Bhalerao A, Rajpoot N (2003) Selecting discriminant subbands for texture classification. in Proc. BMVC, Norwich, September 2003
Breiman L, Friedman JH, Olshen RA, Charles J (1984) Classification and regression trees. Wadsworth & Brooks, Pacific Grove
Chao YH, Wang HM, Chang RC (2005) GMM-based Bhattacharyya kernel fisher discriminant analysis for speaker recognition. in Proc. ICASSP, p 649–652
Choi E, Lee C (2003) Feature extraction based on the Bhattacharyya distance. Pattern Recogn 36(8):1703–1709
Chung FRK (1997) Spectral Graph Theory. AMS
Ding C, Peng HC (2003) Minimum redundancy feature selection from microarray gene expression data. in Proc. IEEE CSB, p 523–528
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889
Dy JG, Brodley CE, Kak AC, Broderick LS, Aisen AM (2003) Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans Pattern Anal Mach Intell 25:373–378
Efron B, Hastie T, Johnstone I, Tibshirani R (2004a) Least angle regression. Ann Stat 25:407–449
Efron B, Hastie T, Johnstone I, Tibshirani R (2004b) Least angle regression. Annals of Stastics 32:407–449
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, San Diego
Geiger JT, Schuller B, Rigoll G (2013) Large-scale audio feature extraction and SVM for acoustic scene classification. in Proc. Applications of Signal Processing to Audio and Acoustics, New Paltz, p 1–4
Giannakopoulos T, Pikrakis A (2014) Introduction to Audio Analysis: A MATLAB Approach, Elsevier Academic Press
Giannakopoulos T, Pikrakis A, Theodoridis S (2008) Gunshot detection in audio streams from movies by means of dynamic programming and Bayesian networks. in Proc. ICASSP
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. in proc. NIPS, Vancouver
Janett WW, Li Y (2009) Estimation of mutual information: A survey. in Proc. Rough Sets and Knowledge Technology, 2009, Gold Coast, Australia, July, 2009
Kailath T (1967) The divergence and Bhattacharyya distance measuresin signal selection. IEEE Trans Commun Technol 15(1):52–60
Misra H, Ikbal S, Bourlard H, Hermansky H (2004) Spectral entropy based feature for robust ASR. in Proc. ICASSP
Panagiotakis C, Tziritas G (2005) A speech/music discriminator based on rms and zero-crossings. IEEE Trans. on Multimedia 7(1):155–166
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Quinlan JR (1993) C4.5:programs for machine learning. Moran Kaufmann Publishers Inc, San Francisco
Ramalingam T, Dhanalakshmi P (2014) Speech/music classification using wavelet based feature extraction Techiques. J Comput Sci 10(1):34–44
Reyes-Aldasoro CC, Bhalerao A (2006) The Bhattacharyya space for feature selection and its application to texture segmentation. Pattern Recogn 39(5):812–826
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53:23–69
Ross BC (2014, Feb.) Mutual information between discrete and continuous data sets. PLoS ONE [Online] 9(2):e87357 Available: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0087357
Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. in Proc. ICASSP
Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 13:1393–1434
Suzuki T, Sugiyama M, Tanaka T (2009) Mutual information approximation via maximum likelihood estimation of density ratio. in Proc. IEEE International Symposium on Information Theory
Wang RY (2011) Research on audio classification under complex environment. Ph. D. dissertation, Beijing University of Posts and Telecommunications
Xu Z, Jin R, Lyu MRR, King K (2009) Discriminative semi-supervised feature selection via manifold regularization. Proc. 21st Int’l Joint Conf. Artificial Intelligence (IJCAI)
Yang XK, He L, Qu D, Zhang WQ, Johnson MT (2016) Semi-supervised feature selection for audio classification based on constraint compensated Laplacian score. EURASIP Journal on Audio, Speech, and Music Processing. doi:10.1186/s13636-016-0086-9
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn 5:1205–1224
Zhang D, Chen S, Zhou Z (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recogn 41(5):1440–1451
Zhao Z, Liu H (2007a) Semi-supervised feature selection via spectral analysis. in proc. SIAM Int. conf. Data Mining, Tempe, p 641–646
Zhao Z, Liu H (2007b) Spectral Feature Selection for Supervised and Unsupervised Learning. in Proc. 24th international conference on Machine learning (ICML), p 1151–1157
Zhao Z, Liu H (2012) Spectral feature selection for data mining (data mining and knowledge discovery series). Chapman and Hall-CRC, Boca Raton
Zubair S, Yan F, Wang W (2013) Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digital Signal Process 23(5):960–970
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (No. 61673395, No. 61403415, No. 61302107, and No. 61403224).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, X.K., He, L., Qu, D. et al. Semi-supervised minimum redundancy maximum relevance feature selection for audio classification. Multimed Tools Appl 77, 713–739 (2018). https://doi.org/10.1007/s11042-016-4287-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4287-0