A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression

Musa, Abdallah Bashir

doi:10.1007/s13042-013-0171-7

A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression

Original Article
Published: 08 May 2013

Volume 5, pages 861–873, (2014)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abdallah Bashir Musa¹

1128 Accesses
25 Citations
Explore all metrics

Abstract

Relevant information extraction and dimensionality reduction of the original input features is an interesting research area in machine learning and data analysis. Logistic regression (LR) is a well-known classification method that has been used widely in many applications of data mining, machine learning, and bioinformatics. However, its performance is affected by the multi-co-linearity among its predictors, and the features’ redundancy. ℓ1-regularizion and features extraction methods are commonly used to enhance the performance of logistic regression under multi-co-linearity and ovefitting problems, and to reduce computational complexity by discarding less relevant or redundant features. These methods include principal component analysis, kernel principal component analysis and independent component analysis. Recently, ℓ1-regularized logistic regression has received much attention as a promising method for features selection in classification tasks. So there is a great need to be compared with these existing methods. In this paper, we assess the performance of the aforementioned feature selection methods on LR and ℓ1-regularized logistic regression using different statistical measures. A variety of performance metrics has been utilized: accuracy, sensitivity, specificity, precision, the area under receiver operating characteristic curve and the receiver operating characteristic analysis. This study is distinct by its inclusion of a comprehensive statistical analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

Supervised feature selection algorithm via discriminative ridge regression

Article 13 October 2017

Shichao Zhang, Debo Cheng, … Zhenyun Deng

Feature Selection

References

Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley series in probability and statistics, 2nd edn. Wiley, New York
Menard S (2002) Applied logistic regression analysis, 2nd edn. Sage publications Inc, UK
Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models, 4th edn. Irwin, Chicago
Google Scholar
Ryan TP (2008) Modern regression methods, 2nd edn. Wiley, New York
Google Scholar
Brzezinski JR Knafl GJ (1999) Logistic regression modeling for context-based classification. In: Proceedings tenth international workshop on database and expert systems applications 1999, pp 755–759. doi:10.1109/DEXA.1999.795279
Liao JG, Chin K-V (2007) Logistic regression for disease classification using microarray data: model selection in a large p and small n case. Bioinformatics 23(15):1945–1951
Google Scholar
Sartor MA, Leikauf GD, Medvedovic Lrpath M (2008) A logistic regression approach for identifying enriched biological groups in gene expression data. Bioinformatics 25(2):211–217
Article Google Scholar
Asgary MP, Jahandideh S, Abdolmaleki P, Kazemnejad A (2007) Analysis and identification of β-turn types using multinomial logistic regression and artificial neural network. Bioinformatics 23(23):3125–3130
Google Scholar
Komarek P (2004) Logistic regression for data mining and high-dimensional classification. Robotics Institute, paper 222. http://repository.cmu.edu/ro-botics/222
Kwak N, Kim C, Kim H (2008) Dimensionality reduction based on ICA for regression problems. Neurocomputing 71(13–15):2596–2603
Google Scholar
Wei P, Ma P, Hu Q, Su X, Ma C (2013) Comparative analysis on margin based features selection algorithms. Int J Mach Learn Cybern (IJMLC). doi:10.1007/s13042-013-0164-6
Wainwright M, Ravikumar P, Lafferty J (2007) High-dimensional graphical model selection using ℓ1-regularized logistic regression. To appear in advances in neural information processing systems (NIPS) 19
Cawley GC, Talbot NLC (2006) Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics 22(19):2348–2355. doi:10.1093/bioinformatics/btl386
Article Google Scholar
Genkin A, Lewis DD, Madigan D (2007) Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3):291–304
Article MathSciNet Google Scholar
Wang X, Dong L, Yan J (2012) Maximum ambiguity based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505
Article Google Scholar
Cao LJ, Chua KS, Chong WK, Lee HP, Gu QM (2003) A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine, Neurocomputing 55(1–2):321–336
Google Scholar
Cai LJ, Zhang JQ, Zongwu CAI, Kian Guan LIM (2006) An empirical study of dimensionality reduction in support vector machine. Neural Network World, pp 177–192
Cao LJ, Chong WK (2002) Feature extraction in support vector machine: a comparison of PCA, XPCA and ICA. In: Proceedings of the 9th international conference on neural information processing 2002, ICONIP ‘02, vol 2, pp 1001–1005. doi:10.1109/ICONIP.2002.1198211
Lerner B, Guterman H, Aladjem M, Dinstein I (1999) A comparative study of neural network based feature extraction paradigms. Pattern Recogn Lett 20(1):7–14
Google Scholar
Ekenel HK, Sankur B (2004) Feature selection in the independent component subspace for face recognition. Pattern Recogn Lett 25:1377–1388
Article Google Scholar
Aguilera AM, Escabias M, Valderrama MJ (2006) Using principal components for estimating logistic regression with high-dimensional multi collinear data. Comput Stat Data Anal 50(8):1905–1924
Article MATH MathSciNet Google Scholar
Xiang D (2010) The listed company’s financial evaluation based on PCA-logistic regression model. In: Second international conference on multimedia and information technology (MMIT) 2010, vol 2, pp 168–171, 24–25. doi:10.1109/MMIT.2010.148
Liu Z, Chen D, Bensmail H (2005) Gene expression data classification with kernel principal component analysis. J. Biomed Biotechnol l2:155–169
Google Scholar
Gao Q-S, Xue F-Z (2011) Applications of the kernel principal component analysis-based logistic regression model on nonlinear association study. J Shandong Univ (health sciences) doi:10.1186/1471-2156-12-75
Villa A, Chanussot J, Jutten C, Benediktsson JA, Moussaoui S (2009) On the use of ICA for hyperspectral image analysis. In: IEEE international, IGARSS 4:IV-97-IV-100 geoscience and remote sensing symposium. doi:10.1109/IGARSS.2009.5417363
Widodo A, Yang B-S (2007) Application of nonlinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert Syst Appl 33(1):241–250
Article Google Scholar
Yu S-N, Chou K-T (2008) Integration of independent component analysis and neural networks for ECG beat classification. Expert Syst Appl 34(4):2841–2846
Google Scholar
Liwei F (2010) Independent component analysis for naive classification, PhD thesis, National University of Singapore, Singapore
Deniz O, Castrillon M, Hernandez M (2003) Face recognition using independent component analysis and support vector machines. Pattern Recogn Lett 24:2153–2157
Article Google Scholar
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754
Article Google Scholar
Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D (2007) An interior-point method for largescale l1-regularized least squares. IEEE J Sel Top Sign Process 1(4):606–617
Article Google Scholar
Jolliffe IT (2002) Principle components analysis, 2nd edn. Springer, Verlag
Escabias M, Aguilera AM, Valderrama MJ (2004) principal component estimation of functional logistic regression: discussion of two different approaches. J Nonparametric Stat 16(3–4):365–384
Google Scholar
van der Maaten LJP, Postma EO, van den Herik HJ (2008) Dimensionality reduction: a comparative review. Neurocomputing
Scholkopf B, Burges C, Smola A (eds) (1999) Advances in kernel methods—support vector learning, MIT Press, Cambridge, pp 327–352
Kim KI, Jung K, Kim HJ (2002) Face recognition using kernel principal component analysis. IEEE Signal Process Lett 9(2):40–42. doi:10.1109/9-7.991133
Article Google Scholar
Hoffmann H (2007) Kernel PCA for novelty detection. Pattern Recogn 40(3):863–874
Article MATH Google Scholar
Tipping ME (2001) Sparse kernel principal component analysis. In: Advances in neural information processing systems 13:633–639
Liu Z, Chen D, Bensmail H (2005) Gene expression data classification with kernel principal component analysis, J Biomed Biotechnol. doi:10.1155/JBB.2005.155
Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York
Book Google Scholar
Hyvarinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13:411–430
Article Google Scholar
Hyvarinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):483–1492
Article Google Scholar
Musa AB (2012) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern. doi: 10.1007/s13042-012-0068-x
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Google Scholar
van der Maaten L (2010) Statistical pattern recognition toolbox for Matlab (stprtool) version 2.11, version 0.7.2b
Gavert H, Hurri J, Sarela J, Hyvarinen A (2005) Fast ICA for Matlab 7.x and 6.x, Version 2.5
Koh K, Kim SJ, Boyd S (2009) l1_logreg: A large-scale solver for l1-regularized logistic regression problems. 0.8.2 Available at http://www.stanford.edu/~boyd/l1_logreg/
Maloof M (2002) On machine learning, ROC analysis, and statistical tests of significance. In: Proceedings of the sixteenth international conference on pattern recognition, pp 204–207
Liang G, Zhu X, Zhang C (2012) The effect of varying levels of class distribution on bagging for different algorithms: an empirical study. Int J Mach Learn Cybern (IJMLC). doi:10.1007/s13042-012-0125-5
Google Scholar
Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recog (PR) 41(12):3600–3612
Google Scholar

Download references

Acknowledgments

This work was supported by a grant from Hebei University, Baoding, Hebei, P. R. China. I wish to thank the PhD students of the departments of computer Sciences and mathematics for their encouragement, useful discussions, and interest. This work is completed in Hebei University during My PhD study period.

Author information

Authors and Affiliations

Faculty of Mathematical Sciences and Computer, University of Gezira, Wad Madani 20, Sudan
Abdallah Bashir Musa

Authors

Abdallah Bashir Musa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdallah Bashir Musa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Musa, A.B. A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression. Int. J. Mach. Learn. & Cyber. 5, 861–873 (2014). https://doi.org/10.1007/s13042-013-0171-7

Download citation

Received: 26 November 2012
Accepted: 23 April 2013
Published: 08 May 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s13042-013-0171-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression

Abstract

Access this article

Similar content being viewed by others

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

Supervised feature selection algorithm via discriminative ridge regression

Feature Selection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression

Abstract

Access this article

Similar content being viewed by others

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

Supervised feature selection algorithm via discriminative ridge regression

Feature Selection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation