Abstract
Feature Selection for mixed data is an active research area with many applications in practical problems where numerical and non-numerical features describe the objects of study. This paper provides the first comprehensive and structured revision of the existing supervised and unsupervised feature selection methods for mixed data reported in the literature. Additionally, we present an analysis of the main characteristics, advantages, and disadvantages of the feature selection methods reviewed in this survey and discuss some important open challenges and potential future research opportunities in this field.
Similar content being viewed by others
Notes
Also called heterogeneous or assorted data.
The label assigned to each object in the dataset can be a category, an ordered value, or a real value, depending on the specific task.
For the case of UFS methods, class labels are not used in this step.
A parameter given by the user in the range (0, 1) that specifies the average fraction of features per cluster.
References
Aggarwal CC, Reddy CK (2013) Data clustering: algorithms and applications. CRC Press, Boca Raton
Ahmad A, Khan SS (2019) Survey of state-of-the-art mixed data clustering algorithms. IEEE Access 7:31883–31902
Akaike H (1998) Information theory and an extension of the maximum likelihood principle. In: Selected papers of Hirotugu Akaike, pp 199–213. Springer
Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: a review. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications, vol 29. CRC Press, Boca Raton, pp 110–121
Ang JC, Mirzal A, Haron H, Hamed HNA (2016) Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans Comput Biol Bioinf 13(5):971–989. https://doi.org/10.1109/TCBB.2015.2478454
Balaji K, Lavanya K (2018) Clustering algorithms for mixed datasets: a review. Int J Pure Appl Math 118(7):547–556
Barcelo-Rico F, Diez JL (2012) Geometrical codification for clustering mixed categorical and numerical databases. J Intell Inf Syst 39(1):167–185. https://doi.org/10.1007/s10844-011-0187-y
Ben Haj Kacem MA, Ben N’Cir CE, Essoussi N (2015) MapReduce-based k-prototypes clustering method for big data. In: Proceedings of the 2015 IEEE international conference on data science and advanced analytics, DSAA 2015 (October 2015), pp 4–6. https://doi.org/10.1109/DSAA.2015.7344894
Bharti KK, kumar Singh P (2014) A survey on filter techniques for feature selection in text mining. In: Proceedings of the second international conference on soft computing for problem solving (SocProS 2012), 28–30 Dec 2012, pp 1545–1559. Springer
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Feature selection for high-dimensional data. Springer, Berlin. https://doi.org/10.1007/978-3-319-21858-8
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Inc, Belmont, CA
Bruin J (2011) newtest: command to compute new test {@ONLINE}. https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing. https://doi.org/10.1016/j.neucom.2017.11.077
Cantú-Paz E (2001) Supervised and unsupervised discretization methods for evolutionary algorithms. In: Workshop proceedings of the genetic and evolutionary computation conference (GECCO-2001), pp 213–216
Chandra B, Gupta M (2011) An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform 44(4):529–535. https://doi.org/10.1016/j.jbi.2011.01.001
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
Chaudhuri A, Samanta D, Sarma M (2021) Two-stage approach to feature set optimization for unsupervised dataset with heterogeneous attributes. Expert Syst Appl 172(January):114563. https://doi.org/10.1016/j.eswa.2021.114563
Chen D, Yang Y (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334. https://doi.org/10.1109/TFUZZ.2013.2291570
Chen D, Zhang L, Zhao S, Hu Q, Zhu P (2012) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389. https://doi.org/10.1109/TFUZZ.2011.2173695
Chmielewski MR, Grzymala-Busse JW (1996) Global discretization of continuous attributes as preprocessing for machine learning. Int J Approx Reason 15(4):319–331
Coelho F, Braga AP, Verleysen M (2016) A mutual information estimator for continuous and discrete variables applied to feature selection and classification problems. Int J Comput Intell Syst 9(4):726–733. https://doi.org/10.1080/18756891.2016.1204120
Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. Routledge, Abingdon
Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley-Interscience, Hoboken
Dash M, Liu H (2000) Feature selection for clustering. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications. Springer, Berlin, pp 110–121
Dash M, Liu H, Yao J (1997) Dimensionality reduction of unsupervised data. In: Proceedings ninth IEEE international conference on tools with artificial intelligence, pp 532–539. IEEE Computer Society. https://doi.org/10.1109/TAI.1997.632300. http://ieeexplore.ieee.org/document/632300/
Dash R, Paramguru RL, Dash R (2011) Comparative analysis of supervised and unsupervised discretization techniques. Int J Adv Sci Technol 2(3):29–37
De Leon AR, Chough KC (2013) Analysis of mixed data: methods & applications. CRC Press, Boca Raton
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM-Algorithm. JSTOR 39:1–22. https://doi.org/10.2307/2984875
Deng X, Li Y, Weng J, Zhang J (2019) Feature selection for text classification: a review. Multimedia Tools Appl 78(3):3797–3816
Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice Hall, Hoboken
Doan DM, Jeong DH, Ji SY (2020) Designing a feature selection technique for analyzing mixed data. In: 2020 10th annual computing and communication workshop and conference, CCWC 2020, Institute of Electrical and Electronics Engineers Inc., pp 46–52. https://doi.org/10.1109/CCWC47524.2020.9031193
Doquire G, Verleysen M (2011a) An hybrid approach to feature selection for mixed categorical and continuous data. In: Proceedings of the international conference on knowledge discovery and information retrieval, pp 394–401. https://doi.org/10.5220/0003634903940401. http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0003634903940401
Doquire G, Verleysen M (2011b) Mutual information based feature selection for mixed data. In: 19th European symposium on artificial neural networks, computational intelligence and machine learning (ESANN 2011), pp 333–338
Dos Santos TRL, Zárate LE (2015) Categorical data clustering: what similarity measure to recommend? Expert Syst Appl 42(3):1247–1260. https://doi.org/10.1016/j.eswa.2014.09.012
Dutta D, Dutta P, Sil J (2014) Simultaneous feature selection and clustering with mixed features by multi objective genetic algorithm. Int J Hybrid Intell Syst 11(1):41–54
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889. https://doi.org/10.1016/j.patrec.2014.11.006
Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI
Focant I, Hernandez-Lobato D, Ducreux J, Durez P, Toukap AN, Elewaut D, Houssiau FA, Dupont P, Lauwerys B (2011) Feasibility of a molecular diagnosis of arthritis based on the identification of specific transcriptomic profiles in knee synovial biopsies. Arthritis Rheum 63:abstract 1927:S751
Fop M, Murphy TB (2018) Variable selection methods for model-based clustering. Stat Surv 12:18–65. https://doi.org/10.1214/18-SS119
Foss AH, Markatou M, Ray B (2018) Distance metrics and clustering methods for mixed-type data. Int Stat Rev. https://doi.org/10.1111/insr.12274
Fowlkes EB, Gnanadesikan R, Kettenring JR (1988) Variable selection in clustering. J Classif 5(2):205–228. https://doi.org/10.1007/BF01897164
François D, Wertz V, Verleysen M (2006) The permutation test for feature selection by mutual information. In: ESANN 2006 Proceedings—European symposium on artificial neural networks, pp 239–244
García S, Luengo J, Sáez JA, López V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734–750. https://doi.org/10.1109/TKDE.2012.35
Garg VK, Rudin C, Jaakkola T (2016) CRAFT: ClusteR-specific Assorted Feature selecTion. In: Artificial intelligence and statistics, pp 305–313
George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88(423):881–889
George EI, McCulloch RE (1997) Approaches for Bayesian variable selection. Stat Sin 7:339–373
Gniazdowski Z, Grabowski M (2016) Numerical coding of nominal data. arXiv preprint arXiv:1601.01966
Greco S, Matarazzo B, Slowinski R (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47. https://doi.org/10.1016/S0377-2217(00)00167-3
Green PJ (1990) On use of the EM for penalized likelihood estimation. J R Stat Soc Ser B (Methodol) 52(3):443–452
Guyon I, Elisseeff A, De AM (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. https://doi.org/10.1016/j.aca.2011.07.027
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the seventeenth international conference on machine learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, ICML ’00, pp 359–366. http://dl.acm.org/citation.cfm?id=645529.657793
Hancer E, Xue B, Zhang M (2020) A survey on feature selection approaches for clustering. Artif Intell Rev 53(6):4519–4545. https://doi.org/10.1007/s10462-019-09800-w
Hartemink A, Gifford DK (2001) Principled computational methods for the validation and discovery of genetic regulatory networks. Massachusetts Institute of Technology. Ph.D. thesis, Ph. D. dissertation
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Advances in neural information processing systems 18, vol 186, pp 507–514
Hennig C, Liao TF (2013) How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification. J R Stat Soc: Ser C: Appl Stat 62(3):309–369. https://doi.org/10.1111/j.1467-9876.2012.01066.x
Hu Q, Liu J, Yu D (2008a) Mixed feature selection based on granulation and approximation. Knowl-Based Syst 21(4):294–304
Hu Q, Yu D, Liu J, Wu C (2008b) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594. https://doi.org/10.1016/j.ins.2008.05.024
Hua H, Zhao H (2009) A discretization algorithm of continuous attributes based on supervised clustering. In: 2009 Chinese conference on pattern recognition, pp 1–5. IEEE
Huang Z (1997) Clustering large data sets with mixed numeric and categorical values. In: Proceedings of the 1st Pacific–Asia conference on knowledge discovery and data mining (PAKDD), Singapore, pp 21–34
Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2(3):283–304
Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141(3):469–485. https://doi.org/10.1016/S0165-0114(03)00021-6
Jiang SY, Wang LX (2016) Efficient feature selection based on correlation measure between continuous and discrete features. Inf Process Lett 116(2):203–215. https://doi.org/10.1016/j.ipl.2015.07.005
Jović A, Brkić K, Bogunović N (2015a) A review of feature selection methods with applications. In: 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), pp 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458
Jović A, Brkić K, Bogunović N (2015b) A review of feature selection methods with applications. In: 2015 38th international convention on information and communication technology, electronics and microelectronics, MIPRO 2015—proceedings vol #, no May, pp 1200–1205, https://doi.org/10.1109/MIPRO.2015.7160458. http://ieeexplore.ieee.org/document/7160458/
Kerber R (1992) Chimerge: discretization of numeric attributes. In: Proceedings of the tenth national conference on Artificial intelligence, pp 123–128. Aaai Press
Kim KJ, Jun CH (2018) Rough set model based feature selection for mixed-type data with feature space decomposition. Expert Syst Appl 103:196–205. https://doi.org/10.1016/j.eswa.2018.03.010
Koller D, Sahami M (1996) Toward optimal feature selection. Technical report, Stanford InfoLab
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: Machine learning: ECML-94, pp 171–182. Springer
Kotsiantis SB (2011) Feature selection for machine learning classification problems: a recent overview. Artif Intell Rev 42:157–176. https://doi.org/10.1007/s10462-011-9230-1
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E - Stat Nonlinear Soft Matter Phys 69(62):1–16. https://doi.org/10.1103/PhysRevE.69.066138
Kurgan LA, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153. https://doi.org/10.1109/TKDE.2004.1269594
Kwak N (2002) Input feature selection by mutual information based on Parzen window. IEEE Trans Pattern Anal Mach Intell 24(12):1667–1671. https://doi.org/10.1109/TPAMI.2002.1114861
Lam D, Wei M, Wunsch D (2015) Clustering data of mixed categorical and numerical type with unsupervised feature learning. IEEE Access 3:1605–1616. https://doi.org/10.1109/ACCESS.2015.2477216
Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, De Schaetzen V, Duque R, Bersini H, Nowé A (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinf 9(4):1106–1119. https://doi.org/10.1109/TCBB.2012.33
Lee J, Jeong JY, Jun CH (2020) Markov blanket-based universal feature selection for classification and regression of mixed-type data. Expert Syst Appl 158:113398. https://doi.org/10.1016/j.eswa.2020.113398
Lee PY, Loh WP, Chin JF (2017) Feature selection in multimedia: the state-of-the-art review. Image Vis Comput 67:29–42. https://doi.org/10.1016/j.imavis.2017.09.004
Li C, Biswas G (2002) Unsupervised learning with mixed numeric and nominal data. IEEE Trans Knowl Data Eng 14(4):673–690. https://doi.org/10.1109/TKDE.2002.1019208
Li J, Liu H (2017) Challenges of feature selection for big data analytics. IEEE Intell Syst 32(2):9–15. https://doi.org/10.1109/MIS.2017.38
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2016) Feature selection: a data perspective. J Mach Learn Res:1–73. arXiv:1601.07996
Li Y, Li T, Liu H (2017) Recent advances in feature selection and its applications. Knowl Inf Syst 53(3):551–577. https://doi.org/10.1007/s10115-017-1059-8
Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recogn 45(6):2251–2265. https://doi.org/10.1016/j.patcog.2011.12.017
Liang S, Ma A, Yang S, Wang Y, Ma Q (2018) A review of matched-pairs feature selection methods for gene expression data analysis. Comput Struct Biotechnol J 16:88–97. https://doi.org/10.1016/j.csbj.2018.02.005
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Springer, Berlin. https://doi.org/10.1007/978-1-4615-5689-3
Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca Raton
Liu H, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes. In: TAI, p 388. IEEE
Liu H, Yu L, Member SS, Yu L, Member SS (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502. https://doi.org/10.1109/TKDE.2005.66
Liu H, Wei R, Jiang G (2013) A hybrid feature selection scheme for mixed attributes data. Comput Appl Math 32(1):145–161
Liu N (2012) The research of intrusion detection based on mixed clustering algorithm. In: Li Z, Li X, Liu Y, Cai Z (eds) Communications in computer and information science. CCIS, vol 316. Springer, Berlin, pp 92–100. https://doi.org/10.1007/978-3-642-34289-9_11
Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. https://doi.org/10.1007/s11222-007-9033-z
Marbac M, Sedki M (2017) Variable selection for mixed data clustering: a model-based approach. eprint arXiv, arXiv:1703.02293
Marbac M, Sedki M (2019) VarSelLCM: an R/C++ package for variable selection in model-based clustering of mixed-data with missing values. Bioinformatics 35(7):1255–1257. https://doi.org/10.1093/bioinformatics/bty786
Miao J, Niu L (2016) A survey on feature selection. Procedia Comput Sci 91(Itqm):919–926. https://doi.org/10.1016/j.procs.2016.07.111
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell PAMI 24(3):301–312. https://doi.org/10.1109/34.990133
Monti S, Cooper GF (1999) A latent variable model for multivariate discretization. In: AISTATS
Mugunthadevi K, Punitha SC, Punithavalli M (2011) Survey on feature selection in document clustering. Int J Comput Sci Eng 3(3):1240–1244
Niu K, Niu Z, Su Y, Wang C, Lu H, Guan J (2015) A coupled user clustering algorithm based on mixed data for web-based learning systems. Math Probl Eng 2015:747628. https://doi.org/10.1155/2015/747628
Pal SK, Mitra P (2004) Pattern recognition algorithms for data mining, 1st edn. Chapman and Hall/CRC, London
Paul J, Dupont P (2014) Kernel methods for mixed feature selection. In: 22nd European symposium on artificial neural networks, computational intelligence and machine learning, ESANN 2014—proceedings, pp 301–306. Citeseer
Paul J, Dupont P (2015) Kernel methods for heterogeneous feature selection. Neurocomputing 169:187–195. https://doi.org/10.1016/j.neucom.2014.12.098
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356. https://doi.org/10.1007/BF01001956
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. https://doi.org/10.1109/TPAMI.2005.159
Rao VM, Sastry VN (2012) Unsupervised feature ranking based on representation entropy. In: 2012 1st international conference on recent advances in information technology, RAIT-2012, pp 421–425. https://doi.org/10.1109/RAIT.2012.6194631
Remeseiro B, Bolon-Canedo V (2019) A review of feature selection methods in medical applications. Comput Biol Med 112(February):103375. https://doi.org/10.1016/j.compbiomed.2019.103375
Ren M, Liu P, Wang Z, Lü L (2016) An improved kernel clustering algorithm for mixed-type data in network forensic. Int J Secur Appl 10(1):343–354. https://doi.org/10.14257/ijsia.2016.10.1.31
Rudnicki WR, Wrzesień M, Paja W (2013) Feature selection for data and pattern classification
Ruiz-Shulcloper J (2008) Pattern recognition with mixed and incomplete data. Pattern Recognit Image Anal 18(4):563–576. https://doi.org/10.1134/S1054661808040044
Ruiz-Shulcloper J, Abidi M (2002) Logical combinatorial pattern recognition: a review. Citeseer, pp 133–176
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517. https://doi.org/10.1093/bioinformatics/btm344
Sang B, Chen H, Li T, Xu W, Yu H (2020) Incremental approaches for heterogeneous feature selection in dynamic ordered data. Inf Sci 541:475–501. https://doi.org/10.1016/j.ins.2020.06.051
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Sharmin S, Shoyaib M, Ali AA, Khan MAH, Chae O (2019) Simultaneous feature selection and discretization based on mutual information. Pattern Recognit 91:162–174. https://doi.org/10.1016/j.patcog.2019.02.016
Sheikhpour R, Sarram MA, Gharaghani S, Chahooki MAZ (2017) A survey on semi-supervised feature selection methods. Pattern Recognit 64(November 2016):141–158. https://doi.org/10.1016/j.patcog.2016.11.003
Solorio-Fernández S, Martínez-Trinidad JF, Carrasco-Ochoa JA (2017) A new unsupervised spectral feature selection method for mixed data: a filter approach. Pattern Recognit 72:314–326. https://doi.org/10.1016/j.patcog.2017.07.020
Solorio-Fernández S, Martínez-Trinidad JF, Carrasco-Ochoa JA (2019) A supervised filter feature selection method for mixed data based on the spectral gap score. In: Carrasco-Ochoa JA, Martínez-Trinidad JF, Olvera-López JA, Salas J (eds) Pattern recognition. Springer International Publishing, Cham, pp 3–13
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2020a) A review of unsupervised feature selection methods. Artif Intell Rev 53(2):907–948. https://doi.org/10.1007/s10462-019-09682-y
Solorio-Fernández S, Martínez-Trinidad JF, Carrasco-Ochoa JA (2020b) A supervised filter feature selection method for mixed data based on spectral feature selection and information-theory redundancy analysis. Pattern Recogn Lett 138:321–328. https://doi.org/10.1016/j.patrec.2020.07.039
Storlie CB, Myers SM, Katusic SK, Weaver AL, Voigt RG, Croarkin PE, Stoeckel RE, Port JD (2018) Clustering and variable selection in the presence of mixed variable types and missing data. Stat Med 37(19):2884–2899. https://doi.org/10.1002/sim.7697
Su X, Liu F (2018) A survey for study of feature selection based on mutual information. In: Workshop on hyperspectral image and signal processing, evolution in remote sensing, vol 2018-Sept, pp 1–4. https://doi.org/10.1109/WHISPERS.2018.8746913
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Aggarwal CC (ed) Data classification: algorithms and applications. CRC Press, Boca Raton, p 37
Tang W, Mao K (2005) Feature selection algorithm for data with both nominal and continuous features. In: Ho TB, Cheung D, Liu H (eds) Advances in knowledge discovery and data mining: 9th Pacific–Asia conference, PAKDD 2005, Hanoi, Vietnam, 18–20 May 2005. Proceedings, pp 683–688. Springer, Berlin. https://doi.org/10.1007/11430919_78
Tang W, Mao KZ (2007) Feature selection algorithm for mixed data with both nominal and continuous features. Pattern Recognit Lett 28(5):563–571. https://doi.org/10.1016/j.patrec.2006.10.008
Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E (2003) Algorithms for large scale Markov blanket discovery. FLAIRS Conf 2:376–380
van ’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536. https://doi.org/10.1038/415530a
Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24(1):175–186. https://doi.org/10.1007/s00521-013-1368-0
Wang F, Liang J (2016) An efficient feature selection algorithm for hybrid data. Neurocomputing 193:33–41. https://doi.org/10.1016/j.neucom.2016.01.056
Wei M, Chow TWS, Chan RHM (2015) Heterogeneous feature subset selection using mutual information-based feature transformation. Neurocomputing 168:706–718. https://doi.org/10.1016/j.neucom.2015.05.053
Wilks SS (1938) The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat 9(1):60–62. https://doi.org/10.1214/aoms/1177732360
Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34. https://doi.org/10.1613/jair.346
Wong AKC, Chiu DKY (1987) Synthesizing statistical knowledge from incomplete mixed-mode data. IEEE Trans Pattern Anal Mach Intell PAMI PAMI–9(6):796–805. https://doi.org/10.1109/TPAMI.1987.4767986
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, pp 267–273
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 856–863
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224. https://doi.org/10.1145/1014052.1014149
Zhang R, Nie F, Li X, Wei X (2019) Feature selection with multi-view data: a survey. Inf Fusion 50:158–167. https://doi.org/10.1016/j.inffus.2018.11.019
Zhang X, Mei C, Chen D, Li J (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recognit 56:1–15. https://doi.org/10.1016/j.patcog.2016.02.013
Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning, pp 1151–1157. ACM
Zhao ZA, Liu H (2011) Spectral feature selection for data mining. Data mining and knowledge discovery series, 1st edn. Chapman & Hall/CRC, London. https://doi.org/10.1201/b11426
Zheng Z, Lei W, Huan L (2010) Efficient spectral feature selection with minimum redundancy. In: Twenty-fourth AAAI conference on artificial intelligence, pp 1–6
Acknowledgements
The first author gratefully acknowledges to the Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE) for the collaboration grant awarded for the completion of this survey.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Solorio-Fernández, S., Carrasco-Ochoa, J. & Martínez-Trinidad, J.F. A survey on feature selection methods for mixed data. Artif Intell Rev 55, 2821–2846 (2022). https://doi.org/10.1007/s10462-021-10072-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-021-10072-6