Abstract
Vertical federated learning (VFL) is an emerging paradigm for cross-silo organizations to build more accurate machine learning (ML) models. In this setting, multiple organizations (i.e., parties) hold the same set of samples with different features. However, different parties may have redundant or highly correlated features, leading to inefficient and ineffective VFL model training. Effective feature selection in VFL is therefore essential to mitigate such a problem and improve model effectiveness, as well as computation and communication efficiency. To this end, in this paper, we propose a federated feature selection framework, called FEAST, which leverages conditional mutual information (CMI) to select more informative features while having low redundancy. Furthermore, we design a communication-efficient method to reduce the information exchanged among the parties while protecting the parties' raw data. Extensive experiments on four real-world datasets demonstrate that the proposed framework achieves state-of-the-art performance in terms of accuracy, communication and computation costs.
Supplemental Material
- Naoual El Aboudi and Laila Benhlima. 2016. Review on wrapper feature selection approaches. 2016 International Conference on Engineering & MIS (ICEMIS) (2016), 1--5.Google ScholarCross Ref
- Javier Apolloni, Guillermo Leguizamó n, and Enrique Alba. 2016. Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl. Soft Comput., Vol. 38 (2016), 922--932. https://doi.org/10.1016/j.asoc.2015.10.037Google ScholarDigital Library
- Roberto Battiti. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks, Vol. 5, 4 (1994), 537--550. https://doi.org/10.1109/72.298224Google ScholarDigital Library
- Sebastian Baunsgaard, Matthias Boehm, Ankit Chaudhary, Behrouz Derakhshan, Stefan Geißelsö der, Philipp M. Grulich, Michael Hildebrand, Kevin Innerebner, Volker Markl, Claus Neubauer, Sarah Osterburg, Olga Ovcharenko, Sergey Redyuk, Tobias Rieger, Alireza Rezaei Mahdiraji, Sebastian Benjamin Wrede, and Steffen Zeuch. 2021. ExDRa: Exploratory Data Science on Federated Raw Data. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 2450--2463. https://doi.org/10.1145/3448016.3457549Google ScholarDigital Library
- Mohamed Bennasar, Yulia Hicks, and Rossitza Setchi. 2015. Feature selection using Joint Mutual Information Maximisation. Expert Syst. Appl., Vol. 42, 22 (2015), 8520--8532. https://doi.org/10.1016/j.eswa.2015.07.007Google ScholarDigital Library
- Akash Bharadwaj and Graham Cormode. 2022. An Introduction to Federated Computation. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 2448--2451. https://doi.org/10.1145/3514221.3522561Google ScholarDigital Library
- Kallista A. Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloé Kiddon, Jakub Konevc ný, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. In Proceedings of Machine Learning and Systems 2019, MLSys 2019, Stanford, CA, USA, March 31 - April 2, 2019, Ameet Talwalkar, Virginia Smith, and Matei Zaharia (Eds.). mlsys.org. https://proceedings.mlsys.org/book/271.pdfGoogle Scholar
- Leo Breiman. 2001. Random Forests. Mach. Learn., Vol. 45, 1 (2001), 5--32. https://doi.org/10.1023/A:1010933404324Google ScholarDigital Library
- Jie Cai, Jiawei Luo, Shulin Wang, and Sheng Yang. 2018. Feature selection in machine learning: A new perspective. Neurocomputing, Vol. 300 (2018), 70--79. https://doi.org/10.1016/j.neucom.2017.11.077Google ScholarCross Ref
- Shaofeng Cai, Kaiping Zheng, Gang Chen, H. V. Jagadish, Beng Chin Ooi, and Meihui Zhang. 2021. ARM-Net: Adaptive Relation Modeling Network for Structured Data. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021. 207--220.Google ScholarDigital Library
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). ACM, 785--794. https://doi.org/10.1145/2939672.2939785Google ScholarDigital Library
- Tianyi Chen, Xiao Jin, Yuejiao Sun, and Wotao Yin. 2020. VAFL: a Method of Vertical Asynchronous Federated Learning. CoRR, Vol. abs/2007.06081 (2020). showeprint[arXiv]2007.06081 https://arxiv.org/abs/2007.06081Google Scholar
- Zhijun Chen, Chaozhong Wu, Yishi Zhang, Zhen Huang, Bin Ran, Ming Zhong, and Nengchao Lyu. 2015. Feature selection with redundancy-complementariness dispersion. Knowl. Based Syst., Vol. 89 (2015), 203--217. https://doi.org/10.1016/j.knosys.2015.07.004Google ScholarDigital Library
- Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, Dimitrios Papadopoulos, and Qiang Yang. 2021. SecureBoost: A Lossless Federated Learning Framework. IEEE Intell. Syst., Vol. 36, 6 (2021), 87--98. https://doi.org/10.1109/MIS.2021.3082561Google ScholarDigital Library
- Isabel F. Cruz, Roberto Tamassia, and Danfeng Yao. 2007. Privacy-Preserving Schema Matching Using Mutual Information. In Data and Applications Security XXI, 21st Annual IFIP WG 11.3 Working Conference on Data and Applications Security, Redondo Beach, CA, USA, July 8--11, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4602), Steve Barker and Gail-Joon Ahn (Eds.). Springer, 93--94. https://doi.org/10.1007/978--3--540--73538-0_7Google ScholarCross Ref
- Jian Dai, Meihui Zhang, Gang Chen, Ju Fan, Kee Yuan Ngiam, and Beng Chin Ooi. 2018. Fine-grained Concept Linking using Neural Networks in Healthcare. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2019, Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein (Eds.). ACM, 51--66. https://doi.org/10.1145/3183713.3196907Google ScholarDigital Library
- Francc ois Fleuret. 2004. Fast Binary Feature Selection with Conditional Mutual Information. J. Mach. Learn. Res., Vol. 5 (2004), 1531--1555. http://jmlr.org/papers/volume5/fleuret04a/fleuret04a.pdfGoogle ScholarDigital Library
- Fangcheng Fu, Yingxia Shao, Lele Yu, Jiawei Jiang, Huanran Xue, Yangyu Tao, and Bin Cui. 2021. VF(^2 )Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 563--576. https://doi.org/10.1145/3448016.3457241Google ScholarDigital Library
- Fangcheng Fu, Huanran Xue, Yong Cheng, Yangyu Tao, and Bin Cui. 2022. BlindFL: Vertical Federated Machine Learning without Peeking into Your Data. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 1316--1330. https://doi.org/10.1145/3514221.3526127Google ScholarDigital Library
- Sainyam Galhotra, Karthikeyan Shanmugam, Prasanna Sattigeri, and Kush R. Varshney. 2022. Causal Feature Selection for Algorithmic Fairness. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 276--285. https://doi.org/10.1145/3514221.3517909Google ScholarDigital Library
- Wanfu Gao, Liang Hu, Ping Zhang, and Jialong He. 2018. Feature selection considering the composition of feature relevancy. Pattern Recognit. Lett., Vol. 112 (2018), 70--74. https://doi.org/10.1016/j.patrec.2018.06.005Google ScholarCross Ref
- Salvador Garc'i a, Juliá n Luengo, José Antonio Sá ez, Victoria Ló pez, and Francisco Herrera. 2013. A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning. IEEE Trans. Knowl. Data Eng., Vol. 25, 4 (2013), 734--750. https://doi.org/10.1109/TKDE.2012.35Google ScholarDigital Library
- Muon Ha and Yulia A. Shichkina. 2022. Translating a Distributed Relational Database to a Document Database. Data Sci. Eng., Vol. 7, 2 (2022), 136--155. https://doi.org/10.1007/s41019-022-00181--9Google Scholar
- Xiaofei He, Ming Ji, Chiyuan Zhang, and Hujun Bao. 2011. A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 33, 10 (2011), 2013--2025. https://doi.org/10.1109/TPAMI.2011.44Google ScholarDigital Library
- David W. Hosmer and Stanley Lemeshow. 2000. Applied Logistic Regression, Second Edition. Wiley. https://doi.org/10.1002/0471722146Google Scholar
- Ronald A. Howard. 1966. Information Value Theory. IEEE Trans. Syst. Sci. Cybern., Vol. 2, 1 (1966), 22--26. https://doi.org/10.1109/TSSC.1966.300074Google ScholarCross Ref
- Yaochen Hu, Di Niu, Jianming Yang, and Shengping Zhou. 2019. FDML: A Collaborative Machine Learning Framework for Distributed Features. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4--8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Ró mer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 2232--2240. https://doi.org/10.1145/3292500.3330765Google ScholarDigital Library
- Samina Khalid, Tehmina Khalil, and Shamila Nasreen. 2014. A survey of feature selection and feature extraction techniques in machine learning. 2014 Science and Information Conference (2014), 372--378.Google ScholarCross Ref
- John Kieffer. 1994. Elements of Information Theory (Thomas M. Cover and Joy A. Thomas). SIAM Rev., Vol. 36, 3 (1994), 509--511. https://doi.org/10.1137/1036124Google ScholarCross Ref
- David D. Lewis. 1992. Feature Selection and Feature Extraction for Text Categorization. In Proceedings of the Workshop on Speech and Natural Language. Association for Computational Linguistics, 212??17.Google ScholarDigital Library
- Xiling Li, Rafael Dowsley, and Martine De Cock. 2021b. Privacy-Preserving Feature Selection with Secure Multiparty Computation. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18--24 July 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 6326--6336. http://proceedings.mlr.press/v139/li21e.htmlGoogle Scholar
- Zitao Li, Bolin Ding, Ce Zhang, Ninghui Li, and Jingren Zhou. 2021a. Federated Matrix Factorization with Privacy Guarantee. Proc. VLDB Endow., Vol. 15, 4 (2021), 900--913. https://doi.org/10.14778/3503585.3503598Google ScholarDigital Library
- Dahua Lin and Xiaoou Tang. 2006. Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion. In Computer Vision - ECCV 2006, 9th European Conference on Computer Vision, Graz, Austria, May 7--13, 2006, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 3951), Ales Leonardis, Horst Bischof, and Axel Pinz (Eds.). Springer, 68--82. https://doi.org/10.1007/11744023_6Google ScholarDigital Library
- Junxu Liu, Jian Lou, Li Xiong, Jinfei Liu, and Xiaofeng Meng. 2021b. Projected Federated Averaging with Heterogeneous Differential Privacy. Proc. VLDB Endow., Vol. 15, 4 (2021), 828--840. https://doi.org/10.14778/3503585.3503592Google ScholarDigital Library
- Yang Liu, Tao Fan, Tianjian Chen, Qian Xu, and Qiang Yang. 2021a. FATE: An Industrial Grade Platform for Collaborative Learning With Data Protection. J. Mach. Learn. Res., Vol. 22 (2021), 226:1--226:6. http://jmlr.org/papers/v22/20--815.htmlGoogle Scholar
- Yang Liu, Yingting Liu, Zhijie Liu, Yuxuan Liang, Chuishi Meng, Junbo Zhang, and Yu Zheng. 2022. Federated Forest. IEEE Trans. Big Data, Vol. 8, 3 (2022), 843--854. https://doi.org/10.1109/TBDATA.2020.2992755Google ScholarCross Ref
- Yejia Liu, Weiyuan Wu, Lampros Flokas, Jiannan Wang, and Eugene Wu. 2021c. Enabling SQL-based Training Data Debugging for Federated Learning. Proc. VLDB Endow., Vol. 15, 3 (2021), 388--400. https://doi.org/10.14778/3494124.3494125Google ScholarDigital Library
- Xinjian Luo, Yuncheng Wu, Xiaokui Xiao, and Beng Chin Ooi. 2021a. Feature Inference Attack on Model Predictions in Vertical Federated Learning. In ICDE. 181--192.Google Scholar
- Zhaojing Luo, Sai Ho Yeung, Meihui Zhang, Kaiping Zheng, Lei Zhu, Gang Chen, Feiyi Fan, Qian Lin, Kee Yuan Ngiam, and Beng Chin Ooi. 2021b. MLCask: Efficient Management of Component Evolution in Collaborative Data Analytics Pipelines. In ICDE. 1655--1666.Google Scholar
- Panagiotis Mandros, Mario Boley, and Jilles Vreeken. 2020. Discovering dependencies with reliable mutual information. Knowl. Inf. Syst., Vol. 62, 11 (2020), 4223--4253. https://doi.org/10.1007/s10115-020-01494--9Google ScholarCross Ref
- Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agü era y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20--22 April 2017, Fort Lauderdale, FL, USA (Proceedings of Machine Learning Research, Vol. 54), Aarti Singh and Xiaojin (Jerry) Zhu (Eds.). PMLR, 1273--1282. http://proceedings.mlr.press/v54/mcmahan17a.htmlGoogle Scholar
- Ramakrishnan Muthukrishnan and R. Rohini. 2016. LASSO: A feature selection technique in predictive modeling for machine learning. 2016 IEEE International Conference on Advances in Computer Applications (ICACA) (2016), 18--20.Google ScholarCross Ref
- Bach Hoai Nguyen, Bing Xue, Ivy Liu, and Mengjie Zhang. 2014. Filter based backward elimination in wrapper based PSO for feature selection in classification. In Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2014, Beijing, China, July 6--11, 2014. IEEE, 3111--3118. https://doi.org/10.1109/CEC.2014.6900657Google ScholarCross Ref
- Milos Nikolic, Haozhe Zhang, Ahmet Kara, and Dan Olteanu. 2020. F-IVM: Learning over Fast-Evolving Relational Data. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020. 2773--2776.Google ScholarDigital Library
- Shinji Ono, Jun Takata, Masaharu Kataoka, Tomohiro I, Kilho Shin, and Hiroshi Sakamoto. 2022. Privacy-Preserving Feature Selection with Fully Homomorphic Encryption. Algorithms (2022).Google Scholar
- Beng Chin Ooi, Kian-Lee Tan, Sheng Wang, Wei Wang, Qingchao Cai, Gang Chen, Jinyang Gao, Zhaojing Luo, Anthony K. H. Tung, Yuan Wang, Zhongle Xie, Meihui Zhang, and Kaiping Zheng. 2015. SINGA: A Distributed Deep Learning Platform. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM. 685--688.Google ScholarDigital Library
- Hanchuan Peng, Fuhui Long, and Chris H. Q. Ding. 2005. Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 27, 8 (2005), 1226--1238. https://doi.org/10.1109/TPAMI.2005.159Google ScholarDigital Library
- Fré dé ric Pennerath, Panagiotis Mandros, and Jilles Vreeken. 2020. Discovering Approximate Functional Dependencies using Smoothed Mutual Information. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23--27, 2020, Rajesh Gupta, Yan Liu, Jiliang Tang, and B. Aditya Prakash (Eds.). ACM, 1254--1264. https://doi.org/10.1145/3394486.3403178Google ScholarDigital Library
- Simone Romano, Xuan Vinh Nguyen, James Bailey, and Karin Verspoor. 2016. A Framework to Adjust Dependency Measure Estimates for Chance. In Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5--7, 2016, Sanjay Chawla Venkatasubramanian and Wagner Meira Jr. (Eds.). SIAM, 423--431. https://doi.org/10.1137/1.9781611974348.48Google ScholarCross Ref
- Thé o Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach. 2018. A generic framework for privacy preserving deep learning. CoRR, Vol. abs/1811.04017 (2018). showeprint[arXiv]1811.04017 http://arxiv.org/abs/1811.04017Google Scholar
- Ricardo Salazar, Felix Neutatz, and Ziawasch Abedjan. 2021. Automated Feature Engineering for Algorithmic Fairness. Proc. VLDB Endow., Vol. 14, 9 (2021), 1694--1702. https://doi.org/10.14778/3461535.3463474Google ScholarDigital Library
- Monica Scannapieco, Ilya Figotin, Elisa Bertino, and Ahmed K. Elmagarmid. 2007. Privacy preserving schema and data matching. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12--14, 2007, Chee Yong Chan, Beng Chin Ooi, and Aoying Zhou (Eds.). ACM, 653--664. https://doi.org/10.1145/1247480.1247553Google ScholarDigital Library
- Patrick Schober, Christa Boer, and Lothar A. Schwarte. 2018. Correlation Coefficients: Appropriate Use and Interpretation. Anesthesia & Analgesia, Vol. 126 (2018), 1763-768.Google ScholarCross Ref
- Bharat Singh, Nidhi Kushwaha, and Om Prakash Vyas. 2014. A Feature Subset Selection Technique for High Dimensional Data Using Symmetric Uncertainty.Google Scholar
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., Vol. 15, 1 (2014), 1929--1958. https://doi.org/10.5555/2627435.2670313Google ScholarDigital Library
- Ingo Steinwart and Andreas Christmann. 2008. Support Vector Machines. Springer.Google Scholar
- Chih-Fong Tsai and Yu-Chi Chen. 2019. The optimal combination of feature selection and data discretization: An empirical study. Inf. Sci., Vol. 505 (2019), 282--293. https://doi.org/10.1016/j.ins.2019.07.091Google ScholarDigital Library
- Sonal Tuteja and Rajeev Kumar. 2022. A Unification of Heterogeneous Data Sources into a Graph Model in E-commerce. Data Sci. Eng., Vol. 7, 1 (2022), 57--70. https://doi.org/10.1007/s41019-021-00174-0Google Scholar
- Wei Wang, Jinyang Gao, Meihui Zhang, Sheng Wang, Gang Chen, Teck Khim Ng, Beng Chin Ooi, Jie Shao, and Moaz Reyad. 2018. Rafiki: Machine Learning as an Analytics Service System. Proc. VLDB Endow., Vol. 12, 2 (2018), 128--140.Google ScholarDigital Library
- Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen, and Beng Chin Ooi. 2020. Privacy Preserving Vertical Federated Learning for Tree-based Models. Proc. VLDB Endow., Vol. 13, 11 (2020), 2090--2103. http://www.vldb.org/pvldb/vol13/p2090-wu.pdfGoogle ScholarDigital Library
- Howard Hua Yang and John E. Moody. 1999. Feature Selection Based on Joint Mutual Information.Google ScholarDigital Library
- Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol., Vol. 10, 2 (2019), 12:1--12:19. https://doi.org/10.1145/3298981Google ScholarDigital Library
- Zhenkun Yang, Chuanhui Yang, Fusheng Han, Mingqiang Zhuang, Bing Yang, Zhifeng Yang, Xiaojun Cheng, Yuzhong Zhao, Wenhui Shi, Huafeng Xi, Huang Yu, Bin Liu, Yi Pan, Boxue Yin, Junquan Chen, and Quanqing Xu. 2022. OceanBase: A 707 Million tpmC Distributed Relational Database System. Proceedings of the VLDB Endowment, Vol. 15, 12 (2022), 3385--3397.Google ScholarDigital Library
- Li Zhang and Xiaobo Chen. 2021. Feature Selection Methods Based on Symmetric Uncertainty Coefficients and Independent Classification Information. IEEE Access, Vol. 9 (2021), 13845--13856. https://doi.org/10.1109/ACCESS.2021.3049815Google ScholarCross Ref
- Zheng Zhao, Ruiwen Zhang, James Cox, David Duling, and Warren Sarle. 2013. Massively parallel feature selection: an approach based on variance preservation. Mach. Learn., Vol. 92, 1 (2013), 195--220. https://doi.org/10.1007/s10994-013--5373--4Google ScholarCross Ref
- Kaiping Zheng, Shaofeng Cai, Horng Ruey Chua, Melanie Herschel, Meihui Zhang, and Beng Chin Ooi. 2022. DyHealth: Making Neural Networks Dynamic for Effective Healthcare Analytics. Proc. VLDB Endow., Vol. 15, 12 (2022), 3445--3458.Google ScholarDigital Library
Index Terms
- FEAST: A Communication-efficient Federated Feature Selection Framework for Relational Data
Recommendations
Nearest neighbor estimate of conditional mutual information in feature selection
Mutual information (MI) is used in feature selection to evaluate two key-properties of optimal features, the relevance of a feature to the class variable and the redundancy of similar features. Conditional mutual information (CMI), i.e., MI of the ...
Feature selection by optimizing a lower bound of conditional mutual information
A new relationship between Bayesian error and mutual information.A unified framework for information theory based feature selection.A novel information theory based feature selection method.A new evaluation metric for feature selection precision. A ...
A Feature Selection Method Using Conditional Correlation Dispersion and Redundancy Analysis
AbstractMany irrelevant and redundant features are commonly found in high-dimensional small sample data. Feature selection effectively solves high-dimensional minor sample problems by removing many irrelevant and redundant features and improving the ...
Comments