Abstract
Feature selection, aiming at identifying the most significant subset of features from the original data, plays a prominent role in high-dimensional data processing. To a certain extent, feature selection can mitigate the issue of poor interpretability of deep neural networks (DNNs). Despite recent advancements in DNN-based feature selection, most methods overlook the error control of selected features and lack reproducibility. In this paper, we propose a new method called DeepTD to perform error-controlled feature selection for DNNs, in which artificial decoy features are constructed and subjected to competition with the original features according to the feature importance scores computed from the trained network, enabling p-value-free local false discovery rate (FDR) estimation of selected features. The merits of DeepTD include: a new DNN-derived measure of feature importance combining the weights and gradients of the network; the first algorithm that estimates the local FDR based on DNN-derived scores; confidence assessment of individual selected features; better robustness to small numbers of important features and low FDR thresholds than competition-based FDR control methods, e.g., the knockoff filter. On multiple synthetic datasets, DeepTD accurately estimated the local FDR and empirically controlled the FDR with 10\(\%\) higher power on average than knockoff filter. At lower FDR thresholds, the power of our method has even reached two to three times that of other state-of-the-art methods. DeepTD was also applied to real datasets and selected 31\(\%\)-49\(\%\) more features than alternatives, demonstrating its validity and utility.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The HIV data is openly available in HIVDB at http://hivdb.stanford.edu/pages/published_analysis/genophenoPNAS2006/, Genotypic Predictors of Human Immunodeficiency Virus Type 1 Drug Resistance. The single-cell RNA-Seq data is available at the GEO repository https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87544.
Code availability
The codes of DeepTD are available at http://fugroup.amss.ac.cn/software/TDFDR/DeepTD.html.
Materials Availability
Not applicable.
References
Zuk O, Hechter E, Sunyaev SR, Lander E (2012) The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci 109(4):1193–1198
Dasari CM, Bhukya R (2022) Explainable deep neural networks for novel viral genome prediction. Appl Intell 52(3):3002–3017
Cao Y, Geddes TA, Yang JY, Yang P (2020) Ensemble deep learning in bioinformatics. Nature Machine Intelligence 2(9):500–508
Manifold B, Men S, Hu R, Fu D (2021) A versatile deep learning architecture for classification and label-free prediction of hyperspectral images. Nature Machine Intelligence 3(4):306–315
Gui Y, Li D, Fang R (2023) A fast adaptive algorithm for training deep neural networks. Appl Intell 53(4):4099–4108
Garson GD (1991) Interpreting neural-network connection weights. AI Expert 6(4):46–51
Ruck DW, Rogers SK, Kabrisky M (1990) Feature selection using a multilayer perceptron. Journal of Neural Network Computing 2(2):40–48
Song Z, Li J (2021) Variable selection with false discovery rate control in deep neural networks. Nature Machine Intelligence 3(5):426–433
Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In International Conference on Machine Learning, pages 3145–3153. PMLR
Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In International Conference on Machine Learning, pages 3319–3328. PMLR
Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144
Lundberg SM, Su-In Lee (2017) A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30
Ghorbani A, Abid A, Zou J (2019) Interpretation of neural networks is fragile. In Proceedings of the AAAI conference on artificial intelligence 33:3681–3688
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodol) 57(1):289–300
Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96(456):1151–1160
Lu Y, Fan Y, Lv J, Noble WS (2018) Deeppink: reproducible feature selection in deep neural networks 31
Zhu G, Zhao T (2021) Deep-gknock: Nonlinear group-feature selection with deep neural networks. Neural Netw 135:139–147
Sesia M, Katsevich E, Bates S, Candès E, Sabatti C (2020) Multi-resolution localization of causal variants across the genome. Nat Commun 11(1):1093
Zhao X, Li W, Chen H, Wang Y, Chen Y, John V (2022) Distribution-dependent feature selection for deep neural networks. Appl Intell 52(4):4432–4442
Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4(3):207–214
He K, Fu Y, Zeng WF, Luo L, Chi H, Liu C, Qing LY, Sun RX, He SM (2015) A theoretical foundation of the target-decoy search strategy for false discovery rate control in proteomics. arXiv:1501.00537
He K, Li M, Fu Y, Gong F, Sun X (2018) A direct approach to false discovery rates by decoy permutations. arXiv:1804.08222
He K, Li M, Fu Y, Gong F, Sun X (2022) Null-free false discovery rate control using decoy permutations. Acta Math Appl Sin Engl Ser 38(2):235–253
Freestone J, Short T, Noble WS, Keich U (2022) Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition. Bioinformatics, 38(Supplement_2):ii82–ii88
Barber RF, Candès EJ (2015) Controlling the false discovery rate via knockoffs. Ann Stat 43(5):2055–2085
Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80(3):551–577
Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination press San Francisco, CA, USA
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 315–323. JMLR Workshop and Conference Proceedings
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188
Efron B, Tibshirani R (2002) Empirical bayes methods and false discovery rates for microarrays. Genet Epidemiol 23(1):70–86
Robin S, Bar-Hen A, Daudin JJ, Pierre L (2007) A semi-parametric approach for mixture models: Application to local false discovery rate estimation. Computational Statistics & Data Analysis 51(12):5483–5493
Guedj M, Robin S, Celisse A, Nuel G (2009) Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation. BMC Bioinformatics 10:1–12
Bickel DR, Rahal A (2021) Correcting false discovery rates for their bias toward false positives. Communications in Statistics-Simulation and Computation 50(11):3699–3713
Sun X, Fu Y (2024) Local false discovery rate estimation with competition-based procedures for variable selection. Stat Med 43(1):61–88
Rhee SY, Taylor J, Wadhera G, Ben-Hur A, Brutlag DL, Shafer RW (2006) Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc Natl Acad Sci 103(46):17355–17360
Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA (2015) The technology and biology of single-cell rna sequencing. Mol Cell 58(4):610–620
Chen R, Wu X, Jiang L, Zhang Y (2017) Single-cell rna-seq reveals hypothalamic cell diversity. Cell Rep 18(13):3227–3241
Funding
This work was supported by the National Key R&D Program of China (Grants 2022YFA1004801 and 2022YFA1304603) and the National Natural Science Foundation of China (Grant 32070668).
Author information
Authors and Affiliations
Contributions
Yan Fu conceived and supervised the study. All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Zixuan Cao and Xiaoya Sun. The first draft of the manuscript was written by Zixuan Cao and all authors revised previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cao, Z., Sun, X. & Fu, Y. Deep neural network-based feature selection with local false discovery rate estimation. Appl Intell 55, 32 (2025). https://doi.org/10.1007/s10489-024-05944-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-05944-7