Skip to main content

Advertisement

Deep neural network-based feature selection with local false discovery rate estimation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection, aiming at identifying the most significant subset of features from the original data, plays a prominent role in high-dimensional data processing. To a certain extent, feature selection can mitigate the issue of poor interpretability of deep neural networks (DNNs). Despite recent advancements in DNN-based feature selection, most methods overlook the error control of selected features and lack reproducibility. In this paper, we propose a new method called DeepTD to perform error-controlled feature selection for DNNs, in which artificial decoy features are constructed and subjected to competition with the original features according to the feature importance scores computed from the trained network, enabling p-value-free local false discovery rate (FDR) estimation of selected features. The merits of DeepTD include: a new DNN-derived measure of feature importance combining the weights and gradients of the network; the first algorithm that estimates the local FDR based on DNN-derived scores; confidence assessment of individual selected features; better robustness to small numbers of important features and low FDR thresholds than competition-based FDR control methods, e.g., the knockoff filter. On multiple synthetic datasets, DeepTD accurately estimated the local FDR and empirically controlled the FDR with 10\(\%\) higher power on average than knockoff filter. At lower FDR thresholds, the power of our method has even reached two to three times that of other state-of-the-art methods. DeepTD was also applied to real datasets and selected 31\(\%\)-49\(\%\) more features than alternatives, demonstrating its validity and utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The HIV data is openly available in HIVDB at http://hivdb.stanford.edu/pages/published_analysis/genophenoPNAS2006/, Genotypic Predictors of Human Immunodeficiency Virus Type 1 Drug Resistance. The single-cell RNA-Seq data is available at the GEO repository https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87544.

Code availability

The codes of DeepTD are available at http://fugroup.amss.ac.cn/software/TDFDR/DeepTD.html.

Materials Availability

Not applicable.

References

  1. Zuk O, Hechter E, Sunyaev SR, Lander E (2012) The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci 109(4):1193–1198

    Article  Google Scholar 

  2. Dasari CM, Bhukya R (2022) Explainable deep neural networks for novel viral genome prediction. Appl Intell 52(3):3002–3017

    Article  MATH  Google Scholar 

  3. Cao Y, Geddes TA, Yang JY, Yang P (2020) Ensemble deep learning in bioinformatics. Nature Machine Intelligence 2(9):500–508

    Article  MATH  Google Scholar 

  4. Manifold B, Men S, Hu R, Fu D (2021) A versatile deep learning architecture for classification and label-free prediction of hyperspectral images. Nature Machine Intelligence 3(4):306–315

    Article  MATH  Google Scholar 

  5. Gui Y, Li D, Fang R (2023) A fast adaptive algorithm for training deep neural networks. Appl Intell 53(4):4099–4108

    Article  MATH  Google Scholar 

  6. Garson GD (1991) Interpreting neural-network connection weights. AI Expert 6(4):46–51

    MATH  Google Scholar 

  7. Ruck DW, Rogers SK, Kabrisky M (1990) Feature selection using a multilayer perceptron. Journal of Neural Network Computing 2(2):40–48

    MATH  Google Scholar 

  8. Song Z, Li J (2021) Variable selection with false discovery rate control in deep neural networks. Nature Machine Intelligence 3(5):426–433

    Article  MATH  Google Scholar 

  9. Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In International Conference on Machine Learning, pages 3145–3153. PMLR

  10. Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In International Conference on Machine Learning, pages 3319–3328. PMLR

  11. Ribeiro MT, Singh S, Guestrin C (2016) “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144

  12. Lundberg SM, Su-In Lee (2017) A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30

  13. Ghorbani A, Abid A, Zou J (2019) Interpretation of neural networks is fragile. In Proceedings of the AAAI conference on artificial intelligence 33:3681–3688

  14. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodol) 57(1):289–300

    Article  MathSciNet  MATH  Google Scholar 

  15. Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96(456):1151–1160

    Article  MathSciNet  MATH  Google Scholar 

  16. Lu Y, Fan Y, Lv J, Noble WS (2018) Deeppink: reproducible feature selection in deep neural networks 31

  17. Zhu G, Zhao T (2021) Deep-gknock: Nonlinear group-feature selection with deep neural networks. Neural Netw 135:139–147

    Article  MATH  Google Scholar 

  18. Sesia M, Katsevich E, Bates S, Candès E, Sabatti C (2020) Multi-resolution localization of causal variants across the genome. Nat Commun 11(1):1093

    Article  MATH  Google Scholar 

  19. Zhao X, Li W, Chen H, Wang Y, Chen Y, John V (2022) Distribution-dependent feature selection for deep neural networks. Appl Intell 52(4):4432–4442

    Article  MATH  Google Scholar 

  20. Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4(3):207–214

    Article  MATH  Google Scholar 

  21. He K, Fu Y, Zeng WF, Luo L, Chi H, Liu C, Qing LY, Sun RX, He SM (2015) A theoretical foundation of the target-decoy search strategy for false discovery rate control in proteomics. arXiv:1501.00537

  22. He K, Li M, Fu Y, Gong F, Sun X (2018) A direct approach to false discovery rates by decoy permutations. arXiv:1804.08222

  23. He K, Li M, Fu Y, Gong F, Sun X (2022) Null-free false discovery rate control using decoy permutations. Acta Math Appl Sin Engl Ser 38(2):235–253

    Article  MathSciNet  MATH  Google Scholar 

  24. Freestone J, Short T, Noble WS, Keich U (2022) Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition. Bioinformatics, 38(Supplement_2):ii82–ii88

  25. Barber RF, Candès EJ (2015) Controlling the false discovery rate via knockoffs. Ann Stat 43(5):2055–2085

    Article  MathSciNet  MATH  Google Scholar 

  26. Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80(3):551–577

    Article  MathSciNet  MATH  Google Scholar 

  27. Nielsen MA (2015) Neural networks and deep learning, vol 25. Determination press San Francisco, CA, USA

    MATH  Google Scholar 

  28. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 315–323. JMLR Workshop and Conference Proceedings

  29. Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188

    Article  MathSciNet  MATH  Google Scholar 

  30. Efron B, Tibshirani R (2002) Empirical bayes methods and false discovery rates for microarrays. Genet Epidemiol 23(1):70–86

    Article  MATH  Google Scholar 

  31. Robin S, Bar-Hen A, Daudin JJ, Pierre L (2007) A semi-parametric approach for mixture models: Application to local false discovery rate estimation. Computational Statistics & Data Analysis 51(12):5483–5493

    Article  MathSciNet  MATH  Google Scholar 

  32. Guedj M, Robin S, Celisse A, Nuel G (2009) Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation. BMC Bioinformatics 10:1–12

    Article  MATH  Google Scholar 

  33. Bickel DR, Rahal A (2021) Correcting false discovery rates for their bias toward false positives. Communications in Statistics-Simulation and Computation 50(11):3699–3713

    Article  MathSciNet  MATH  Google Scholar 

  34. Sun X, Fu Y (2024) Local false discovery rate estimation with competition-based procedures for variable selection. Stat Med 43(1):61–88

  35. Rhee SY, Taylor J, Wadhera G, Ben-Hur A, Brutlag DL, Shafer RW (2006) Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc Natl Acad Sci 103(46):17355–17360

    Article  Google Scholar 

  36. Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA (2015) The technology and biology of single-cell rna sequencing. Mol Cell 58(4):610–620

    Article  Google Scholar 

  37. Chen R, Wu X, Jiang L, Zhang Y (2017) Single-cell rna-seq reveals hypothalamic cell diversity. Cell Rep 18(13):3227–3241

Download references

Funding

This work was supported by the National Key R&D Program of China (Grants 2022YFA1004801 and 2022YFA1304603) and the National Natural Science Foundation of China (Grant 32070668).

Author information

Authors and Affiliations

Authors

Contributions

Yan Fu conceived and supervised the study. All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Zixuan Cao and Xiaoya Sun. The first draft of the manuscript was written by Zixuan Cao and all authors revised previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yan Fu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, Z., Sun, X. & Fu, Y. Deep neural network-based feature selection with local false discovery rate estimation. Appl Intell 55, 32 (2025). https://doi.org/10.1007/s10489-024-05944-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05944-7

Keywords