Screening rules and information criteria-based analysis of gene expression data
Pages 133 - 139
Abstract
High-dimensional data is becoming increasingly common, and the biomedical field is no exception with the rapid development of technology. There are various methods to deal with high-dimensional gene expression data, but all of them have some shortcomings. In this paper, we address the theory and application of feature screening in ultra-high-dimensional discriminative classification data, with the aim of reducing ultra-high-dimensional data to a size appropriate for the originally proposed sample size, while retaining all important variables. To this end, we propose a variable screening method that sure independence screening methods in conjunction with EBIC information criteria, which can effectively reduce data dimensionality while improving computational efficiency and helping to discover the most informative variables relevant to the target. In this paper, a random simulation sampling method was first used to select parameters and filter variables using randomly sampled data, and the correct selection rate and correct fit rate of the simulation results were higher than those of other approaches, which verified the reliability of the method used. Finally, four sets of real gene expression data were used to further validate the effectiveness of the method in selecting gene expression data features.
References
[1]
Fan J and Lv J. 2008. Sure Independence Screening for Ultrahigh Dimensional Feature Space. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 70(5), 849-911. Retrieved February 12, 2020 from www.jstor.org/stable/20203862
[2]
Hall P and Miller H. 2009. Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems. Journal of Computational and Graphical Statistics, 18(3), 533-550. https://doi.org/10.1198/jcgs.2009.08041
[3]
Fan J, Samworth R and Wu Y. 2009. Ultrahigh Dimensional Feature Selection: Beyond The Linear Model. Journal of Machine Learning Research, 10(5), 2013-2038. https:// .ncbi.nlm.nih.gov/21603590/
[4]
Fan J and Song R. 2010. Sure independence screening in generalized linear models with NP-dimensionality. Annals of Statistics, 38(6), 3567-3604. https://doi.org/10.1214/10-AOS798
[5]
Gaorong Li, Heng Peng, Jun Zhang and Lixing Zhu.2012. ROBUST RANK CORRELATION BASED SCREENING. The Annals of Statistics,40(3), 1846-1887. https://www.jstor.org/stable/41713696
[6]
Zhu L, Li L, Li R and Zhu L. 2011. Modle-Free Feature Screening for Ultrahigh Dimensioonal Data. Journal of the American Statistical Association,106(496), 1464-1475. https://doi.org/10.1198/jasa.2011.tm10563
[7]
Li R, Zhong W and Zhu L. 2012. Feature Screening via Distance Correlation Learning. JASA: Journal of the American Statistical Association. 107(499), 1129-1139. http://
[8]
Cui H, Li R and Zhong W. 2015. Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis. Journal of the American Statistical Association, 110(510), 630-641. http://
[9]
Fan J and Li R. 2001. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Publications of the American Statistical Association, 96(456), 1348-1360. https://www.jstor.org/stable/3085904
[10]
Hoerl A E and Kennard R W. 1970. Ridge Regression: Applications to Nonorthogonal Problems. Technometrics, 12(1), 69–82. https://doi.org/10.2307/1267352
[11]
Frank I E and Friedman J H. 1993. A statistical view of some chemometrics regression tools. (With discussion). Technometrics, 35(2), 109-135. https://doi.org/10.2307/1269656
[12]
Tibshirani R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1), 267–88. http://www.jstor.org/stable/2346178
[13]
Li Yang, Xu Wenfu and Ma Shuangqi. 2018. Research on robust sparse grouping variable selection method for pollution data. Statistics and Information Forum, 33(06), 26-34
[14]
Liu and Jang Wen. 1980. The Akazuchi information criterion AIC and its significance. Practice and understanding of mathematics, 1980(03), 64-72
[15]
WANG H, LI R and TSAI CL. 2007. Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94(3), 553-568. http://www.jstor.org/stable/20441396
[16]
Gene H, Golub, Michael Heath and Grace Wahba. 2012. Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter. Technometrics, 21(2), 215-223. https://doi.org/10.2307/1268518
[17]
Jiahua Chen and Zehua Chen. 2008. Extended Bayesian Information Criteria for Model Selection with Large Model Spaces. Biometrika, 95(3), 759-771. https://www.jstor.org/stable/20441500
Index Terms
- Screening rules and information criteria-based analysis of gene expression data
Index terms have been assigned to the content through auto-classification.
Recommendations
Literature based Bayesian analysis of gene expression data
BIBMW '11: Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine WorkshopsRecent research has focused on incorporating biological function and pathway information into the analysis of gene expression data, partly as a means of compensating for insufficient experimental replications, low signal to noise, lack of ...
Comments
Information & Contributors
Information
Published In

July 2023
199 pages
ISBN:9798400707605
DOI:10.1145/3611450
Copyright © 2023 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 20 August 2023
Check for updates
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
AI2A '23
AI2A '23: 2023 3rd International Conference on Artificial Intelligence, Automation and Algorithms
July 21 - 23, 2023
Beijing, China
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 24Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign inFull Access
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML Format