Abstract
Hierarchical Feature Selection (HFS) is an under-explored subarea of machine learning/ data mining. Unlike conventional (flat) feature selection algorithms, HFS algorithms work by exploiting hierarchical (generalizationspecialization) relationships between features, in order to improve the predictive accuracy of classifiers. The basic idea is to remove hierarchical redundancy between features, where the presence of a feature in an instance implies the presence of all ancestors of that feature in that instance. By using an HFS algorithm to select a feature subset where the hierarchical redundancy among features is eliminated or reduced, and then giving only the selected feature subset to a classification algorithm, it is possible to improve the predictive accuracy of classification algorithms. In terms of applications, this thesis focuses on datasets of aging-related genes. This type of dataset is an interesting type of application for machine learning/data mining methods due to the technical difficulty and ethical issues associated with doing aging experiments with humans and the strategic importance of research on the biology of aging, since old age is the greatest risk factor for a number of diseases, but is still a not well understood biological process.
- Wan, C., & Freitas, A. A. (2013, Dec.). Prediction of the pro-longevity or anti-longevity effect of Caenorhabditis Elegans genes based on Bayesian classification methods. In Proc. IEEE international conference on bioinformatics and biomedicine (BIBM) (p. 373--380). Shanghai, China.Google Scholar
- Wan, C., & Freitas, A. A. (2015, Sept.). Two methods for constructing a gene ontology-based feature selection network for a Bayesian network classifier and applications to datasets of aging-related genes. In Proc. the sixth ACM conference on bioinformatics, computational biology and health informatics (ACM-BCB) (p. 27--36). Atlanta, USA. Google ScholarDigital Library
- Wan, C., Freitas, A. A., & de Magalhães, J. P. (2015, Mar.). Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12(2), 262--275. Google ScholarDigital Library
Index Terms
- Novel hierarchical feature selection algorithms for predicting genes' aging-related function
Recommendations
Two methods for constructing a gene ontology-based feature network for a Bayesian network classifier and applications to datasets of aging-related genes
BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health InformaticsIn the context of the classification task of data mining or machine learning, hierarchical feature selection methods exploit hierarchical relationships among features in order to select a subset of features without hierarchical redundancy. Hierarchical ...
Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods
Ageing is a highly complex biological process that is still poorly understood. With the growing amount of ageing-related data available on the web, in particular concerning the genetics of ageing, it is timely to apply data mining methods to that data, ...
Detecting novel hypermethylated genes in Breast cancer benefiting from feature selection
The aberrant hypermethylation of CpG islands in promoter regions of genes plays an important role in the onset and progression of Breast cancer. Meanwhile, it is highly associated with human genomic features. Two feature selection algorithms: t-test and ...
Comments