Abstract
For a microarray dataset with attached phenotype information – which gives expression levels of various genes and a phenotype classification for each of a set of samples – an important problem is to find informative genes. These genes have high information content as attributes for classification, minimizing the expected number of tests needed to identify a phenotype. This study investigates the use of a heuristic method for finding complete sets of informative genes (sets that are sufficient for constructing a maximally discriminating classifier) that are as small as possible. These minimal sets of informative genes can be very useful in developing an appreciation for the data. Our method uses branch-and-bound depth-first search. Experimental results suggest that our method is effective in finding minimal gene sets, and the resulting classifiers have good performance in terms of classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, U., et al.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proceedings of National Academy of Sciences 96, 6745–6750 (1999)
Armstrong, S.A., et al.: MLL Translocations Specify A Distinct Gene Expression Profile that Distinguishes A Unique Leukemia. Nature Genetics 30, 41–47 (2002)
Bø, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biology 3(4):research0017.1-0017.11 (2002)
Bhattacharjee, A., et al.: Classification of Human Lung Carcinomas by mRNA Expression Profiling Reveals Distinct Adenocarcinoma Subclasses. Proceedings of National Academy of Sciences 98, 13790–13795 (2001)
Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Golub, T.R., et al.: Molecular classifications of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Nutt, C.L., et al.: Gene Expression-Based Classification of Malignant Gliomas Correlates better with Survival than Histological Classification. Cancer Research 63(7), 1602–1607 (2003)
Paul, T.K., Iba, H.: Extraction of Informative Genes from Microarray Data. In: Proceedings of the Genetic and Evolutionary Computation Conference, Washington DC, USA, pp. 453–460 (2005)
Pomeroy, S.L., et al.: Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression. Nature (Letters to Nature) 415, 436–442 (2002)
Singh, D., et al.: Gene Expression Correlates of Clinical Prostate Cancer Behavior. Cancer Cell 1(2), 203–209 (2002)
Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98, 5116–5121 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chang, KH., Kwon, Y.K., Parker, D.S. (2007). Finding Minimal Sets of Informative Genes in Microarray Data. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)