Abstract
An open problem in gene expression data analysis is the evaluation of the performance of gene selection methods applied to discover biologically relevant sets of genes. The problem is difficult, as the entire set of genes involved in specific biological processes is usually unknown or only partially known, making unfeasible a correct comparison between different gene selection methods. The natural solution to this problem consists in developing an artificial model to generate gene expression data, in order to know in advance the set of biologically relevant genes. The models proposed in the literature, even if useful for a preliminary evaluation of gene selection methods, did not explicitly consider the biological characteristics of gene expression data. The main aim of this work is to individuate the main biological characteristics that need to be considered to design a model for validating gene selection methods based on the analysis of DNA microarray data.
Keywords
- Gene Expression Data
- Expression Signature
- Independent Component Analysis
- Gene Expression Signature
- Gene Expression Data Analysis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baldi, P., Hatfield, G.: DNA Microarrays and Gene Expression. Cambridge University Press, Cambridge (2002)
Golub, T., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46, 389–422 (2002)
Muselli, M.: Gene selection through Switched Neural Networks. In: NETTAB-2003, Workshop on Bioinformatics for Microarrays, Bologna, Italy (2003)
Weston, J., et al.: Use of the zero-norm with linear models and kernels methods. Journal of Machine Learning Research 3, 1439–1461 (2003)
Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Martoglio, A., Miskin, J., Smith, S., MacKay, D.: A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer. Bioinformatics 18, 1617–1624 (2002)
Dyrskjøt, L., et al.: Identifying distinct classes of bladder carcinoma using microarrays. Nature Genetics 33, 90–96 (2003)
McCarroll, S., et al.: Comparing genomic expression patterns across species identifies shared transcriptional profile in aging. Nature Genetics 36, 197–204 (2004)
Yu, Y., et al.: Expression profiling identifies the cytoskeletal organizer ezrin and the developmental homoprotein Six-1 as key metastatic regulators. Nature Medicine 10, 175–181 (2004)
Cui, X., Churchill, G.: Statistical tests for differential expression in cDNA microarray experiments. Genome Biology 4 (2003)
Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. PNAS 95, 14863–14868 (1998)
Kotska, D., Spang, R.: Finding disease specific alterations in the co–expression of genes. Bioinformatics 199, i194–i199 (2004)
Ramaswamy, S., Ross, K., Lander, E., Golub, T.: A molecular signature of metastasis in primary solid tumors. Nature Genetics 33, 49–54 (2003)
Gasch, P., Eisen, M.: Exploring the conditional regulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3 (2002)
Ihmels, J., Bergmann, S., Barkai, N.: Defining transciption modules using large-scale gene expression data. Bioinformatics (2004)
Ye, Q., et al.: Predicting hepatitis b virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning. Nature Medicine 9, 416–423 (2003)
Chen, J., et al.: Analysis of variance components in gene expression data. Bioinformatics 20, 1436–1446 (2004)
Cheung, V., et al.: Natural variation in human gene expression assessed in lymphoblastoid cells. Nature Genetics 33, 422–425 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruffino, F., Muselli, M., Valentini, G. (2006). Biological Specifications for a Synthetic Gene Expression Data Generation Model. In: Bloch, I., Petrosino, A., Tettamanzi, A.G.B. (eds) Fuzzy Logic and Applications. WILF 2005. Lecture Notes in Computer Science(), vol 3849. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11676935_34
Download citation
DOI: https://doi.org/10.1007/11676935_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32529-1
Online ISBN: 978-3-540-32530-7
eBook Packages: Computer ScienceComputer Science (R0)