Abstract
When applied to genomic data, many popular unsupervised explorative data analysis tools based on clustering algorithms often fail due to their small cardinality and high dimensionality. In this paper we propose a wrapper method for gene selection based on simulated annealing and unsupervised clustering. The proposed approach, even if computationally intensive, permits to select the most relevant features (genes), and to rank their relevance, allowing to improve the results of clustering algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Bickel, D.R.: Robust cluster analysis of microarray gene expression data with the number of cluster determined biologically. Bioinformatics 19(7), 818–824 (2003)
Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97(1-2), 245–271 (1997)
Bolshakova, N., Azuaje, F.: Cluster validation techniques for genome expression data Source. Signal Processing 83, 825–833 (2003)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
Dy, J.G., Brodley, C.E., Kak, A., Broderick, L.S., Aisen, A.M.: Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images. IEEE Trans. Pattern Analysis and Machine Intelligence 25(3), 373–378 (2003)
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Jornsten, R., Yu, B.: Simultaneous gene clustering and subset selection for sample classification via MDL. Bioinformatics 19(8), 1100–1109 (2003)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 661–680 (1983)
Kohavi, R., John, G.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)
Law, M.H., Figueiredo, M.A.T., Jain, A.K.: Simultaneous Feature Selection in and Clustering Using Mixture Models. IEEE Trans. Pattern Analysis and Machine Intelligence 28(9) (2004)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations for fast computing machines. Journal of Chemical Physics 21, 1087–1092 (1953)
Mitra, P., Murthy, C.A.: Unsupervised Feature Selection Using Feature Similarity. IEEE Trans. Pattern Analysis and Machine Intelligence 24(3), 301–312 (2002)
Mumey, B., Showe, L., Showe, M.: A Combinatorial Approach to Clustering Gene Expession Data. Bioinformatics (2003)
Wang, Q., Shen, Y., Zhang, Y., Zhang, J.: A quantitative method for evaluating the performances of hyperspectral image fusion. IEEE Trans. Instrumentation and Measurement 52, 1041–1047 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Filippone, M., Masulli, F., Rovetta, S. (2006). Unsupervised Gene Selection and Clustering Using Simulated Annealing. In: Bloch, I., Petrosino, A., Tettamanzi, A.G.B. (eds) Fuzzy Logic and Applications. WILF 2005. Lecture Notes in Computer Science(), vol 3849. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11676935_28
Download citation
DOI: https://doi.org/10.1007/11676935_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32529-1
Online ISBN: 978-3-540-32530-7
eBook Packages: Computer ScienceComputer Science (R0)