Abstract
We introduce a new clustering method for DNA microarray data that is based on space filling curves and wavelet denoising. The proposed method is much faster than the established fuzzy c-means clustering because clustering occurs in one dimension and it clusters cells that contain data, instead of data themselves. Moreover, preliminary evaluation results on data sets from Small Round Blue-Cell tumors, Leukemia and Lung cancer microarray experiments show that it can be equally or more accurate than fuzzy c-means clustering or a gaussian mixture model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Macgregor, P., Squire, J.: Application of microarrays to the analysis of gene expression in cancer. Clinical Chemistry 48, 1170–1177 (2002)
Eisen, M., Spellman, P., Brown, P., Botsetein, D.: Cluster analysis and display of genome-wide expression patterns. In: Proceedings of the National Academy of Scienes, vol. 95 (1998)
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Hinneburg, A., Keim, D.: Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering. In: Proceedings of the 25th VLDB Conference, Edinburgh, Scotland (1999)
Verbeek, J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Computation 15, 469–485 (2002)
Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE transactions on knowledge and data engineering 16, 1370–1386 (2004)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: a wavelet-based clustering approach for spatial data in very large databases. The VLDB Journal 8, 289–304 (2000)
Faloutsos, C., Roseman, S.: Fractals for secondary key retrieval. In: 8th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems PODS, pp. 247–252 (1989)
Donoho, D.L., Johnstone, I.M., Kerkyacharian, G., Picard, D.: Wavelet shrinkage: Asymptopia? J. R. Statist. Soc. B. 57, 301–337 (1995)
Yeung, K., Haynor, D., Ruzzo, W.: Validating clustering for gene expression data. Bioinformatics 17, 309–318 (2001)
Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comp. App. Math. 20, 53–65 (1987)
Khan, J., Wei, J., Ringer, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C., Meltzer, P.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural network. Nature Medicine 7, 673–679 (2001)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science (1999)
Gordon, G., Jensen, R., Hsiao, L., Gullans, S., Blumenstock, J., Ramaswamy, S., Richard, W., Sugarbaker, D., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research, 4963–4967 (2002)
Lam, B., Yan, H.: Cluster Validity for DNA Microarray Data using a Geometrical Index. In: Proceedings of the 4th International Conference on Machine Learning and Cybernetics (2005)
Tavazoie, S., Hughes, D., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vogiatzis, D., Tsapatsoulis, N. (2007). Clustering Microarray Data with Space Filling Curves. In: Masulli, F., Mitra, S., Pasi, G. (eds) Applications of Fuzzy Sets Theory. WILF 2007. Lecture Notes in Computer Science(), vol 4578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73400-0_67
Download citation
DOI: https://doi.org/10.1007/978-3-540-73400-0_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73399-7
Online ISBN: 978-3-540-73400-0
eBook Packages: Computer ScienceComputer Science (R0)