Skip to main content

Clustering Microarray Data with Space Filling Curves

  • Conference paper
  • 2049 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4578))

Abstract

We introduce a new clustering method for DNA microarray data that is based on space filling curves and wavelet denoising. The proposed method is much faster than the established fuzzy c-means clustering because clustering occurs in one dimension and it clusters cells that contain data, instead of data themselves. Moreover, preliminary evaluation results on data sets from Small Round Blue-Cell tumors, Leukemia and Lung cancer microarray experiments show that it can be equally or more accurate than fuzzy c-means clustering or a gaussian mixture model.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Macgregor, P., Squire, J.: Application of microarrays to the analysis of gene expression in cancer. Clinical Chemistry 48, 1170–1177 (2002)

    Google Scholar 

  2. Eisen, M., Spellman, P., Brown, P., Botsetein, D.: Cluster analysis and display of genome-wide expression patterns. In: Proceedings of the National Academy of Scienes, vol. 95 (1998)

    Google Scholar 

  3. Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  4. Hinneburg, A., Keim, D.: Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering. In: Proceedings of the 25th VLDB Conference, Edinburgh, Scotland (1999)

    Google Scholar 

  5. Verbeek, J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Computation 15, 469–485 (2002)

    Article  Google Scholar 

  6. Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE transactions on knowledge and data engineering 16, 1370–1386 (2004)

    Article  Google Scholar 

  7. Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: a wavelet-based clustering approach for spatial data in very large databases. The VLDB Journal 8, 289–304 (2000)

    Article  Google Scholar 

  8. Faloutsos, C., Roseman, S.: Fractals for secondary key retrieval. In: 8th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems PODS, pp. 247–252 (1989)

    Google Scholar 

  9. Donoho, D.L., Johnstone, I.M., Kerkyacharian, G., Picard, D.: Wavelet shrinkage: Asymptopia? J. R. Statist. Soc. B. 57, 301–337 (1995)

    MATH  MathSciNet  Google Scholar 

  10. Yeung, K., Haynor, D., Ruzzo, W.: Validating clustering for gene expression data. Bioinformatics 17, 309–318 (2001)

    Article  Google Scholar 

  11. Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comp. App. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  12. Khan, J., Wei, J., Ringer, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C., Meltzer, P.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural network. Nature Medicine 7, 673–679 (2001)

    Article  Google Scholar 

  13. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science (1999)

    Google Scholar 

  14. Gordon, G., Jensen, R., Hsiao, L., Gullans, S., Blumenstock, J., Ramaswamy, S., Richard, W., Sugarbaker, D., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research, 4963–4967 (2002)

    Google Scholar 

  15. Lam, B., Yan, H.: Cluster Validity for DNA Microarray Data using a Geometrical Index. In: Proceedings of the 4th International Conference on Machine Learning and Cybernetics (2005)

    Google Scholar 

  16. Tavazoie, S., Hughes, D., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Francesco Masulli Sushmita Mitra Gabriella Pasi

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vogiatzis, D., Tsapatsoulis, N. (2007). Clustering Microarray Data with Space Filling Curves. In: Masulli, F., Mitra, S., Pasi, G. (eds) Applications of Fuzzy Sets Theory. WILF 2007. Lecture Notes in Computer Science(), vol 4578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73400-0_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73400-0_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73399-7

  • Online ISBN: 978-3-540-73400-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics