Skip to main content

A Multi-objective Sequential Ensemble for Cluster Structure Analysis and Visualization and Application to Gene Expression

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5997))

Abstract

In the presence of huge high dimensional datasets, it is important to investigate and visualize the connectivity of patterns in huge arbitrary shaped clusters. While density or distance-relatedness based clustering algorithms are used to efficiently discover clusters of arbitrary shapes and densities, classical (yet less efficient) clustering algorithms can be used to analyze the internal cluster structure and visualize it. In this work, a sequential ensemble, that uses an efficient distance-relatedness based clustering, “Mitosis”, followed by the centre-based K-means algorithm, is proposed. K-means is used to segment the clusters obtained by Mitosis into a number of subclusters. The ensemble is used to reveal the gradual change of patterns when applied to gene expression sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Strehl, A., Ghosh, J.: Cluster Ensembles- a knowledge reuse framework for combining multiple partitions. Journal of Machine learning Research 3, 583–617 (2002)

    Article  MathSciNet  Google Scholar 

  2. Fred, A.L.N.: Finding Consistent Clusters in Data Partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Topchy, A., Jain, A.K., Punch, W.: Combining Multiple Weak Clusterings. In: Proc. IEEE Intl. Conf. on Data Mining, pp. 331–338 (2003)

    Google Scholar 

  4. Fred, A.L.N., Jain, A.K.: Data clustering using evidence accumulation. In: Proceedings of International Conference on Pattern Recognition (2002)

    Google Scholar 

  5. Hadjitodorov, S.T., Kuncheva, L.I., Todor-ova, L.P.: Moderate diversity for better cluster ensembles. Information Fusion 7(3), 264–275 (2006)

    Article  Google Scholar 

  6. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning 52, 91–118 (2003)

    Article  MATH  Google Scholar 

  7. Alter, O., Patrick, O.B., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97(18) (August 2000)

    Google Scholar 

  8. Yousri, N.A., Kamel, M.S., Ismail, M.A.: A Distance-Relatedness Dynamic Model for Clustering High Dimensional Data of Arbitrary Shapes and Densities. Journal of Pattern Recognition (July 2009)

    Google Scholar 

  9. Yousri, N.A.: Novel Methodologies for Discovering Clusters of Arbitrary Shapes and Densities in High Dimensional Data, with Applications. Ph.D.Thesis, Computers and Systems Engineering, Alexandria University, Egypt (June 2008)

    Google Scholar 

  10. Hartigan, J.A.: Clustering Algorithms. Wiley Series in Probability & Mathematical Statistics (1975)

    Google Scholar 

  11. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial data sets with noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, pp. 226–231

    Google Scholar 

  12. Hinneburg, A., Keim, D.: An efficient approach to clustering in large multimedia data sets with noise. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 58–65 (1998)

    Google Scholar 

  13. Karypis, G., Han, E.H., Kumar, V.: CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  14. Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. In: Proceedings of the IEEE Conference on Data Engineering (1999)

    Google Scholar 

  15. Yousri, N.A., Ismail, M.A., Kamel, M.S.: Discovering Connected Patterns in Gene Expression Arrays. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Hawaii, USA (2007)

    Google Scholar 

  16. Yousri, N.A., Kamel, M.S., Ismail, M.A.: Pattern Cores and Connectedness in Cancer Gene Expression. In: 7th IEEE International Conference on BioInformatics and BioEngineering (BIBE), Boston, USA (October 2007)

    Google Scholar 

  17. Yousri, N.A., Kamel, M.S., Ismail, M.A.: A Fuzzy Approach for Analyzing Outliers in Gene Expression Data. In: BMEI 2008, Haikou, China (2008)

    Google Scholar 

  18. Yousri, N.A., Kamel, M.S., Ismail, M.A.: A Novel Validity Measure for Clusters of Arbitrary Shapes and Densities. In: International Conference of Pattern Recognition, ICPR 2008 (2008)

    Google Scholar 

  19. Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974)

    Article  MathSciNet  Google Scholar 

  20. http://www.sciencemag.org/feature/data/984559.shl

  21. West, M., et al.: Predicting the clinical status of human breast cancer by using gene expression profiles. PNAS 98(20), 11462–11467 (2001)

    Article  Google Scholar 

  22. http://data.cgt.duke.edu/west.php

  23. Bertoni, A., Valentini, G.: Discovering multi–level structures in bio-molecular data through the Bernstein inequality. BMC Bioinformatics 9(suppl. 2), S4 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yousri, N.A. (2010). A Multi-objective Sequential Ensemble for Cluster Structure Analysis and Visualization and Application to Gene Expression. In: El Gayar, N., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2010. Lecture Notes in Computer Science, vol 5997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12127-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12127-2_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12126-5

  • Online ISBN: 978-3-642-12127-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics