Skip to main content

Consensus Clustering in Gene Expression

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8623))

Abstract

In data analysis, clustering is the process of finding groups in unlabelled data according to similarities among them in such a way that data items belonging to the same group are more similar between each other than items in different groups. Consensus clustering is a methodology for combining different clustering solutions from the same data set in a new clustering, in order to obtain a more accurate and stable solution. In this work we compared different consensus approaches in combination with different clustering algorithms and ran several experiments on gene expression data sets. We show that consensus techniques lead to an improvement in clustering accuracy and give evidence of the stability of the solutions obtained with these methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wolpert, D.H.: The Lack of A Priori Distinctions Between Learning Algorithms. Neural Computation 8, 1341–1390 (1996)

    Article  Google Scholar 

  2. Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)

    Article  Google Scholar 

  3. Vega-Pons, S., Ruiz-Shulcloper, J.: A Survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25, 337–372 (2011)

    Article  MathSciNet  Google Scholar 

  4. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 52, 91–118 (2003)

    Article  MATH  Google Scholar 

  5. Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science 315, 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  6. Frey Lab, Probabilistic and Statistical Inference Group, University of Toronto. http://www.psi.toronto.edu/affinitypropagation

  7. Zhang, X., Wang, W., Nørvag, K., Sebag, M.K.-A.: K-AP: Generating Specified K Clusters by Efficient Affinity Propagation. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 1187–1192 (2010)

    Google Scholar 

  8. Langfelder, P., Zhang, B., Horvath, S.: Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24, 719–720 (2008)

    Article  Google Scholar 

  9. Wilkerson, M.D., Hayes, D.N.: ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010)

    Article  Google Scholar 

  10. Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: ICML, vol. 3, pp. 186–193 (2003)

    Google Scholar 

  11. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 26(1), 189–206 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  12. Bertoni, A., Valentini, G.: Ensembles based on random projections to improve the accuracy of clustering algorithms. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds.) WIRN/NAIS 2005. LNCS, vol. 3931, pp. 31–37. Springer, Heidelberg (2006)

    Google Scholar 

  13. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250. ACM (2001)

    Google Scholar 

  14. Iam-on, N., Garrett, S.: LinkCluE: A MATLAB package for link-based cluster ensembles. J. Stat. Software 36(9), 1–36 (2010)

    Article  Google Scholar 

  15. Buffa, F.M., Camps, C., Winchester, L., Snell, C.E., Gee, H.E., Sheldon, H., Taylor, M., Harris, A.L., Ragoussis, J.: microRNA-Associated Progression Pathways and Potential Therapeutic Targets Identified by Integrated mRNA and microRNA Expression Profiling in Breast Cancer. Cancer Res. 71, 5635–5645 (2011)

    Article  Google Scholar 

  16. Gene Expression Omnibus (GEO). http://www.ncbi.nlm.nih.gov/geo/

  17. Tcga Genome Atlas. https://tcga-data.ncl.nih.gov/tcga/

  18. Serra, A., Fratello, M., Fortino, V., Raiconi, G., Tagliaferri, R., Greco, D.: MVDA: A multi-view genomic data integration methodology. BMC Bioinformatics 16, 261 (2015)

    Article  Google Scholar 

  19. Galdi, P., Napolitano, F., Tagliaferri, R.: A comparison between Affinity Propagation and assessment based methods in finding the best number of clusters. In: Di Serio, C., Li, P., Richardson, S., Tagliaferri, R. (eds.) Proceedings of Eleventh International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2014), Cambridge, pp. 978–988, June 2014. ISBN: 978-88-906437-4-3

    Google Scholar 

  20. Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  21. Bifulco, I., Fedullo, C., Napolitano, F., Raiconi, G., Tagliaferri, R.: Robust clustering by aggregation and intersection methods. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part III. LNCS (LNAI), vol. 5179, pp. 732–739. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paola Galdi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Galdi, P., Napolitano, F., Tagliaferri, R. (2015). Consensus Clustering in Gene Expression. In: DI Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2014. Lecture Notes in Computer Science(), vol 8623. Springer, Cham. https://doi.org/10.1007/978-3-319-24462-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24462-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24461-7

  • Online ISBN: 978-3-319-24462-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics