Consensus Clustering in Gene Expression

Galdi, Paola; Napolitano, Francesco; Tagliaferri, Roberto

doi:10.1007/978-3-319-24462-4_5

Consensus Clustering in Gene Expression

Paola Galdi¹⁷,
Francesco Napolitano¹⁸ &
Roberto Tagliaferri¹⁷

Conference paper
First Online: 18 November 2015

1430 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8623))

Abstract

In data analysis, clustering is the process of finding groups in unlabelled data according to similarities among them in such a way that data items belonging to the same group are more similar between each other than items in different groups. Consensus clustering is a methodology for combining different clustering solutions from the same data set in a new clustering, in order to obtain a more accurate and stable solution. In this work we compared different consensus approaches in combination with different clustering algorithms and ran several experiments on gene expression data sets. We show that consensus techniques lead to an improvement in clustering accuracy and give evidence of the stability of the solutions obtained with these methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wolpert, D.H.: The Lack of A Priori Distinctions Between Learning Algorithms. Neural Computation 8, 1341–1390 (1996)
Article Google Scholar
Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)
Article Google Scholar
Vega-Pons, S., Ruiz-Shulcloper, J.: A Survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25, 337–372 (2011)
Article MathSciNet Google Scholar
Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 52, 91–118 (2003)
Article MATH Google Scholar
Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science 315, 972–976 (2007)
Article MathSciNet MATH Google Scholar
Frey Lab, Probabilistic and Statistical Inference Group, University of Toronto. http://www.psi.toronto.edu/affinitypropagation
Zhang, X., Wang, W., Nørvag, K., Sebag, M.K.-A.: K-AP: Generating Specified K Clusters by Efficient Affinity Propagation. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 1187–1192 (2010)
Google Scholar
Langfelder, P., Zhang, B., Horvath, S.: Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24, 719–720 (2008)
Article Google Scholar
Wilkerson, M.D., Hayes, D.N.: ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010)
Article Google Scholar
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: ICML, vol. 3, pp. 186–193 (2003)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 26(1), 189–206 (1984)
Article MathSciNet MATH Google Scholar
Bertoni, A., Valentini, G.: Ensembles based on random projections to improve the accuracy of clustering algorithms. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds.) WIRN/NAIS 2005. LNCS, vol. 3931, pp. 31–37. Springer, Heidelberg (2006)
Google Scholar
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250. ACM (2001)
Google Scholar
Iam-on, N., Garrett, S.: LinkCluE: A MATLAB package for link-based cluster ensembles. J. Stat. Software 36(9), 1–36 (2010)
Article Google Scholar
Buffa, F.M., Camps, C., Winchester, L., Snell, C.E., Gee, H.E., Sheldon, H., Taylor, M., Harris, A.L., Ragoussis, J.: microRNA-Associated Progression Pathways and Potential Therapeutic Targets Identified by Integrated mRNA and microRNA Expression Profiling in Breast Cancer. Cancer Res. 71, 5635–5645 (2011)
Article Google Scholar
Gene Expression Omnibus (GEO). http://www.ncbi.nlm.nih.gov/geo/
Tcga Genome Atlas. https://tcga-data.ncl.nih.gov/tcga/
Serra, A., Fratello, M., Fortino, V., Raiconi, G., Tagliaferri, R., Greco, D.: MVDA: A multi-view genomic data integration methodology. BMC Bioinformatics 16, 261 (2015)
Article Google Scholar
Galdi, P., Napolitano, F., Tagliaferri, R.: A comparison between Affinity Propagation and assessment based methods in finding the best number of clusters. In: Di Serio, C., Li, P., Richardson, S., Tagliaferri, R. (eds.) Proceedings of Eleventh International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2014), Cambridge, pp. 978–988, June 2014. ISBN: 978-88-906437-4-3
Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2, 193–218 (1985)
Article MATH Google Scholar
Bifulco, I., Fedullo, C., Napolitano, F., Raiconi, G., Tagliaferri, R.: Robust clustering by aggregation and intersection methods. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part III. LNCS (LNAI), vol. 5179, pp. 732–739. Springer, Heidelberg (2008)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

NeuRoNe Lab, Department of Informatics, University of Salerno, via Giovanni Paolo II 132, 84084, Fisciano, SA, Italy
Paola Galdi & Roberto Tagliaferri
Systems and Synthetic Biology Lab, Telethon Institute of Genetics and Medicine (TIGEM), via Pietro Castellino 111, 80131, Naples, Italy
Francesco Napolitano

Authors

Paola Galdi
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Napolitano
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Tagliaferri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paola Galdi .

Editor information

Editors and Affiliations

CUSSB, University "Vita-Salute" San Raffae, Milano, Italy
Clelia DI Serio
The Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
Pietro Liò
CUSSB, Università Vita-Salute San Raffaele, Milano, Italy
Alessandro Nonis
Dipartimento di Informatica, Universitá degli Studi di Salerno, Fisciano, Salerno, Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Galdi, P., Napolitano, F., Tagliaferri, R. (2015). Consensus Clustering in Gene Expression. In: DI Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2014. Lecture Notes in Computer Science(), vol 8623. Springer, Cham. https://doi.org/10.1007/978-3-319-24462-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-24462-4_5
Published: 18 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24461-7
Online ISBN: 978-3-319-24462-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics