An Algorithm to Assess the Reliability of Hierarchical Clusters in Gene Expression Data

Avogadri, Roberto; Brioschi, Matteo; Ruffino, Francesca; Ferrazzi, Fulvia; Beghini, Alessandro; Valentini, Giorgio

doi:10.1007/978-3-540-85567-5_95

Roberto Avogadri¹,
Matteo Brioschi²,
Francesca Ruffino¹,
Fulvia Ferrazzi³,
Alessandro Beghini² &
…
Giorgio Valentini¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5179))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

3335 Accesses
1 Citations

Abstract

The validation of clusters discovered in bio-molecular data is a central issue in bioinformatics. Recently, stability-based methods have been successfully applied to the analysis of the reliability of clusterings characterized by a relatively low number of examples and clusters. Nevertheless, several problems in functional genomics are characterized by a very large number of examples and clusters. We present a stability-based algorithm to discover significant clusters in hierarchical clusterings with a large number of examples and clusters. Preliminary results on gene expression data of patients affected by Human Myeloid Leukemia, show how to apply the proposed method when thousands of gene clusters are involved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Datta, S.: Comparison and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 19, 459–466 (2003)
Article Google Scholar
Napolitano, F., Raiconi, G., Tagliaferri, R., Ciaramella, A., Staiano, A., Miele, G.: Clustering and visualization approaches for human cell cycle gene expression data analysis. Int. J. Approx. Reasoning 47, 70–84 (2008)
Article Google Scholar
Handl, J., Knowles, J., Kell, D.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3215 (2005)
Article Google Scholar
Bolshakova, N., Azuaje, F., Cunningham, P.: An integrated tool for microarray data clustering and cluster validity assessment. Bioinformatics 21, 451–455 (2005)
Article Google Scholar
Kerr, M., Curchill, G.: Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments. PNAS 98, 8961–8965 (2001)
Article MATH Google Scholar
Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus Clustering: A Resampling-based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 52, 91–118 (2003)
Article MATH Google Scholar
Ben-Hur, A., Ellisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Altman, R., Dunker, A., Hunter, L., Klein, T., Lauderdale, K. (eds.) Pacific Symposium on Biocomputing, Lihue, Hawaii, USA, vol. 7, pp. 6–17. World Scientific, Singapore (2002)
Google Scholar
McShane, L., Radmacher, D., Freidlin, B., Yu, R., Li, M., Simon, R.: Method for assessing reproducibility of clustering patterns observed in analyses of microarray data. Bioinformatics 18, 1462–1469 (2002)
Article Google Scholar
Smolkin, M., Gosh, D.: Cluster stability scores for microarray data in cancer studies. BMC Bioinformatics 36 (2003)
Google Scholar
Bertoni, A., Valentini, G.: Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses. Artificial Intelligence in Medicine 37, 85–109 (2006)
Article Google Scholar
Bertoni, A., Valentini, G.: Model order selection for bio-molecular data clustering. BMC Bioinformatics 8 (2007)
Google Scholar
Achlioptas, D.: Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of Comp. & Sys. Sci. 66, 671–687 (2003)
Article MATH MathSciNet Google Scholar
Efron, B., Tibshirani, R.: An introduction to the Bootstrap. Chapman and Hall, New York (1993)
MATH Google Scholar
Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Article Google Scholar
Gentleman, R., et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biology 5 (2004)
Google Scholar
Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., Speed, T.: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2, 249–264 (2003)
Article Google Scholar
Gautier, L., Cope, L., Bolstad, B., Irizarry, R.: Affy–analysis of affymetrix genechip data at the probe level. Bioinformatics 20, 307–315 (2004)
Article Google Scholar
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)
Google Scholar
Khatri, P., Draghici, S.: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595 (2005)
Article Google Scholar
Dopazo, J.: Functional interpretation of microarray experiments. OMICS 3 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

DSI - Dip. Scienze dell’ Informazione, Università degli Studi di Milano, Italy
Roberto Avogadri, Francesca Ruffino & Giorgio Valentini
DBioGen - Dip. Biologia e Genetica per le Scienze Mediche, Università degli Studi di Milano, Italy
Matteo Brioschi & Alessandro Beghini
Dip. Informatica e Sistemistica, Università degli Studi di Pavia, Italy
Fulvia Ferrazzi

Authors

Roberto Avogadri
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Brioschi
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Ruffino
View author publications
You can also search for this author in PubMed Google Scholar
Fulvia Ferrazzi
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Beghini
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Valentini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ignac Lovrek Robert J. Howlett Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Avogadri, R., Brioschi, M., Ruffino, F., Ferrazzi, F., Beghini, A., Valentini, G. (2008). An Algorithm to Assess the Reliability of Hierarchical Clusters in Gene Expression Data. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5179. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85567-5_95

Download citation

DOI: https://doi.org/10.1007/978-3-540-85567-5_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85566-8
Online ISBN: 978-3-540-85567-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics