Reference Hub4
Clustering Genes Using Heterogeneous Data Sources

Clustering Genes Using Heterogeneous Data Sources

Erliang Zeng, Chengyong Yang, Tao Li, Giri Narasimhan
Copyright: © 2010 |Volume: 1 |Issue: 2 |Pages: 17
ISSN: 1947-9115|EISSN: 1947-9123|EISBN13: 9781609604479|DOI: 10.4018/jkdb.2010040102
Cite Article Cite Article

MLA

Zeng, Erliang, et al. "Clustering Genes Using Heterogeneous Data Sources." IJKDB vol.1, no.2 2010: pp.12-28. http://doi.org/10.4018/jkdb.2010040102

APA

Zeng, E., Yang, C., Li, T., & Narasimhan, G. (2010). Clustering Genes Using Heterogeneous Data Sources. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 1(2), 12-28. http://doi.org/10.4018/jkdb.2010040102

Chicago

Zeng, Erliang, et al. "Clustering Genes Using Heterogeneous Data Sources," International Journal of Knowledge Discovery in Bioinformatics (IJKDB) 1, no.2: 12-28. http://doi.org/10.4018/jkdb.2010040102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.