research-article

F-statistics algorithm for gene clustering evaluation

Authors:
Mohamad Qayoom

University of New Orleans, New Orleans, LA

University of New Orleans, New Orleans, LA
View Profile

,
Qi Zhang

University of New Orleans, New Orleans, LA

University of New Orleans, New Orleans, LA
View Profile

,
Christopher Taylor

University of New Orleans, New Orleans, LA

University of New Orleans, New Orleans, LA
View Profile

BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational BiologyAugust 2010Pages 490–492https://doi.org/10.1145/1854776.1854867

Published:02 August 2010Publication History

BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

Pages 490–492

ABSTRACT

An enormous amount of microarray data has been generated and archived for a large variety of biological studies such as gene expression. In order to analyze gene expression data, many clustering algorithms have been proposed, but very few techniques have been developed to evaluate those clustering algorithms. A clustering evaluation method is used to find the degree of similarity between members of the same clusters and members of different clusters. We propose a new clustering evaluation technique F-Statistics Algorithm for Clustering Evaluation (FACE), which uses both inter-cluster and intracluster distances, and can be used to improve performance of clustering methods. We describe and evaluate FACE in the context of bioinformatics clustering by comparison with existing evaluation measurements on a set of yeast data. Results show that FACE is more stable and makes better conclusions.

References

M. Bhattacharyya and S. Bandyopadhyay. 2008 Integration of Co-expression Networks for Gene Clustering. Machine Intelligence Unit, Indian Statistical Institute.Google Scholar
G. Kerr, H. J. Ruskin, M. Crane, P. Doolan. 2008 Techniques for clustering gene expression data. Computers in Biology and Medicine. Mar; 38(3):283--93. Google ScholarDigital Library
M. Halkidi, Y. Batistakis, M. Vazirgiannis, 2001. On clustering validation techniques, Journal of Intelligent Information Systems, 17:2/3 107--145. Google ScholarDigital Library
N. Bolshakovaa and F. Azuaje, 2003. Cluster validation techniques for genome expression data. Signal Processing 83 825--833. Google ScholarDigital Library
R. Kashef and M. S. Kamel, 2008. Towards better outliers detection for gene expression datasets. IEEE 149--154. Google ScholarDigital Library
K. Y. Yeung, D. R. Haynor, and W. L. Ruzzo, 2001. Validating clustering for gene expression data. Bioinformatics Vol. 17 309--318.Google Scholar
F. X. Wu, W. J. Wang, A. J. Kusalik. 2005 Dynamic model-based clustering for time-course gene expression data. J Bioinform Comput Biol. Aug: 3(4):821--36.Google ScholarCross Ref
K. Yeung, M. Medvedovic and R. Bumgarner, 2003. Clustering gene-expression data with repeated measurements. Department of Microbiology, University of Washington.Google Scholar

Index Terms

F-statistics algorithm for gene clustering evaluation
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Recommendations

A Hierarchical Approach for Clustering and Pattern Matching of Gene Expression Data
ICGEC '12: Proceedings of the 2012 Sixth International Conference on Genetic and Evolutionary Computing

Clustering data is a well-known and challenging problem that has been widely studied in data base analysis. This paper shows how it made possible in genetic engineering to observe simultaneously the expression levels of huge genes during important ...
Read More
Clustering of Gene Expression Data: Performance and Similarity Analysis
IMSCCS '06: Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences - Volume 1 (IMSCCS'06) - Volume 01

Recent advances of the DNA Microarray technology allow monitoring gene expression profiles of thousands of genes simultaneously. However, the analysis and handling of such fast growing data is becoming the major bottleneck in the utilization of the ...
Read More
Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm
BIBM '11: Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine

Clustering is one of the important analysis in functional genomics that discovers groups of co-expressed genes from micro array data. In this paper, the application of rough-fuzzy c-means (RFCM)algorithm is presented to discover co-expressed gene ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
August 2010
705 pages
ISBN:9781450304382
DOI:10.1145/1854776
General Chairs:
Aidong Zhang
SUNY at Buffalo
,
Mark Borodovsky
Georgia Tech
,
Program Chairs:
Gultekin Ozsoyoglu
Case Western Reserve University
,
Armin Mikler
University of North Texas
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 August 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
evaluation
gene expression
microarray
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate254of885submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 87
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

F-statistics algorithm for gene clustering evaluation

BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Hierarchical Approach for Clustering and Pattern Matching of Gene Expression Data

Clustering of Gene Expression Data: Performance and Similarity Analysis

Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

F-statistics algorithm for gene clustering evaluation

BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Hierarchical Approach for Clustering and Pattern Matching of Gene Expression Data

Clustering of Gene Expression Data: Performance and Similarity Analysis

Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media