ABSTRACT
An enormous amount of microarray data has been generated and archived for a large variety of biological studies such as gene expression. In order to analyze gene expression data, many clustering algorithms have been proposed, but very few techniques have been developed to evaluate those clustering algorithms. A clustering evaluation method is used to find the degree of similarity between members of the same clusters and members of different clusters. We propose a new clustering evaluation technique F-Statistics Algorithm for Clustering Evaluation (FACE), which uses both inter-cluster and intracluster distances, and can be used to improve performance of clustering methods. We describe and evaluate FACE in the context of bioinformatics clustering by comparison with existing evaluation measurements on a set of yeast data. Results show that FACE is more stable and makes better conclusions.
- M. Bhattacharyya and S. Bandyopadhyay. 2008 Integration of Co-expression Networks for Gene Clustering. Machine Intelligence Unit, Indian Statistical Institute.Google Scholar
- G. Kerr, H. J. Ruskin, M. Crane, P. Doolan. 2008 Techniques for clustering gene expression data. Computers in Biology and Medicine. Mar; 38(3):283--93. Google ScholarDigital Library
- M. Halkidi, Y. Batistakis, M. Vazirgiannis, 2001. On clustering validation techniques, Journal of Intelligent Information Systems, 17:2/3 107--145. Google ScholarDigital Library
- N. Bolshakovaa and F. Azuaje, 2003. Cluster validation techniques for genome expression data. Signal Processing 83 825--833. Google ScholarDigital Library
- R. Kashef and M. S. Kamel, 2008. Towards better outliers detection for gene expression datasets. IEEE 149--154. Google ScholarDigital Library
- K. Y. Yeung, D. R. Haynor, and W. L. Ruzzo, 2001. Validating clustering for gene expression data. Bioinformatics Vol. 17 309--318.Google Scholar
- F. X. Wu, W. J. Wang, A. J. Kusalik. 2005 Dynamic model-based clustering for time-course gene expression data. J Bioinform Comput Biol. Aug: 3(4):821--36.Google ScholarCross Ref
- K. Yeung, M. Medvedovic and R. Bumgarner, 2003. Clustering gene-expression data with repeated measurements. Department of Microbiology, University of Washington.Google Scholar
Index Terms
- F-statistics algorithm for gene clustering evaluation
Recommendations
A Hierarchical Approach for Clustering and Pattern Matching of Gene Expression Data
ICGEC '12: Proceedings of the 2012 Sixth International Conference on Genetic and Evolutionary ComputingClustering data is a well-known and challenging problem that has been widely studied in data base analysis. This paper shows how it made possible in genetic engineering to observe simultaneously the expression levels of huge genes during important ...
Clustering of Gene Expression Data: Performance and Similarity Analysis
IMSCCS '06: Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences - Volume 1 (IMSCCS'06) - Volume 01Recent advances of the DNA Microarray technology allow monitoring gene expression profiles of thousands of genes simultaneously. However, the analysis and handling of such fast growing data is becoming the major bottleneck in the utilization of the ...
Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm
BIBM '11: Proceedings of the 2011 IEEE International Conference on Bioinformatics and BiomedicineClustering is one of the important analysis in functional genomics that discovers groups of co-expressed genes from micro array data. In this paper, the application of rough-fuzzy c-means (RFCM)algorithm is presented to discover co-expressed gene ...
Comments