Abstract
A similarity measure for gene expression data should give the shapes of the patterns of the gene expression data and should be less susceptible to outliers. In this paper, we present a similarity measure for clustering gene expression time series data. Our similarity measure, PWCTM, uses the pairwise changing tendency measure of every pair of conditions. We have compared our measure with several proximity measures using k-means clustering algorithm in terms of Silhouette index, z-score and p-value. Our experimental results indicate that the gene clusters obtained with PWCTM as the similarity measure are biologically significant in the respective clusters due to their low p-values and high z-values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sarmah, R.: Gene Expression Data Clustering using a Fuzzy Link based Approach. International Journal of Computer Information Systems and Industrial Management 5, 532–541 (2013) ISSN No. 2150-7988
Das, R., Bhattacharyya, D.K., Kalita, J.K.: A new approach for clustering gene expression time series data. International Journal of Bioinformatics Reasearch and Applications 5(3), 310–328 (2009)
Das, R., Bhattacharyya, D.K., Kalita, J.K.: Clustering Gene Expression Data using an Effective Dissimilarity Measure. International Journal of Computational BioScience (Special Issue) 1(1), 55–68 (2010)
Choudhury, N., Sarmah, R., Sarma, S.: A Modified QT-Clustering Algorithm over Gene Expression Data. In: Proc. of International Conference on Recent Advances in Information Technology, pp. 542–547 (2012) ISBN: 978-1-4577-0694-3
Sarmah, S., Bhattacharyya, D.K.: An Effective Technique for Clustering Incremental Gene Expression data. International Journal of Computer Science Issues 7(3) (2010)
Stekel, D.: Microarray Bioinformatics. Cambridge University Press, Cambridge (2003)
Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey (2003), http://www.cse.buffalo.edu/DBGROUP/bioinformatics/papers/survey.pdf (accessed April 2008)
Bandyopadhyay, S., Bhattacharyya, M.: A Biologically Inspired Measure for Coexpression Analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(4) (2011)
Wang, K., Wang, B., Peng, L.: CVAP: Validation for Cluster Analyses. Data Science Journal 8, 88–93 (2009)
Sharan, R., Shamir, R.: CLICK: A clustering algorithm with applications to gene expression analysis. In: Proc. of Eighth Int. Conf. on Intelligent Systems for Molecular Biology. AAAI Press (2000)
Cho, R.J., Campbell, M., Winzeler, E., Steinmetz, L., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)
Iyer, V.R., DeRisi, J.L., Brown, P.O.: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 24, 278(5338), 680–686 (1997)
Gibbons, F.D., Roth, F.P.: udging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation. Genome Research 12, 1574–1581 (2002)
Berriz, F.G., et al.: Characterizing gene sets with funcassociate. Bioinformatics 19, 2502–2504 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Baishya, R.C., Sarmah, R., Bhattacharyya, D.K., Dutta, M.A. (2014). A Similarity Measure for Clustering Gene Expression Data. In: Gupta, P., Zaroliagis, C. (eds) Applied Algorithms. ICAA 2014. Lecture Notes in Computer Science, vol 8321. Springer, Cham. https://doi.org/10.1007/978-3-319-04126-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-04126-1_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04125-4
Online ISBN: 978-3-319-04126-1
eBook Packages: Computer ScienceComputer Science (R0)