A Similarity Measure for Clustering Gene Expression Data

Baishya, Ram Charan; Sarmah, Rosy; Bhattacharyya, Dhruba Kumar; Dutta, Malay Ananda

doi:10.1007/978-3-319-04126-1_21

Ram Charan Baishya¹⁸,
Rosy Sarmah¹⁸,
Dhruba Kumar Bhattacharyya¹⁸ &
…
Malay Ananda Dutta¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8321))

Included in the following conference series:

International Conference on Applied Algorithms

1315 Accesses

Abstract

A similarity measure for gene expression data should give the shapes of the patterns of the gene expression data and should be less susceptible to outliers. In this paper, we present a similarity measure for clustering gene expression time series data. Our similarity measure, PWCTM, uses the pairwise changing tendency measure of every pair of conditions. We have compared our measure with several proximity measures using k-means clustering algorithm in terms of Silhouette index, z-score and p-value. Our experimental results indicate that the gene clusters obtained with PWCTM as the similarity measure are biologically significant in the respective clusters due to their low p-values and high z-values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sarmah, R.: Gene Expression Data Clustering using a Fuzzy Link based Approach. International Journal of Computer Information Systems and Industrial Management 5, 532–541 (2013) ISSN No. 2150-7988
Google Scholar
Das, R., Bhattacharyya, D.K., Kalita, J.K.: A new approach for clustering gene expression time series data. International Journal of Bioinformatics Reasearch and Applications 5(3), 310–328 (2009)
Article Google Scholar
Das, R., Bhattacharyya, D.K., Kalita, J.K.: Clustering Gene Expression Data using an Effective Dissimilarity Measure. International Journal of Computational BioScience (Special Issue) 1(1), 55–68 (2010)
Google Scholar
Choudhury, N., Sarmah, R., Sarma, S.: A Modified QT-Clustering Algorithm over Gene Expression Data. In: Proc. of International Conference on Recent Advances in Information Technology, pp. 542–547 (2012) ISBN: 978-1-4577-0694-3
Google Scholar
Sarmah, S., Bhattacharyya, D.K.: An Effective Technique for Clustering Incremental Gene Expression data. International Journal of Computer Science Issues 7(3) (2010)
Google Scholar
Stekel, D.: Microarray Bioinformatics. Cambridge University Press, Cambridge (2003)
Book Google Scholar
Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey (2003), http://www.cse.buffalo.edu/DBGROUP/bioinformatics/papers/survey.pdf (accessed April 2008)
Bandyopadhyay, S., Bhattacharyya, M.: A Biologically Inspired Measure for Coexpression Analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(4) (2011)
Google Scholar
Wang, K., Wang, B., Peng, L.: CVAP: Validation for Cluster Analyses. Data Science Journal 8, 88–93 (2009)
Article MathSciNet Google Scholar
Sharan, R., Shamir, R.: CLICK: A clustering algorithm with applications to gene expression analysis. In: Proc. of Eighth Int. Conf. on Intelligent Systems for Molecular Biology. AAAI Press (2000)
Google Scholar
Cho, R.J., Campbell, M., Winzeler, E., Steinmetz, L., et al.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)
Article Google Scholar
Iyer, V.R., DeRisi, J.L., Brown, P.O.: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 24, 278(5338), 680–686 (1997)
Google Scholar
Gibbons, F.D., Roth, F.P.: udging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation. Genome Research 12, 1574–1581 (2002)
Article Google Scholar
Berriz, F.G., et al.: Characterizing gene sets with funcassociate. Bioinformatics 19, 2502–2504 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of CSE, Tezpur University, Tezpur, Assam, India
Ram Charan Baishya, Rosy Sarmah, Dhruba Kumar Bhattacharyya & Malay Ananda Dutta

Authors

Ram Charan Baishya
View author publications
You can also search for this author in PubMed Google Scholar
Rosy Sarmah
View author publications
You can also search for this author in PubMed Google Scholar
Dhruba Kumar Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar
Malay Ananda Dutta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Engineering, Heritage Institute of Technology, Chowbaga Road, Anandapur, 700107, Kolkata, India
Prosenjit Gupta
Department of Computer Engineering and Informatics, University of Patras, 26500, Patras, Greece
Christos Zaroliagis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Baishya, R.C., Sarmah, R., Bhattacharyya, D.K., Dutta, M.A. (2014). A Similarity Measure for Clustering Gene Expression Data. In: Gupta, P., Zaroliagis, C. (eds) Applied Algorithms. ICAA 2014. Lecture Notes in Computer Science, vol 8321. Springer, Cham. https://doi.org/10.1007/978-3-319-04126-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-04126-1_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04125-4
Online ISBN: 978-3-319-04126-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics