Abstract
This paper introduces an enhanced version of Pearson’s correlation coefficient (PCC) to achieve better biclustering-enabled co-expression analysis. The modified measure called local pearson correlation measure (LPCM) helps detect shifting, scaling, and shifting-and-scaling correlation patterns effectively over gene expression data in the presence of outlier. An LPCM-based biclustering technique called local correlation-based biclustering technique (LCBT) has also been proposed to identify biclusters of high biological significance. The biclustering results have been established both statistically and biologically using benchmarked gene expression data.
Similar content being viewed by others
References
Ahmed HA, Priyakshi M, Dhruba KB, Kalita JK (2014) Shifting-and-scaling correlation based biclustering algorithm. In: IEEE/ACM transactions on computational biology and bioinformatics (TCBB) 11(6):1239–1252
Al-Akwaa FM, Ali MH, Kadah YM (2009) Bicat\_plus: an automatic comparative tool for bi/clustering of gene expression data obtained using microarrays. In: 2009 National radio science conference, IEEE, pp 1–8
Ashburner M, Ball Catherine A, Blake Judith A, Botstein D, Butler H, Michael Cherry J, Davis Allan P, Dolinski K, Dwight Selina S, Eppig Janan T et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29
Ben-Dor Amir, Chor Benny, Karp Richard, Yakhini Zohar (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3–4):373–384
Bergmann S, Ihmels J, Barkai N (2003) Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Rev E 67(3):031902
Cheng Y, Church GM (2000) Biclustering of expression data. In: Ismb, vol 8, pp 93–103
Huang DW, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC et al (2007) David bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucl Acids Res 35(suppl\_2):W169–W175
Jiang D, Tang C, Zhang A (2004) Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng 16(11):1370–1386
Kamal S, Ripon SH, Nilanjan D, Ashour Amira S, Santhi V (2016) A mapreduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset. Comput Methods Programs Biomed 131:191–206
Kishor DR, Venkateswarlu NB (2016) A novel hybridization of expectation-maximization and k-means algorithms for better clustering performance. Int J Ambient Comput Intell (IJACI) 7(2):47–74
Lavanya K, Reddy LSS, Eswara Reddy B (2019) Distributed based serial regression multiple imputation for high dimensional multivariate data in multicore environment of cloud. Int J Ambient Comput Intell (IJACI) 10(2):63–79
Li G, Ma Q, Tang H, Paterson AH, Xu Y (2009) Qubic: a qualitative biclustering algorithm for analyses of gene expression data. Nucl Acids Res, p gkp491
Liu X, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1):50–56
Mahanta P, Ahmed HA, Bhattacharyya DK, Kalita JK (2011) Triclustering in gene expression data analysis: a selected survey. In: 2011 2nd National conference on emerging trends and applications in computer science, IEEE, pp 1–6
Murali TM, Kasif S (2002) Extracting conserved gene expression motifs from gene expression data. In: Biocomputing 2003, pp 77–88. World Scientific
Patowary P, Bhattacharyya DK, Barah P (2019) Biomarker identification for escc using integrative dea. In: International conference on pattern recognition and machine intelligence, Springer, New York, pp 156–164
Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129
Sarmah S, Bhattacharyya Dhruba K (2012) A grid-density based technique for finding clusters in satellite image. Pattern Recogn Lett 33(5):589–604
Sarwar Kamal M, Linkon C, Khan MI, Ashour Amira S, Tavares João Manuel RS, Nilanjan D (2017) Hidden markov model and chapman kolmogrov for protein structures prediction from images. Comput Biol Chem 68:231–244
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Patowary, P., Sarmah, R. & Bhattacharyya, D.K. Developing an effective biclustering technique using an enhanced proximity measure. Netw Model Anal Health Inform Bioinforma 9, 6 (2020). https://doi.org/10.1007/s13721-019-0211-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-019-0211-7