Abstract
Effective biomarkers aid in the early diagnosis and monitoring of breast cancer and thus play an important role in the treatment of patients suffering from the disease. Growing evidence indicates that alteration of expression levels of miRNA is one of the principal causes of cancer. We analyze breast cancer miRNA data to discover a list of biclusters as well as breast cancer miRNA biomarkers which can help to understand better this critical disease and take important clinical decisions for treatment and diagnosis. In this paper, we propose a pattern-based parallel biclustering algorithm termed Rank-Preserving Biclustering (RPBic). The key strategy is to identify rank-preserved rows under a subset of columns based on a modified version of all substrings common subsequence (ALCS) framework. To illustrate the effectiveness of the RPBic algorithm, we consider synthetic datasets and show that RPBic outperforms relevant biclustering algorithms in terms of relevance and recovery. For breast cancer data, we identify 68 biclusters and establish that they have strong clinical characteristics among the samples. The differentially co-expressed miRNAs are found to be involved in KEGG cancer related pathways. Moreover, we identify frequency-based biomarkers (hsa-miR-410, hsa-miR-483-5p) and network-based biomarkers (hsa-miR-454, hsa-miR-137) which we validate to have strong connectivity with breast cancer. The source code and the datasets used can be found at http://agnigarh.tezu.ernet.in/~rosy8/Bioinformatics_RPBic_Data.rar.
Similar content being viewed by others
References
Shimomura A, Shiino S, Kawauchi J, Takizawa S, Sakamoto H, Matsuzaki J, Ono M, Takeshita F, Niida S, Shimizu C, et al. (2016) Novel combination of serum microRNA for detecting breast cancer in the early stage. Cancer science 107(3):326–334
Larrea E, Sole C, Manterola L, Goicoechea I, Armesto M, Arestin M, Caffarel MM, Araujo AM, Araiz M, Fernandez-Mercado M, et al. (2016) New concepts in cancer biomarkers: circulating miRNAs in liquid biopsies. Int. J. Mol. Sci. 17(5):627
Vargo-Gogola T, Rosen JM (2007) Modelling breast cancer: one size does not fit all. Nat Rev Cancer 7(9):659
Wang YK, Crampin EJ, et al. (2013) Biclustering reveals breast cancer tumour subgroups with common clinical features and improves prediction of disease recurrence. BMC Genomics 14(1):102
Hamam R, Hamam D, Alsaleh KA, Kassem M, Zaher W, Alfayez M, Aldahmash A, Alajez NM (2017) Circulating microRNAs in breast cancer: novel diagnostic and prognostic biomarkers. Cell Death Dis. 8(9):e3045
Brady-West DC, McGrowder DA (2011) Triple negative breast cancer: therapeutic and prognostic implications. Asian Pac J Cancer Prev 12(8):2139–2143
Eswaran J, Cyanam D, Mudvari P, Reddy S DN, Pakala SB, Nair SS, Florea L, Fuqua SuzanneAW, Godbole S, Kumar R (2012) Transcriptomic landscape of breast cancers through mRNA sequencing. Scientific Reports 2:264
Yang L, Shen Y, Yuan X, Zhang J, Wei J (2017) Analysis of breast cancer subtypes by AP-ISA biclustering. BMC Bioinformatics 18(1):481
Zhang J, Le TD, Liu L, Li J (2017) Identifying miRNA sponge modules using biclustering and regulatory scores. BMC Bioinformatics 18(3):44
Croce CM (2009) Causes and consequences of microRNA dysregulation in cancer. Nat. Rev. Genet. 10(10):704
Fiannaca A, LaRosa M, LaPaglia L, Rizzo R, Urso A (2015) Analysis of miRNA expression profiles in breast cancer using biclustering. BMC Bioinformatics 16(4):S7
Jin D, Lee H (2016) Prioritizing cancer-related microRNAs by integrating microRNA and mRNA datasets. Sci. Rep. 6:35350
Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1(1):24– 45
Eren K, Deveci M, Küçüktunç O, Çatalyürek UV (2012) A comparative analysis of biclustering algorithms for gene expression data. Brief. Bioinformatics 14(3):279–292
Mandal K, Sarmah R, Bhattacharyya DK (2018) Biomarker identification for cancer disease using biclustering approach: an empirical study. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Pontes B, Giráldez R, Aguilar-Ruiz JS (2015) Biclustering on expression data: a review. Journal of biomedical informatics 57:163–180
Padilha VA, Campello RJGB (2017) A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics 18(1):55
Hartigan JA (1972) Direct clustering of a data matrix. Journal of the american statistical association 67(337):123–129
Wang Z, Li G, Robinson RW, Huang X (2016) Unibic: sequential row-based biclustering algorithm for analysis of gene expression data. Scientific Reports 6:23466
Xue Y, Liao Z, Li M, Luo J, Kuang Q, Hu X, Li T (2015) A new approach for mining order-preserving submatrices based on all common subsequences. Computational and mathematical methods in medicine, 680434:1–680434:11 2015
Cheng Y, Church GM (2000) Biclustering of expression data. In: Ismb, 8, pp 93–103
Ben-Dor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. Journal of computational biology 10(3-4):373–384
Cheung L, Cheung DW, Kao B, Yip KY, Ng MK (2006) On mining micro-array data by order-preserving submatrix. Int J Bioinforma Res Appl 3(1):42–64
Chui CK, Kao B, Yip KY, Lee SD (2008) Mining order-preserving submatrices from data with repeated measurements. In: Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pp 133–142. IEEE
Gao BJ, Griffith OL, Ester M, Xiong H, Zhao Q, Jones StevenJM (2012) On the deep order-preserving submatrix problem: a best effort approach. IEEE transactions on knowledge and data engineering 24(2):309–325
Fang Q, Ng W, Feng J, Li Y (2012) Mining bucket order-preserving submatrices in gene expression data. IEEE transactions on knowledge and data engineering 24(12):2218–2231
Fang Q, Ng W, Feng J, Li Y (2014) Mining order-preserving submatrices from probabilistic matrices. ACM Transactions on Database Systems (TODS) 39(1):6
Rodriguez-Baena DS, Perez-Pulido AJ, Aguilar-Ruiz JS (2011) A biclustering algorithm for extracting bit-patterns from binary datasets. Bioinformatics 27(19):2738–2745
Henriques R, Madeira SC (2014) Bicspam: flexible biclustering using sequential patterns. BMC Bioinformatics 15(1):130
Liu B, Xin Y, Cheung RayCC, Yan H (2014) GPU-based biclustering for microarray data analysis in neurocomputing. Neurocomputing 134:239–246
Bhattacharya A, Cui Y (2017) A GPU-accelerated algorithm for biclustering analysis and detection of condition-dependent coexpression network modules. Scientific Reports 7(1):1–9
Kim J, Eades P, Fleischer R, Hong S-H, Iliopoulos CS, Park K, Puglisi SJ, Tokuyama T (2014) Order-preserving matching. Theor Comput Sci 525:68–79
Corizzo R, Pio G, Ceci M, Malerba D (2019) Dencast: distributed density-based clustering for multi-target regression. Journal of Big Data 6(1):43
Pio G, Serafino F, Malerba D, Ceci M (2018) Multi-type clustering and classification from heterogeneous networks. Inf Sci 425:107–126
Alves C ER, Cáceres EN, Song SW (2008) An all-substrings common subsequence algorithm. Discret Appl Math 156(7):1025–1035
Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129
Henriques R, Madeira SC (2014) Bicpam: pattern-based biclustering for biomedical data analysis. Algorithms for Molecular Biology 9(1):27
deSouto MCP, Costa IG, deAraujo DSA, Ludermir TB, Schliep A (2008) Clustering cancer gene expression data: a comparative study. BMC bioinformatics 9(1):497
Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with func associate. Bioinformatics 19(18):2502–2504
Oghabian A, Kilpinen S, Hautaniemi S, Czeizler E (2014) Biclustering methods: biological relevance and application in gene expression analysis. PloS one 9(3):e90801
Chia BKH, Karuturi RKM (2010) Differential co-expression framework to quantify goodness of biclusters and compare biclustering algorithms. Algorithms for Molecular Biology 5(1):23
Farazi TA, Horlings HM, ten Hoeve J, Mihailovic A, Halfwerk H, Morozov P, Brown M, Hafner M, Reyal F, van Kouwenhove M, et al. (2011) Microrna sequence and expression analysis in breast tumors by deep sequencing. Cancer Research 71: canres– 0608
Luo Z, Zhao Y, Azencott R (2014) Impact of miRNA sequence on miRNA expression and correlation between miRNA expression and cell cycle regulation in breast cancer cells. PloS one 9(4):e95205
Vlachos IS, Zagganas K, Paraskevopoulou MD, Georgakilas G, Karagkouni D, Vergoulis T, Dalamagas T, Hatzigeorgiou AG (2015) Diana-mirpath v3. 0: deciphering microRNA function with experimental support. Nucleic Acids Research 43(W1):W460–W466
Costa DaniellyCF, deOliveira GuilhermeAP, Cino EA, Soares IN, Rangel LP, Silva JL (2016) Aggregation and prion-like properties of misfolded tumor suppressors: is cancer a prion disease?. Cold Spring Harbor Perspectives in Biology 8(10):a023614
Wu X, Somlo G, Yu Y, Palomares MR, Li AX, Zhou W, Chow A, Yen Y, Rossi JJ, Gao H, et al. (2012) De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer. Journal of Translational Medicine 10(1):42
Cao Z-G, Li J-J, Yao L, Huang Y-N, Liu Y-R, Hu X, Song C-G, Shao Z-M (2016) High expression of microRNA-454 is associated with poor prognosis in triple-negative breast cancer. Oncotarget 7 (40):64900
Zhao Y, Li Y, Lou G, Zhao L, Xu Z, Zhang Y, He F (2012) Mir-137 targets estrogen-related receptor alpha and impairs the proliferative and migratory capacity of breast cancer cells. PloS one 7 (6):e39102
Li Y, Qiu C, Tu J, Geng B, Yang J, Jiang T, Cui Q (2013) Hmdd v2. 0: a database for experimentally supported human microRNA and disease associations. Nucleic Acids Research 42 (D1):D1070–D1074
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Mandal, K., Sarmah, R., Bhattacharyya, D.K. et al. Rank-preserving biclustering algorithm: a case study on miRNA breast cancer. Med Biol Eng Comput 59, 989–1004 (2021). https://doi.org/10.1007/s11517-020-02271-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-020-02271-0