Abstract
The extraction of knowledge from large biological data is among the main challenges of bioinformatics. Several data mining techniques have been proposed to extract data; in this work, we focus on biclustering which has grown considerably in recent years. Biclustering aims to extract a set of genes with similar behavior under a condition set. In this paper, we propose an evolutionary biclustering algorithm and we analyze its performance by varying its genetic components. Hence, several versions of the evolutionary biclustering algorithm are introduced. Further, an experimental study is achieved on two real microarray datasets and the results are compared to other state-of-the-art biclustering algorithms. This thorough study allows to retain the best combination of operators among the various experienced choices.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abdullah A, Hussain A (2006) A new biclustering technique based on crossing minimization. Neurocomputing 69(16–18):1882–1896. https://doi.org/10.1016/j.neucom.2006.02.018
Aguilar-Ruiz JS (2005) Shifting and scaling patterns from gene expression data. Bioinformatics 21(20):3840–3845. https://doi.org/10.1093/bioinformatics/bti641
Ahmed HA, Mahanta P, Bhattacharyya DK, Kalita JK (2014) Shifting-and-scaling correlation based biclustering algorithm. IEEE/ACM Trans Comput Biol Bioinform 11(6):1239–1252. https://doi.org/10.1109/TCBB.2014.2323054
Amna AR, Hermanto A (2017) Implementation of BCBimax algorithm to determine customer segmentation based on customer market and behavior. In: Proceedings of the 4th international conference on computer applications and information processing technology (CAIPT’17), pp 1–5. https://doi.org/10.1109/CAIPT.2017.8320694
Arikan SDO, Iyigun C (2016) A supervised biclustering optimization model for feature selection in biomedical dataset classification. In: Proceedings of the data mining and big data, first international conference, (DMBD’16), pp 196–204. https://doi.org/10.1007/978-3-319-40973-3_19
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29. https://doi.org/10.1038/75556
Auer H, Lyianarachchi S, Newsom D, Klisovic MI, Marcucci G, Kornacker K (2003) Chipping away at the chip bias: RNA degradation in microarray analysis. Nat Genet 35(4):292–293. https://doi.org/10.1038/ng1203-292
Ayadi W, Hao J (2014) A memetic algorithm for discovering negative correlation biclusters of DNA microarray data. Neurocomputing 145:14–22. https://doi.org/10.1016/j.neucom.2014.05.074
Ayadi W, Elloumi M, Hao J (2009) A biclustering algorithm based on a bicluster enumeration tree: application to DNA microarray data. BioData Mining 2:9. https://doi.org/10.1186/1756-0381-2-9
Ayadi W, Elloumi M, Hao J (2010) Iterated local search for biclustering of microarray data. In: Pattern recognition in bioinformatics—5th IAPR international conference (PRIB), pp 219–229. https://doi.org/10.1007/978-3-642-16001-1_19
Ayadi W, Elloumi M, Hao J (2012a) BicFinder: a biclustering algorithm for microarray data analysis. Knowl Inf Syst 30(2):341–358. https://doi.org/10.1007/s10115-011-0383-7
Ayadi W, Elloumi M, Hao J (2012b) BiMine+: an efficient algorithm for discovering relevant biclusters of DNA microarray data. Knowl Based Syst 35:224–234. https://doi.org/10.1016/j.knosys.2012.04.017
Ayadi W, Elloumi M, Hao J (2012c) Pattern-driven neighborhood search for biclustering of microarray data. BMC Bioinform 13(S–7):S11. https://doi.org/10.1186/1471-2105-13-S7-S11
Ayadi W, Maâtouk O, Bouziri H (2012d) Evolutionary biclustering algorithm of gene expression data. In: 23rd international workshop on database and expert systems applications (DEXA), pp 206–210. https://doi.org/10.1109/DEXA.2012.46
Balamurugan R, Natarajan AM, Premalatha K (2014) Comparative study on swarm intelligence techniques for biclustering of microarray gene expression data. Int J Comput Electr Autom Control Inf Eng 8(2):333–339
Ben-Dor A, Chor B, Karp RM, Yakhini Z (2002) Discovering local structure in gene expression data: the order-preserving submatrix problem. In: Proceedings of the sixth annual international conference on computational biology, RECOMB 2002, Washington, DC, USA, April 18–21, 2002, pp 49–57. https://doi.org/10.1145/565196.565203
Berrar DP, Dubitzky W, Granzow M (2003) A practical approach to microarray data analysis. Kluwer Academic Publishers, Dordrecht. https://doi.org/10.1007/b101875
Berriz GF, Beaver JE, Cenik C, Tasan M, Roth FP (2009) Next generation software for functional trend analysis. Bioinformatics 25(22):3043–3044. https://doi.org/10.1093/bioinformatics/btp498
Bottarelli L, Bicego M, Denitto M, Di Pierro A, Farinelli A, Mengoni R (2018) Biclustering with a quantum annealer. Soft Comput. https://doi.org/10.1007/s00500-018-3034-z
Cachucho R, Liu K, Nijssen S, Knobbe AJ (2016) Bipeline: a web-based visualization tool for biclustering of multivariate time series. In: Proceedings of the machine learning and knowledge discovery in databases—European conference, ECMLPKDD’16 , Part III, pp 12–16. https://doi.org/10.1007/978-3-319-46131-1_3
Chen J, Chang Y (2009) A condition-enumeration tree method for mining biclusters from DNA microarray data sets. Biosystems 97(1):44–59. https://doi.org/10.1016/j.biosystems.2009.04.003
Cheng Y, Church GM (2000) Biclustering of expression data. In: Proceedings of the eighth international conference on intelligent systems for molecular biology, August 19–23, 2000, La Jolla/San Diego, CA, USA, pp 93–103
Christinat Y, Wachmann B, Zhang L (2008) Gene expression data analysis using a novel approach to biclustering combining discrete and continuous data. IEEE/ACM Trans Comput Biol Bioinform 5(4):583–593. https://doi.org/10.1145/1486911.1486917
Das S, Idicula SM (2010) Application of cardinality based GRASP to the biclustering of gene expression data. Int J Comput Appl 1(18):44–51. https://doi.org/10.5120/384-575
de Castro PAD, de França FO, Ferreira HM, Zuben FJV (2007) Applying biclustering to text mining: an immune-inspired approach. In: Artificial immune systems, 6th international conference (ICARIS), pp 83–94. https://doi.org/10.1007/978-3-540-73922-7_8
Dharan S, Nair AS (2009) Biclustering of gene expression data using reactive greedy randomized adaptive search procedure. BMC Bioinform. https://doi.org/10.1186/1471-2105-10-S1-S27
Divina F, Aguilar-Ruiz JS (2006) Biclustering of expression data with evolutionary computation. IEEE Trans Knowl Data Eng 18(5):590–602. https://doi.org/10.1109/TKDE.2006.74
Divina F, Aguilar-Ruiz JS (2007) A multi-objective approach to discover biclusters in microarray data. In: Genetic and evolutionary computation conference, GECCO 2007, proceedings, London, England, UK, July 7–11, 2007, pp 385–392. https://doi.org/10.1145/1276958.1277038
Divina F, Pontes B, Giráldez R, Aguilar-Ruiz JS (2012) An effective measure for assessing the quality of biclusters. Comput Biol Med 42(2):245–256. https://doi.org/10.1016/j.compbiomed.2011.11.015
Elizabeth BI, Shuai W, Jeremy G, Heng J, David B, Michael CJ, Gavin S (2004) GO: termfinder-open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics 20(18):3710–3715. https://doi.org/10.1093/bioinformatics/bth456
Fogel DB (1997) The advantages of evolutionary computation. In: Biocomputing and emergent computation: proceedings of BCEC97, pp 1–11
Gallo CA, Carballido JA, Ponzoni I (2009) BiHEA: a hybrid evolutionary approach for microarray biclustering. In: Advances in bioinformatics and computational biology, 4th Brazilian symposium on bioinformatics (BSB), pp 36–47. https://doi.org/10.1007/978-3-642-03223-3_4
Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11(12):4241–4257
GOTermFinder (2004) http://db.yeastgenome.org/cgi-bin/go/gotermfinde
Hartigan JA (1975) Clustering algorithms. Wiley, Hoboken
Henriques R, Madeira SC (2016) BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge. Algorithms Mol Biol 11(1):2–23. https://doi.org/10.1186/s13015-016-0085-5
Huang Q, Tao D, Li X, Liew AW (2012) Parallelized evolutionary learning for detection of biclusters in gene expression data. IEEE/ACM Trans Comput Biol Bioinf 9(2):560–570. https://doi.org/10.1109/TCBB.2011.53
Huang X, Zhang L, Wang B, Li F, Zhang Z (2018) Feature clustering based support vector machine recursive feature elimination for gene selection. Appl Intell 48(3):594–607. https://doi.org/10.1007/s10489-017-0992-2
Hussain SF, Ramazan M (2016) Biclustering of human cancer microarray data using co-similarity based co-clustering. Expert Syst Appl 55:520–531. https://doi.org/10.1016/j.eswa.2016.02.029
Ihmels J, Bergmann S, Barkai N (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics 20(13):1993–2003. https://doi.org/10.1093/bioinformatics/bth166
Inbarani HH, Thangavel K (2013) Effective web personalisation based on rough biclustering. Int J Granul Comput Rough Sets Intell Syst (IJGCRSIS’13) 3(1):59–84, https://doi.org/10.1504/IJGCRSIS.2013.054127
Ishibuchi H, Murata T (1998) A multi-objective genetic local search algorithm and its application to flowshop scheduling. IEEE Trans Syst Mand Cybern C 28(3):392–403. https://doi.org/10.1109/5326.704576
Kenyon GL, DeMarini DM, Fuchs E, Galas DJ, Kirsch JF, Leyh TS, Moos WH, Petsko GA, Ringe D, Rubin GM, Sheahan LC, National Research Council Steering Committee (US) (2002) Defining the mandate of proteomics in the post-genomics era: Workshop report. http://europepmc.org/books/NBK95348
Liew AWC (2016) Biclustering Analysis of Gene Expression Data Using Evolutionary Algorithms. John Wiley & Sons, Inc., chap 4, 67–95. https://doi.org/10.1002/9781119079453.ch4
Liu J, Wang W (2003) Op-cluster: clustering by tendency in high dimensional space. In: Proceedings of the 3rd IEEE international conference on data mining ((ICDM), pp 187–194. https://doi.org/10.1109/ICDM.2003.1250919
Liu J, Li Z, Hu X, Chen Y (2009) Biclustering of microarray data with MOSPO based on crowding distance. BMC Bioinform. https://doi.org/10.1186/1471-2105-10-S4-S9
Liu X, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1):50–56. https://doi.org/10.1093/bioinformatics/btl560
Maâtouk O, Ayadi W, Bouziri H, Duval B (2014) Evolutionary algorithm based on new crossover for the biclustering of gene expression data. In: Proceedings of the 9th international conference pattern recognition in bioinformatics (PRIB’14), pp 48–59. https://doi.org/10.1007/978-3-319-09192-1_5
Maâtouk O, Ayadi W, Bouziri H, Duval B (2017) Local search method based on biological knowledge for the biclustering of gene expression data. 7th International workshop on combinations of intelligent methods and applications (CIMA 17) as part of 21st international conference on knowledge-based and intelligent information & engineering systems (KES 17), Marseille, France vol 6(2), pp 65–74
Madeira SC, Teixeira MC, Sá-Correia I, Oliveira AL (2010) Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans Comput Biol Bioinform 7(1):153–165. https://doi.org/10.1145/1719272.1719289
Maind A, Raut S (2018) Comparative analysis and evaluation of biclustering algorithms for microarray data. In: Perez GM, Mishra KK, Tiwari S, Trivedi MC (eds) Networking communication and data knowledge engineering, vol 4. Springer, Singapore, pp 159–171. https://doi.org/10.1007/978-981-10-4600-1_15
Manduchi E, Scearce LM, Brestelli JE, Grant GR, Kaestner KH, Stoeckert CJJ (2002) Comparison of different labeling methods for 2-channel high-density microarray experiments. Physiol Genomics 10(3):169–179. https://doi.org/10.1152/physiolgenomics.00120.2001
Meunier H (2002) Algorithmes évolutionnaires parallèles pour l’optimisation multi-objectif de réseaux de télécommunications mobiles. PhD thesis, Université des Sciences et Technologies de Lille, France
Miao Y, Zhang H (2017) A biclustering-based lead user identification methodology applied to xiaomi. In: Li X, Xu X (eds) Proceedings of the fourth international forum on decision sciences. Springer, Singapore, pp 865–871. https://doi.org/10.1007/978-981-10-2920-2_80
Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recognit 39(12):2464–2477. https://doi.org/10.1016/j.patcog.2006.03.003
Nepomuceno JA, Lora AT, Aguilar-Ruiz JS (2009) A hybrid metaheuristic for biclustering based on scatter search and genetic algorithms. In: Pattern recognition in bioinformatics, 4th IAPR international conference, PRIB 2009, Sheffield, UK, September 7–9, 2009. Proceedings, pp 199–210. https://doi.org/10.1007/978-3-642-04031-3_18
Nepomuceno JA, Lora AT, Aguilar-Ruiz JS (2010) Evolutionary metaheuristic for biclustering based on linear correlations among genes. In: Proceedings of the 2010 ACM symposium on applied computing (SAC), Sierre, Switzerland, March 22–26, 2010, pp 1143–1147. https://doi.org/10.1145/1774088.1774329
Nepomuceno JA, Lora AT, Aguilar-Ruiz JS (2011) Biclustering of gene expression data by correlation-based scatter search. BioData Min 4(1):3. https://doi.org/10.1186/1756-0381-4-3
Nepomuceno JA, Troncoso A, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2015a) Integrating biological knowledge based on functional annotations for biclustering of gene expression data. Comput Methods Program Biomed 119(3):163–180. https://doi.org/10.1016/j.cmpb.2015.02.010
Nepomuceno JA, Troncoso A, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2015b) Scatter search-based identification of local patterns with positive and negative correlations in gene expression data. Appl Soft Comput 35:637–651. https://doi.org/10.1016/j.asoc.2015.06.019
Nepomuceno JA, Troncoso A, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2016) Biclustering of gene expression data based on simUI semantic similarity measure. Hybrid Artif Intell Syst 9648:685–693. https://doi.org/10.1007/978-3-319-32034-2_57
Orzechowski P, Boryczko K (2016) Text mining with hybrid biclustering algorithms. In: Proceedings of the artificial intelligence and soft computing—15th international conference, (ICAISC’16), Part II, pp 102–113. https://doi.org/10.1007/978-3-319-39384-1_9
Orzechowski P, Sipper M, Huang X, Moore JH (2018) EBIC: an artificial intelligence-based parallel biclustering algorithm for pattern discovery. Comput Res Repos (CoRR) abs/1801.03039. arXiv:1801.03039
Padilha VA, Campello RJGB (2017) A systematic comparative evaluation of biclustering techniques. BMC Bioinform 18(1):55:1–55:25. https://doi.org/10.1186/s12859-017-1487-1
Pontes B, Divina F, Giráldez R, Aguilar-Ruiz JS (2007) Virtual error: A new measure for evolutionary biclustering. In: Evolutionary computation, machine learning and data mining in bioinformatics, 5th European conference, EvoBIO 2007, Valencia, Spain, April 11–13, 2007, Proceedings, pp 217–226. https://doi.org/10.1007/978-3-540-71783-6_21
Pontes B, Giráldez R, Aguilar-Ruiz JS (2010) Measuring the quality of shifting and scaling patterns in biclusters. In: Pattern recognition in bioinformatics—5th IAPR international conference, PRIB 2010, Nijmegen, The Netherlands, September 22–24, 2010. Proceedings, pp 242–252. https://doi.org/10.1007/978-3-642-16001-1_21
Pontes B, Giráldez R, Aguilar-Ruiz JS (2013) Configurable pattern-based evolutionary biclustering of gene expression data. Algorithms Mol Biol 8:4. https://doi.org/10.1186/1748-7188-8-4
Pontes B, Giráldez R, Aguilar-Ruiz JS (2015) Biclustering on expression data: a review. J Biomed Inform 57:163–180. https://doi.org/10.1016/j.jbi.2015.06.028
Prelic A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129. https://doi.org/10.1093/bioinformatics/btl060
Quackenbush J (2006) Microarray analysis and tumor classification. N Engl J Med 354(23):2463–2472. https://doi.org/10.1056/NEJMra042342
Robinson PN, Wollstein A, Böhme U, Beattie BJ (2004) Ontologizing gene-expression microarray data: characterizing clusters with gene ontology. Bioinformatics 20(6):979–981. https://doi.org/10.1093/bioinformatics/bth040
Rodríguez-Baena DS, Pérez-Pulido AJ, Aguilar-Ruiz JS (2011) A biclustering algorithm for extracting bit-patterns from binary datasets. Bioinformatics 27(19):2738–2745. https://doi.org/10.1093/bioinformatics/btr464
Saeed T, Jason HD, Michael CJ, Raymond CJ, George CM (1999) Systematic determination of genetic network architecture. Nat Genet 22(3):281–285
Sang C, Sun D (2014) Co-clustering over multiple dynamic data streams based on non-negative matrix factorization. Appl Intell 41(2):487–502. https://doi.org/10.1007/s10489-014-0526-0
Schaffer JD (1985) Multiple objective optimization with vector evaluated genetic algorithms. In: Proceedings of the 1st international conference on genetic algorithms, Pittsburgh, PA, USA, July 1985, pp 93–100
Seridi K, Jourdan L, Talbi E (2011) Multi-objective evolutionary algorithm for biclustering in microarrays data. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2011, New Orleans, LA, USA, 5–8 June, 2011, pp 2593–2599. https://doi.org/10.1109/CEC.2011.5949941
Seridi K, Jourdan L, Talbi E (2015) Using multiobjective optimization for biclustering microarray data. Appl Soft Comput 33:239–249. https://doi.org/10.1016/j.asoc.2015.03.060
Serin A, Vingron M (2011) DeBi: discovering differentially expressed biclusters using a frequent itemset approach. Algorithms Mol Biol 6:18. https://doi.org/10.1186/1748-7188-6-18
Sharan R, Porat UB, Bleiberg O (2006) Analysis of biological networks: network modules—clustering and biclustering. Lecture 5:9
Shyama D, Mary IS (2010) Application of greedy randomized adaptive search procedure to the biclustering of gene expression data. Int J Comput Appl 2(3):6–13. https://doi.org/10.5120/650-907
Teng L, Chan L (2008) Discovering biclusters by iteratively sorting with weighted correlation coefficient in gene expression data. J Signal Process Syst 50(3):267–280. https://doi.org/10.1007/s11265-007-0121-2
Thangavel K, Bagyamani J, Rathipriya R (2012) Novel hybrid PSO-SA model for biclustering of expression data. Procedia Eng 30:1048–1055. https://doi.org/10.1016/j.proeng.2012.01.962
Trang T, Chi NC, Minh HN (2007) Management and analysis of DNA microarray data by using weighted trees. J Glob Optim 39(4):623–645. https://doi.org/10.1007/s10898-007-9158-9
Valente AF, Ayadi W, Elloumi M, Oliveira J, Oliveira J, Kao HJ (2013) A survey on biclustering of gene expression data. Biological knowledge discovery handbook: preprocessing, mining, and postprocessing of biological data. Wiley, Hoboken, pp 591–608. https://doi.org/10.1002/9781118617151.ch25
Wang J, Zaki MJ, Toivonen H, Shasha D (2005) Data mining in bioinformatics, advanced information and knowledge processing: chapter introduction to data mining in bioinformatics. Springer, Singapore, pp 3–8
Yang YH, Buckley MJ, Speed TP (2001) Analysis of CDNA microarray images. Brief Bioinform 2(4):341–349. https://doi.org/10.1093/bib/2.4.341
Yip K (2003) DB seminar series: biclustering methods for microarray data analysis, pp 46–47. http://www.cs.wayne.edu/shiyong/csc7710/assignments/biclusterppt
Yun T, Yi GS (2013) Biclustering for the comprehensive search of correlated gene expression patterns using clustered seed expansion. BMC Genom 14(1):144. https://doi.org/10.1186/1471-2164-14-144
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Maâtouk, O., Ayadi, W., Bouziri, H. et al. Evolutionary biclustering algorithms: an experimental study on microarray data. Soft Comput 23, 7671–7697 (2019). https://doi.org/10.1007/s00500-018-3394-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3394-4