Skip to main content
Log in

BicFinder: a biclustering algorithm for microarray data analysis

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In the context of microarray data analysis, biclustering allows the simultaneous identification of a maximum group of genes that show highly correlated expression patterns through a maximum group of experimental conditions (samples). This paper introduces a heuristic algorithm called BicFinder (The BicFinder software is available at: http://www.info.univ-angers.fr/pub/hao/BicFinder.html) for extracting biclusters from microarray data. BicFinder relies on a new evaluation function called Average Correspondence Similarity Index (ACSI) to assess the coherence of a given bicluster and utilizes a directed acyclic graph to construct its biclusters. The performance of BicFinder is evaluated on synthetic and three DNA microarray datasets. We test the biological significance using a gene annotation web-tool to show that our proposed algorithm is able to produce biologically relevant biclusters. Experimental results show that BicFinder is able to identify coherent and overlapping biclusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aguilar-Ruiz JS (2005) Shifting and scaling patterns from gene expression data. Bioinformatics 21: 3840–3845

    Article  Google Scholar 

  2. Akadi A, Amine A, El Ouardighi A, Aboutajdine D (2010) A two-stage gene selection scheme utilizing MRMR filter and GA wrapper. Knowl Inf Syst, Published online: 10 March 2010

  3. Alizadeh AA, Eisen MB, Davis RE et al (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403: 503–511

    Article  Google Scholar 

  4. Angiulli F, Cesario E, Pizzuti C (2008) Random walk biclustering for microarray data. J Inf Sci 178: 1479–1497

    Article  MATH  Google Scholar 

  5. Ayadi W, Elloumi M (2011) Algorithms in computational molecular biology: techniques, approaches and applications, chapter biclustering of microarray data. In: Wiley book series on bioinformatics : computational techniques and engineering, Wiley-Blackwell, John Wiley & Sons Ltd., New Jersey (Publish.) (to appear)

  6. Ayadi W, Elloumi M, Hao JK (2009) A biclustering algorithm based on a bicluster enumeration tree: application to dna microarray data. BioData Min 2(1): 9

    Article  Google Scholar 

  7. Balasubramaniyan R, llermeier H, Weskamp E, Kamper J (2005) Clustering of gene expression data using a local shape-based similarity measure. Bioinformatics 21: 1069–1077

    Article  Google Scholar 

  8. Barkow S, Bleuler S, Prelic A, Zimmermann P, Zitzler E (2006) Bicat: a biclustering analysis toolbox. Bioinformatics 22(10): 1282–1283

    Article  Google Scholar 

  9. Ben-Dor A, Chor B, Karp R, Yakhini Z (2002) Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB ’02: proceedings of the sixth annual international conference on computational biology. ACM, New York, pp 49–57

  10. Bergmann S, Ihmels J, Barkai N (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics 20(13): 1993–2003

    Article  Google Scholar 

  11. Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with funcassociate. Bioinformatics 19(18): 2502–2504

    Article  Google Scholar 

  12. Bleuler S, Prelic A, Zitzler E (2004) An ea framework for biclustering of gene expression data. In: Proceedings of congress on evolutionary computation. pp 166–173

  13. Bryan K, Cunningham P, Bolshakova N (2006) Application of simulated annealing to the biclustering of gene expression data. In: IEEE Transactions on information technology on biomedicine, 10(3): 519–525

  14. Cano C, Adarve L, Lopez J, Blanco A (2007) Possibilistic approach for biclustering microarray data. In: Computers in biology and medicine, 37, pp 1426–1436

  15. Cheng KO, Law NF, Siu WC, Liew AW (2008) Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization. BMC Bioinformatics 9(210): 1282–1283

    Google Scholar 

  16. Cheng Y, Church GM (2000) Biclustering of expression data. In: Proceedings of the eighth international conference on intelligent systems for molecular biology. AAAI Press, pp 93–103

  17. Cheng Y, Church GM (2006) Biclustering of expression data. Technical report (supplementary information)

  18. Christinat Y, Wachmann B, Zhang L (2008) Gene expression data analysis using a novel approach to biclustering combining discrete and continuous data. IEEE/ACM Trans Comput Biol Bioinform 5(4): 583–593

    Article  Google Scholar 

  19. Dharan A, Nair AS (2009) Biclustering of gene expression data using reactive greedy randomized adaptive search procedure. BMC Bioinform 10(Suppl 1): S27

    Article  Google Scholar 

  20. Dimaggio P, Mcallister S, Floudas C (2008) Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies. BMC Bioinform 9(1):458

    Google Scholar 

  21. Divina F, Aguilar-Ruiz JS (2007) A multi-objective approach to discover biclusters in microarray data. In: GECCO ’07: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York

  22. Gallo CA, Carballido JA, Ponzoni I (2009) Microarray biclustering: A novel memetic approach based on the pisa platform. In: EvoBIO ’09: Proceedings of the 7th European conference on evolutionary computation, machine learning and data mining in bioinformatics. Springer, Berlin, pp 44–55

  23. Hartigan JA (1972) Direct clustering of a data matrix. J American Statistical Association 67(337): 123–129

    Article  Google Scholar 

  24. Jiang D, Pei J, Ramanathan M, Lin C, Tang C, Zhang A (2007) Mining gene-sample-time microarray data: a coherent gene cluster discovery approach. Knowl Inf Syst 13(3): 305–335

    Article  Google Scholar 

  25. Lehmann EL, D’Abrera HJM (1998) Nonparametrics: statistical methods based on ranks. Prentice-Hall, rev. ed. Englewood Cliffs, NJ, pp 292–323

  26. Liu J, Li Z, Hu X, Chen Y (2009) Biclustering of microarray data with MOSPO based on crowding distance. BMC Bioinform 10(S–4)

  27. Liu J, Wang W (2003) Op-cluster: clustering by tendency in high dimensional space. IEEE Int Conf Data Min. ISBN 0-7695-1978-4, pp 187–194

  28. Liu JW, Li ZJ, Liu FF, Chen YM (2008) Multi-objective particle swarm optimization biclustering of microarray data. In: IEEE international conference on bioinformatics and biomedicine(BIBM 2008). IEEE Computer Society, Washington, pp 363–366

  29. Liu X, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1): 50–56

    Article  Google Scholar 

  30. Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with b-splines. Bioinformatics 19: 474–482

    Article  Google Scholar 

  31. Madeira SaraC, Oliveira ArlindoL (2004) Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 1(1): 24–45

    Article  Google Scholar 

  32. Madeira SC, Oliveira AL (2009) A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms Mol Biol 4: 8

    Article  Google Scholar 

  33. Maulik U, Mukhopadhyay A, Bandyopadhyay S (2009) Combining pareto-optimal clusters using supervised learning for identifying co-expressed genes. BMC Bioinform 10: 27

    Article  Google Scholar 

  34. Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn 39(12): 2464–2477

    Article  MATH  Google Scholar 

  35. Myers JL, Arnold DW (2003) Research design and statistical analysis

  36. Okada Y, Okubo K, Horton P, Fujibuchi W (2007) Exhaustive search method of gene expression modules and its application to human tissue data. In: IAENG international journal of computer science, 34, pp 1–16

  37. Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM (2003) Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19: 834–841

    Article  Google Scholar 

  38. Pontes B, Divina F, Giráldez R, Aguilar-Ruiz JS (2007) Virtual error: a new measure for evolutionary biclustering. In: Evolutionary computation, machine learning and data mining in bioinformatics. pp 217–226

  39. Prelic A, Bleuler S, Zimmermann P, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9): 1122–1129

    Article  Google Scholar 

  40. Schliep A, Schonhuth A, Steinhoff C (2003) Using hidden markov models to analyze gene expression time course data. Bioinformatics 19: i255–i263

    Article  Google Scholar 

  41. Son YS, Baek J (2008) A modified correlation coefficient based similarity measure for clustering time-course gene expression data. Pattern Recognit Lett 29(3): 232–242

    Article  Google Scholar 

  42. Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18: S136–S144

    Article  Google Scholar 

  43. Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM (1999) Systematic determination of genetic network architecture. Nat Genet 22: 281–285

    Article  Google Scholar 

  44. Teng L, Chan L (2008) Discovering biclusters by iteratively sorting with weighted correlation coefficient in gene expression data. J Signal Process Syst 50(3): 267–280

    Article  Google Scholar 

  45. Wei JM, Wang SQ, Yuan XJ (2010) Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans Knowl Data Eng 22(3): 381–391

    Article  Google Scholar 

  46. Yang J, Wang H, Wang W, Yu P (2003) Enhanced biclustering on expression data. In: BIBE ’03: Proceedings of the 3rd IEEE symposium on bioInformatics and bioengineering. IEEE Computer Society, Washington, p 321

  47. Zhang Z, Teo A, Ooi BC, Tan KL (2004) Mining deterministic biclusters in gene expression data. Bioinformatic and bioengineering, IEEE international symposium on, pp 283–290

  48. Zhao H, Liew A, Xie X, Yan H (2008) A new geometric biclustering algorithm based on the hough transform for analysis of large scale microarray data. J Theoretical Biol 251: 264–274

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wassim Ayadi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ayadi, W., Elloumi, M. & Hao, JK. BicFinder: a biclustering algorithm for microarray data analysis. Knowl Inf Syst 30, 341–358 (2012). https://doi.org/10.1007/s10115-011-0383-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0383-7

Keywords

Navigation