Skip to main content
Log in

Ensemble biclustering gene expression data based on the spectral clustering

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Many biclustering algorithms and bicluster criteria have been proposed in analyzing the gene expression data. However, there are no clues about the choice of a specific biclustering algorithm, which make ensemble biclustering method receive much attention for aggregating the advantage of various biclustering algorithms. Although the method of co-association consensus (COAC) is a landmark of ensemble biclustering, the effectiveness and efficiency are the worst in state-of-the-art methods. In this paper, to improve COAC, we propose spectral ensemble biclustering (SEB) in which an novel method for generating a set of basic biclusters is proposed for generating the basic biclusters with better quality as well as higher diversity and an new consensus method is also adopted for combing the above basic biclusters. In SEB, spectral clustering is directly applied to the co-association matrix and equivalently transformed into the weighted k-means. Experiments on six gene expression data demonstrate that the effectiveness, efficiency and scalability of SEB are the best compared with existing ensemble methods in terms of the biological significance and runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Hartigan JA (1972) Direct clustering of a data matrix]. J Am Stat Assoc 67(337):123–129

    Article  Google Scholar 

  2. Cheng Y, Church GM (2000) Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8:93–103

  3. Maderia SC, Oliverial AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1(1):24–45

    Article  Google Scholar 

  4. Pontes B, Giráldez R, Aguilar-Ruiz JS (2015) Biclustering on expression data: a review. J Biomed Inform 57(C):163–180

    Article  Google Scholar 

  5. Ayadi W, Elloumi M, Hao JK (2009) A biclustering algorithm based on a bicluster enumeration tree: application to DNA microarray data. Biodata Min 2(2):146–150

    Google Scholar 

  6. Divina F, Aguilar-Ruiz JS (2006) Biclustering of expression data with evolutionary computation. IEEE Trans Knowl Data Eng 18(5):590–602

    Article  Google Scholar 

  7. Nepomuceno JA, Troncoso A, Aguilarruiz JS (2011) Biclustering of gene expression data by correlation-based scatter search. Biodata Min 4(1):1–17

    Article  Google Scholar 

  8. Liu J, Li Z, Hu X, Chen Y (2009) Biclustering of microarray data with MOPSO based on crowding distance. BMC Bioinform 10(9):S9

    Article  Google Scholar 

  9. de Franca FO, Bezerra G, Von Zuben FJ (2006) New perspectives for the biclustering problem. IEEE Cong Evol Comput Vanc, BC, Canada, pp 753–760

    Google Scholar 

  10. Bryan K, Cunningham P, Bolshakova N (2006) Application of simulated annealing to the biclustering of gene expression data. IEEE Trans Inf Technol Biomed 10(3):519–525

    Article  Google Scholar 

  11. Divina F, Pontes B, Giráldez R, Aguilarruiz JS (2012) An effective measure for assessing the quality of biclusters. Comput Biol Med 42(2):245–256

    Article  Google Scholar 

  12. Ayadi W, Elloumi M, Hao JK (2012) BicFinder: a biclustering algorithm for microarray data analysis. Knowl Inf Syst 30(2):341–358

    Article  Google Scholar 

  13. Mukhopadhyay A, Maulik U, Bandyopadhyay S (2009) A novel coherence measure for discovering scaling biclusters from gene expression data. J Bioinform Comput Biol 7(5):853–868

    Article  Google Scholar 

  14. Flores JL, Inza I, Larrañaga P, Calvo B (2013) A new measure for gene expression biclustering based on non-parametric correlation. Comput Methods Programs Biomed 112(3):367–397

    Article  Google Scholar 

  15. Liu X, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(26):50–56

    Article  Google Scholar 

  16. Hanczar B, Nadif M (2011) Using the bagging approach for biclustering of gene expression data. Neurocomputing 74(10):1595–1605

    Article  Google Scholar 

  17. Hanczar B, Nadif M (2012) Ensemble methods for biclustering tasks. Pattern Recogn 45(11):3938–3949

    Article  Google Scholar 

  18. Aggarwal G, Gupta N (2013) BiETopti-biclustering ensemble using optimization techniques. Advances in data mining: applications and theoretical aspects. Springer, Berlin, pp 181–192

    Book  Google Scholar 

  19. Aggarwal G, Gupta N (2013) BEMI bicluster ensemble using mutual information. International conference on machine learning and applications, IEEE computer society, pp 321–324

  20. Hanczar B, Nadif M (2014) Unsupervised consensus function applied to ensemble biclustering. In: Proceedings of the 3rd international conference on pattern recognition application and methods, pp 30–39

  21. Liu H, Liu T, Wu J, Tao D, Yun F (2015) Spectral ensemble clustering. ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, pp 715–724

  22. Yin L, Liu Y (2015) Biclustering of the gene expression data by coevolution cuckoo search. Int J Bioautom 19(2):161–176

    Google Scholar 

  23. Pontes B, Girldez R, Aguilarruiz JS (2014) Quality measures for gene expression biclusters. PLOS One 10(3):1–24

    Article  Google Scholar 

  24. Henriques R, Madeira SC (2015) Biclustering with Flexible Plaid Models to Unravel Interactions between Biological Processes. IEEE/ACM Trans Comput Biol Bioinf 12(4):738–752

    Article  Google Scholar 

  25. Chekouo T, Murua A (2015) The penalized biclustering model and related algorithms. J Appl Stat 42(6):1255–1277

    Article  MathSciNet  Google Scholar 

  26. Denitto M, Farinelli A, Figueiredo MAT (2016) A Biclustering Approach based on factor graphs and the max-sum algorithm. Pattern Recogn 62:114–124

    Article  Google Scholar 

  27. Hussain SF, Ramazan M (2016) Biclustering of human cancer microarray data using co-similarity based co-clustering. Expert Syst Appl 55:520–531

    Article  Google Scholar 

  28. Yang XS, Deb S (2009) Cuckoo Search via Lévy flights. In: Proceedings of world congress on nature & biologically inspired computing, India, pp 210–214

  29. Strehl A, Ghosh J (2002) Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(3):583–617

    MathSciNet  MATH  Google Scholar 

  30. Zhao H, Weechung LAZ, Wang D, Yan H (2012) Biclustering analysis for pattern discovery: current techniques. Comp Stud Appl Curr Bioinform 7(1):43–55

    Article  Google Scholar 

  31. Falcon S, Gentleman R (2007) How to use GOstats testing gene lists for go term association. Bioinformatics 23(2):257–258

    Article  Google Scholar 

  32. Edgar R, Domrachev M, Alex EL (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210

    Article  Google Scholar 

  33. Gautier L, Cope L, Bolstad BM (2004) Affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(20):307–315

    Article  Google Scholar 

  34. Kanehisa M (1997) A database for post-genome analysis. Trends Genet 13(13):375–376

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported in part by the National Natural Science Foundation of China (NSFC) under grants 60903074 and the National High Technology Research and Development Program of China (863 Program) under grant 2008AA01Z119.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongguo Liu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, L., Liu, Y. Ensemble biclustering gene expression data based on the spectral clustering. Neural Comput & Applic 30, 2403–2416 (2018). https://doi.org/10.1007/s00521-016-2819-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2819-1

Keywords

Navigation