A new FCA-based method for identifying biclusters in gene expression data

Houari, Amina; Ayadi, Wassim; Ben Yahia, Sadok

doi:10.1007/s13042-018-0794-9

A new FCA-based method for identifying biclusters in gene expression data

Original Article
Published: 07 March 2018

Volume 9, pages 1879–1893, (2018)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

308 Accesses
15 Citations
Explore all metrics

Abstract

Biclustering has been very relevant within the field of gene expression data analysis. In fact, its main thrust stands in its ability to identify groups of genes that behave in the same way under a subset of samples (conditions). However, the pioneering algorithms of the literature has shown some limits in terms of the quality of unveiled biclusters. In this paper, we introduce a new algorithm, called BiFCA+, for biclustering microarray data. BiFCA+ heavily relies on the mathematical background of the formal concept analysis, in order to extract the set of biclusters. In addition, the Bond correlation measure is of use to filter out the overlapping biclusters. The extensive experiments, carried out on real-life datasets, shed light on BiFCA+’s ability to identify statistically and biologically significant biclusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Top-K formal concepts-based algorithm for mining positive and negative correlation biclusters of DNA microarray data

Article 12 September 2023

Integrative Approach to Gene Expression Data Analysis: Combining Biclustering Techniques with Gene Ontology

Top-K Formal Concepts for Identifying Positively and Negatively Correlated Biclusters

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

We use a separator-free abbreviated form for the sets; e.g., $\{I_{1}I_{2}I_{3}\}$ stands for the set of items $\{I_{1}, I_{2}, I_{3}\}$.
This may be either monotone increasing, monotone decreasing, up–down or down–up, etc.
Available at https://github.com/mehdi-kaytoue/trimax.
Available at http://arep.med.harvard.edu/biclustering/.
Available at http://www.tik.ethz.ch/sop/bimax/.
Available at http://arep.med.harvard.edu/biclustering/.
The human B-cell lymphoma dataset version that we have does not contain the names of genes to perform other tests.
Available at http://llama.mshri.on.ca/funcassociate/
http://geneontology.org/
Available at http://db.yeastgenome.org/cgi-bin/GO/goTermFinder
The adjusted significance scores assess genes in each bicluster, which indicates how well they match with the different GO categories.
http://geneontology.org/

References

Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson JJ, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511
Article Google Scholar
Aswanikumar C, Srinivas S (2010) Concept lattice reduction using fuzzy k-means clustering. Expert Syst Appl 37(3):2696–2704. https://doi.org/10.1016/j.eswa.2009.09.026
Article Google Scholar
Ayadi W (2011) Algorithmes systematiques et stochastiques de biregroupement pour l’analyse des donnees biopuces. Ph.D. thesis, University of Angers, France
Ayadi W, Elloumi M, Hao JK (2009) A biclustering algorithm based on a bicluster enumeration tree: application to DNA microarray data. BioData Mining 2:9
Article Google Scholar
Ayadi W, Elloumi M, Hao JK (2010) Iterated local search for biclustering of microarray data. In: pattern recognition in bioinformatics–5th IAPR international conference, PRIB 2010, Nijmegen, The Netherlands, September 22-24, 2010. Proceedings, pp. 219–229
Ayadi W, Elloumi M, Hao JK (2012) Bicfinder: a biclustering algorithm for microarray data analysis. Knowl Inf Syst 30(2):341–358
Article Google Scholar
Ayadi W, Elloumi M, Hao JK (2012) Bimine+: an efficient algorithm for discovering relevant biclusters of DNA microarray data. Knowl Based Syst 35:224–234
Article Google Scholar
Barbut M, Monjardet B (1970) Ordre et classification: algèbre et combinatoire. Classiques Hachette. Hachette. https://books.google.fr/books?id=n3BpSgAACAAJ. Accessed Jan 2014
Ben-Dor A, Chor B, Karp RM, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3/4):373–384
Article Google Scholar
Bergmann S, Ihmels J, Barkai N (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics 20(13):1993–2003
Article Google Scholar
Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with funcassociate. Bioinformatics 19:2502–2504
Article Google Scholar
Bleuler S, Prelic A, Zitzler E (2004) An EA framework for biclustering of gene expression data. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2004, 19-23 June 2004, Portland, OR, USA, pp. 166–173. https://doi.org/10.1109/CEC.2004.1330853
Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G (2004) GO: : Termfinder-open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics 20(18):3710–3715. https://doi.org/10.1093/bioinformatics/bth456
Article Google Scholar
Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE Trans Comput Biol Bioinform 1:24–45
Article Google Scholar
Cheng K, Law N, Chan Y, Siu W (2014) A joint framework for missing values estimation and biclusters detection in gene expression data. IJBRA 10(6):574–586. https://doi.org/10.1504/IJBRA.2014.065243
Article Google Scholar
Cheng K, Law N, Siu W (2013) Use of biclustering for missing value imputation in gene expression data. Artif Intell Res 2(2):96–108. https://doi.org/10.5430/air.v2n2p96
Article Google Scholar
Cheng KO, Law NF, Siu WC, Liew AWC (2008) Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization. BMC Bioinform 9:210
Article Google Scholar
Cheng Y, Church GM (2000) Biclustering of expression data. In: proc of ISMB, UC San Diego, California, pp 93–103
Cheng Y, Church GM (2006) Biclustering of expression data. Tech. rep., supplementary information
Das S, Idicula SM (2010) Application of cardinality based grasp to the biclustering of gene expression data. Int J Comput Appl 1:44–53
Google Scholar
Divina F, Aguilar-Ruiz JS (2007) A multi-objective approach to discover biclusters in microarray data. In: genetic and evolutionary computation conference, GECCO 2007, proceedings, London, England, UK, July 7–11, 2007, pp 385–392. https://doi.org/10.1145/1276958.1277038
Divina F, AguilarRuiz JS (2006) Biclustering of expression data with evolutionary computation. IEEE Trans Knowl Data Eng 18(5):590–602
Article Google Scholar
Eren K, Deveci M, Küçüktunç O, Çatalyürek ÜV (2013) A comparative analysis of biclustering algorithms for gene expression data. Brief Bioinform 14(3):279–292. https://doi.org/10.1093/bib/bbs032
Article Google Scholar
Fisher RA (1922) On the interpretation of $\chi ^{\mathit{2}}$ from contingency tables, and the calculation of P. J R Stat Soc 85(1):87–94. https://doi.org/10.2307/2340521
Article Google Scholar
Freitas A, Ayadi W, Elloumi M, Oliveira LJ, Hao JK (2013) Survey on biclustering of gene expression data. In: Elloumi M, Zomaya AY (eds) Biological knowledge discovery handbook: preprocessing, mining, and postprocessing of biological data. Wiley, Hoboken, New Jersey, pp 591–608
Chapter Google Scholar
Gallo CA, Carballido JA, Ponzoni I (2009) Microarray biclustering: a novel memetic approach based on the pisa platform. In: Pizzuti C, Ritchie MD, Giacobini M (eds) Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO 2009. Springer, Berlin, Heidelberg, pp 44–55
Chapter Google Scholar
Ganter B, Wille R (1999) Formal concept analysis–mathematical foundations. Springer
Gasch AP, Eisen MB (2002) Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol. https://doi.org/10.1186/gb-2002-3-11-research0059
Article Google Scholar
Henriques R, Antunes C, Madeira SC (2013) Methods for the efficient discovery of large item-indexable sequential patterns. In: New frontiers in mining complex patterns–second international workshop, NFMCP 2013, Held in Conjunction with ECML-PKDD 2013, Prague, Czech Republic, September 27, 2013, Revised Selected Papers, pp 100–116. https://doi.org/10.1007/978-3-319-08407-7_7
Google Scholar
Henriques R, Antunes C, Madeira SC (2015) A structured view on pattern mining-based biclustering. Pattern Recognit 48(12):3941–3958. https://doi.org/10.1016/j.patcog.2015.06.018
Article Google Scholar
Henriques R, Madeira SC (2014) Bicpam: pattern-based biclustering for biomedical data analysis. Algorithm Mol Biol 9:27. https://doi.org/10.1186/s13015-014-0027-z
Article Google Scholar
Henriques R, Madeira SC (2014) Bicspam: flexible biclustering using sequential patterns. BMC Bioinform 15:130. https://doi.org/10.1186/1471-2105-15-130
Article Google Scholar
Henriques R, Madeira SC (2016) Bic2pam: constraint-guided biclustering for biological data analysis with domain knowledge. Algorithm Mol Biol 11:23. https://doi.org/10.1186/s13015-016-0085-5
Article Google Scholar
Henriques R, Madeira SC (2016) Bicnet: flexible module discovery in large-scale biological networks using biclustering. Algorithm Mol Biol 11:14. https://doi.org/10.1186/s13015-016-0074-8
Article Google Scholar
Hochreiter S, Bodenhofer U, Heusel M, Mayr A, Mitterecker A, Kasim A, Khamiakova T, Sanden SV, Lin D, Talloen W, Bijnens L, Göhlmann HWH, Shkedy Z, Clevert D (2010) FABIA: factor analysis for bicluster acquisition. Bioinformatics 26(12):1520–1527. https://doi.org/10.1093/bioinformatics/btq227
Article Google Scholar
Ignatov DI, Gnatyshak DV, Kuznetsov SO, Mirkin BG (2015) Triadic formal concept analysis and triclustering: searching for optimal patterns. Mach Learning 101(1–3):271–302. https://doi.org/10.1007/s10994-015-5487-y
Article MathSciNet MATH Google Scholar
Ihmels J, Bergmann S, Barkai N (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics 20:1993–2003
Article Google Scholar
Kaytoue M, Kuznetsov SO, Macko J, Napoli A (2014) Biclustering meets triadic concept analysis. Ann Math Artif Intell 70(1–2):55–79. https://doi.org/10.1007/s10472-013-9379-1
Article MathSciNet MATH Google Scholar
Kaytoue M, Kuznetsov SO, Napoli A (2011) Biclustering numerical data in formal concept analysis. In: proc of ICFCA, Leuven, Belgium, pp 135–150
Chapter Google Scholar
Kaytoue M, Kuznetsov SO, Napoli A, Duplessis S (2011) Mining gene expression data with pattern structures in formal concept analysis. Inf Sci 181(10):1989–2001. https://doi.org/10.1016/j.ins.2010.07.007
Article MathSciNet Google Scholar
Király A, Laiho A, Abonyi J, Gyenesei A (2014) Novel techniques and an efficient algorithm for closed pattern mining. Expert Syst Appl 41(11):5105–5114. https://doi.org/10.1016/j.eswa.2014.02.029
Article Google Scholar
Kumar CA (2012) Fuzzy clustering-based formal concept analysis for association rules mining. Appl Artif Intell 26(3):274–301
Article Google Scholar
Lehmann F, Wille R (1995) A triadic approach to formal concept analysis. In: Conceptual structures: applications, implementation and theory, third international conference on conceptual structures, ICCS ’95, Santa Cruz, California, USA, August 14–18, 1995, proceedings, pp 32–43. https://doi.org/10.1007/3-540-60161-9_27
Chapter Google Scholar
Li J, Kumar CA, Mei C, Wang X (2017) Comparison of reduction in formal decision contexts. Int J Approx Reason 80:100–122. https://doi.org/10.1016/j.ijar.2016.08.007
Article MathSciNet MATH Google Scholar
Li X, Shao MW, Zhao XM (2016) Constructing lattice based on irreducible concepts. Int J Mach Learning Cybern. https://doi.org/10.1007/s13042-016-0587-y
Article Google Scholar
Liu J, Li Z, Hu X, Chen Y (2009) Biclustering of microarray data with MOSPO based on crowding distance. BMC Bioinform. https://doi.org/10.1186/1471-2105-10-S4-S9
Article Google Scholar
Liu J, Li Z, Liu F, Chen Y (2008) Multi-objective particle swarm optimization biclustering of microarray data. In: 2008 IEEE international conference on bioinformatics and biomedicine, BIBM 2008, 3–5 November 2008, Philadephia, Pennsylvania, USA, pp 363–366. https://doi.org/10.1109/BIBM.2008.17
Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with b-splines. Bioinformatics 19(4):474–482
Article Google Scholar
Martínez R, Pasquier N, Pasquier C (2008) Genminer: mining non-redundant association rules from integrated gene expression data and annotations. Bioinformatics 24(22):2643–2644. https://doi.org/10.1093/bioinformatics/btn490
Article Google Scholar
Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recognit 39:2464–2477
Article Google Scholar
Mondal KC, Pasquier N (2014) Galois closure based association rule mining from biological data. In: Elloumi M, Zomaya AY (eds) Biological knowledge discovery handbook: preprocessing, mining, and postprocessing of biological data. Wiley, Hoboken, New Jersey, pp 761–802
Google Scholar
Mondal KC, Pasquier N, Mukhopadhyay A, Maulik U, Bandyopadhyay S (2012) A new approach for association rule mining and bi-clustering using formal concept analysis. In: proc of machine learning and data mining in pattern recognition (MLDM), Berlin, Germany, pp 86–101
Mouakher A, Ben Yahia S (2016) Qualitycover: efficient binary relation coverage guided by induced knowledge quality. Inf Sci 355:58–73
Article Google Scholar
Nepomuceno JA, Lora AT, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2015) Integrating biological knowledge based on functional annotations for biclustering of gene expression data. Comput Method Progr Biomed 119(3):163–180. https://doi.org/10.1016/j.cmpb.2015.02.010
Article Google Scholar
Omiecinski ER (2003) Alternative interest measures for mining associations in databases. IEEE Trans Knowl Data Eng 15:57–69
Article Google Scholar
Orzechowski P (2013) Proximity measures and results validation in biclustering–a survey. In: Artificial intelligence and soft computing–12th international conference, ICAISC 2013, Zakopane, Poland, June 9–13, 2013, proceedings, part II, pp 206–217. https://doi.org/10.1007/978-3-642-38610-7_20
Chapter Google Scholar
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Beeri C, Buneman P (eds) ICDT. Springer, Berlin, Heidelberg, pp 398–416
Google Scholar
Peddada S, Lobenhofer E, Li L, Afshari C, Weinberg C (2003) Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19:834–841
Article Google Scholar
Pensa RG, Besson J, Boulicaut JF (2004) A methodology for biologically relevant pattern discovery from gene expression data. In: proc of discovery science, pp 230–241
Chapter Google Scholar
Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129
Article Google Scholar
Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18:S136–S144
Article Google Scholar
Tavazoieand S, Hughes JD, Campbell MJ, Cho RJ, Church GM (1999) Systematic determination of genetic network architecturegenetics. Nat Genet 22:281–285
Article Google Scholar
Teng L, Chan L (2008) Discovering biclusters by iteratively sorting with weighted correlation coefficient in gene expression data. J Signal Process Syst 50:267–280
Article Google Scholar
Trabelsi C, Jelassi N, Ben Yahia S (2012) Scalable mining of frequent tri-concepts from folksonomies. In: Advances in knowledge discovery and data mining–16th Pacific-Asia conference, PAKDD 2012, Kuala Lumpur, Malaysia, May 29–June 1, 2012, proceedings, part II, pp 231–242. Springer-Verlag. https://doi.org/10.1007/978-3-642-30220-6_20
Chapter Google Scholar
Uno T, Asai T, Uchida Y, Arimura H (2004) An efficient algorithm for enumerating closed patterns in transaction databases. In: Discovery science, 7th international conference, DS 2004, Padova, Italy, October 2–5, 2004, proceedings, pp 16–31. https://doi.org/10.1007/978-3-540-30214-8_2
Chapter Google Scholar
Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data, Madison, Wisconsin, June 3–6, 2002, pp 394–405. https://doi.org/10.1145/564691.564737
Wei J, Wang S, Yuan X (2010) Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans Knowl Data Eng 22(3):381–391. https://doi.org/10.1109/TKDE.2009.114
Article Google Scholar
Wille R (1982) Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival I (ed) Ordered Sets. Reidel, Dordrecht/Boston, pp 445–470
Chapter Google Scholar
Zhang Y, Zha H, Chu CH (2005) A time-series biclustering algorithm for revealing co-regulated genes. Proc 5th Int Conf Inf Technol 1:32–37

Download references

Author information

Authors and Affiliations

Faculty of Sciences of Tunis, University of Tunis El Manar, LIPAH-LR11ES14, 2092, Tunis, Tunisia
Amina Houari & Sadok Ben Yahia
National Higher Engineering School of Tunis, University of Tunis, LaTICE-LR11ES04, 1008, Tunis, Tunisia
Wassim Ayadi

Authors

Amina Houari
View author publications
You can also search for this author in PubMed Google Scholar
Wassim Ayadi
View author publications
You can also search for this author in PubMed Google Scholar
Sadok Ben Yahia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wassim Ayadi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Houari, A., Ayadi, W. & Ben Yahia, S. A new FCA-based method for identifying biclusters in gene expression data. Int. J. Mach. Learn. & Cyber. 9, 1879–1893 (2018). https://doi.org/10.1007/s13042-018-0794-9

Download citation

Received: 13 July 2015
Accepted: 26 February 2018
Published: 07 March 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s13042-018-0794-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new FCA-based method for identifying biclusters in gene expression data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Top-K formal concepts-based algorithm for mining positive and negative correlation biclusters of DNA microarray data

Integrative Approach to Gene Expression Data Analysis: Combining Biclustering Techniques with Gene Ontology

Top-K Formal Concepts for Identifying Positively and Negatively Correlated Biclusters

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A new FCA-based method for identifying biclusters in gene expression data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Top-K formal concepts-based algorithm for mining positive and negative correlation biclusters of DNA microarray data

Integrative Approach to Gene Expression Data Analysis: Combining Biclustering Techniques with Gene Ontology

Top-K Formal Concepts for Identifying Positively and Negatively Correlated Biclusters

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation