An improved combinatorial biclustering algorithm

Nosova, Ekaterina; Napolitano, Francesco; Amato, Roberto; Cocozza, Sergio; Miele, Gennaro; Raiconi, Giancarlo; Tagliaferri, Roberto

doi:10.1007/s00521-012-0902-9

An improved combinatorial biclustering algorithm

Original Article
Published: 10 March 2012

Volume 22, pages 293–302, (2013)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Ekaterina Nosova¹,
Francesco Napolitano¹,
Roberto Amato²,
Sergio Cocozza³,
Gennaro Miele²,
Giancarlo Raiconi¹ &
…
Roberto Tagliaferri¹

197 Accesses
Explore all metrics

Abstract

DNA microarray analysis represents a relevant technology in genetic research to explore and recognize possible genomic features of many diseases. Since it is a high-throughput technology, it requires advanced tools for a dimensional reduction in massive data sets. Clustering is among the most appropriate tools for mining these data, although it suffers from the following problems: instability of the results, large number of genes compared with the number of samples, high noise level, complexity of initialization, and grouping genes and samples simultaneously. Almost all these problems can be positively addressed by using novel techniques, such as biclustering. In this paper, a new biclustering algorithm is proposed, hereafter denoted as combinatorial biclustering algorithm (CBA), that addresses the problems listed above. The algorithm analyzes the data finding biclusters of the desired size and allowable error. CBA performances are compared with the ones of other bicluster algorithms by discussing the output of different methods once running them on a synthetic data set. CBA seems to perform better, and for this reason, it has been applied to study a real data set as well. In particular, CBA has analyzed the transcriptional profile of 38 gastric cancer tissues with microsatellite instability (MSI) and without MSS. The results show clearly a much coherent behavior in gene expression of normal tissues versus tumoral ones. The high level of gene misregulation in tumoral tissues affects any further bicluster analysis, and it is only partially smoothed in the MSI/MSS study even admitting much higher level on initial admissible error.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evolutionary biclustering algorithms: an experimental study on microarray data

Article 17 July 2018

A Biclustering-Based Classification Framework for Microarray Analysis

A systematic comparative evaluation of biclustering techniques

Article Open access 23 January 2017

References

Ben-Dor A, Chor B, Karp R, Yakhini Z (2002) Discovering local structure in gene expression data: the order-preserving submatrix problem. In: Proceedings of the sixth international conference on computational biology, Washington, DC, USA, ACM, pp 89–100
Bergmann S, Ihmels J, Barkai N (2003) Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Rev E Stat Nonlin Soft Matter Phys 67(3 Pt 1):41–48
Google Scholar
Bhattacharya A, De RK (2009) Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics 25(21):2795–801
Article Google Scholar
Cheng Y, Church G (2000) Biclustering of expression data. In: Press A (ed) Proceeding of the Eighth International Conference Intelligent systems for molecular biology (ISMB 00), pp 93–103
D’Errico M, de Rinaldis E, Blasi M, Viti V, Falchetti M, Calcagnile A, Sera F, Saieva C, Ottini L, Palli D, Palombo F, Giuliani A, Dogliotti E (2009) Genome-wide expression profile of sporadic gastric cancers with microsatellite instability. Eur J Cancer 3(45):461–469
Article Google Scholar
Getz G, Levine E, Domany E (2000) Coupled two-way clustering analysis of gene microarray data. PNAS 97(22):12,079–12,084
Article Google Scholar
Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67:123–129
Article Google Scholar
Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34:D354–D357
Article Google Scholar
Kluger Y, Basri R, Chang J, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13:703–716
Article Google Scholar
Lazzeroni L, Owen A (2000) Plaid models for gene expression data. Technical report, Stanford Univ
Milne AN, Carneiro F, O’Morain C, Offerhaus GJ (2009) Nature meets nurture: molecular genetics of gastric cancer. Hum Genet 126:615–628
Article Google Scholar
Mirkin B (1996) Mathematical classification and clustering. Kluwer, Boston
Book MATH Google Scholar
Nosova E, Raiconi G, Tagliaferri R (2011) A multi-biclustering combinatorial based algorithm. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), IEEE Catalog Number: CFP11IDM-CDR ISBN: 978-1-4244-9925-0
Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129
Article Google Scholar
Reiss D, Baliga N, Bonneau R (2006) Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinform 2(7):280–302
Article Google Scholar
Tanay A, Sharan R, Kupiec M, Shamir R (2004) Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. In: PNAS (ed) Proceedings of the National Academic Science USA, vol 101, pp 2981–2986
Tang C, Zhang L, Ramanathan M, Zhang A (2001) Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: I.C. Society (ed) Proceedings of the IEEE 2nd International Symposium on Bioinformatics and Bioengineering (BIBE’01), pp 41–48
Tchagang AB, Tewfik A (2006) Dna microarray data analysis: a novel biclustering algorithm approach. EURASIP J Appl Signal Process 1:60–60
Google Scholar
Wang HX (2002) Clustering by pattern similarity: the pcluster algorithm. http://wis.cs.ucla.edu/hxwang/proj/delta.html
Yang J, Wang W, Wang H, Yu P (2003) Enhanced biclustering on expression data. In: I.C. Society (ed) Proceedings of the Third IEEE Conference Bioinformatics and Bioengineering, pp 321–327
Yang J, Wang W, Wang H, Yu PS (2002) Delta-clusters: capturing subspace correlation in a large data set. In: I.C.S. Press (ed) Proceedings of the IEEE International Conference on Data Engineering (ICDE), Los Alamitos, pp 517–528

Download references

Acknowledgments

This work is supported by Istituto Nazionale di Alta Matematica Francesco Severi (INdAM) with the scholarship N U 2007/000458 07/09/2007.

Author information

Authors and Affiliations

Dipartimento di Informatica, Università degli Studi di Salerno, Fisciano, Salerno, Italy
Ekaterina Nosova, Francesco Napolitano, Giancarlo Raiconi & Roberto Tagliaferri
Dipartimento di Scienze Fisiche, Università degli Studi di Napoli “Federico II”, Naples, Italy
Roberto Amato & Gennaro Miele
Dipartimento di Biologia e Patologia Cellulare e Molecolare “L. Califano”, Università degli Studi di Napoli “Federico II”, Naples, Italy
Sergio Cocozza

Authors

Ekaterina Nosova
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Napolitano
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Amato
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Cocozza
View author publications
You can also search for this author in PubMed Google Scholar
Gennaro Miele
View author publications
You can also search for this author in PubMed Google Scholar
Giancarlo Raiconi
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Tagliaferri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Napolitano.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nosova, E., Napolitano, F., Amato, R. et al. An improved combinatorial biclustering algorithm. Neural Comput & Applic 22 (Suppl 1), 293–302 (2013). https://doi.org/10.1007/s00521-012-0902-9

Download citation

Received: 05 July 2011
Accepted: 25 February 2012
Published: 10 March 2012
Issue Date: May 2013
DOI: https://doi.org/10.1007/s00521-012-0902-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved combinatorial biclustering algorithm

Abstract

Access this article

Similar content being viewed by others

Evolutionary biclustering algorithms: an experimental study on microarray data

A Biclustering-Based Classification Framework for Microarray Analysis

A systematic comparative evaluation of biclustering techniques

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improved combinatorial biclustering algorithm

Abstract

Access this article

Similar content being viewed by others

Evolutionary biclustering algorithms: an experimental study on microarray data

A Biclustering-Based Classification Framework for Microarray Analysis

A systematic comparative evaluation of biclustering techniques

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation