Skip to main content

Pattern-Based Biclustering with Constraints for Gene Expression Data Analysis

  • Conference paper
  • First Online:
Book cover Progress in Artificial Intelligence (EPIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9273))

Included in the following conference series:

Abstract

Biclustering has been largely applied for gene expression data analysis. In recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of biclustering algorithms, referred as pattern-based biclustering. These algorithms are able to discover exhaustive structures of biclusters with flexible coherency and quality. Background knowledge has also been increasingly applied for biological data analysis to guarantee relevant results. In this context, despite numerous contributions from domain-driven pattern mining, there is not yet a solid view on whether and how background knowledge can be applied to guide pattern-based biclustering tasks.

In this work, we extend pattern-based biclustering algorithms to effectively seize efficiency gains in the presence of constraints. Furthermore, we illustrate how constraints with succinct, (anti-)monotone and convertible properties can be derived from knowledge repositories and user expectations. Experimental results show the importance of incorporating background knowledge within pattern-based biclustering to foster efficiency and guarantee non-trivial yet biologically relevant solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Besson, J., Robardet, C., De Raedt, L., Boulicaut, J.-F.: Mining Bi-sets in numerical data. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 11–23. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Exante: a preprocessing method for frequent-pattern mining. IEEE Intel. Systems 20(3), 25–31 (2005)

    Article  Google Scholar 

  3. Bonchi, F., Goethals, B.: FP-Bonsai: the art of growing and pruning small FP-trees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 155–160. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Bonchi, F., Lucchese, C.: Extending the state-of-the-art of constraint-based pattern discovery. Data Knowl. Eng. 60(2), 377–399 (2007)

    Article  Google Scholar 

  5. Fang, G., Haznadar, M., Wang, W., Yu, H., Steinbach, M., Church, T.R., Oetting, W.S., Van Ness, B., Kumar, V.: High-Order SNP Combinations Associated with Complex Diseases: Efficient Discovery, Statistical Power and Functional Interactions. Plos One 7 (2012)

    Google Scholar 

  6. Gasch, A.P., Werner-Washburne, M.: The genomics of yeast responses to environmental stress and starvation. Functional & integrative genomics 2(4–5), 181–192 (2002)

    Article  Google Scholar 

  7. Guerra, I., Cerf, L., Foscarini, J., Boaventura, M., Meira, W.: Constraint-based search of straddling biclusters and discriminative patterns. JIDM 4(2), 114–123 (2013)

    Google Scholar 

  8. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Discov. 15(1), 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  9. Henriques, R., Madeira, S.: Biclustering with flexible plaid models to unravel interactions between biological processes. IEEE/ACM Trans, Computational Biology and Bioinfo (2015). doi:10.1109/TCBB.2014.2388206

  10. Henriques, R., Antunes, C., Madeira, S.C.: Generative modeling of repositories of health records for predictive tasks. Data Mining and Knowledge Discovery, pp. 1–34 (2014)

    Google Scholar 

  11. Henriques, R., Madeira, S.: Bicpam: Pattern-based biclustering for biomedical data analysis. Algorithms for Molecular Biology 9(1), 27 (2014)

    Article  Google Scholar 

  12. Henriques, R., Madeira, S.: Bicspam: Flexible biclustering using sequential patterns. BMC Bioinformatics 15, 130 (2014)

    Article  Google Scholar 

  13. Henriques, R., Madeira, S.C., Antunes, C.: F2g: Efficient discovery of full-patterns. In: ECML /PKDD IW on New Frontiers to Mine Complex Patterns. Springer-Verlag, Prague, CR (2013)

    Google Scholar 

  14. Khiari, M., Boizumault, P., Crémilleux, B.: Constraint programming for mining n-ary patterns. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 552–567. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Kuznetsov, S.O., Poelmans, J.: Knowledge representation and processing with formal concept analysis. Wiley Interdisc. Reviews: Data Mining and Knowledge Discovery 3(3), 200–215 (2013)

    Google Scholar 

  16. Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1(1), 24–45 (2004)

    Article  Google Scholar 

  17. Martin, D., Brun, C., Remy, E., Mouren, P., Thieffry, D., Jacq, B.: Gotoolbox: functional analysis of gene datasets based on gene ontology. Genome Biology (12), 101 (2004)

    Google Scholar 

  18. Martinez, R., Pasquier, C., Pasquier, N.: Genminer: Mining informative association rules from genomic data. In: BIBM, pp. 15–22. IEEE CS (2007)

    Google Scholar 

  19. Mouhoubi, K., Létocart, L., Rouveirol, C.: A knowledge-driven bi-clustering method for mining noisy datasets. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part III. LNCS, vol. 7665, pp. 585–593. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Nepomuceno, J.A., Troncoso, A., Nepomuceno-Chamorro, I.A., Aguilar-Ruiz, J.S.: Integrating biological knowledge based on functional annotations for biclustering of gene expression data. Computer Methods and Programs in Biomedicine (2015)

    Google Scholar 

  21. Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD R. 27(2), 13–24 (1998)

    Article  Google Scholar 

  22. Okada, Y., Fujibuchi, W., Horton, P.: A biclustering method for gene expression module discovery using closed itemset enumeration algorithm. IPSJ T. on Bioinfo. 48(SIG5), 39–48 (2007)

    Google Scholar 

  23. Pei, J., Han, J.: Can we push more constraints into frequent pattern mining? In: KDD. pp. 350–354. ACM, New York (2000)

    Google Scholar 

  24. Pei, J., Han, J.: Constrained frequent pattern mining: a pattern-growth view. SIGKDD Explor. Newsl. 4(1), 31–39 (2002)

    Article  Google Scholar 

  25. Serin, A., Vingron, M.: Debi: Discovering differentially expressed biclusters using a frequent itemset approach. Algorithms for Molecular Biology 6, 1–12 (2011)

    Article  Google Scholar 

  26. Visconti, A., Cordero, F., Pensa, R.G.: Leveraging additional knowledge to support coherent bicluster discovery in gene expression data. Intell. Data Anal. 18(5), 837–855 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Henriques .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Henriques, R., Madeira, S.C. (2015). Pattern-Based Biclustering with Constraints for Gene Expression Data Analysis. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23485-4_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23484-7

  • Online ISBN: 978-3-319-23485-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics