Skip to main content
Log in

Theoretical backgrounds of Boolean reasoning-based binary n-clustering

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Biclustering is a two-dimensional data analysis technique, where submatrices of a given data matrix are looked for. Its extension into three-dimensional data is called triclustering. In the paper, a new generalized look into n-dimensional binary data n-clustering is presented. The searching is performed in terms of the Boolean reasoning paradigm, where the original case (the data) is coded into the Boolean formula and its prime implicants are equivalent to the solutions of the original issue. The correctness (finding n-clusters containing only 0s or 1s) and maximality (the n-cluster cannot be expanded in any dimension without the correctness requirement violation) of such an approach have strong mathematical foundations. The paper also shows the application of Boolean reasoning-based n-clustering for small three- and four-dimensional artificial data as well as for some biomedical ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129. https://doi.org/10.1080/01621459.1972.10481214

    Article  Google Scholar 

  2. Krolak-Schwerdt S, Orlik P, Ganter B (1994) Information Systems and Data Analysis, In: Bock HH, Lenski W, Richter MM (eds) Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 298–307. https://doi.org/10.1007/978-3-642-46808-7_27

  3. Lehmann F, Wille R (1995) Conceptual Structures: Applications, Implementation and Theory, In: Ellis G, Levinson R, Rich W, Sowa JF (eds), Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 32–43. https://doi.org/10.1007/3-540-60161-9_27

  4. Mishra S, Vipsita S (2017) In 2017 14th IEEE India Council International Conference (INDICON), pp. 1–6. https://doi.org/10.1109/INDICON.2017.8488107

  5. Mahanta P, Ahmed HA, Bhattacharyya DK, Kalita JK (2011) In 2011 2nd National Conference on Emerging Trends and Applications in Computer Science, pp. 1–6. https://doi.org/10.1109/NCETACS.2011.5751409

  6. Tang J, Shu X, Qi G, Li Z, Wang M, Yan S, Jain R (2017) Tri-clustered tensor completion for social-aware image tag refinement. IEEE Trans Pattern Anal Mach Intell 39(8):1662–1674. https://doi.org/10.1109/TPAMI.2016.2608882

    Article  Google Scholar 

  7. Michalak M, Ślȩzak D (2018) Boolean representation for exact biclustering. Fund Inform 161(3):275–297. https://doi.org/10.3233/FI-2018-1703

    Article  MathSciNet  MATH  Google Scholar 

  8. Michalak M, Jaksik P, Ślȩzak D (2020) Heuristic search of exact biclusters in binary data. Int J Appl Math Comput Sci 30(1):161–171

    MathSciNet  MATH  Google Scholar 

  9. Michalak M, Ślȩzak D (2019) On Boolean representation of continuous data biclustering. Fund Inform 167(3):193–217. https://doi.org/10.3233/FI-2019-1814

    Article  MathSciNet  MATH  Google Scholar 

  10. Michalak M (2020) Induction of centre-based biclusters in terms of Boolean reasoning. Adv Intell Syst Comput 1061:239–248. https://doi.org/10.1007/978-3-030-31964-9_23

    Article  Google Scholar 

  11. MacQueen JB (1967) In Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, Cam LML , Neyman J (eds), University of California Press, pp. 281–297

  12. Steinhaus H (1957) Sur la division des corps matériels en parties. Bull Acad Pol Sci Cl III 4:801–804

    MATH  Google Scholar 

  13. Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57. https://doi.org/10.1080/01969727308546046

    Article  MathSciNet  MATH  Google Scholar 

  14. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Springer-Verlag, US

    Book  Google Scholar 

  15. Ester M, Kriegel HP, Sander J, Xu X (1996) (AAAI Press), KDD‘96, pp. 226–231

  16. Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69. https://doi.org/10.1007/BF00337288

    Article  MathSciNet  MATH  Google Scholar 

  17. Tanay A, Sharan R, Shamir R (2005) Handbook of Computational Molecular Biology (Chapman & Hall, CRC Press, Chap. A Survey, Biclustering Algorithms

  18. Latkowski R (2003) On decomposition for incomplete data. Fund Inform 54:1–16

    MathSciNet  MATH  Google Scholar 

  19. Chagoyen M, Carmona-Saez P, Shatkay H, Carazo JM, Pascual-Montano A (2006) Discovering semantic features in the literature: a foundation for building functional associations. BMC Bioinf. https://doi.org/10.1186/1471-2105-7-41

    Article  Google Scholar 

  20. Orzechowski P, Boryczko K (2016) In Proceedings of the 15th International Conference on Artificial Intelligence and Soft Computing (Springer International Publishing), pp. 102–113. https://doi.org/10.1007/978-3-319-39384-1_9

  21. Busygin S, Prokopyev O, Pardalos PM (2008) Biclustering in data mining. Computers Oper Res 35(9):2964–2987. https://doi.org/10.1016/j.cor.2007.01.005

    Article  MathSciNet  MATH  Google Scholar 

  22. Pontes B, Giráldez R, Aguilar-Ruiz JS (2015) Biclustering on expression data: a review. J Biomed Inform 57:163–180

    Article  Google Scholar 

  23. Ignatov DI, Watson BW (2016) In Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis, vol. 1522, pp. 23–39

  24. Serin A, Vingron M (2011) DeBi: Discovering differentially expressed biclusters using a frequent itemset approach. Algorithms Mole Biol. https://doi.org/10.1186/1748-7188-6-18

    Article  Google Scholar 

  25. Henriques R, Madeira SC (2018) Triclustering algorithms for three-dimensional data analysis: a comprehensive survey. ACM Comput Surv. https://doi.org/10.1145/3195833

    Article  Google Scholar 

  26. Bhar A, Haubrock M, Mukhopadhyay A, Maulik U, Bandyopadhyay S, Wingender E (2012) In Algorithms in Bioinformatics, Raphael B, Tang J (eds), Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 165–177. https://doi.org/10.1007/978-3-642-33122-0_13

  27. Dede D, Oğul H (2013) In 2013 IEEE INISTA, pp. 1–5. https://doi.org/10.1109/INISTA.2013.6577644

  28. Dede D, Oğul H (2014) Triclust: A tool for cross-species analysis of gene regulation. Mol Inf 33(5):382–387. https://doi.org/10.1002/minf.201400007

    Article  Google Scholar 

  29. Sim K, Aung Z, Gopalkrishnan V (2010) In 2010 IEEE International Conference on Data Mining, pp. 471–480. https://doi.org/10.1109/ICDM.2010.19

  30. Xu X, Lu Y, Tan K, Tung AKH (2009) In 2009 IEEE 25th International Conference on Data Engineering, pp. 445–456. https://doi.org/10.1109/ICDE.2009.80

  31. Gerber GK, Dowell RD, Jaakkola TS, Gifford DK (2007) Automated discovery of functional generality of human gene expression programs. PLoS Comput Biol 3(8):1–15. https://doi.org/10.1371/journal.pcbi.0030148

    Article  MathSciNet  Google Scholar 

  32. Guigourès R, Boullé M, Rossi F (2018) Discovering patterns in time-varying graphs: a triclustering approach. Adv Data Anal Classif 12(3):509–536. https://doi.org/10.1007/s11634-015-0218-6

    Article  MathSciNet  MATH  Google Scholar 

  33. Ignatov DI, Gnatyshak DV, Kuznetsov SO, Mirkin BG (2015) Triadic formal concept analysis and triclustering: searching for optimal patterns. Mach Learn 101(1):271–302. https://doi.org/10.1007/s10994-015-5487-y

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhao L, Zaki MJ (2005) In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (Association for Computing Machinery, New York, NY, USA), SIGMOD ’05, pp. 694–705. https://doi.org/10.1145/1066157.1066236

  35. Hu Z, Bhatnagar R (2010) In 2010 IEEE International Conference on Data Mining, pp. 236–245. https://doi.org/10.1109/ICDM.2010.77

  36. Ji L, Tan KL, Tung AKH (2006) In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB Endowment), VLDB ’06, pp. 811–822

  37. Liu Junwan, Li Zhoujun, Hu Xiaohua, Chen Yiming (2008) in 2008 IEEE International Conference on Granular Computing, pp. 442–447. https://doi.org/10.1109/GRC.2008.4664735

  38. Gutierrez-Aviles D, Rubio-Escudero C (2014) in 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 30–37. https://doi.org/10.1109/BIBM.2014.6999244

  39. Brown FM (1990) Boolean Reasoning. Springer, US

    Book  Google Scholar 

  40. Pawlak Z, Skowron A (2007) Rough sets and boolean reasoning. Inf Sci 177(1):41–73

    Article  MathSciNet  Google Scholar 

  41. Stawicki S, Ślȩzak D, Janusz A, Widz S (2017) Decision bireducts and decision reducts—a comparison. Int J Approx Reason 84:75–109 https://doi.org/10.1016/j.ijar.2017.02.007. https://www.sciencedirect.com/science/article/pii/S0888613X17301408

  42. Johnson D (1974) Approximation algorithms for combinational problems. J Comput Syst Sci 9:256–278. https://doi.org/10.1016/S0022-0000(74)80044-9

    Article  MATH  Google Scholar 

  43. Cook SA (1971) In Proceedings of the Third Annual ACM Symposium on Theory of Computing (Association for Computing Machinery, New York, NY, USA), STOC ’71, pp. 151–158. https://doi.org/10.1145/800157.805047

  44. Michalak M (2022) Hierarchical heuristics for Boolean–reasoning—based binary bicluster induction. Acta Informatica. https://doi.org/10.1007/s00236-021-00415-9

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcin Michalak.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Michalak, M. Theoretical backgrounds of Boolean reasoning-based binary n-clustering. Knowl Inf Syst 64, 2171–2188 (2022). https://doi.org/10.1007/s10115-022-01708-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01708-2

Keywords

Navigation