Abstract
The Boolean matrix factorization (BMF) is a well-established and widely used tool for preprocessing and analyzing Boolean (binary, yes-no) data. In many situations, the set of factors is already computed, but some changes in the data occur after the computation, e.g., new entries to the input data are added. Recompute the factors from scratch after each small change in the data is inefficient. In the paper, we propose an incremental algorithm for (from-below) BMF which adjusts the already computed factorization according to the changes in the data. Moreover, we provide a comparison of the incremental and non-incremental algorithm on real-world data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Belohlavek, R., Grissa, D., Guillaume, S., Mephu Nguifo, E., Outrata, J.: Boolean factors as a means of clustering of interestingness measures of association rules. Ann. Math. Artif. Intell. 70(1), 151–184 (2013). https://doi.org/10.1007/s10472-013-9370-x
Belohlavek, R., Trnecka, M.: From-below approximations in Boolean matrix factorization: geometry and new algorithm. J. Comput. Syst. Sci. 81(8), 1678–1697 (2015). https://doi.org/10.1016/j.jcss.2015.06.002
Belohlavek, R., Trnecka, M.: A new algorithm for Boolean matrix factorization which admits overcovering. Discrete Appl. Math. 249, 36–52 (2018). https://doi.org/10.1016/j.dam.2017.12.044
Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010). https://doi.org/10.1016/j.jcss.2009.05.002
Claudio, L., Salvatore, O., Raffaele, P.: A unifying framework for mining approximate top-k binary patterns. IEEE Trans. Knowl. Data Eng. 26(12), 2900–2913 (2014). https://doi.org/10.1109/TKDE.2013.181
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Ene, A., Horne, W.G., Milosavljevic, N., Rao, P., Schreiber, R., Tarjan, R.E.: Fast exact and heuristic methods for role minimization problems. In: Ray, I., Li, N. (eds.) Proceedings of the 13th ACM Symposium on Access Control Models and Technologies, SACMAT 2008, Estes Park, CO, USA, 11–13 June 2008, pp. 1–10. ACM (2008). https://doi.org/10.1145/1377836.1377838
Fortelius, M., et al.: Neogene of the old world database of fossil mammals (now) (2003). http://www.helsinki.fi/science/now
Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-642-59830-2
Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30214-8_22
Hashemi, S., Tann, H., Reda, S.: Approximate logic synthesis using Boolean matrix factorization. In: Reda, S., Shafique, M. (eds.) Approximate Circuits, pp. 141–154. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-99322-5_7
Ignatov, D.I., Nenova, E., Konstantinova, N., Konstantinov, A.V.: Boolean matrix factorisation for collaborative filtering: an FCA-based approach. In: Agre, G., Hitzler, P., Krisnadhi, A.A., Kuznetsov, S.O. (eds.) AIMSA 2014. LNCS (LNAI), vol. 8722, pp. 47–58. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10554-3_5
Kocayusufoglu, F., Hoang, M.X., Singh, A.K.: Summarizing network processes with network-constrained Boolean matrix factorization. In: IEEE International Conference on Data Mining, ICDM 2018, Singapore, 17–20 November 2018, pp. 237–246. IEEE Computer Society (2018). https://doi.org/10.1109/ICDM.2018.00039
Lucchese, C., Orlando, S., Perego, R.: Mining top-k patterns from binary datasets in presence of noise. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, Columbus, Ohio, USA, 29 April–1 May 2010, pp. 165–176. SIAM (2010). https://doi.org/10.1137/1.9781611972801.15
Miettinen, P.: Matrix decomposition methods for data mining: computational complexity and algorithms (2009)
Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. IEEE Trans. Knowl. Data Eng. 20(10), 1348–1362 (2008). https://doi.org/10.1109/TKDE.2008.53
Nau, D.S., Markowsky, G., Woodbury, M.A., Amos, D.B.: A mathematical analysis of human leukocyte antigen serology. Math. Biosci. 40(3–4), 243–270 (1978)
Stockmeyer, L.J.: The Set Basis Problem is NP-complete. IBM Thomas J. Watson Research Division, Research reports (1975)
Trnecka, M., Trneckova, M.: Data reduction for Boolean matrix factorization algorithms based on formal concept analysis. Knowl. Based Syst. 158, 75–80 (2018). https://doi.org/10.1016/j.knosys.2018.05.035
Xiang, Y., Jin, R., Fuhry, D., Dragan, F.F.: Summarizing transactional databases with overlapped hyperrectangles. Data Min. Knowl. Discov. 23(2), 215–251 (2011). https://doi.org/10.1007/s10618-010-0203-9
Acknowledgment
The paper was supported by the grant JG 2020 of Palacký University Olomouc, No. JG_2020_003. Support by Grant No. IGA_PrF_2020_019 and No. IGA_PrF_2021_022 of IGA of Palacký University are also acknowledged. The authors would like to thank Jan Outrata for providing an efficient implementation of the non-incremental algorithm.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Trnecka, M., Trneckova, M. (2021). An Incremental Recomputation of From-Below Boolean Matrix Factorization. In: Braud, A., Buzmakov, A., Hanika, T., Le Ber, F. (eds) Formal Concept Analysis. ICFCA 2021. Lecture Notes in Computer Science(), vol 12733. Springer, Cham. https://doi.org/10.1007/978-3-030-77867-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-77867-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77866-8
Online ISBN: 978-3-030-77867-5
eBook Packages: Computer ScienceComputer Science (R0)