Abstract
Boolean matrix factorization (BMF) is a well-established data analytical method whose goal is to decompose a single large matrix into two, preferably smaller, matrices, carrying the same or similar information as the original matrix. In essence, it can be used to reduce data dimensionality and to provide fundamental insight into data. Existing algorithms are often negatively affected by the presence of noise in the data, which is a common case for real-world datasets. We present an initial study on an algorithm for approximate BMF that uses association rules in a novel way to identify possible noise. This allows us to suppress the impact of noise and improve the quality of results. Moreover, we show that association rules provide a suitable framework allowing the handling of noise in BMF in a justified way.
P. Krajča—was supported by the grant JG 2019 of Palacký University Olomouc, No. JG_2019_008. Martin Trnecka was supported by the grant JG 2020 of Palacký University Olomouc, No. JG_2020_003. Support by Grant No. IGA_PrF_2020_019 of IGA of Palacký University is also acknowledged.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
UCI Machine Learning Repository (2020). http://archive.ics.uci.edu/ml
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of ACM SIGMOD. ACM Press (1993)
Andrews, S.: A ‘best-of-breed’ approach for designing a fast algorithm for computing fixpoints of Galois connections. Inf. Sci. 295, 633–649 (2015)
Belohlávek, R., Trnecka, M.: From-below approximations in Boolean matrix factorization: geometry and new algorithm. J. Comput. Syst. Sci. 81(8), 1678–1697 (2015)
Belohlávek, R., Trnecka, M.: Handling noise in Boolean matrix factorization. Int. J. Approx. Reason. 96, 78–94 (2018)
Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)
Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-642-59830-2
Gupta, R., Fang, G., Field, B., Steinbach, M.S., Kumar, V.: Quantitative evaluation of approximate frequent pattern mining algorithms. In: Li, Y., Liu, B., Sarawagi, S. (eds.) Proceedings of ACM SIGKDD (2008)
Lucchese, C., Orlando, S., Perego, R.: A unifying framework for mining approximate top-k binary patterns. IEEE Trans. Knowl. Data Eng. 26(12), 2900–2913 (2014)
Makhalova, T., Trnecka, M.: From-below Boolean matrix factorization algorithm based on mdl. Adv. Data Anal. Classif. 1–20 (2020)
Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. IEEE Trans. Knowl. Data Eng. 20(10), 1348–1362 (2008)
Myllykangas, S., Himberg, J., Böhling, T., Nagy, B., Hollmén, J., Knuutila, S.: DNA copy number amplification profiling of human neoplasms. Oncogene 25(55), 7324–7332 (2006)
Outrata, J., Vychodil, V.: Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data. Inf. Sci. 185(1), 114–127 (2012)
Rauch, J.: Observational Calculi and Association Rules. Studies in Computational Intelligence, vol. 469. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-11737-4
Trnecka, M., Vyjidacek, R.: Revisiting the Grecon algorithm for Boolean matrix factorization. In: Valverde-Albacete, F.J., Trnecka, M. (eds.) Proceedings of the Fifthteenth International Conference on Concept Lattices and Their Applications, Tallinn, Estonia, June 29-July 1, 2020. CEUR Workshop Proceedings, vol. 2668, pp. 59–70. CEUR-WS.org (2020). http://ceur-ws.org/Vol-2668/paper4.pdf
Xiang, Y., Jin, R., Fuhry, D., Dragan, F.F.: Summarizing transactional databases with overlapped hyperrectangles. Data Min. Knowl. Discov. 23(2), 215–251 (2011)
Zaki, M.J.: Mining non-redundant association rules. Data Min. Knowl. Discov. 9(3), 223–248 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Krajča, P., Trnecka, M. (2021). Reducing Negative Impact of Noise in Boolean Matrix Factorization with Association Rules. In: Abreu, P.H., Rodrigues, P.P., Fernández, A., Gama, J. (eds) Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham. https://doi.org/10.1007/978-3-030-74251-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-74251-5_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-74250-8
Online ISBN: 978-3-030-74251-5
eBook Packages: Computer ScienceComputer Science (R0)