skip to main content
10.1145/3132847.3133054acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Boolean Matrix Decomposition by Formal Concept Sampling

Published:06 November 2017Publication History

ABSTRACT

Finding interesting patterns is a classical problem in data mining. Boolean matrix decomposition is nowadays a standard tool that can find a set of patterns-also called factors-in Boolean data that explain the data well. We describe and experimentally evaluate a probabilistic algorithm for Boolean matrix decomposition problem. The algorithm is derived from GreCon algorithm which uses formal concepts-maximal rectangles or tiles-as factors in order to find a decomposition. We change the core of GreCon by substituting a sampling procedure for a deterministic computation of suitable formal concepts. This allows us to alleviate the greedy nature of GreCon, creates a possibility to bypass some of the its pitfalls and to preserve its features, e.g. an ability to explain the entire data.

References

  1. Radim Belohlavek and Martin Trnecka. 2015. From-below approximations in Boolean matrix factorization: Geometry and new algorithm. J. Comput. Syst. Sci. 81, 8 (2015), 1678--1697. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Radim Belohlavek and Vilem Vychodil. 2010. Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76, 1 (2010), 3--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mario Boley, Thomas Gärtner, and Henrik Grosskreutz. 2010. Formal Concept Sampling for Counting and Threshold-Free Local Pattern Mining. In Proceedings of the SIAM International Conference on Data Mining, SDM 2010. SIAM, 177--188.Google ScholarGoogle ScholarCross RefCross Ref
  4. Edwin Diday and Richard Emilion. 2003. Maximal and Stochastic Galois Lattices. Discrete Applied Mathematics 127, 2 (2003), 271--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Richard Emilion and Gérard Lévy. 2009. Size of random Galois lattices and number of closed frequent itemsets. Discrete Applied Mathematics 157, 13 (2009), 2945--2957. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Alina Ene, William G. Horne, Nikola Milosavljevic, Prasad Rao, Robert Schreiber, and Robert Endre Tarjan. 2008. Fast exact and heuristic methods for role minimization problems. In 13th ACM Symposium on Access Control Models and Technologies, SACMAT 2008. ACM, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bernhard Ganter. 2011. Random Extents and Random Closure Systems. In Proceedings of The Eighth International Conference on Concept Lattices and Their Applications, 2011 (CEUR Workshop Proceedings), Vol. 959. CEUR-WS.org, 309--318. http://ceur-ws.org/Vol-959/paper21.pdfGoogle ScholarGoogle Scholar
  8. Bernhard Ganter and Rudolf Wille. 1999. Formal concept analysis - mathematical foundations. Springer. Google ScholarGoogle Scholar
  9. Hermann Gruber and Markus Holzer. 2007. Inapproximability of Nondeterministic State and Transition Complexity Assuming P=!NP. In Developments in Language Theory, 11th International Conference, DLT 2007 (Lecture Notes in Computer Science), Vol. 4588. Springer, 205--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mosche Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/mlGoogle ScholarGoogle Scholar
  11. Pauli Miettinen, Taneli Mielikainen, Aristides Gionis, Gautam Das, and Heikki Mannila. 2008. The Discrete Basis Problem. IEEE Transactions on Knowledge and Data Engineering 20, 10 (2008), 1348--1362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. James Orlin. 1977. Contentment in graph theory: Covering graphs with cliques. Indagationes Mathematicae (Proceedings) 80, 5 (1977), 406--424.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jan Outrata and Martin Trnecka. 2016. Running Boolean Matrix Factoriza- tion in Parallel. In Proceedings of the 14th Australasian Data Mining Conference (AusDM 2016).Google ScholarGoogle Scholar
  14. Hans Ulrich Simon. 1990. On Approximate Solutions for Combinatorial Optimization Problems. SIAM J. Discrete Math. 3, 2 (1990), 294--310.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Boolean Matrix Decomposition by Formal Concept Sampling

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
          November 2017
          2604 pages
          ISBN:9781450349185
          DOI:10.1145/3132847

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 6 November 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader