SATMargin: Practical Maximal Frequent Subgraph Mining via Margin Space Sampling

Published: 25 April 2022


Maximal Frequent Subgraph (MFS) mining asks to identify the maximal subgraph that commonly appears in a set of graphs, which has been found valuable in many applications in social science, biology, and other domains. Previous studies focused on reducing the search space of MFSs and discovered the theoretically smallest search space. Despite the success in theory, no practical algorithm can exhaustively search the space as it is huge even for small graphs with only tens of nodes and hundreds of edges. Moreover, deciding whether a subgraph is an MFS needs to solve subgraph monomorphism (SM), an NP-complete problem that introduces extra challenges. Here, we propose a practical MFS mining algorithm that targets large MFSs, named SATMargin. SATMargin adopts random walk in the search space to perform efficient search and utilizes a customized conflict learning Boolean Satisfiability (SAT) algorithm to accelerate SM queries. We design a mechanism that reuses SAT solutions to combine the random walk and the SAT solver effectively. We evaluate SATMargin over synthetic graphs and 6 real-world graph datasets. SATMargin shows superior performance to baselines in finding more and larger MFSs. We further demonstrate the effectiveness of SATMargin in a case study of RNA graphs. The identified frequent subgraph by SATMargin well matches the functional core structure of RNAs previously detected in biological experiments. Our software can be found at


      WWW '22: Proceedings of the ACM Web Conference 2022
      April 2022
      3764 pages
      Publication History

      Published: 25 April 2022


      Author Tags

      1. Boolean Satisfiability
      2. Maximal Frequent Subgraph Mining


      WWW '22
      WWW '22: The ACM Web Conference 2022
      April 25 - 29, 2022
      Virtual Event, Lyon, France

