Skip to main content

Decomposition Based SAT Encodings for Itemset Mining Problems

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Recently, several constraint programming (CP)/propositional satisfiability (SAT) based encodings have been proposed to deal with various data mining problems including itemset and sequence mining problems. This research issue allows to model data mining problems in a declarative way, while exploiting efficient and generic solving techniques. In practice, for large datasets, they usually lead to constraints network/Boolean formulas of huge size. Space complexity is clearly identified as the main bottleneck behind the competitiveness of these new declarative and flexible models w.r.t. specialized data mining approaches. In this paper, we address this issue by considering SAT based encodings of itemset mining problems. By partitioning the transaction database, we propose a new encoding framework for SAT based itemset mining problems. Experimental results on several known datasets show significant improvements, up to several orders of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Asin, R., Nieuwenhuis, R., Oliveras, A., Rodriguez-Carbonell, E.: Cardinality networks: a theoretical and empirical study. Constraints 16(2), 195–221 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  2. Cambazard, H., Hadzic, T., O’Sullivan, B.: Knowledge compilation for itemset mining. In: ECAI 2010, pp. 1109–1110 (2010)

    Google Scholar 

  3. Guns, T., Dries, A., Tack, G., Nijssen, S., De Raedt, L.: Miningzinc: A modeling language for constraint-based mining. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 1365–1372 (2013)

    Google Scholar 

  4. Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: A constraint programming perspective. Artificial Intelligence 175(12–13), 1951–1983 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  5. Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: A constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)

    Article  MATH  Google Scholar 

  6. Jabbour, S., Lonlac, J., Sais, L., Salhi, Y.: Extending modern sat solvers for models enumeration. In: Proceedings of the 11th IEEE International Conference on Information Reuse and Integration (IEEE-IRI 2014), San Francisco, 13–15 September 2014 (2014) (to appear). http://arxiv.org/abs/1305.0574, CoRR 2013

  7. Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k sat problem. In: European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), pp. 403–418 (2013)

    Google Scholar 

  8. Khiari, M., Boizumault, P., Crémilleux, B.: Combining csp and constraint-based mining for pattern discovery. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds.) ICCSA 2010, Part II. LNCS, vol. 6017, pp. 432–447. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Metivier, J.P., Boizumault, P., Crémilleux, B., Khiari, M., Loudni, S.: A constraint-based Language for Declarative Pattern Discovery. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), Vancouver, Canada, pp. 1112–1119 (2011)

    Google Scholar 

  10. Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)

    Google Scholar 

  11. Marques-Silva, J., Lynce, I.: Towards robust cnf encodings of cardinality constraints. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 483–497. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Sinz, C.: Towards an optimal cnf encoding of boolean cardinality constraints. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 827–831. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Tseitin, G.: On the complexity of derivations in the propositional calculus. In: Structures in Constructives Mathematics and Mathematical Logic, Part II, pp. 115–125 (1968)

    Google Scholar 

  14. Warners, J.P.: A linear-time transformation of linear inequalities into conjunctive normal form. Information Processing Letters (1996)

    Google Scholar 

  15. Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: KDD 04: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 344–353. ACM Press (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lakhdar Sais .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jabbour, S., Sais, L., Salhi, Y. (2015). Decomposition Based SAT Encodings for Itemset Mining Problems. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics