Abstract
Enumerating interesting patterns from data is an important data mining task. Among the set of possible relevant patterns, maximal frequent patterns is a well known condensed representation that limits at least to some extent the size of the output. Recently, a new declarative mining framework based on constraint programming (CP and satisfiability (SAT) has been designed to deal with several pattern mining tasks. For instance, the itemset mining problem has been modeled as a constraint network/propositional formula whose models correspond to the pattern to be mined. In this framework, closeness, maximality and frequency properties can be handled by additional constraints/formulas. In this paper, we propose a new propositional satisfiability based approach for mining maximal frequent itemsets that extends the one proposed in [13]. We show that instead of adding constraints to the initial SAT based itemset mining encoding, the maximal itemsets, can be obtained by performing clause learning during search. Our approach leads to a more compact encoding. Experimental results on several datasets, show the feasibility of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
FIMI: http://fimi.ua.ac.be/data/.
- 2.
References
Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: Depth first generation of long patterns. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 108–118 (2000)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM, New York (1993)
Biere, A., Heule, M.J.H., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability. Frontiers in AI and Applications, vol. 185. IOS Press, Amsterdam (2009)
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases. In: ICDE, pp. 443–452 (2001)
Coquery, E., Jabbour, S., Saïs, L., Salhi, Y.: A sat-based approach for discovering frequent, closed and maximal patterns in a sequence. In: Proceedings of the 20th European Conference on Artificial Intelligence (ECAI 2012), pp. 258–263 (2012)
Davis, M., Logemann, G., Loveland, D.W.: A machine program for theorem-proving. Commun. ACM 5(7), 394–397 (1962)
Gebser, M., Guyet, T., Quiniou, R., Romero, J., Schaub, T.: Knowledge-based sequence mining with ASP. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016 (2016)
Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose, California, USA, 29 November–2 December 2001, pp. 163–170 (2001)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29, 1–12 (2000)
Jabbour, S., Lonlac, J., Sais, L., Salhi, Y.: Extending modern SAT solvers for models enumeration. In: Proceedings of the 15th IEEE International Conference on Information Reuse and Integration, IRI 2014, Redwood City, CA, USA, 13–15 August 2014, pp. 803–810 (2014)
Jabbour, S., Sais, L., Salhi, Y.: Boolean satisfiability for sequence mining. In: 22nd ACM International Conference on Information and Knowledge Management (CIKM 2013), pp. 649–658. ACM (2013)
Jabbour, S., Sais, L., Salhi, Y.: A pigeon-hole based encoding of cardinality constraints. TPLP 13(4–5-Online-Suppl.) (2013)
Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2003), pp. 403–418 (2013)
Jabbour, S., Sais, L., Salhi, Y.: On SAT models enumeration in itemset mining. CoRR abs/1506.02561 (2015)
Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, 2–4 June 1998, pp. 85–93 (1998)
Lin, D.I., Kedem, Z.M.: Pincer-search: a new algorithm for discovering the maximum frequent set, pp. 103–119 (1998)
Marques-Silva, J.P., Sakallah, K.A.: GRASP - a new search algorithm for satisfiability. In: Proceedings of IEEE/ACM CAD, pp. 220–227 (1996)
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings IEEE International Conference on Data Mining, ICDM 2001, pp. 441–448 (2001)
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 24–27 August 2008, pp. 204–212 (2008)
Tiwari, A., Gupta, R., Agrawal, D.: A survey on frequent pattern mining: current status and challenging issues. Inf. Technol. J. 9, 1278–1293 (2010)
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: FIMI 2004, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, 1 November 2004 (2004)
Zhang, L., Madigan, C.F., Moskewicz, M.W., Malik, S.: Efficient conflict driven learning in Boolean satisfiability solver. In: IEEE/ACM CAD 2001, pp. 279–285 (2001)
Zou, Q., Chu, W.W., Lu, B.: SmartMiner: a depth first algorithm guided by tail information for mining maximal frequent itemsets. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, 9–12 December 2002, pp. 570–577 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Jabbour, S., Mana, F.Z., Sais, L. (2018). On Maximal Frequent Itemsets Enumeration. In: Abraham, A., Haqiq, A., Muda, A., Gandhi, N. (eds) Proceedings of the Ninth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2017). SoCPaR 2017. Advances in Intelligent Systems and Computing, vol 737. Springer, Cham. https://doi.org/10.1007/978-3-319-76357-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-76357-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76356-9
Online ISBN: 978-3-319-76357-6
eBook Packages: EngineeringEngineering (R0)