Skip to main content

Integer Linear Programming for Pattern Set Mining; with an Application to Tiling

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2017)

Abstract

Pattern set mining is an important part of a number of data mining tasks such as classification, clustering, database tiling, or pattern summarization. Efficiently mining pattern sets is a highly challenging task and most approaches use heuristic strategies. In this paper, we formulate the pattern set mining problem as an optimization task, ensuring that the produced solution is the best one from the entire search space. We propose a method based on integer linear programming (ILP) that is exhaustive, declarative and optimal. ILP solvers can exploit different constraint types to restrict the search space, and can use any pattern set measure (or combination thereof) as an objective function, allowing the user to focus on the optimal result. We illustrate and show the efficiency of our method by applying it to the tiling problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We use the implementation available at https://people.mmci.uni-saarland.de/~jilles/prj/tiling/.

  2. 2.

    The k-LTM implementation is Windows-only, and run times therefore only roughly indicate its behavior.

References

  1. Babaki, B., Guns, T., Nijssen, S.: Constrained clustering using column generation. In: Simonis, H. (ed.) CPAIOR 2014. LNCS, vol. 8451, pp. 438–454. Springer, Cham (2014). doi:10.1007/978-3-319-07046-9_31

    Chapter  Google Scholar 

  2. Bringmann, B., Zimmermann, A.: One in a million: picking the right patterns. Knowl. Inf. Syst. 18(1), 61–81 (2009)

    Article  Google Scholar 

  3. Cagliero, L., Chiusano, S., Garza, P., Bruno, G.: Pattern set mining with schema-based constraint. Knowl.-Based Syst. 84, 224–238 (2015)

    Article  Google Scholar 

  4. Cheng, H., Yan, X., Han, J., Hsu, C.-W.: Discriminative frequent pattern analysis for effective classification. In: ICDE 2007, Istanbul, Turkey, April 15, pp. 716–725 (2007)

    Google Scholar 

  5. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996, Portland, pp. 226–231 (1996)

    Google Scholar 

  6. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 278–289. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30214-8_22

    Chapter  Google Scholar 

  7. Guns, T., Nijssen, S., De Raedt, L.: k-pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)

    Article  Google Scholar 

  8. IBM/ILOG, Inc. ILOG CPLEX: High-performance software for mathematical programming and optimization (2016)

    Google Scholar 

  9. Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k SAT problem. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 403–418. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_26

    Chapter  Google Scholar 

  10. Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.): 50 Years of Integer Programming 1958–2008 - From the Early Years to the State-of-the-Art. Springer, Heidelberg (2010)

    Google Scholar 

  11. Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)

    Google Scholar 

  12. Khiari, M., Boizumault, P., Crémilleux, B.: Constraint programming for mining n-ary patterns. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 552–567. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15396-9_44

    Chapter  Google Scholar 

  13. Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: ACM SIGKDD 2006, Philadelphia, PA, USA, pp. 237–244 (2006)

    Google Scholar 

  14. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rules mining. In: Proceedings of Fourth International Conference on Knowledge Discovery & Data Mining (KDD 1998), pp. 80–86, New York. AAAI Press (1998)

    Google Scholar 

  15. Andrzej, J.O.: Integer and combinatorial optimization. Int. J. Adapt. Control Signal Process. 4(4), 333–334 (1990)

    Google Scholar 

  16. Ouali, A., Loudni, S., Lebbah, Y., Boizumault, P., Zimmermann, A., Loukil, L.: Efficiently finding conceptual clustering models with integer linear programming. In: IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp. 647–654 (2016)

    Google Scholar 

  17. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1999). doi:10.1007/3-540-49257-7_25

    Chapter  Google Scholar 

  18. De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: SIAM 2007, 26–28 April 2007, Minneapolis, Minnesota, USA, pp. 237–248 (2007)

    Google Scholar 

  19. Shima, Y., Hirata, K., Harao, M.: Extraction of frequent few-overlapped monotone DNF formulas with depth-first pruning. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 50–60. Springer, Heidelberg (2005). doi:10.1007/11430919_8

    Chapter  Google Scholar 

  20. Vreeken, J., van Leeuwen, M., Siebes, A.: KRIMP: mining itemsets that compress. Data Min. Knowl. Discov. 23(1), 169–214 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  21. Xin, D., Cheng, H., Yan, X., Han, J.: Extracting redundancy-aware top-k patterns. In: ACM SIGKDD 2006, Philadelphia, PA, USA, 20–23 August 2006, pp. 444–453 (2006)

    Google Scholar 

  22. Xindong, W., Vipin, K.: The Top Ten Algorithms in Data Mining, vol. 1. Chapman & Hall/CRC, Boca Raton (2009)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdelkader Ouali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ouali, A. et al. (2017). Integer Linear Programming for Pattern Set Mining; with an Application to Tiling. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57529-2_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57528-5

  • Online ISBN: 978-3-319-57529-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics