Skip to main content

A Relational Query Primitive for Constraint-Based Pattern Mining

  • Conference paper
Constraint-Based Mining and Inductive Databases

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3848))

Abstract

As a step towards the design of an Inductive Database System, in this paper we present a primitive for constraint-based frequent pattern mining, which represents a careful trade-off between expressiveness and efficiency. Such primitive is a simple mechanism which takes a relational table in input and extracts from it all frequent patterns which satisfy a given set of user-defined constraints. Despite its simplicity, the proposed primitive is expressive enough to deal with a broad range of interesting constraint-based frequent pattern queries,using a comprehensive repertoire of constraints defined over SQL aggregates. Thanks to its simplicity, the proposed primitive is amenable to be smoothly embedded in a variety of data mining query languages and be efficiently executed, by the state-of-the-art optimization techniques based on pushing the various form of constraints by means of data reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD 1993 (1993)

    Google Scholar 

  2. Agrawal, R., Shim, K.: Developing tightly-coupled data mining applications on a relational database system. In: Proceedings of KDD 1996 (1996)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proceedings of VLDB 1994 (1994)

    Google Scholar 

  4. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Adaptive Constraint Pushing in frequent pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 47–58. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Efficient Breadth-first Mining of Frequent Pattern with Monotone Constraints. In: To appear in Knowledge and Information Systems - An International Journal (KAIS), Springer, Berlin

    Google Scholar 

  6. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAMiner: Optimized level-wise frequent pattern mining with monotone constraints. In: Proceedings of ICDM 2003 (2003)

    Google Scholar 

  7. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Preprocessing for Frequent Pattern Mining through Data Reduction. To appear in IEEE Intelligent Systems

    Google Scholar 

  9. Bonchi, F., Goethals, B.: FP-Bonsai: the Art of Growing and Pruning Small FP-trees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 155–160. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proceedings of ICDM 2004 (2004)

    Google Scholar 

  11. Bonchi, F., Lucchese, C.: On Condensed Representations of Constrained Frequent Patterns. In: To appear in Knowledge and Information Systems - An International Journal (KAIS). Springer, Berlin

    Google Scholar 

  12. Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 114–124. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Boulicaut, J.F., Jeudy, B.: Using constraints during set mining: Should we prune or not? In: Actes des Seizième Journées Bases de Données Avancées BDA 2000 (2000)

    Google Scholar 

  14. Boulicaut, J.F., Marcel, P., Rigotti, C.: Query driven knowledge discovery in multidimensional data. In: Proceedings of DOLAP 1999 (1999)

    Google Scholar 

  15. Bucila, C., Gehrke, J., Kifer, D., White, W.: DualMiner: A dual-pruning algorithm for itemsets with constraints. In: Proceedings of ACM SIGKDD 2002 (2002)

    Google Scholar 

  16. Choenni, S., Siebes, A.: Query Optimization to Support Data Mining. In: Proc. of the Int’l. Workshop on Database and Expert Systems Application 1997 (1997)

    Google Scholar 

  17. Dehaspe, L., De Raedt, L.: Dlab: A declarative language bias formalism. In: Proceedings of ISMIS 1996 (1996)

    Google Scholar 

  18. Dehaspe, L., Toivonen, H.: Discovery of Frequent Datalog Patterns. Journal of Knowledge Discovery and Data Mining 3(1), 7–36 (1999)

    Article  Google Scholar 

  19. De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Proceedings of IJCAI 2001 (2001)

    Google Scholar 

  20. Džeroski, S., Lavrač, N. (eds.): Relational Data Mining. Springer, Berlin (2001)

    MATH  Google Scholar 

  21. Giannotti, F., Manco, G.: Querying Inductive Databases via Logic-Based User-Defined Aggregates. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 125–135. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  22. Giannotti, F., Manco, G.: Making Knowledge Extraction and Reasoning Closer. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805. Springer, Heidelberg (2000)

    Google Scholar 

  23. Giannotti, F., Manco, G., Turini, F.: Specifying Mining Algorithms with Iterative User-Defined Aggregates. IEEE Trans. Knowl. Data Eng. 16(10), 1232–1246 (2004)

    Article  Google Scholar 

  24. Giannotti, F., Manco, G., Wijsen, J.: Logical Languages for Data Mining. In: Logics for emerging Applications of Databases. Springer, Berlin (2003)

    Google Scholar 

  25. Grahne, G., Lakshmanan, L., Wang, X.: Efficient mining of constrained correlated sets. In: Proceedings of ICDE 2000 (2000)

    Google Scholar 

  26. Han, J.: Towards On-Line Analytical Mining in Large Databases. Sigmod Records 27(1), 97–107 (1998)

    Article  Google Scholar 

  27. Han, J., Chee, S., Chiand, J.: Issues for On-Line Analytical Mining of Data Warehouses. In: Proceedings of DMKD 1998 (1998)

    Google Scholar 

  28. Han, J., Fu, Y., Koperski, K., Wang, W., Zaiane, O.: DMQL: A Data Mining Query Language for Relational Databases. In: Proceedings of DMKD 1996 (1996)

    Google Scholar 

  29. Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-based, multidimensional data mining. Computer 32(8), 46–50 (1999)

    Article  Google Scholar 

  30. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD 2000 (2000)

    Google Scholar 

  31. Hand, D., Mannila, H., Smyh, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)

    Google Scholar 

  32. Houtsma, M., Swami, A.: Set-oriented mining for association rules in relational databases. In: Proceedings of ICDE 1995 (1995)

    Google Scholar 

  33. Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Comm. Of The Acm 39, 58–64 (1996)

    Article  Google Scholar 

  34. Imielinski, T., Virmani, A.: MSQL: A Query Language for Database Mining. Data Mining and Knowledge Discovery 3(4), 373–408 (1999)

    Article  Google Scholar 

  35. Imielinski, T., Virmani, A., Abdulghani, A.: DMajor - Application Programming Interface for Database Mining. Data Mining and Knowledge Discovery 3(4), 347–372 (1999)

    Article  Google Scholar 

  36. Jeudy, B., Boulicaut, J.F.: Optimization of association rule mining queries. Intelligent Data Analysis Journal 6(4), 341–357 (2002)

    MATH  Google Scholar 

  37. Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in hiv data. In: Proceedings of ACM SIGKDD 2001 (2001)

    Google Scholar 

  38. Lakshmanan, L.V.S., Ng, R.T., Han, J., Pang, A.: Optimization of constrained frequent set queries with 2-variable constraints. SIGMOD Record 28(2) (1999)

    Google Scholar 

  39. Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of ICDM 2001 (2001)

    Google Scholar 

  40. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of KDD 1998 (1998)

    Google Scholar 

  41. Mannila, H., Toivonen, H.: Levelwise Search and Border of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery 3, 241–258 (1997)

    Article  Google Scholar 

  42. Meo, R., Psaila, G., Ceri, S.: A new SQL-like operator for mining association rules. In: Proceedings of VLDB 1996 (1996)

    Google Scholar 

  43. Meo, R., Psaila, G., Ceri, S.: A Tightly-Coupled Architecture for Data Mining. In: Proceedings of ICDE 1998 (1998)

    Google Scholar 

  44. Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of the ACM SIGMOD 1998 (1998)

    Google Scholar 

  45. Orlando, S., Palmerini, P., Perego, R.: Enhancing the Apriori Algorithm for Frequent Set Counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, p. 71. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  46. Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: Adaptive and Resource-Aware Mining of Frequent Sets. In: Proceedings of ICDM 2002 (2002)

    Google Scholar 

  47. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  48. Pei, J., Han, J.: Can we push more constraints into frequent pattern mining? In: Proceedings of ACM SIGKDD 2000 (2000)

    Google Scholar 

  49. Pei, J., Han, J., Lakshmanan, L.V.S.: Mining frequent item sets with convertible constraints. In: Proceedings of ICDE 2001 (2001)

    Google Scholar 

  50. Pei, J., Zhang, X., Cho, M., Wang, H., Yu, P.: Maple: A fast algorithm for maximal pattern-based clustering. In: Proceedings of ICDM 2003 (2003)

    Google Scholar 

  51. De Raedt, L.: A logical database mining query language. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, p. 78. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  52. Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. In: Proceedings of the ACM SIGMOD 1998 (1998)

    Google Scholar 

  53. Shen, W., Leng, B.: A Metapattern-Based Discovery Loop for Integrated Data Mining - Unsupervised Learning of Relational Patterns. IEEE Trans. on Knowledge and Data Engineering 8(6), 898–910 (1996)

    Article  Google Scholar 

  54. Shen, W., Ong, K., Mitbander, B., Zaniolo, C.: Metaqueries for Data Mining. In: Advances in Knowledge Discovery and Data Mining, pp. 375–398. AAAI Press/The MIT Press (1996)

    Google Scholar 

  55. Siebes, A.P.J.M., Kersten, M.L.: Keso: Minimizing Database Interaction. In: Proceedings of KDD 1997 (1997)

    Google Scholar 

  56. Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Proceedings of KDD 1997 (1997)

    Google Scholar 

  57. Tsur, D., Ullman, J.D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S., Rosenthal, A.: Query flocks: A generalization of association-rule mining. In: Proceedings of ACM SIGMOD 1998 (1998)

    Google Scholar 

  58. Yiu, M.L., Mamoulis, N.: Frequent-pattern based iterative projected clustering. In: Proceedings of ICDM 2003 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bonchi, F., Giannotti, F., Pedreschi, D. (2006). A Relational Query Primitive for Constraint-Based Pattern Mining. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_2

Download citation

  • DOI: https://doi.org/10.1007/11615576_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31331-1

  • Online ISBN: 978-3-540-31351-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics