Abstract
As a step towards the design of an Inductive Database System, in this paper we present a primitive for constraint-based frequent pattern mining, which represents a careful trade-off between expressiveness and efficiency. Such primitive is a simple mechanism which takes a relational table in input and extracts from it all frequent patterns which satisfy a given set of user-defined constraints. Despite its simplicity, the proposed primitive is expressive enough to deal with a broad range of interesting constraint-based frequent pattern queries,using a comprehensive repertoire of constraints defined over SQL aggregates. Thanks to its simplicity, the proposed primitive is amenable to be smoothly embedded in a variety of data mining query languages and be efficiently executed, by the state-of-the-art optimization techniques based on pushing the various form of constraints by means of data reduction.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD 1993 (1993)
Agrawal, R., Shim, K.: Developing tightly-coupled data mining applications on a relational database system. In: Proceedings of KDD 1996 (1996)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proceedings of VLDB 1994 (1994)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Adaptive Constraint Pushing in frequent pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 47–58. Springer, Heidelberg (2003)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Efficient Breadth-first Mining of Frequent Pattern with Monotone Constraints. In: To appear in Knowledge and Information Systems - An International Journal (KAIS), Springer, Berlin
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAMiner: Optimized level-wise frequent pattern mining with monotone constraints. In: Proceedings of ICDM 2003 (2003)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Preprocessing for Frequent Pattern Mining through Data Reduction. To appear in IEEE Intelligent Systems
Bonchi, F., Goethals, B.: FP-Bonsai: the Art of Growing and Pruning Small FP-trees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 155–160. Springer, Heidelberg (2004)
Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proceedings of ICDM 2004 (2004)
Bonchi, F., Lucchese, C.: On Condensed Representations of Constrained Frequent Patterns. In: To appear in Knowledge and Information Systems - An International Journal (KAIS). Springer, Berlin
Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 114–124. Springer, Heidelberg (2005)
Boulicaut, J.F., Jeudy, B.: Using constraints during set mining: Should we prune or not? In: Actes des Seizième Journées Bases de Données Avancées BDA 2000 (2000)
Boulicaut, J.F., Marcel, P., Rigotti, C.: Query driven knowledge discovery in multidimensional data. In: Proceedings of DOLAP 1999 (1999)
Bucila, C., Gehrke, J., Kifer, D., White, W.: DualMiner: A dual-pruning algorithm for itemsets with constraints. In: Proceedings of ACM SIGKDD 2002 (2002)
Choenni, S., Siebes, A.: Query Optimization to Support Data Mining. In: Proc. of the Int’l. Workshop on Database and Expert Systems Application 1997 (1997)
Dehaspe, L., De Raedt, L.: Dlab: A declarative language bias formalism. In: Proceedings of ISMIS 1996 (1996)
Dehaspe, L., Toivonen, H.: Discovery of Frequent Datalog Patterns. Journal of Knowledge Discovery and Data Mining 3(1), 7–36 (1999)
De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Proceedings of IJCAI 2001 (2001)
Džeroski, S., Lavrač, N. (eds.): Relational Data Mining. Springer, Berlin (2001)
Giannotti, F., Manco, G.: Querying Inductive Databases via Logic-Based User-Defined Aggregates. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 125–135. Springer, Heidelberg (1999)
Giannotti, F., Manco, G.: Making Knowledge Extraction and Reasoning Closer. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805. Springer, Heidelberg (2000)
Giannotti, F., Manco, G., Turini, F.: Specifying Mining Algorithms with Iterative User-Defined Aggregates. IEEE Trans. Knowl. Data Eng. 16(10), 1232–1246 (2004)
Giannotti, F., Manco, G., Wijsen, J.: Logical Languages for Data Mining. In: Logics for emerging Applications of Databases. Springer, Berlin (2003)
Grahne, G., Lakshmanan, L., Wang, X.: Efficient mining of constrained correlated sets. In: Proceedings of ICDE 2000 (2000)
Han, J.: Towards On-Line Analytical Mining in Large Databases. Sigmod Records 27(1), 97–107 (1998)
Han, J., Chee, S., Chiand, J.: Issues for On-Line Analytical Mining of Data Warehouses. In: Proceedings of DMKD 1998 (1998)
Han, J., Fu, Y., Koperski, K., Wang, W., Zaiane, O.: DMQL: A Data Mining Query Language for Relational Databases. In: Proceedings of DMKD 1996 (1996)
Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-based, multidimensional data mining. Computer 32(8), 46–50 (1999)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD 2000 (2000)
Hand, D., Mannila, H., Smyh, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)
Houtsma, M., Swami, A.: Set-oriented mining for association rules in relational databases. In: Proceedings of ICDE 1995 (1995)
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Comm. Of The Acm 39, 58–64 (1996)
Imielinski, T., Virmani, A.: MSQL: A Query Language for Database Mining. Data Mining and Knowledge Discovery 3(4), 373–408 (1999)
Imielinski, T., Virmani, A., Abdulghani, A.: DMajor - Application Programming Interface for Database Mining. Data Mining and Knowledge Discovery 3(4), 347–372 (1999)
Jeudy, B., Boulicaut, J.F.: Optimization of association rule mining queries. Intelligent Data Analysis Journal 6(4), 341–357 (2002)
Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in hiv data. In: Proceedings of ACM SIGKDD 2001 (2001)
Lakshmanan, L.V.S., Ng, R.T., Han, J., Pang, A.: Optimization of constrained frequent set queries with 2-variable constraints. SIGMOD Record 28(2) (1999)
Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of ICDM 2001 (2001)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of KDD 1998 (1998)
Mannila, H., Toivonen, H.: Levelwise Search and Border of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery 3, 241–258 (1997)
Meo, R., Psaila, G., Ceri, S.: A new SQL-like operator for mining association rules. In: Proceedings of VLDB 1996 (1996)
Meo, R., Psaila, G., Ceri, S.: A Tightly-Coupled Architecture for Data Mining. In: Proceedings of ICDE 1998 (1998)
Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of the ACM SIGMOD 1998 (1998)
Orlando, S., Palmerini, P., Perego, R.: Enhancing the Apriori Algorithm for Frequent Set Counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, p. 71. Springer, Heidelberg (2001)
Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: Adaptive and Resource-Aware Mining of Frequent Sets. In: Proceedings of ICDM 2002 (2002)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Pei, J., Han, J.: Can we push more constraints into frequent pattern mining? In: Proceedings of ACM SIGKDD 2000 (2000)
Pei, J., Han, J., Lakshmanan, L.V.S.: Mining frequent item sets with convertible constraints. In: Proceedings of ICDE 2001 (2001)
Pei, J., Zhang, X., Cho, M., Wang, H., Yu, P.: Maple: A fast algorithm for maximal pattern-based clustering. In: Proceedings of ICDM 2003 (2003)
De Raedt, L.: A logical database mining query language. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, p. 78. Springer, Heidelberg (2000)
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. In: Proceedings of the ACM SIGMOD 1998 (1998)
Shen, W., Leng, B.: A Metapattern-Based Discovery Loop for Integrated Data Mining - Unsupervised Learning of Relational Patterns. IEEE Trans. on Knowledge and Data Engineering 8(6), 898–910 (1996)
Shen, W., Ong, K., Mitbander, B., Zaniolo, C.: Metaqueries for Data Mining. In: Advances in Knowledge Discovery and Data Mining, pp. 375–398. AAAI Press/The MIT Press (1996)
Siebes, A.P.J.M., Kersten, M.L.: Keso: Minimizing Database Interaction. In: Proceedings of KDD 1997 (1997)
Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Proceedings of KDD 1997 (1997)
Tsur, D., Ullman, J.D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S., Rosenthal, A.: Query flocks: A generalization of association-rule mining. In: Proceedings of ACM SIGMOD 1998 (1998)
Yiu, M.L., Mamoulis, N.: Frequent-pattern based iterative projected clustering. In: Proceedings of ICDM 2003 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonchi, F., Giannotti, F., Pedreschi, D. (2006). A Relational Query Primitive for Constraint-Based Pattern Mining. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_2
Download citation
DOI: https://doi.org/10.1007/11615576_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31331-1
Online ISBN: 978-3-540-31351-9
eBook Packages: Computer ScienceComputer Science (R0)