Abstract
We present a framework designed to mine sequential temporal patterns from multi-relational databases. In order to exploit logic-relational information without using aggregation methodologies, we convert the multi-relational dataset into what we name a multi-sequence database. Each example in a multi-relational target table is coded into a sequence that combines intra-table and inter-table relational temporal information. This allows us to find heterogeneous temporal patterns through standard sequence miners. Our framework is grounded in the excellent results achieved by previous propositionalization strategies. We follow a pipelined approach, where we first use a sequence miner to find frequent sequences in the multi-sequence database. Next, we select the most interesting findings to augment the representational space of the examples. The most interesting sequence patterns are discriminative and class correlated. In the final step we build a classifier model by taking an enlarged target table as input to a classifier algorithm. We evaluate the performance of this work through a motivating application, the hepatitis multi-relational dataset. We prove the effectiveness of our methodology by addressing two problems of the hepatitis dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ferreira, C.A., Gama, J., Costa, V.S.: RUSE-WARMR: Rule Selection for Classifier Induction in Multi-relational Data-Sets. In: ICTAI, pp. 379–386 (2008)
Zelezny, F., Lavrac, N.: Propositionalization-Based Relational Subgroup Discovery with RSD. Machine Learning, 33–63 (2006)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: ICDE, pp. 3–14 (1995)
Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: ICDE, pp. 215–224 (2001)
Garofalakis, M., Rastogi, R., Shim, K.: Mining Sequential Patterns with Regular Expression Constraints. IEEE Trans. on Know. and Data Eng., 223–234 (2002)
Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Datasets. In: SDM, pp. 166–177 (2003)
Quinlan, J.R., Cameron-Jones, R.M.: Induction of Logic Programs: FOIL and Related Systems. New Generation Computing, 287–312 (1995)
Muggleton, S., Feng, C.: Efficient Induction Of Logic Programs. Academic Press, London (1990)
Landwehr, N., Kersting, K., De Raedt, L.: nFOIL: Integrating Naïve Bayes and FOIL. In: AAAI, pp. 795–800 (2005)
Davis, J., Burnside, E., Page, D., Dutra, I., Costa, V.S.: Learning Bayesian networks of rules with SAYU. In: MRDM, p.13 (2005)
Dehaspe, L., Toivonen, H.: Discovery of frequent DATALOG patterns. Data Min. Knowl. Discov. (1999)
Ohara, K., Yoshida, T., Geamsakul, W., Motoda, H., Washio, T., Yokoi, H., Takabayashi, K.: Analysis of Hepatitis Dataset by Decision Tree Graph-Based Induction. Proceedings of Discovery Challenge, 173–184 (2004)
Yamada, Y., Suzuki, E., Yokoi, H., Takabayashi, K.: Decision-tree Induction from Time-series Data Based on a Standard-example Split Test. In: ICML, pp. 840–847 (2003)
Witten, I., Frank, E.: Data mining: practical machine learning tools with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ferreira, C.A., Gama, J., Costa, V.S. (2010). Sequential Pattern Mining in Multi-relational Datasets. In: Meseguer, P., Mandow, L., Gasca, R.M. (eds) Current Topics in Artificial Intelligence. CAEPIA 2009. Lecture Notes in Computer Science(), vol 5988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14264-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-14264-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14263-5
Online ISBN: 978-3-642-14264-2
eBook Packages: Computer ScienceComputer Science (R0)