Skip to main content

Itemset-Based Variable Construction in Multi-relational Supervised Learning

  • Conference paper
Inductive Logic Programming (ILP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7842))

Included in the following conference series:

Abstract

In multi-relational data mining, data are represented in a relational form where the individuals of the target table are potentially related to several records in secondary tables in one-to-many relationship. In this paper, we introduce an itemset based framework for constructing variables in secondary tables and evaluating their conditional information for the supervised classification task. We introduce a space of itemset based models in the secondary table and conditional density estimation of the related constructed variables. A prior distribution is defined on this model space, resulting in a parameter-free criterion to assess the relevance of the constructed variables. A greedy algorithm is then proposed in order to explore the space of the considered itemsets. Experiments on multi-relationalal datasets confirm the advantage of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Knobbe, A.J., Blockeel, H., Siebes, A., Van Der Wallen, D.: Multi-Relational Data Mining. In: Proceedings of Benelearn 1999 (1999)

    Google Scholar 

  2. Džeroski, S., Lavrač, N.: Relational Data Mining. Springer-Verlag New York, Inc. (2001)

    Google Scholar 

  3. Kramer, S., Flach, P.A., Lavrač, N.: Propositionalization approaches to relational data mining. In: Džeroski, S., Lavrač, N. (eds.) Relational Data Mining, pp. 262–286. Springer, New York (2001)

    Chapter  Google Scholar 

  4. Van Laer, W., De Raedt, L., Džeroski, S.: On multi-class problems and discretization in inductive logic programming. In: Raś, Z.W., Skowron, A. (eds.) ISMIS 1997. LNCS, vol. 1325, pp. 277–286. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  5. Knobbe, A.J., Ho, E.K.Y.: Numbers in multi-relational data mining. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 544–551. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Alfred, R.: Discretization Numerical Data for Relational Data with One-to-Many Relations. Journal of Computer Science 5(7), 519–528 (2009)

    Article  Google Scholar 

  7. Lachiche, N., Flach, P.A.: A first-order representation for knowledge discovery and Bayesian classification on relational data. In: PKDD 2000 Workshop on Data Mining, Decision Support, Meta-learning and ILP, pp. 49–60 (2000)

    Google Scholar 

  8. Flach, P.A., Lachiche, N.: Naive Bayesian Classification of Structured Data. Machine Learning 57(3), 233–269 (2004)

    Article  MATH  Google Scholar 

  9. Ceci, M., Appice, A., Malerba, D.: Mr-SBC: A Multi-relational Naïve Bayes Classifier. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 95–106. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Krogel, M.-A., Wrobel, S.: Transformation-based learning using multirelational aggregation. In: Rouveirol, C., Sebag, M. (eds.) ILP 2001. LNCS (LNAI), vol. 2157, pp. 142–155. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  11. Lahbib, D., Boullé, M., Laurent, D.: Informative variables selection for multi-relational supervised learning. In: Perner, P. (ed.) MLDM 2011. LNCS, vol. 6871, pp. 75–87. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE (11), 2278–2324 (1998)

    Google Scholar 

  13. De Raedt, L., Dehaspe, L.: Mining Association Rules in Multiple Relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  14. Nijssen, S., Kok, J.N.: Faster association rules for multiple relations. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, vol. (1) (2001)

    Google Scholar 

  15. Guo, J., Bian, W., Li, J.: Multi-relational Association Rule Mining with Guidance of User. In: Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), pp. 704–709 (2007)

    Google Scholar 

  16. Gu, Y., Liu, H., He, J., Hu, B., Du, X.: MrCAR: A Multi-relational Classification Algorithm Based on Association Rules. In: 2009 International Conference on Web Information Systems and Mining, pp. 256–260 (2009)

    Google Scholar 

  17. Crestana-Jensen, V., Soparkar, N.: Frequent itemset counting across multiple tables. In: Terano, T., Liu, H., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 49–61. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  18. Goethals, B., Le Page, W., Mampaey, M.: Mining interesting sets and rules in relational databases. In: Proceedings of the 2010 ACM Symposium on Applied Computing, p. 997 (2010)

    Google Scholar 

  19. Goethals, B., Laurent, D., Le Page, W., Dieng, C.T.: Mining frequent conjunctive queries in relational databases through dependency discovery. Knowledge and Information Systems 33(3), 655–684 (2012)

    Article  Google Scholar 

  20. Ceci, M., Appice, A.: Spatial associative classification: propositional vs structural approach. Journal of Intelligent Information Systems 27(3), 191–213 (2006)

    Article  Google Scholar 

  21. Ceci, M., Appice, A., Malerba, D.: Emerging pattern based classification in relational data mining. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 283–296. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  22. Boullé, M.: Optimum simultaneous discretization with data grid models in supervised classification A Bayesian model selection approach. Advances in Data Analysis and Classification 3(1), 39–61 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  23. Gay, D., Boullé, M.: A bayesian approach for classification rule mining in quantitative databases. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 243–259. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Lahbib, D., Boullé, M., Laurent, D.: An evaluation criterion for itemset based variable construction in multi-relational supervised learning. In: Riguzzi, F., Železný, F. (eds.) The 22nd International Conference on Inductive Logic Programming (ILP 2012), Dubrovnik, Croatia (2012)

    Google Scholar 

  25. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (1991)

    Book  MATH  Google Scholar 

  26. Rissanen, J.: A universal prior for integers and estimation by minimum description length. Annals of Statistics 11(2), 416–431 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  27. Shannon, C.: A mathematical theory of communication. Technical report. Bell Systems Technical Journal (1948)

    Google Scholar 

  28. Boullé, M.: Compression-based averaging of selective naive Bayes classifiers. Journal of Machine Learning Research 8, 1659–1685 (2007)

    MATH  Google Scholar 

  29. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems 15, pp. 561–568. MIT Press (2003)

    Google Scholar 

  30. Zhou, Z.H., Zhang, M.L.: Multi-instance multi-label learning with application to scene classification. In: Advances in Neural Information Processing Systems (NIPS 2006), Number i, pp. 1609–1616. MIT Press, Cambridge (2007)

    Google Scholar 

  31. Džeroski, S., Schulze-Kremer, S., Heidtke, K.R., Siems, K., Wettschereck, D., Blockeel, H.: Diterpene Structure Elucidation From 13C NMR Spectra with Inductive Logic Programming. Applied Artificial Intelligence 12(5), 363–383 (1998)

    Article  Google Scholar 

  32. De Raedt, L.: Attribute-Value Learning Versus Inductive Logic Programming: The Missing Links (Extended Abstract). In: Page, D. (ed.) ILP 1998. LNCS, vol. 1446, pp. 1–8. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  33. Srinivasan, A., Muggleton, S., King, R., Sternberg, M.: Mutagenesis: ILP experiments in a non-determinate biological domain. In: Proceedings of the 4th International Workshop on ILP, pp. 217–232 (1994)

    Google Scholar 

  34. Tomečková, M., Rauch, J., Berka, P.: STULONG - Data from a Longitudinal Study of Atherosclerosis Risk Factors. In: ECML/PKDD 2002 Discovery Challenge Workshop Notes (2002)

    Google Scholar 

  35. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lahbib, D., Boullé, M., Laurent, D. (2013). Itemset-Based Variable Construction in Multi-relational Supervised Learning. In: Riguzzi, F., Železný, F. (eds) Inductive Logic Programming. ILP 2012. Lecture Notes in Computer Science(), vol 7842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38812-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38812-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38811-8

  • Online ISBN: 978-3-642-38812-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics