Skip to main content
Log in

Logical scalability and efficiency of relational learning algorithms

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. The same database may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. We introduce the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We show theoretically and empirically that current relational learning algorithms are generally not schema independent. We propose Castor, a relational learning algorithm that achieves schema independence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases: The Logical Level. Addison-Wesley, Boston (1994)

    Google Scholar 

  2. Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J., Silberschatz, A.: Learning and verifying quantified boolean queries by example. In: PODS, pp. 49–60 (2013)

  3. Angluin, D., Frazier, M., Pitt, L.: Learning conjunctions of Horn clauses. Mach. Learn. 9(2–3), 147–164 (1992)

    MATH  Google Scholar 

  4. Arias, M., Khardon, R., Maloberti, J.: Learning Horn expressions with LOGAN-H. J. Mach. Learn. Res. 8, 549–587 (2007)

    MATH  Google Scholar 

  5. Atzeni, P., Ausiello, G., Batini, C., Moscarini, M.: Inclusion and Equivalent Between Relational Database Schemata. TCS, Mumbai (1982)

    MATH  Google Scholar 

  6. Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. TODS 38(4), 28:1–28:31 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, Y., Goldberg, S., Wang, D.Z., Johri, S.S.: Ontological pathfinding: mining first-order knowledge from large knowledge bases. In: SIGMOD, pp. 835–846 (2016)

  8. Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query transformations for improving the efficiency of ILP systems. J. Mach. Learn. Res. 4, 465–491 (2003)

    MathSciNet  MATH  Google Scholar 

  9. Davis, J., Burnside, E.S., de Castro Dutra, I., Page, D., Ramakrishnan, R., Costa, V.S., Shavlik, J.W.: View learning for statistical relational learning: with an application to mammography. In: IJCAI, pp. 677–683 (2005)

  10. De Raedt, L.: Logical and Relational Learning, 1st edn. Springer, Berlin (2010)

    MATH  Google Scholar 

  11. Fagin, R.: Inverting schema mappings. TODS 32(4), 25 (2007)

    Article  MATH  Google Scholar 

  12. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT, pp. 207–224. Springer, Berlin, Heidelberg (2003)

    Google Scholar 

  13. Fan, W., Bohannon, P.: Information preserving xml schema embedding. TODS 33(1), 1–44 (2008)

    Article  Google Scholar 

  14. Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.: Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 24(6), 707–730 (2015)

    Article  Google Scholar 

  15. Getoor, L., Machanavajjhala, A.: Entity resolution in big data. In: KDD, p. 1527 (2013)

  16. Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)

    Book  MATH  Google Scholar 

  17. Hull, R.: Relative information capacity of simple relational database schemata. In: PODS, pp. 97–109 (1984)

  18. Khardon, R.: Learning function-free Horn expressions. Mach. Learn. 37(3), 241–275 (1999)

    Article  MATH  Google Scholar 

  19. Kuželka, O., Železný, F.: A restarted strategy for efficient subsumption testing. Fundam. Inf. 89(1), 95–109 (2009)

    MATH  Google Scholar 

  20. Li, H., Chan, C.-Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8(13), 2158–2163 (2015)

    Google Scholar 

  21. Muggleton, S.: Inverse Entailment and Progol. New Gener. Comput. Special Issue Inductive Log. Program. 13, 245–286 (1995)

    Article  Google Scholar 

  22. Muggleton, S., Feng, C.: Efficient induction of logic programs. In: ALT. Springer/Ohmsha, pp. 368–381 (1990)

  23. Muggleton, S., Santos, J.C.A., Tamaddoni-Nezhad, A.: Progolem: a system based on relative minimal generalisation. In: De Raedt, L. (ed.) ILP, vol. 5989. Springer, Berlin, Heidelberg (2009)

    Google Scholar 

  24. Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. arXiv:1508.03846 (2015)

  25. Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. In: SIGMOD, pp. 929–944 (2017)

  26. Quinlan, J.R.: Learning logical definitions from relations. Mach. Learn. 5, 239–266 (1990)

    Google Scholar 

  27. Reddy, C., Tadepalli, P.: Learning Horn definitions: theory and an application to planning. New Gener. Comput. 17, 77–98 (1998)

    Article  Google Scholar 

  28. Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)

    Article  Google Scholar 

  29. Srinivasan, A.: The Aleph manual (2007). http://www.cs.ox.ac.uk/activities/machinelearning/Aleph/aleph. Accessed 15 Oct 2018

  30. Termehchy, A., Winslett, M., Chodpathumwan, Y.: How schema independent are schema free query interfaces? In: ICDE, pp. 645–660 (2011)

  31. Zeng, Q., Patel, J.M., Page, D.: QuickFOIL: scalable inductive logic programming. PVLDB 8(3), 197–208 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Picado.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Picado, J., Termehchy, A., Fern, A. et al. Logical scalability and efficiency of relational learning algorithms. The VLDB Journal 28, 147–171 (2019). https://doi.org/10.1007/s00778-018-0523-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0523-8

Keywords

Navigation