Skip to main content

Efficient Sampling in Relational Feature Spaces

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3625))

Abstract

State-of-the-art algorithms implementing the ‘extended transformation approach’ to propositionalization use backtrack depth first search for the construction of relational features (first order atom conjunctions) complying to user’s mode/type declarations and a few basic syntactic conditions. As such they incur a complexity factor exponential in the maximum allowed feature size. Here I present an alternative based on an efficient reduction of the feature construction problem on the propositional satisfiability (SAT) problem, such that the latter involves only Horn clauses and is therefore tractable: a model to a propositional Horn theory can be found without backtracking in time linear in the number of literals contained. This reduction allows to either efficiently enumerate the complete set of correct features (if their total number is polynomial in the maximum feature size), or otherwise efficiently obtain a random sample from the uniform distribution on the feature space. The proposed sampling method can also efficiently provide an unbiased estimate of the total number of correct features entailed by the user language declaration.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blatak, J., Popelinsky, L.: Distributed mining maximal first-order patterns. In: Work in Progress track of Inductive Logic Programming, 14th Inf. Conf. (2004)

    Google Scholar 

  2. Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query transformations for improving the efficiency of ilp systems. J. Mach. Learn. Res. 4, 465–491 (2003)

    Article  Google Scholar 

  3. Dechter, R., Itai, A.: Finding all solutions if you can find one. In: AAAI 1992 Workshop on Tractable Reasoning (1992)

    Google Scholar 

  4. Domingos, P.: Process-oriented estimation of generalization error. In: IJCAI 1997, pp. 714–721 (1999)

    Google Scholar 

  5. Dowling, W.F., Gallier, J.H.: Linear time algorithms for testing the satisfiability of propositional horn formula. Journal of Logic Programming 3, 267–284 (1994)

    MathSciNet  Google Scholar 

  6. Krogel, M.-A., Rawles, S., Železný, F., Wrobel, S., Flach, P., Lavrac, N.: Comparative evaluation of approaches to propositionalization. In: Proceedings of the 13th International Conference on Inductive Logic Programming. Springer, Heidelberg (2003)

    Google Scholar 

  7. Lavrač, N., Flach, P.A.: An extended transformation approach to inductive logic programming. ACM Transactions on Computational Logic 2(4), 458–494 (2001)

    Article  Google Scholar 

  8. Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)

    Google Scholar 

  9. Pfahringer, B., Holmes, G.: Propositionalization through stochastic discrimination. In: Work in Progress Track at Inductive Logic Programming, 13th Inf. Conf. (2003)

    Google Scholar 

  10. Schaefer, T.J.: The complexity of satisfiability problems. In: Tenth Annual Symposium on Theory of Computing, pp. 216–226 (1978)

    Google Scholar 

  11. Sebag, M., Rouveirol, C.: Tractable induction and classification in first-order logic via stochastic matching, pp. 888–893 (1997)

    Google Scholar 

  12. Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: a study in first-order and feature-based induction. Artif. Intell. 85(1-2), 277–299 (1996)

    Article  Google Scholar 

  13. Vens, C., Van Assche, A., Blockeel, H., Dzeroski, S.: First order random forests with complex aggregates. In: Camacho, R., King, R., Srinivasan, A. (eds.) ILP 2004. LNCS, vol. 3194, pp. 323–340. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Železný, F., Srinivasan, A., Page, D.: Lattice-search runtime distributions may be heavy-tailed. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS, vol. 2583, pp. 333–345. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Železný, F. (2005). Efficient Sampling in Relational Feature Spaces. In: Kramer, S., Pfahringer, B. (eds) Inductive Logic Programming. ILP 2005. Lecture Notes in Computer Science(), vol 3625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11536314_24

Download citation

  • DOI: https://doi.org/10.1007/11536314_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28177-1

  • Online ISBN: 978-3-540-31851-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics