Efficient Sampling in Relational Feature Spaces

Železný, Filip

doi:10.1007/11536314_24

Efficient Sampling in Relational Feature Spaces

Filip Železný²⁰

Conference paper

544 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3625))

Abstract

State-of-the-art algorithms implementing the ‘extended transformation approach’ to propositionalization use backtrack depth first search for the construction of relational features (first order atom conjunctions) complying to user’s mode/type declarations and a few basic syntactic conditions. As such they incur a complexity factor exponential in the maximum allowed feature size. Here I present an alternative based on an efficient reduction of the feature construction problem on the propositional satisfiability (SAT) problem, such that the latter involves only Horn clauses and is therefore tractable: a model to a propositional Horn theory can be found without backtracking in time linear in the number of literals contained. This reduction allows to either efficiently enumerate the complete set of correct features (if their total number is polynomial in the maximum feature size), or otherwise efficiently obtain a random sample from the uniform distribution on the feature space. The proposed sampling method can also efficiently provide an unbiased estimate of the total number of correct features entailed by the user language declaration.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blatak, J., Popelinsky, L.: Distributed mining maximal first-order patterns. In: Work in Progress track of Inductive Logic Programming, 14th Inf. Conf. (2004)
Google Scholar
Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query transformations for improving the efficiency of ilp systems. J. Mach. Learn. Res. 4, 465–491 (2003)
Article Google Scholar
Dechter, R., Itai, A.: Finding all solutions if you can find one. In: AAAI 1992 Workshop on Tractable Reasoning (1992)
Google Scholar
Domingos, P.: Process-oriented estimation of generalization error. In: IJCAI 1997, pp. 714–721 (1999)
Google Scholar
Dowling, W.F., Gallier, J.H.: Linear time algorithms for testing the satisfiability of propositional horn formula. Journal of Logic Programming 3, 267–284 (1994)
MathSciNet Google Scholar
Krogel, M.-A., Rawles, S., Železný, F., Wrobel, S., Flach, P., Lavrac, N.: Comparative evaluation of approaches to propositionalization. In: Proceedings of the 13th International Conference on Inductive Logic Programming. Springer, Heidelberg (2003)
Google Scholar
Lavrač, N., Flach, P.A.: An extended transformation approach to inductive logic programming. ACM Transactions on Computational Logic 2(4), 458–494 (2001)
Article Google Scholar
Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)
Google Scholar
Pfahringer, B., Holmes, G.: Propositionalization through stochastic discrimination. In: Work in Progress Track at Inductive Logic Programming, 13th Inf. Conf. (2003)
Google Scholar
Schaefer, T.J.: The complexity of satisfiability problems. In: Tenth Annual Symposium on Theory of Computing, pp. 216–226 (1978)
Google Scholar
Sebag, M., Rouveirol, C.: Tractable induction and classification in first-order logic via stochastic matching, pp. 888–893 (1997)
Google Scholar
Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: a study in first-order and feature-based induction. Artif. Intell. 85(1-2), 277–299 (1996)
Article Google Scholar
Vens, C., Van Assche, A., Blockeel, H., Dzeroski, S.: First order random forests with complex aggregates. In: Camacho, R., King, R., Srinivasan, A. (eds.) ILP 2004. LNCS, vol. 3194, pp. 323–340. Springer, Heidelberg (2004)
Chapter Google Scholar
Železný, F., Srinivasan, A., Page, D.: Lattice-search runtime distributions may be heavy-tailed. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS, vol. 2583, pp. 333–345. Springer, Heidelberg (2003)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Czech Technical University in Prague, Technická 6, 166 27, Prague 6, Czech Republic
Filip Železný

Authors

Filip Železný
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Informatik I12, Technische Universität München, Boltzmannstr. 3, D-85748, Garching b. München, Germany
Stefan Kramer
Department of Computer Science, University of Waikato, Hamilton, New Zealand
Bernhard Pfahringer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Železný, F. (2005). Efficient Sampling in Relational Feature Spaces. In: Kramer, S., Pfahringer, B. (eds) Inductive Logic Programming. ILP 2005. Lecture Notes in Computer Science(), vol 3625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11536314_24

Download citation

DOI: https://doi.org/10.1007/11536314_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28177-1
Online ISBN: 978-3-540-31851-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics