Demand-Driven Construction of Structural Features in ILP

Kramer, Stefan

doi:10.1007/3-540-44797-0_11

Stefan Kramer³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2157))

Included in the following conference series:

International Conference on Inductive Logic Programming

319 Accesses
7 Citations

Abstract

This paper tackles the problem that methods for proposition-alization and feature construction in first-order logic to date construct features in a rather unspecific way. That is, they do not construct features “on demand”, but rather in advance and without detecting the need for a representation change. Even if structural features are required, current methods do not construct these features in a goal-directed fashion.

In previous work, we presented a method that creates structural features in a class-sensitive manner: We queried the molecular feature miner (MolFea) for features (linear molecular fragments) with a minimum frequency in the positive examples and a maximum frequency in the negative examples, such that they are, statistically significant, over-represented in the positives and under-represented in the negatives. In the present paper, we go one step further. We construct structural features in order to discriminate between those examples from different classes that are particularly problematic to classify. In order to avoid overfitting, this is done in a boosting framework. We are alternating AdaBoost re-weighting episodes and feature construction episodes in order to construct structural features “on demand”. In a feature construction episode, we are querying for features with a minimum cumulative weight in the positives and a maximum cumulative weight in the negatives, where the weights stem from the previous AdaBoost iteration. In summary, we propose to construct structural features “on demand” by a combination of AdaBoost and an extension of MolFea to handle weighted learning instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

I. Bournaud, M. Courtine, J.-D. Zucker. Abstractions for knowledge organization of relational descriptions. in: Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation (SARA-00), 87–106, Springer, 2000.
Google Scholar
C. Carpineto. Shift of bias without operators. in: Proceedings of the 10th European Conference on Artificial Intelligence, 471–473, IOS Press, 1992.
Google Scholar
E. Alphonse, C. Rouveirol. Lazy propositionalization for relational learning. in: Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, 2000.
Google Scholar
L. Dehaspe, H. Toivonen. Discovery of frequent datalog patterns, Data Mining and Knowledge Discovery, 3(1):7–36, 1999.
Article Google Scholar
L. De Raedt. Interactive Concept Learning. PhD thesis, Katholieke Universiteit Leuven, Leuven, Belgium, 1991.
Google Scholar
L. De Raedt. A logical database mining query language. in: Proceedings of the 10th Inductive Logic Programming Conference, 78–92, Lecture Notes in Artificial Intelligence, Vol. 1866, Springer, 2000.
Google Scholar
L. De Raedt, S. Kramer. The levelwise version space algorithm and its application to molecular fragment finding. in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), 2001.
Google Scholar
S. Džeroski, H. Blockeel, B. Kompare, S. Kramer, B. Pfahringer, W. Van Laer. Experiments in predicting biodegradability. in: Proceedings of the 9th International Workshop on Inductive Logic Programming (ILP-99), 80–91, Springer, 1999.
Google Scholar
E. Frank, I.H. Witten. Generating Accurate Rule Sets Without Global Optimization. in: Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98), Morgan Kaufmann Publishers, San Francisco, CA, 1998.
Google Scholar
Y. Freund, R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:119–139, 1997.
Article MATH MathSciNet Google Scholar
H. Hirsh. Generalizing version spaces. Machine Learning, 17(1): 5–46, 1994.
Google Scholar
A. Inokuchi, T. Washio, H. Motoda. An Apriori-based algorithm for mining frequent substructures from graph data. in: D. Zighed, J. Komorowski, J. Zyktow (eds.), Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases 2000, Lecture Notes in Artificial Intelligence, Vol. 1910, Springer, 2000.
Google Scholar
S. Kramer. Relational Learning vs. Propositionalization: Investigations in Inductive Logic Programming and Propositional Machine Learning, PhD thesis, Vienna University of Technology, Vienna, Austria, 1999. http://www.informatik.uni-freiburg.de/skramer/phd.ps.gz
Google Scholar
S. Kramer, E. Frank. Bottom-Up Propositionalization. In the Proceedings of the Work-in-Progress Track at the 10th International Conference on Inductive Logic Programming, 156–162, 2000.
Google Scholar
S. Kramer, N. Lavrač, P. Flach. Propositionalization Approaches to Relational Data Mining. in: S. Džeroski, N. Lavrač (eds.), Relational Data Mining, Springer, 2001.
Google Scholar
S. Kramer, L. De Raedt. Feature construction with version spaces for biochemical applications. in: Proceedings of the Eighteenth International Conference on Machine Learning (ICML-01), 2001.
Google Scholar
S. Kramer, L. De Raedt, C. Helma. Molecular Feature Mining in HIV Data. in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), 2001.
Google Scholar
H. Mannila, H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997.
Article Google Scholar
T. Mitchell. Generalization as search. Artificial Intelligence, 18(2), 1982.
Google Scholar
A. Srinivasan, S. Muggleton, R. D. King, M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85(1-2):277–299, 1996.
Article Google Scholar
A. Srinivasan, R. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(1):37–57, 1999.
Article Google Scholar
A. Srinivasan, R. D. King, D. W. Bristol. An assessment of submissions made to the predictive toxicology evaluation challenge. in: Proceedings of the International Joint Conference on Artificial Intelligence 1999, 270–275, 1999.
Google Scholar
S. Wrobel. Demand-driven concept formation. in: K. Morik (ed.), Knowledge Representation and Organization in Machine Learning, 289–319, Springer, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Computer Science Machine Learning Lab, Albert-Ludwigs-University, Georges-Köhler-Allee, Gebäude 079, D-79110, Freiburg i. Brg, Germany
Stefan Kramer

Authors

Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Université Paris Sud, LRI, bât. 490, 91405, Orsay, France
Céline Rouveirol
Ecole Polytechnique, LMS, 91128, Palaiseau, France
Michéle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kramer, S. (2001). Demand-Driven Construction of Structural Features in ILP. In: Rouveirol, C., Sebag, M. (eds) Inductive Logic Programming. ILP 2001. Lecture Notes in Computer Science(), vol 2157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44797-0_11

Download citation

DOI: https://doi.org/10.1007/3-540-44797-0_11
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42538-0
Online ISBN: 978-3-540-44797-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics