Abstract
This paper tackles the problem that methods for proposition-alization and feature construction in first-order logic to date construct features in a rather unspecific way. That is, they do not construct features “on demand”, but rather in advance and without detecting the need for a representation change. Even if structural features are required, current methods do not construct these features in a goal-directed fashion.
In previous work, we presented a method that creates structural features in a class-sensitive manner: We queried the molecular feature miner (MolFea) for features (linear molecular fragments) with a minimum frequency in the positive examples and a maximum frequency in the negative examples, such that they are, statistically significant, over-represented in the positives and under-represented in the negatives. In the present paper, we go one step further. We construct structural features in order to discriminate between those examples from different classes that are particularly problematic to classify. In order to avoid overfitting, this is done in a boosting framework. We are alternating AdaBoost re-weighting episodes and feature construction episodes in order to construct structural features “on demand”. In a feature construction episode, we are querying for features with a minimum cumulative weight in the positives and a maximum cumulative weight in the negatives, where the weights stem from the previous AdaBoost iteration. In summary, we propose to construct structural features “on demand” by a combination of AdaBoost and an extension of MolFea to handle weighted learning instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
I. Bournaud, M. Courtine, J.-D. Zucker. Abstractions for knowledge organization of relational descriptions. in: Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation (SARA-00), 87–106, Springer, 2000.
C. Carpineto. Shift of bias without operators. in: Proceedings of the 10th European Conference on Artificial Intelligence, 471–473, IOS Press, 1992.
E. Alphonse, C. Rouveirol. Lazy propositionalization for relational learning. in: Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, 2000.
L. Dehaspe, H. Toivonen. Discovery of frequent datalog patterns, Data Mining and Knowledge Discovery, 3(1):7–36, 1999.
L. De Raedt. Interactive Concept Learning. PhD thesis, Katholieke Universiteit Leuven, Leuven, Belgium, 1991.
L. De Raedt. A logical database mining query language. in: Proceedings of the 10th Inductive Logic Programming Conference, 78–92, Lecture Notes in Artificial Intelligence, Vol. 1866, Springer, 2000.
L. De Raedt, S. Kramer. The levelwise version space algorithm and its application to molecular fragment finding. in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), 2001.
S. Džeroski, H. Blockeel, B. Kompare, S. Kramer, B. Pfahringer, W. Van Laer. Experiments in predicting biodegradability. in: Proceedings of the 9th International Workshop on Inductive Logic Programming (ILP-99), 80–91, Springer, 1999.
E. Frank, I.H. Witten. Generating Accurate Rule Sets Without Global Optimization. in: Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98), Morgan Kaufmann Publishers, San Francisco, CA, 1998.
Y. Freund, R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:119–139, 1997.
H. Hirsh. Generalizing version spaces. Machine Learning, 17(1): 5–46, 1994.
A. Inokuchi, T. Washio, H. Motoda. An Apriori-based algorithm for mining frequent substructures from graph data. in: D. Zighed, J. Komorowski, J. Zyktow (eds.), Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases 2000, Lecture Notes in Artificial Intelligence, Vol. 1910, Springer, 2000.
S. Kramer. Relational Learning vs. Propositionalization: Investigations in Inductive Logic Programming and Propositional Machine Learning, PhD thesis, Vienna University of Technology, Vienna, Austria, 1999. http://www.informatik.uni-freiburg.de/skramer/phd.ps.gz
S. Kramer, E. Frank. Bottom-Up Propositionalization. In the Proceedings of the Work-in-Progress Track at the 10th International Conference on Inductive Logic Programming, 156–162, 2000.
S. Kramer, N. Lavrač, P. Flach. Propositionalization Approaches to Relational Data Mining. in: S. Džeroski, N. Lavrač (eds.), Relational Data Mining, Springer, 2001.
S. Kramer, L. De Raedt. Feature construction with version spaces for biochemical applications. in: Proceedings of the Eighteenth International Conference on Machine Learning (ICML-01), 2001.
S. Kramer, L. De Raedt, C. Helma. Molecular Feature Mining in HIV Data. in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01), 2001.
H. Mannila, H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997.
T. Mitchell. Generalization as search. Artificial Intelligence, 18(2), 1982.
A. Srinivasan, S. Muggleton, R. D. King, M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85(1-2):277–299, 1996.
A. Srinivasan, R. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(1):37–57, 1999.
A. Srinivasan, R. D. King, D. W. Bristol. An assessment of submissions made to the predictive toxicology evaluation challenge. in: Proceedings of the International Joint Conference on Artificial Intelligence 1999, 270–275, 1999.
S. Wrobel. Demand-driven concept formation. in: K. Morik (ed.), Knowledge Representation and Organization in Machine Learning, 289–319, Springer, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kramer, S. (2001). Demand-Driven Construction of Structural Features in ILP. In: Rouveirol, C., Sebag, M. (eds) Inductive Logic Programming. ILP 2001. Lecture Notes in Computer Science(), vol 2157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44797-0_11
Download citation
DOI: https://doi.org/10.1007/3-540-44797-0_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42538-0
Online ISBN: 978-3-540-44797-9
eBook Packages: Springer Book Archive