Inductive Queries on Polynomial Equations

Džeroski, Sašo; Todorovski, Ljupčo; Ljubič, Peter

doi:10.1007/11615576_7

Sašo Džeroski²¹,
Ljupčo Todorovski²¹ &
Peter Ljubič²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3848))

311 Accesses

Abstract

Inductive databases (IDBs) contain both data and patterns. Inductive Queries (IQs) are used to access, generate and manipulate the patterns in the IDB. IQs are conjunctions of primitive constraints that have to be satisfied by target patterns: they can be different for different types of patterns. Constraint-based data mining algorithms are used to answer IQs.

So far, mostly the problem of mining frequent patterns has been considered in the framework of IDBs: the types of patterns considered include frequent itemsets, episodes, Datalog queries, sequences, and molecular fragments. Here we consider the problem of constraint-based mining for predictive models, where the data mining task is regression and the models are polynomial equations.

More specifically, we first define the pattern domain of polynomial equations. We then present a complete and a heuristic solver for this domain. We evaluate the use of the heuristic solver on standard regression problems and illustrate its use on a toy problem of reconstructing a biochemical reaction network. Finally, we consider the use of a combination of different pattern domains (molecular fragments and polynomial equations) for practical applications in modeling quantitative structure-activity relationships (QSARs).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An empirical study of on-line models for relational data streams

Article 03 December 2016

Identification

Inductive Logic Programming Meets Relational Databases: Efficient Learning of Markov Logic Networks

References

Bassingthwaighte, J.B. (ed.): Web Page of the Physiome Project (2002) (Web page update), http://www.physiome.org/
Bayardo, R.: Constraints in data mining. SIGKDD Explorations 4(1) (2002)
Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
De Raedt, L.: Data mining as constraint logic programming. In: Computational Logic: From Logic Programming into the Future (In honor of Bob Kowalski). Springer, Berlin (2002)
Google Scholar
De Raedt, L., Kramer, S.: Inductive databases for bio and chemoinformatics. In: Frasconi, P., Shamir, R. (eds.) Artificial Intelligence and Heuristic Methods for Bioinformatics. IOS Press, Amsterdam (2003)
Google Scholar
Džeroski, S., Blockeel, H., Kompare, B., Kramer, S., Pfahringer, B., Van Laer, W.: Experiments in predicting biodegradability. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 80–91. Springer, Heidelberg (1999)
Chapter Google Scholar
Džeroski, S., Todorovski, L., Ljubič, P.: Using constraints in discovering dynamics. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 297–305. Springer, Heidelberg (2003)
Chapter Google Scholar
Džeroski, S., Todorovski, L.: Discovering dynamics: from inductive logic programming to machine discovery. Journal of Intelligent Information Systems 4, 89–108 (1995)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Berlin (2001)
MATH Google Scholar
Helma, C. (ed.): Predictive Toxicology. CRC Press, Boca Raton (2005)
Google Scholar
Howard, P.H., Boethling, R.S., Jarvis, W.F., Meylan, W.M., Michalenko, E.M.: Handbook of Environmental Degradation Rates. Lewis Publishers (1991)
Google Scholar
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39(11), 58–64 (1996)
Article Google Scholar
Koza, J.R., Mydlowec, W., Lanza, G., Yu, J., Keane, M.A.: Reverse engineering of metabolic pathways from observed data using genetic programming. In: Proc. Sixth Pacific Symposium on Biocomputing, pp. 434–445. World Scientific, Singapore (2001)
Google Scholar
Kramer, S., De Raedt, L.: Feature construction with version spaces for biochemical applications. In: Proc. Eighteenth International Conference on Machine Learning, pp. 258–265. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Langley, P., Simon, H.A., Bradshaw, G.L., Żytkow, J.M.: Scientific Discovery. MIT Press, Cambridge (1987)
Google Scholar
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Article Google Scholar
Richard, A. (ed.): Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network (2004) (Web page update), http://www.epa.gov/nheerl/dsstox/
Todorovski, L., Džeroski, S.: Declarative bias in equation discovery. In: Proc. Fourteenth International Conference on Machine Learning, pp. 376–384. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Todorovski, L., Džeroski, S.: Theory revision in equation discovery. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 390–400. Springer, Heidelberg (2001)
Chapter Google Scholar
Todorovski, L., Džeroski, S., Ljubic, P.: Discovery of polynomial equations for regression. In: Proc. Sixth International Multi-Conference Information Society, vol. A, pp. 151–154. Jožef Stefan Institute, Ljubljana (2003)
Google Scholar
Todorovski, L., Ljubič, P., Džeroski, S.: Inducing polynomial equations for regression. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 441–452. Springer, Heidelberg (2004)
Chapter Google Scholar
Torgo, L.: Regression data sets (2001), http://www.liacc.up.pt/~ltorgo/Regression/DataSets.html
Voit, E.O.: Computational Analysis of Biochemical Systems. Cambridge University Press, Cambridge (2000)
Google Scholar
Wang, Y., Witten, I.H.: Induction of model trees for predicting continuous classes. In: The Proceedings of the Poster Papers of the Eighth European Conference on Machine Learning, pp. 128–137. University of Economics, Faculty of Informatics and Statistics, Prague (1997)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Mateo (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Sašo Džeroski, Ljupčo Todorovski & Peter Ljubič

Authors

Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar
Ljupčo Todorovski
View author publications
You can also search for this author in PubMed Google Scholar
Peter Ljubič
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Heverlee, Belgium
Luc De Raedt
HIIT, Helsinki University of Technology and, University of Helsinki, Finland
Heikki Mannila

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Džeroski, S., Todorovski, L., Ljubič, P. (2006). Inductive Queries on Polynomial Equations. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_7

Download citation

DOI: https://doi.org/10.1007/11615576_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31331-1
Online ISBN: 978-3-540-31351-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics