Learning Ensembles of Process-Based Models by Bagging of Random Library Samples

Simidjievski, Nikola; Todorovski, Ljupčo; Džeroski, Sašo

doi:10.1007/978-3-319-46307-0_16

Nikola Simidjievski^16,17,
Ljupčo Todorovski¹⁸ &
Sašo Džeroski^16,17

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9956))

Included in the following conference series:

International Conference on Discovery Science

1714 Accesses
1 Citations

Abstract

We propose a new method for learning ensembles of process-based models for predictive modeling of dynamic systems from data and knowledge. Previous work has shown that ensembles based on sampling data (i.e., bagging), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial computational overhead needed for learning. On the other hand, methods for constructing ensembles based on sampling knowledge (i.e., random library samples, RLS) allow for efficient learning ensembles of process-based models, while maintaining their long-term predictive performance. This paper aims at checking the conjecture whether the combination of these methods has a potential for further performance improvements. The proposed method, bagging of random library samples for learning ensembles of process-based models combines the afore-mentioned approaches in terms of sampling both data and knowledge. We apply the method to and evaluate its performance on a set of automated predictive modeling tasks in two lake ecosystems from data and library of knowledge for modeling population dynamics. The experimental results serve both to identify the optimal design decisions regarding the proposed method as well as to asses its predictive ability. The results show that such ensembles outperform single process-based model, but also outperform each of the two methods for learning ensembles from data samples (bagging) and knowledge samples (RLS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning from evolving data streams through ensembles of random patches

Article 09 June 2021

Assessment of machine learning models to predict daily streamflow in a semiarid river catchment

Article 24 April 2024

Bayesian Network Structure Learning from Big Data: A Reservoir Sampling Based Ensemble Method

Notes

1.
Note that the sampling procedure does not require for generation of the power set $\mathbb {P}(L)$.
2.
Predictions are obtained by simulating the model m on the test (and not the validation) set.

References

Aleksovski, D., Kocijan, J., Džeroski, S.: Ensembles of fuzzy linear model trees for the identification of multi-output systems. IEEE Trans. Fuzzy Syst. 24(4), 916–929 (2015)
Article Google Scholar
Atanasova, N., Todorovski, L., Džeroski, S., Kompare, B.: Constructing a library of domain knowledge for automated modelling of aquatic ecosystems. Ecol. Model. 194(1–3), 14–36 (2006)
Article Google Scholar
Atanasova, N., Todorovski, L., Džeroski, S., Remec, R., Recknagel, F., Kompare, B.: Automated modelling of a food web in Lake Bled using measured data and a library of domain knowledge. Ecol. Model. 194(1–3), 37–48 (2006)
Article Google Scholar
Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman & Hall, London (1984)
MATH Google Scholar
Bridewell, W., Asadi, N.B., Langley, P.W., Todorovski, L.: Reducing overfitting in process model induction. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 81–88. ACM, New York (2005)
Google Scholar
Bridewell, W., Langley, P.W., Todorovski, L., Džeroski, S.: Inductive process modeling. Mach. Learn. 71, 1–32 (2008)
Article Google Scholar
Cohen, S.D., Hindmarsh, A.C.: CVODE, a stiff/nonstiff ODE solver in C. J. Comput. Phys. 10(2), 138–143 (1996)
Article Google Scholar
Dietzel, A., Mieleitner, J., Kardaetz, S., Reichert, P.: Effects of changes in the driving forces on water quality and plankton dynamics in three swiss lakes – long-term simulations with BELAMO. Freshw. Biol. 58(1), 10–35 (2013)
Article Google Scholar
Džeroski, S., Todorovski, L.: Modeling the dynamics of biological networks from time course data. In: Choi, S. (ed.) Systems Biology of Signaling Networks. Systems Biology, pp. 275–295. Springer, New York (2010)
Chapter Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Article Google Scholar
Langley, P.W., Simon, H.A., Bradshaw, G., Zytkow, J.M.: Scientific Discovery: Computational Explorations of the Creative Processes. The MIT Press, MA (1987)
Google Scholar
Ljung, L.: System identification - Theory for the User. Prentice-Hall, Upper Saddle River (1999)
MATH Google Scholar
Simidjievski, N., Todorovski, L., Džeroski, S.: Predicting long-term population dynamics with bagging and boosting of process-based models. Expert Syst. Appl. 42(22), 8484–8496 (2015)
Article Google Scholar
Simidjievski, N., Todorovski, L., Džeroski, S.: Modeling dynamic systems with efficient ensembles of process-based models. PLoS ONE 11(4), 1–27 (2016)
Article Google Scholar
Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Article MathSciNet MATH Google Scholar
Tanevski, J., Todorovski, L., Džeroski, S.: Learning stochastic process-based models of dynamical systems from knowledge and data. BMC Syst. Biol. 10(1), 30 (2016)
Article Google Scholar
Taškova, K., Šilc, J., Atanasova, N., Džeroski, S.: Parameter estimation in a nonlinear dynamic model of an aquatic ecosystem with meta-heuristic optimization. Ecol. Model. 226, 36–61 (2012)
Article Google Scholar
Todorovski, L., Bridewell, W., Shiran, O., Langley, P.W.: Inducing hierarchical process models in dynamic domains. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings of the 20th National Conference on Artificial Intelligence, NCAI 2005, pp. 892–897. AAAI Press, Pittsburgh (2005)
Google Scholar
Todorovski, L., Džeroski, S.: Integrating domain knowledge in equation discovery. In: Džeroski, S., Todorovski, L. (eds.) Computational Discovery of Scientific Knowledge. LNCS (LNAI), vol. 4660, pp. 69–97. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73920-3_4
Chapter Google Scholar
Čerepnalkoski, D., Taškova, K., Todorovski, L., Atanasova, N., Džeroski, S.: The influence of parameter fitting methods on model structure selection in automated modeling of aquatic ecosystems. Ecol. Model. 245(0), 136–165 (2012)
Article Google Scholar
Štrumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Nikola Simidjievski & Sašo Džeroski
Jožef Stefan International Postgraduate School, Ljubljana, Slovenia
Nikola Simidjievski & Sašo Džeroski
Faculty of Administration, University of Ljubljana, Ljubljana, Slovenia
Ljupčo Todorovski

Authors

Nikola Simidjievski
View author publications
You can also search for this author in PubMed Google Scholar
Ljupčo Todorovski
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikola Simidjievski .

Editor information

Editors and Affiliations

Campus Middelhe, M.G.103a, Universiteit Antwerpen Campus Middelhe, M.G.103a, Antwerp, Belgium
Toon Calders
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Bari, Italy
Donato Malerba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Simidjievski, N., Todorovski, L., Džeroski, S. (2016). Learning Ensembles of Process-Based Models by Bagging of Random Library Samples. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-46307-0_16
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46306-3
Online ISBN: 978-3-319-46307-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics