Abstract
We propose a new method for learning ensembles of process-based models for predictive modeling of dynamic systems from data and knowledge. Previous work has shown that ensembles based on sampling data (i.e., bagging), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial computational overhead needed for learning. On the other hand, methods for constructing ensembles based on sampling knowledge (i.e., random library samples, RLS) allow for efficient learning ensembles of process-based models, while maintaining their long-term predictive performance. This paper aims at checking the conjecture whether the combination of these methods has a potential for further performance improvements. The proposed method, bagging of random library samples for learning ensembles of process-based models combines the afore-mentioned approaches in terms of sampling both data and knowledge. We apply the method to and evaluate its performance on a set of automated predictive modeling tasks in two lake ecosystems from data and library of knowledge for modeling population dynamics. The experimental results serve both to identify the optimal design decisions regarding the proposed method as well as to asses its predictive ability. The results show that such ensembles outperform single process-based model, but also outperform each of the two methods for learning ensembles from data samples (bagging) and knowledge samples (RLS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that the sampling procedure does not require for generation of the power set \(\mathbb {P}(L)\).
- 2.
Predictions are obtained by simulating the model m on the test (and not the validation) set.
References
Aleksovski, D., Kocijan, J., Džeroski, S.: Ensembles of fuzzy linear model trees for the identification of multi-output systems. IEEE Trans. Fuzzy Syst. 24(4), 916–929 (2015)
Atanasova, N., Todorovski, L., Džeroski, S., Kompare, B.: Constructing a library of domain knowledge for automated modelling of aquatic ecosystems. Ecol. Model. 194(1–3), 14–36 (2006)
Atanasova, N., Todorovski, L., Džeroski, S., Remec, R., Recknagel, F., Kompare, B.: Automated modelling of a food web in Lake Bled using measured data and a library of domain knowledge. Ecol. Model. 194(1–3), 37–48 (2006)
Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman & Hall, London (1984)
Bridewell, W., Asadi, N.B., Langley, P.W., Todorovski, L.: Reducing overfitting in process model induction. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 81–88. ACM, New York (2005)
Bridewell, W., Langley, P.W., Todorovski, L., Džeroski, S.: Inductive process modeling. Mach. Learn. 71, 1–32 (2008)
Cohen, S.D., Hindmarsh, A.C.: CVODE, a stiff/nonstiff ODE solver in C. J. Comput. Phys. 10(2), 138–143 (1996)
Dietzel, A., Mieleitner, J., Kardaetz, S., Reichert, P.: Effects of changes in the driving forces on water quality and plankton dynamics in three swiss lakes – long-term simulations with BELAMO. Freshw. Biol. 58(1), 10–35 (2013)
Džeroski, S., Todorovski, L.: Modeling the dynamics of biological networks from time course data. In: Choi, S. (ed.) Systems Biology of Signaling Networks. Systems Biology, pp. 275–295. Springer, New York (2010)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Langley, P.W., Simon, H.A., Bradshaw, G., Zytkow, J.M.: Scientific Discovery: Computational Explorations of the Creative Processes. The MIT Press, MA (1987)
Ljung, L.: System identification - Theory for the User. Prentice-Hall, Upper Saddle River (1999)
Simidjievski, N., Todorovski, L., Džeroski, S.: Predicting long-term population dynamics with bagging and boosting of process-based models. Expert Syst. Appl. 42(22), 8484–8496 (2015)
Simidjievski, N., Todorovski, L., Džeroski, S.: Modeling dynamic systems with efficient ensembles of process-based models. PLoS ONE 11(4), 1–27 (2016)
Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Tanevski, J., Todorovski, L., Džeroski, S.: Learning stochastic process-based models of dynamical systems from knowledge and data. BMC Syst. Biol. 10(1), 30 (2016)
Taškova, K., Šilc, J., Atanasova, N., Džeroski, S.: Parameter estimation in a nonlinear dynamic model of an aquatic ecosystem with meta-heuristic optimization. Ecol. Model. 226, 36–61 (2012)
Todorovski, L., Bridewell, W., Shiran, O., Langley, P.W.: Inducing hierarchical process models in dynamic domains. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings of the 20th National Conference on Artificial Intelligence, NCAI 2005, pp. 892–897. AAAI Press, Pittsburgh (2005)
Todorovski, L., Džeroski, S.: Integrating domain knowledge in equation discovery. In: Džeroski, S., Todorovski, L. (eds.) Computational Discovery of Scientific Knowledge. LNCS (LNAI), vol. 4660, pp. 69–97. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73920-3_4
Čerepnalkoski, D., Taškova, K., Todorovski, L., Atanasova, N., Džeroski, S.: The influence of parameter fitting methods on model structure selection in automated modeling of aquatic ecosystems. Ecol. Model. 245(0), 136–165 (2012)
Štrumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Simidjievski, N., Todorovski, L., Džeroski, S. (2016). Learning Ensembles of Process-Based Models by Bagging of Random Library Samples. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-46307-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46306-3
Online ISBN: 978-3-319-46307-0
eBook Packages: Computer ScienceComputer Science (R0)