Abstract
We propose a general framework for regression models with functional response containing a potentially large number of flexible effects of functional and scalar covariates. Special emphasis is put on historical functional effects, where functional response and functional covariate are observed over the same interval and the response is only influenced by covariate values up to the current grid point. Historical functional effects are mostly used when functional response and covariate are observed on a common time interval, as they account for chronology. Our formulation allows for flexible integration limits including, e.g., lead or lag times. The functional responses can be observed on irregular curve-specific grids. Additionally, we introduce different parameterizations for historical effects and discuss identifiability issues.The models are estimated by a component-wise gradient boosting algorithm which is suitable for models with a potentially high number of covariate effects, even more than observations, and inherently does model selection. By minimizing corresponding loss functions, different features of the conditional response distribution can be modeled, including generalized and quantile regression models as special cases. The methods are implemented in the open-source R package FDboost. The methodological developments are motivated by biotechnological data on Escherichia coli fermentations, but cover a much broader model class.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Brockhaus, S.: FDboost: boosting functional regression models. R package version 0.0-8, (2015) Available at http://CRAN.R-project.org/package=FDboost
Brockhaus, S., Scheipl, F., Hothorn, T., Greven, S.: The functional linear array model. Stat. Model. 15(3), 279–300 (2015)
Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat. Sci. 22(4), 477–505 (2007)
Bühlmann, P., Yu, B.: Boosting with the \(L_2\) loss: regression and classification. J. Am. Stat. Assoc. 98(462), 324–339 (2003)
Buja, A., Hastie, T.J., Tibshirani, R.J.: Linear smoothers and additive models. Ann. Stat. 17(2), 453–510 (1989)
Crainiceanu, C.M., Reiss, P.T., Goldsmith, J., Huang, L., Huo, L., Scheipl, F.: refund: Regression with Functional Data. R package version 0.1-12, (2015) Available at https://github.com/refunders/refund
Currie, I.D., Durban, M., Eilers, P.H.C.: Generalized linear array models with applications to multidimensional smoothing. J. R. Stat. Soc. 68(2), 259–280 (2006)
Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties (with comments and rejoinder). Stat. Sci. 11(2), 89–121 (1996)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Gellar, J.E., Colantuoni, E., Needham, D.M., Crainiceanu, C.M.: Variable-domain functional regression for modeling ICU data. J. Am. Stat. Assoc. 109(508), 1425–1439 (2014)
Gervini, D.: Dynamic retrospective regression for functional data. Technometrics 57(1), 26–34 (2015)
Harezlak, J., Coull, B.A., Laird, N.M., Magari, S.R., Christiani, D.C.: Penalized solutions to functional regression problems. Comput. Stat. Data Anal. 51(10), 4911–4925 (2007)
Hastie, T.J., Tibshirani, R.J.: Varying-coefficient models. J. R. Stat. Soc. 55(4), 757–796 (1993)
Hofner, B., Hothorn, T., Kneib, T., Schmid, M.: A framework for unbiased model selection based on boosting. J. Comput. Graph. Stat. 20(4), 956–971 (2011)
Hofner, B., Boccuto, L., Göker, M.: Controlling false discoveries in high-dimensional situations: boosting with stability selection. BMC Bioinform. 16(1), 144 (2015)
Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: mboost: Model-based boosting. R package version 2.4-2, (2015) Available at http://CRAN.R-project.org/package=mboost
Ivanescu, A.E., Staicu, A.M., Scheipl, F., Greven, S.: Penalized function-on-function regression. Comput. Stat. 30(2), 539–568 (2015)
Kim, K., Şentürk, D., Li, R.: Recent history functional linear models for sparse longitudinal data. J. Stat. Plan. Inference 141(4), 1554–1566 (2011)
Luchner, M., Gutmann, R., Bayer, K., Dunkl, J., Hansel, A., Herbig, J., Singer, W., Strobl, F., Winkler, K., Striedner, G.: Implementation of proton transfer reaction-mass spectrometry (PTR-MS) for advanced bioprocess monitoring. Biotechnol. Bioeng. 109(12), 3059–3069 (2012)
Malfait, N., Ramsay, J.O.: The historical functional linear model. Can. J. Stat. 31(2), 115–128 (2003)
Marra, G., Wood, S.N.: Practical variable selection for generalized additive models. Comput. Stat. Data Anal. 55(7), 2372–2387 (2011)
Meinshausen, N., Bühlmann, P.: Stability selection (with discussion). J. R. Stat. Soc. 72(4), 417–473 (2010)
Melcher, M., Scharl, T., Spangl, B., Luchner, M., Cserjan, M., Bayer, K., Leisch, F., Striedner, G.: The potential of random forest and neural networks for biomass and recombinant protein modeling in Escherichia coli fed-batch fermentations. Biotechnol. J. 10(11), 1770–1782 (2015)
Morris, J.S.: Functional regression. Ann. Rev. Stat. Appl. 2(1), 321–359 (2015)
Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. R. Stat. Soc. 135(3), 370–384 (1972)
R Core Team.: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, R 3.1.2, (2014) Available at http://www.R-project.org/
Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)
Scheipl, F., Greven, S.: Identifiability in penalized function-on-function regression models. Electron. J. Stat. 10(1), 495–526 (2016)
Scheipl, F., Staicu, A.M., Greven, S.: Functional additive mixed models. J. Comput. Graph. Stat. 24(2), 477–501 (2015)
Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. R. Stat. Soc. 75(1), 55–80 (2013)
Striedner, G., Bayer, K.: An advanced monitoring platform for rational design of recombinant processes. In: Mandenius, C.F., Titchener-Hooker, N.J. (eds.) Measurement, Monitoring, Modelling and Control of Bioprocesses, pp. 65–84. Springer, Berlin (2013)
Tutz, G., Gertheiss, J.: Feature extraction in signal regression: a boosting technique for functional data regression. J. Comput. Graph. Stat. 19(1), 154–174 (2010)
Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman & Hal/CRC, Boca Raton (2006)
Acknowledgments
Special thanks to Markus Luchner and Gerald Striedner for providing us with the fermentation data. We thank Fabian Scheipl for useful discussions. The work of Sarah Brockhaus and Sonja Greven was supported by the German Research Foundation (DFG) through Emmy Noether grant GR 3793/1-1. The work of Michael Melcher and Friedrich Leisch was supported by the Federal Ministry of Traffic, Innovation and Technology (bmvit), the Federal Ministry of Economy, Family and Youth (BMWFJ), the Styrian Business Promotion Agency (SFG), the Standortagentur Tirol and the ZIT Technology Agency of the City of Vienna through the COMET-Funding Program managed by the Austrian Research Promotion Agency (FFG). The computational results presented have been in part achieved using the Vienna Scientific Cluster (VSC). We thank the reviewers and the associate editor for their useful comments.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
11222_2016_9662_MOESM1_ESM.zip
Web Appendices, Tables, and Figures referenced in Sects. 2, 3, 5, and 6 as well as reproducible R code for the simulation study are available online with this paper. (ZIP 1533 KB)
Rights and permissions
About this article
Cite this article
Brockhaus, S., Melcher, M., Leisch, F. et al. Boosting flexible functional regression models with a high number of functional historical effects. Stat Comput 27, 913–926 (2017). https://doi.org/10.1007/s11222-016-9662-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-016-9662-1