Skip to main content
Log in

Boosting flexible functional regression models with a high number of functional historical effects

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose a general framework for regression models with functional response containing a potentially large number of flexible effects of functional and scalar covariates. Special emphasis is put on historical functional effects, where functional response and functional covariate are observed over the same interval and the response is only influenced by covariate values up to the current grid point. Historical functional effects are mostly used when functional response and covariate are observed on a common time interval, as they account for chronology. Our formulation allows for flexible integration limits including, e.g., lead or lag times. The functional responses can be observed on irregular curve-specific grids. Additionally, we introduce different parameterizations for historical effects and discuss identifiability issues.The models are estimated by a component-wise gradient boosting algorithm which is suitable for models with a potentially high number of covariate effects, even more than observations, and inherently does model selection. By minimizing corresponding loss functions, different features of the conditional response distribution can be modeled, including generalized and quantile regression models as special cases. The methods are implemented in the open-source R package FDboost. The methodological developments are motivated by biotechnological data on Escherichia coli fermentations, but cover a much broader model class.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Brockhaus, S.: FDboost: boosting functional regression models. R package version 0.0-8, (2015) Available at http://CRAN.R-project.org/package=FDboost

  • Brockhaus, S., Scheipl, F., Hothorn, T., Greven, S.: The functional linear array model. Stat. Model. 15(3), 279–300 (2015)

    Article  MathSciNet  Google Scholar 

  • Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat. Sci. 22(4), 477–505 (2007)

    Article  MATH  Google Scholar 

  • Bühlmann, P., Yu, B.: Boosting with the \(L_2\) loss: regression and classification. J. Am. Stat. Assoc. 98(462), 324–339 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Buja, A., Hastie, T.J., Tibshirani, R.J.: Linear smoothers and additive models. Ann. Stat. 17(2), 453–510 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Crainiceanu, C.M., Reiss, P.T., Goldsmith, J., Huang, L., Huo, L., Scheipl, F.: refund: Regression with Functional Data. R package version 0.1-12, (2015) Available at https://github.com/refunders/refund

  • Currie, I.D., Durban, M., Eilers, P.H.C.: Generalized linear array models with applications to multidimensional smoothing. J. R. Stat. Soc. 68(2), 259–280 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties (with comments and rejoinder). Stat. Sci. 11(2), 89–121 (1996)

    Article  MATH  Google Scholar 

  • Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Gellar, J.E., Colantuoni, E., Needham, D.M., Crainiceanu, C.M.: Variable-domain functional regression for modeling ICU data. J. Am. Stat. Assoc. 109(508), 1425–1439 (2014)

    Article  MathSciNet  Google Scholar 

  • Gervini, D.: Dynamic retrospective regression for functional data. Technometrics 57(1), 26–34 (2015)

    Article  MathSciNet  Google Scholar 

  • Harezlak, J., Coull, B.A., Laird, N.M., Magari, S.R., Christiani, D.C.: Penalized solutions to functional regression problems. Comput. Stat. Data Anal. 51(10), 4911–4925 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Hastie, T.J., Tibshirani, R.J.: Varying-coefficient models. J. R. Stat. Soc. 55(4), 757–796 (1993)

    MathSciNet  MATH  Google Scholar 

  • Hofner, B., Hothorn, T., Kneib, T., Schmid, M.: A framework for unbiased model selection based on boosting. J. Comput. Graph. Stat. 20(4), 956–971 (2011)

    Article  MathSciNet  Google Scholar 

  • Hofner, B., Boccuto, L., Göker, M.: Controlling false discoveries in high-dimensional situations: boosting with stability selection. BMC Bioinform. 16(1), 144 (2015)

    Article  Google Scholar 

  • Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M., Hofner, B.: mboost: Model-based boosting. R package version 2.4-2, (2015) Available at http://CRAN.R-project.org/package=mboost

  • Ivanescu, A.E., Staicu, A.M., Scheipl, F., Greven, S.: Penalized function-on-function regression. Comput. Stat. 30(2), 539–568 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Kim, K., Şentürk, D., Li, R.: Recent history functional linear models for sparse longitudinal data. J. Stat. Plan. Inference 141(4), 1554–1566 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Luchner, M., Gutmann, R., Bayer, K., Dunkl, J., Hansel, A., Herbig, J., Singer, W., Strobl, F., Winkler, K., Striedner, G.: Implementation of proton transfer reaction-mass spectrometry (PTR-MS) for advanced bioprocess monitoring. Biotechnol. Bioeng. 109(12), 3059–3069 (2012)

    Article  Google Scholar 

  • Malfait, N., Ramsay, J.O.: The historical functional linear model. Can. J. Stat. 31(2), 115–128 (2003)

  • Marra, G., Wood, S.N.: Practical variable selection for generalized additive models. Comput. Stat. Data Anal. 55(7), 2372–2387 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Meinshausen, N., Bühlmann, P.: Stability selection (with discussion). J. R. Stat. Soc. 72(4), 417–473 (2010)

    Article  MathSciNet  Google Scholar 

  • Melcher, M., Scharl, T., Spangl, B., Luchner, M., Cserjan, M., Bayer, K., Leisch, F., Striedner, G.: The potential of random forest and neural networks for biomass and recombinant protein modeling in Escherichia coli fed-batch fermentations. Biotechnol. J. 10(11), 1770–1782 (2015)

    Article  Google Scholar 

  • Morris, J.S.: Functional regression. Ann. Rev. Stat. Appl. 2(1), 321–359 (2015)

    Article  Google Scholar 

  • Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. R. Stat. Soc. 135(3), 370–384 (1972)

    Google Scholar 

  • R Core Team.: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, R 3.1.2, (2014) Available at http://www.R-project.org/

  • Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)

    Book  MATH  Google Scholar 

  • Scheipl, F., Greven, S.: Identifiability in penalized function-on-function regression models. Electron. J. Stat. 10(1), 495–526 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Scheipl, F., Staicu, A.M., Greven, S.: Functional additive mixed models. J. Comput. Graph. Stat. 24(2), 477–501 (2015)

    Article  MathSciNet  Google Scholar 

  • Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. R. Stat. Soc. 75(1), 55–80 (2013)

    Article  MathSciNet  Google Scholar 

  • Striedner, G., Bayer, K.: An advanced monitoring platform for rational design of recombinant processes. In: Mandenius, C.F., Titchener-Hooker, N.J. (eds.) Measurement, Monitoring, Modelling and Control of Bioprocesses, pp. 65–84. Springer, Berlin (2013)

    Google Scholar 

  • Tutz, G., Gertheiss, J.: Feature extraction in signal regression: a boosting technique for functional data regression. J. Comput. Graph. Stat. 19(1), 154–174 (2010)

    Article  MathSciNet  Google Scholar 

  • Wood, S.N.: Generalized Additive Models: An Introduction with R. Chapman & Hal/CRC, Boca Raton (2006)

    MATH  Google Scholar 

Download references

Acknowledgments

Special thanks to Markus Luchner and Gerald Striedner for providing us with the fermentation data. We thank Fabian Scheipl for useful discussions. The work of Sarah Brockhaus and Sonja Greven was supported by the German Research Foundation (DFG) through Emmy Noether grant GR 3793/1-1. The work of Michael Melcher and Friedrich Leisch was supported by the Federal Ministry of Traffic, Innovation and Technology (bmvit), the Federal Ministry of Economy, Family and Youth (BMWFJ), the Styrian Business Promotion Agency (SFG), the Standortagentur Tirol and the ZIT Technology Agency of the City of Vienna through the COMET-Funding Program managed by the Austrian Research Promotion Agency (FFG). The computational results presented have been in part achieved using the Vienna Scientific Cluster (VSC). We thank the reviewers and the associate editor for their useful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarah Brockhaus.

Electronic supplementary material

Below is the link to the electronic supplementary material.

11222_2016_9662_MOESM1_ESM.zip

Web Appendices, Tables, and Figures referenced in Sects. 2, 3, 5, and 6 as well as reproducible R code for the simulation study are available online with this paper. (ZIP 1533 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brockhaus, S., Melcher, M., Leisch, F. et al. Boosting flexible functional regression models with a high number of functional historical effects. Stat Comput 27, 913–926 (2017). https://doi.org/10.1007/s11222-016-9662-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-016-9662-1

Keywords

Navigation