Incremental Natural Gradient Boosting for Probabilistic Regression

Wu, Weiwen; Zhang, Hui; Yang, Chunming; Li, Bo; Zhao, Xujian

doi:10.1007/978-3-031-46661-8_32

Weiwen Wu¹⁵,
Hui Zhang¹⁵,
Chunming Yang¹⁵,
Bo Li¹⁵ &
…
Xujian Zhao¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14176))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

603 Accesses

Abstract

The natural gradient boosting method for probabilistic regression \((\mathrm {{\textbf {NGBoost}}})\) is capable of predicting not only point estimates but also target distributions under sample conditions, thereby quantifying prediction uncertainty. However, NGBoost is designed only for batch settings, which are not well-suited for data stream learning. In this paper, we present an incremental natural gradient boosting method for probabilistic regression \((\mathrm {{\textbf {INGBoost}}})\). The proposed method employs scoring rule reduction as a metric and applies the Hoeffding inequality incrementally to construct decision trees that fit the natural gradient, thus achieving incremental natural gradient boosting. Experimental results demonstrate that INGBoost performs well in both point regression and probabilistic regression tasks while maintaining the interpretability of the tree model. Furthermore, the model size of INGBoost is significantly smaller than that of NGBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amari, S.I.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
Article Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Avati, A., Duan, T., Zhou, S., Jung, K., Shah, N.H., Ng, A.Y.: Countdown regression: sharp and calibrated survival predictions. In: Uncertainty in Artificial Intelligence, pp. 145–155. PMLR (2020)
Google Scholar
Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. Society for Industrial and Applied Mathematics (2007). https://doi.org/10.1137/1.9781611972771.42
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22
Chapter Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000). https://doi.org/10.1145/347090.347107
Duan, T., et al.: NGBoost: natural gradient boosting for probabilistic prediction. In: International Conference on Machine Learning, pp. 2690–2700. PMLR (2020)
Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). https://doi.org/10.1214/aos/1013203451
Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528. ACM (2003). https://doi.org/10.1145/956750.956813
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. stat. Assoc. 102(477), 359–378 (2007). https://doi.org/10.1198/016214506000001437
Hoeffding, W.: Probability inequalities for sums of bounded random variables. In: Fisher, N.I., Sen, P.K. (eds.) The Collected Works of Wassily Hoeffding, pp. 409–426. Springer, New York (1994). https://doi.org/10.1007/978-1-4612-0865-5_26
Chapter MATH Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001). https://doi.org/10.1145/502512.502529
Ikonomovska, E., Gama, J., Džeroski, S.: Learning model trees from evolving data streams. Data Min. Knowl. Disc. 23(1), 128–168 (2011). https://doi.org/10.1007/s10618-010-0201-y
Article MathSciNet MATH Google Scholar
Ikonomovska, E., Gama, J., Džeroski, S.: Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458–470 (2015). https://doi.org/10.1016/j.neucom.2014.04.076
Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 537–544 (2011)
Google Scholar
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Maron, O., Moore, A.: Hoeffding races: accelerating model selection search for classification and function approximation. In: Advances in Neural Information Processing Systems 6 (1993)
Google Scholar
Mastelini, S.M., Barbon Jr, S., de Carvalho, A.C.P.d.L.F.: Online multi-target regression trees with stacked leaf models. arXiv preprint arXiv:1903.12483 (2019)
Read, J., Bifet, A., Holmes, G., Pfahringer, B.: Scalable and efficient multi-label classification for evolving data streams. Mach. Learn. 88(1–2), 243–272 (2012). https://doi.org/10.1007/s10994-012-5279-6
Article MathSciNet Google Scholar

Download references

Acknowments

provincial scientific research institutes’ achievement transformation project of the science and technology department of Sichuan Province, China (2023JDZH0011).

Author information

Authors and Affiliations

School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
Weiwen Wu, Hui Zhang, Chunming Yang, Bo Li & Xujian Zhao

Authors

Weiwen Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunming Yang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar
Xujian Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Zhang .

Editor information

Editors and Affiliations

Northeastern University, Shenyang, China
Xiaochun Yang
The University of Indonesia, Depok, Indonesia
Heru Suhartanto
Beijing Institute of Technology, Beijing, China
Guoren Wang
Northeastern University, Shenyang, China
Bin Wang
University of Technology Sydney, Sydney, NSW, Australia
Jing Jiang
Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Bing Li
Sun Yat-sen University, Guangzhou, China
Huaijie Zhu
Anhui University, Hefei, China
Ningning Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, W., Zhang, H., Yang, C., Li, B., Zhao, X. (2023). Incremental Natural Gradient Boosting for Probabilistic Regression. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14176. Springer, Cham. https://doi.org/10.1007/978-3-031-46661-8_32

Download citation

DOI: https://doi.org/10.1007/978-3-031-46661-8_32
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46660-1
Online ISBN: 978-3-031-46661-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incremental Natural Gradient Boosting for Probabilistic Regression