Least Squares Approach for Multivariate Split Selection in Regression Trees

Schöne, Marvin; Kohlhase, Martin

doi:10.1007/978-3-030-62362-3_5

Marvin Schöne¹² &
Martin Kohlhase¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12489))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

737 Accesses
2 Citations

Abstract

In the context of Industry 4.0, an increasing number of data-driven models is used in order to improve industrial processes. These models need to be accurate and interpretable. Regression Trees are able to fulfill these requirements, especially if their model flexibility is increased by multivariate splits that adapt to the process function. In this paper, a novel approach for multivariate split selection is presented. The direction of the split is determined by a first-order Least Squares model, that adapts to process function gradient in a local area. By using a forward selection method, the curse of dimensionality is weakened, interpretability is maintained and a generalized split is created. The approach is implemented in CART as an extension to the existing algorithm for constructing the Least Squares Regression Tree (LSRT). For evaluation, an extensive experimental analysis is performed in which LSRT leads to much smaller trees and a higher prediction accuracy than univariate CART. Furthermore, low sensitivity to noise and performance improvements for high dimensional input spaces and small data sets are achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Variable Selection Methods for Process Monitoring

Outlier Detection and Robust Variable Selection for Least Angle Regression

SVRT: a decision tree-assisted support vector regression for modeling engineering data with complex regression relationship

Article 29 April 2022

References

Breiman, L., Friedman, J., Stone, C.J., Olshen, R.: Classification and Regression Trees. Chapman and Hall/CRC, New York (1984)
MATH Google Scholar
Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Mach. Learn. 19(1), 45–77 (1995)
MATH Google Scholar
Ebert, T., Fischer, T., Belz, J., Heinz, T.O., Kampmann, G., Nelles, O.: Extended deterministic local search algorithm for maximin latin hypercube designs. In: IEEE Symposium Series on Computational Intelligence, pp. 375–382 (2015)
Google Scholar
Eriksson, L., Trygg, J., Wold, S.: PLS-trees® a top-down clustering approach. J. Chemometr. 23, 569–580 (2009)
Article Google Scholar
Friedman, J.H., Grosse, E., Stuetzle, W.: Multidimensional additive spline approximation. SIAM J. Sci. Stat. Comput. 4(2), 291–301 (1983)
Article MathSciNet Google Scholar
Gijsbers, P.: OpenML wine-quality-red. https://www.openml.org/d/40691. Accessed 21 May 2020
Evolutionary Decision Trees in Large-Scale Data Mining. SBD, vol. 59. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21851-5_8
Li, K.C., Lue, H.H., Chen, C.H.: Interactive tree-structured regression via principal hessian directions. J. Am. Stat. Assoc. 95, 547–560 (2000)
Article MathSciNet Google Scholar
Lindsey, C., Sheather, S.: Variable selection in linear regression. Stata J. 10(4), 650–669 (2010)
Article Google Scholar
Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82, 329–348 (2014)
Article MathSciNet Google Scholar
Nelles, O.: Nonlinear System Identification. Springer, Berlin Heidelberg (2001)
Book Google Scholar
van Rijn, J.: OpenML machine\_cpu. https://www.openml.org/d/230. Accessed 21 May 2020
Shang, C., You, F.: Data analytics and machine learning for smart process manufacturing: recent advances and perspectives in the big data era. Engineering 5(6), 1010–1016 (2019)
Article Google Scholar
Vanschoren, J.: OpenML boston. https://www.openml.org/d/531. Accessed 21 May 2020
Vanschoren, J.: OpenML tecator. https://www.openml.org/d/505. Accessed 21 May 2020
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the EFRE-NRW funding programme"Forschungsinfrastrukturen" (grant no. 34.EFRE–0300180).

Author information

Authors and Affiliations

Center for Applied Data Science Gütersloh, Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, Bielefeld, Germany
Marvin Schöne & Martin Kohlhase

Authors

Marvin Schöne
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kohlhase
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marvin Schöne .

Editor information

Editors and Affiliations

University of Minho, Braga, Portugal
Cesar Analide
University of Minho, Braga, Portugal
Paulo Novais
Technical University of Madrid, Madrid, Spain
David Camacho
University of Manchester, Manchester, UK
Hujun Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schöne, M., Kohlhase, M. (2020). Least Squares Approach for Multivariate Split Selection in Regression Trees. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-62362-3_5
Published: 27 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62361-6
Online ISBN: 978-3-030-62362-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Least Squares Approach for Multivariate Split Selection in Regression Trees

Abstract

Access this chapter

Similar content being viewed by others

Variable Selection Methods for Process Monitoring

Outlier Detection and Robust Variable Selection for Least Angle Regression

SVRT: a decision tree-assisted support vector regression for modeling engineering data with complex regression relationship

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Least Squares Approach for Multivariate Split Selection in Regression Trees

Abstract

Access this chapter

Similar content being viewed by others

Variable Selection Methods for Process Monitoring

Outlier Detection and Robust Variable Selection for Least Angle Regression

SVRT: a decision tree-assisted support vector regression for modeling engineering data with complex regression relationship

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation