Abstract
In the context of Industry 4.0, an increasing number of data-driven models is used in order to improve industrial processes. These models need to be accurate and interpretable. Regression Trees are able to fulfill these requirements, especially if their model flexibility is increased by multivariate splits that adapt to the process function. In this paper, a novel approach for multivariate split selection is presented. The direction of the split is determined by a first-order Least Squares model, that adapts to process function gradient in a local area. By using a forward selection method, the curse of dimensionality is weakened, interpretability is maintained and a generalized split is created. The approach is implemented in CART as an extension to the existing algorithm for constructing the Least Squares Regression Tree (LSRT). For evaluation, an extensive experimental analysis is performed in which LSRT leads to much smaller trees and a higher prediction accuracy than univariate CART. Furthermore, low sensitivity to noise and performance improvements for high dimensional input spaces and small data sets are achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.: Classification and Regression Trees. Chapman and Hall/CRC, New York (1984)
Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Mach. Learn. 19(1), 45–77 (1995)
Ebert, T., Fischer, T., Belz, J., Heinz, T.O., Kampmann, G., Nelles, O.: Extended deterministic local search algorithm for maximin latin hypercube designs. In: IEEE Symposium Series on Computational Intelligence, pp. 375–382 (2015)
Eriksson, L., Trygg, J., Wold, S.: PLS-trees® a top-down clustering approach. J. Chemometr. 23, 569–580 (2009)
Friedman, J.H., Grosse, E., Stuetzle, W.: Multidimensional additive spline approximation. SIAM J. Sci. Stat. Comput. 4(2), 291–301 (1983)
Gijsbers, P.: OpenML wine-quality-red. https://www.openml.org/d/40691. Accessed 21 May 2020
Evolutionary Decision Trees in Large-Scale Data Mining. SBD, vol. 59. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21851-5_8
Li, K.C., Lue, H.H., Chen, C.H.: Interactive tree-structured regression via principal hessian directions. J. Am. Stat. Assoc. 95, 547–560 (2000)
Lindsey, C., Sheather, S.: Variable selection in linear regression. Stata J. 10(4), 650–669 (2010)
Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82, 329–348 (2014)
Nelles, O.: Nonlinear System Identification. Springer, Berlin Heidelberg (2001)
van Rijn, J.: OpenML machine\_cpu. https://www.openml.org/d/230. Accessed 21 May 2020
Shang, C., You, F.: Data analytics and machine learning for smart process manufacturing: recent advances and perspectives in the big data era. Engineering 5(6), 1010–1016 (2019)
Vanschoren, J.: OpenML boston. https://www.openml.org/d/531. Accessed 21 May 2020
Vanschoren, J.: OpenML tecator. https://www.openml.org/d/505. Accessed 21 May 2020
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)
Acknowledgements
This work was supported by the EFRE-NRW funding programme"Forschungsinfrastrukturen" (grant no. 34.EFRE–0300180).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Schöne, M., Kohlhase, M. (2020). Least Squares Approach for Multivariate Split Selection in Regression Trees. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-62362-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62361-6
Online ISBN: 978-3-030-62362-3
eBook Packages: Computer ScienceComputer Science (R0)