Model tree pruning

Zhou, Xinlei; Yan, Dasen

doi:10.1007/s13042-019-00930-9

467 Accesses
Explore all metrics

Abstract

A model tree is a decision tree in which a specified model, such as a linear regression or naive Bayes model, is built on part of the leaf nodes. Compared with the typical decision tree in which every leaf node is assigned a class label, a model tree has several advantages: the flexibility to handle mixed attributes, a simplified tree structure, and a good potential for processing big data. This paper investigates a model tree in which the ELM model is applied to some leaf nodes of the tree and compares two fundamental strategies for generating model trees in terms of training complexity and generalization ability, namely, prepruning and postpruning. The experimental results and algorithmic analysis show that, with respect to the ELM model tree, postpruning achieves better performance than does prepruning, which has previously been universally regarded as one of the most popular decision tree generation strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Fig. 4

SPAARC: A Fast Decision Tree Algorithm

Recent advances in decision trees: an updated survey

Article 10 October 2022

Fast linear model trees by PILOT

Article Open access 08 July 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32(1):63–76
Article Google Scholar
Quinlan J R (1992) Learning with continues classes. In: 5th Australian joint conference on artificial intelligence
Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234
Article Google Scholar
Esposito F, Malerba D, Semeraro G et al (1997) A comparative analysis of methods for pruning decision trees. IEEE Trans Pattern Anal Mach Intell 19(5):476–491
Article Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Google Scholar
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Article Google Scholar
Holte R C, Acker L E, Porter B W (1989) Concept learning and the problem of small disjuncts. In: International joint conference on artificial intelligence, pp 813–818
Niblett T (1987) Constructing decision trees in noisy domains. In: Proceedings of the second European working session on learning. Sigma Press, Wilmslow, England, pp 67–78
Google Scholar
Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234
Article Google Scholar
Breslow LA, Aha DW (1997) Simplifying decision trees: a survey. Knowl Eng Rev 12(1):1–40
Article Google Scholar
Niblett T, Bratko I (1986) Learning decision rules in noisy domains. In: Proceedings of expert systems’86. Cambridge University Press, Cambridge, pp 25–34
Cestnik B, Bratko I (1991) On estimating probabilities in tree pruning. In: Proceedings of European working sessions on learning. Springer, Porto, pp 138–150
Breiman L, Friedman J, Olshen RA et al (1984) Classification and regression trees. Wadsworth, Belmont, pp 1–358
MATH Google Scholar
Nobel A (2002) Analysis of a complexity-based pruning scheme for classification trees. IEEE Trans Inf Theory 48(8):2362–2368
Article MathSciNet Google Scholar
Wang R, He YL, Chow CY et al (2015) Learning ELM-Tree from big data based on uncertainty reduction. Fuzzy Sets Syst 258(C):79–100
Article MathSciNet Google Scholar
Schmidt WF, Kraaijveld MA, Duin RPW (1992) Feedforward neural networks with random weights. In: Pattern recognition, 1992, vol II, conference B: pattern recognition methodology and systems, Proceedings, 11th IAPR international conference on. IEEE, pp 1–4
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Article Google Scholar
Lan Y, Soh YC, Huang GB (2010) Two-stage extreme learning machine for regression. Neurocomputing 73(16–18):3028–3038
Article Google Scholar
Huang GB, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16–18):3460–3468
Article Google Scholar
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Article Google Scholar
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Article Google Scholar
Gama J (2004) Functional trees. Mach Learn 55(3):219–250
Article Google Scholar
Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: A decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining (KDD-96). AAAI, Cambridge, pp 202–207
Google Scholar
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1–2):161–205
Article Google Scholar
Sumner M, Frank E, Hall M (2005) Speeding up logistic model tree induction. In: European conference on principles of data mining and knowledge discovery. Springer, Berlin, pp 675–683
Chapter Google Scholar
Witten IH, Frank E, Hall MA (2005) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann
UCI Machine Learning Repository. Available online: http://archive.ics.uci.edu/ml/
Srivastava A, Han EH, Kumar V et al (1999) Parallel formulations of decision-tree classification algorithms. High performance data mining. Springer, Boston, pp 237–261
Google Scholar
Ben-Haim Y, Tom-Tov E (2010) A streaming parallel decision tree algorithm. J Mach Learn Res 11:849–872
MathSciNet MATH Google Scholar
Jin R, Agrawal G (2003) Communication and memory efficient parallel decision tree construction. In: Proceedings of the 2003 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 119–129
He Q, Shang T, Zhuang F et al (2013) Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102:52–58
Article Google Scholar
Wang Y, Dou Y, Liu X et al (2016) PR-ELM: parallel regularized extreme learning machine based on cluster. Neurocomputing 173:1073–1081
Article Google Scholar

Download references

Acknowledgements

We would like to express our gratitude to all those who helped me during the writing of this paper. We gratefully acknowledge the help of our supervisor, Prof. XiZhao Wang, who has offered us valuable suggestions to revise and improve this paper. This work was supported in part by the National Natural Science Foundation of China (Grant 61772344 and Grant 61732011), in part by the Natural Science Foundation of SZU (Grant 827-000140, Grant 827-000230 and Grant 2017060).

Author information

Authors and Affiliations

College of Computer Science and Software Engineering, Guangdong Key Lab of Intelligent Information Processing, Shenzhen University, Guangdong, China
Xinlei Zhou & Dasen Yan

Authors

Xinlei Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Dasen Yan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dasen Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, X., Yan, D. Model tree pruning. Int. J. Mach. Learn. & Cyber. 10, 3431–3444 (2019). https://doi.org/10.1007/s13042-019-00930-9

Download citation

Received: 11 November 2018
Accepted: 19 January 2019
Published: 02 February 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s13042-019-00930-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model tree pruning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SPAARC: A Fast Decision Tree Algorithm

Recent advances in decision trees: an updated survey

Fast linear model trees by PILOT

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now