A Competition Strategy to Cost-Sensitive Decision Trees

Min, Fan; Zhu, William

doi:10.1007/978-3-642-31900-6_45

Fan Min²⁶ &
William Zhu²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7414))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

1588 Accesses
5 Citations

Abstract

Learning from data with test cost and misclassification cost has been a hot topic in data mining. Many algorithms have been proposed to induce decision trees for this purpose. This paper studies a number of such algorithms and presents a competition strategy to obtain trees with lower cost. First, we generate a population of decision trees using λ-ID3 and EG2 algorithms through considering information gain and test cost. λ-ID3 is a generalization of three existing algorithms, namely ID3, IDX, and CS-ID3. EG2 is another parameterized algorithm, and its parameter range is extended in this work. Second, we post-prune these trees by considering the tradeoff between the test cost and the misclassification cost. Finally, we select the best decision tree for classification. Experimental results on the mushroom dataset with various cost settings indicate: 1) there does not exist an optimal parameter for λ-ID3 or EG2; 2) the competition strategy is effective in selecting an appropriate decision tree; and 3) post-pruning can help decreasing the average cost effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Technology and Decision Making 5(4), 597–604 (2006)
Article Google Scholar
Turney, P.D.: Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. Journal of Artificial Intelligence Research 2, 369–409 (1995)
Google Scholar
Fan, W., Stolfo, S., Zhang, J., Chan, P.: Adacost: Misclassification cost-sensitive boosting. In: Proceedings of the 16th International Conference on Machine Learning, pp. 97–105 (1999)
Google Scholar
Zhou, Z., Liu, X.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering 18(1), 63–77 (2006)
Article Google Scholar
Kukar, M., Kononenko, I.: Cost-sensitive learning with neural networks. In: Proceedings of the 13th European Conference on Artificial Intelligence, pp. 445–449 (1998)
Google Scholar
Chai, X.Y., Deng, L., Yang, Q., Ling, C.X.: Test-cost sensitive Naïve Bayes classification. In: Proceedings of the 5th International Conference on Data Mining, pp. 51–58 (2004)
Google Scholar
Min, F., Liu, Q.: A hierarchical model for test-cost-sensitive decision systems. Information Sciences 179, 2442–2452 (2009)
Article MathSciNet MATH Google Scholar
Min, F., He, H., Qian, Y., Zhu, W.: Test-cost-sensitive attribute reduction. Information Sciences 181, 4928–4942 (2011)
Article Google Scholar
Zhu, W., Wang, F.: Reduction and axiomization of covering generalized rough sets. Information Sciences 152(1), 217–230 (2003)
Article MathSciNet MATH Google Scholar
Zhu, W.: Generalized rough sets based on relations. Information Sciences 177(22), 4997–5011 (2007)
Article MathSciNet MATH Google Scholar
Turney, P.D.: Types of cost in inductive concept learning. In: Proceedings of the Workshop on Cost-Sensitive Learning at the 17th ICML, pp. 1–7 (2000)
Google Scholar
Hunt, E.B., Marin, J., Stone, P.J. (eds.): Experiments in induction. Academic Press, New York (1966)
Google Scholar
Ling, C.X., Sheng, V.S., Yang, Q.: Test strategies for cost-sensitive decision trees. IEEE Transactions on Knowledge and Data Engineering 18(8), 1055–1067 (2006)
Article Google Scholar
Grefenstette, J.J.: Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 122–128 (1986)
Google Scholar
Núñez, M.: The use of background knowledge in decision tree induction. Machine Learning 6, 231–250 (1991)
Google Scholar
Yao, Y.Y., Zhao, Y.: Attribute reduction in decision-theoretic rough set models. Information Sciences 178(17), 3356–3373 (2008)
Article MathSciNet MATH Google Scholar
Min, F., Zhu, W.: Minimal Cost Attribute Reduction through Backtracking. In: Kim, T.-H., Adeli, H., Cuzzocrea, A., Arslan, T., Zhang, Y., Ma, J., Chung, K.-I., Mariyam, S., Song, X. (eds.) DTA/BSBT 2011. CCIS, vol. 258, pp. 100–107. Springer, Heidelberg (2011)
Chapter Google Scholar
Jia, X., Li, W., Shang, L., Chen, J.: An Optimization Viewpoint of Decision-Theoretic Rough Set Model. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 457–465. Springer, Heidelberg (2011)
Chapter Google Scholar
Liu, D., Yao, Y.Y., Li, T.R.: Three-way investment decisions with decision-theoretic Rough sets. International Journal of Computational Intelligence Systems 4, 66–74 (2011)
Article Google Scholar
Li, H., Zhou, X.: Risk decision making based on decision-theoretic rough set: a three-way view decision model. International Journal of Computational Intelligence Systems 4(1), 1–11 (2011)
Article Google Scholar
Norton, S.: Generating better decision trees. In: Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 800–805 (1989)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Tan, M.: Cost-sensitive learning of classification knowledge and its applications in robotics. Machine Learning 13, 7–33 (1993)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/mlrepository.html
Min, F., Zhu, W., Zhao, H., Pan, G.: Coser: Cost-senstive rough sets (2011), http://grc.fjzs.edu.cn/~fmin/coser/
Quinlan, J.R. (ed.): C4.5 Programs for Machine Learning. Morgan kaufmann Publisher, San Mateo (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Lab of Granular Computing, Zhangzhou Normal University, Zhangzhou, 363000, China
Fan Min & William Zhu

Authors

Fan Min
View author publications
You can also search for this author in PubMed Google Scholar
William Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Science and Technology, Southwest Jiaotong University, 610031, Chengdu, P.R. China
Tianrui Li
Institute of Mathematics, The University of Warsaw, ul. Banacha 2, 02-097, Warsaw, Poland
Hung Son Nguyen
Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
Guoyin Wang
Department of Electrical Engineering and Computer Science, University of Kansas, 66045 - 7621, Lawrence, KS, USA
Jerzy Grzymala-Busse
Department of Computing and Software, McMaster University, L8S 4K1, Hamilton, Ontario, Canada
Ryszard Janicki
Faculty of Computers and Information, Cairo University, Cairo, Egypt
Aboul Ella Hassanien
Software School, Dalian University of Technology, Dalian, China
Hong Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Min, F., Zhu, W. (2012). A Competition Strategy to Cost-Sensitive Decision Trees. In: Li, T., et al. Rough Sets and Knowledge Technology. RSKT 2012. Lecture Notes in Computer Science(), vol 7414. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31900-6_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-31900-6_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31899-3
Online ISBN: 978-3-642-31900-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics