Abstract
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy-based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field.
- Abe, N., Zadrozny, B., and Langford, J. 2004. An iterative method for multi-class cost-sensitive learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '04). W. Kim, R. Kohavi, J, J. Gehrke, and W. DuMouchel, Eds., 3. Google ScholarDigital Library
- Afifi, A. A. and Clark, V. 1996. Computer-Aided Multivariate Analysis, 3<sup>rd</sup> ED. Chapman & Hall, London. Google ScholarDigital Library
- Bauer, E. and Kohavi, R. 1999. An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Mach. Learn. 36, 1-2, 105--139. Google ScholarDigital Library
- Bradford, J. P., Kunz, C., Kohavi, R., Brunk, C., and Brodley, C. E. 1998a. Pruning decision trees with misclassification costs. In Proceedings of the 10<sup>th</sup> European Conference on Machine Learning (ECML '98). 131--136. Google ScholarDigital Library
- Bradford, J. P., Kunz, C., Kohavi, R., Brunk, C., and Brodley, C. E. 1998b. Pruning decision trees with misclassification costs. http://robotics.stanford.edu/~ronnyk/prune-long.ps.gzGoogle Scholar
- Breiman, L., Friedman J. H., Olsen R. A., and Stone C. J. 1984. Classification and Regression Trees. Chapman and Hall/CRC, London.Google Scholar
- Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140. Google ScholarDigital Library
- Davis, J. V., Jungwoo, H., and Rossbach, C. J. 2006. Cost-Sensitive decision tree learning for forensic classification. InProceedings of 17<sup>th</sup> European Conference on Machine Learning (ECML). Lecture Notes in Computer Science, vol. 4212, Springer, 622--629. Google ScholarDigital Library
- Domingos, P. 1999. MetaCost: A general method for making classifiers cost-sensitive. InProceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 155--164. Google ScholarDigital Library
- Dong, M. and Kothari, R. 2001. Look-Ahead Based Fuzzy Decision Tree Induction. IEEE Trans. Fuzzy Syst. 9, 3, 461--468. Google ScholarDigital Library
- Draper, B. A., Brodley, C. E., and Utgoff, P. E. 1994. Goal-Directed classification using linear machine decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 16, 9, 888--893. Google ScholarDigital Library
- Elkan, C. 2001. The foundations of cost-sensitive learning. In Proceedings of 17<sup>th</sup> International Joint Conference on Artificial Intelligence (IJCAI '01). Vol. 2., Morgan Kaufmann, 973--978. Google ScholarDigital Library
- Esmeir, S. and Markovitch, S. 2004. Lookahead-Based algorithms for anytime induction of decision trees. In Proceedings of the 21st International Conference on Machine Learning (ICML '04). C. E. Brodley, Ed., 257--264. Google ScholarDigital Library
- Esmeir, S. and Markovitch, S. 2007. Anytime induction of cost-sensitive trees. In Proceedings of The 21st Annual Conference on Neural Information Processing Systems (NIPS '07). 1--8.Google Scholar
- Esmeir, S. and Markovitch, S. 2008. Anytime induction of low-cost, low-error classifiers: A sampling-based approach. J. Artif. Intell. Res. 33, 1--31. Google ScholarDigital Library
- Esmeir, S. and Markovitch, S. 2010. Anytime algorithms for learning resource-bounded classifiers. InProceedings of the Budgeted Learning Workshop(ICML '10).Google Scholar
- Esmeir, S. and Markovitch, S. 2011. Anytime learning of anycost classifiers. Mach. Learn. 82, 3, 445--473. Google ScholarDigital Library
- Estruch, V., Ferri, C., Hernández-Orallo, J., and Ramírez-Quintana, m. j. 2002. Re-designing cost-sensitive decision tree learning. In Workshop de Mineria de Datos y Aprendizaje. 33--42.Google Scholar
- Fan, W., Stolfo, S. J., Zhang, J., and Chan, P. K. 1999. AdaCost: Misclassification cost-sensitive boosting. In Proceedings of the 16th International Conference on Machine Learning. 97--105. Google ScholarDigital Library
- Ferri, C., Flach, P., and Hernández-Orallo, J. 2002. Learning decision trees using the area under the roc curve. In Proceedings of the 19<sup>th</sup> Machine Learning International Workshop then Conference. 139--146. Google ScholarDigital Library
- Ferri-Ramírez, C., Hernández, J., and Ramirez, M. J. 2002. Induction of decision multi-trees using levin search. In Proceedings of the International Conference on Computational Science (ICCS '02). Lecture Notes in Computer Science, vol. 2329., Springer, 166--175. Google ScholarDigital Library
- Frank, E. and Witten, I. 1998. Reduced-Error pruning with significance tests. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2272Google Scholar
- Frean, M. 1990. Small nets and short paths: Optimizing neural computation. Doctoral thesis, Centre for Cognitive Science, University of Edinburgh.Google Scholar
- Freitas, A., Costa-Pereira, A., and Brazdil, P. 2007. Cost-Sensitive decision trees applied to medical data. In Proceedings of the 9th International Conference on Data Warehousing and Knowledge Discovery. Lecture Notes in Computer Science, vol. 4654., Springer, 303--312. Google ScholarDigital Library
- Freund, Y. and Schapire, R. E. 1996. Experiments with a new boosting algorithm. In Proceedings of the 13<sup>th</sup> International Machine Learning Workshop then Conference. 148--156.Google Scholar
- Freund, Y. and Schapire, R. E. 1997. A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci. 55, 1, 119--139. Google ScholarDigital Library
- Grefenstette, J. J. 1990. A User's Guide to GENESIS v5.0. Naval Research Laboratory, Washington, DC.Google Scholar
- Greiner, R., Grove, A. J., and Roth, D. 2002. Learning class-sensitive active classifiers. Art. Intell. 139, 2, 137--174. Google ScholarDigital Library
- Hart, A. E. 1985. Experience in the use of an inductive system in knowledge engineering. In Research and Development in Expert Systems. M. A. Bramer, Ed., Cambridge University Press.Google Scholar
- Hunt, E. B., Marin, J., and Stone, P. J. 1966. Experiments in Induction. Academic Press, New York.Google Scholar
- Knoll, U., Nakhaeizadeh, G., and Tausend, B. 1994. Cost-Sensitive pruning of decision trees. In Proceedings of the 8th European Conference on Machine Learning (ECML '94). 383--386. Google ScholarDigital Library
- Kretowski, M. and Grzes, M. 2007. Evolutionary induction of decision trees for misclassification cost minimization. InProceedings of the 8<sup>th</sup> International Conference on Adaptive and Natural Computing Algorithms (ICANNGA). Lecture Notes in Computer Science, vol. 4431, Springer, 1--10. Google ScholarDigital Library
- Li, J., Li, X., and Yao, X. 2005. Cost-Sensitive classification with genetic programming. In Proceedings of the IEEE Congress on Evolutionary Computation. 2114--2121.Google Scholar
- Lin, F. Y. and Mcclean, S. 2000. The Prediction of financial distress using a cost sensitive approach and prior probabilities. In Proceedings of the 17<sup>th</sup> International Conference on Machine Learning (ICML '00).Google Scholar
- Ling, C. X., Yang, Q., Wang, J., and Zhang, S. 2004. Decision trees with minimal costs. In Proceedings of the ACM International Conference on Machine Learning. ACM Press New York. Google ScholarDigital Library
- Ling, C. X., Sheng, V. S., Bruckhaus, T., and Madhavji, N. H. 2006a. Maximum profit mining and its application in software development. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '06). 929. Google ScholarDigital Library
- Ling, C., Sheng, V., and Yang, Q. 2006b. Test strategies for cost-sensitive decision trees. IEEE Trans. Knowl. Data Engin. 18, 8, 1055--1067. Google ScholarDigital Library
- Liu, X. 2007. A new cost-sensitive decision tree with missing values. Asian J. Inf. Technol. 6, 11, 1083--1090.Google Scholar
- Lomax, S. and Vadera, S. 2011. An empirical comparison of cost-sensitive decision tree induction algorithms. Expert Syst. J. Knowl. Engin. 28, 3, 227--268.Google ScholarCross Ref
- Lozano, A.C. and Abe, N. 2008. Multi-class cost-sensitive boosting with p-norm loss functions. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). 506. Google ScholarDigital Library
- Margineantu, D. and Dietterich, T. 2003. A wrapper method for cost-sensitive learning via stratification. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.1102.Google Scholar
- Margineantu, D. 2001. Methods for cost-sensitive learning. Doctoral thesis, Oregon State University. Google ScholarDigital Library
- Merler, S., Furlanello, C., Larcher, B., and Sboner, A. 2003. Automatic model selection in cost-sensitive boosting. Inf. Fusion 4, 1, 3--10.Google ScholarCross Ref
- Mease, D., Wyner, A. J., and Buja, A. 2007. Boosted classification trees and class probability/quantile estimation. J. Mach. Learn. Res. 8, 409--439. Google ScholarDigital Library
- Meir, R. and Rätsch, g. 2003. An introduction to boosting and leveraging. In Advanced Lectures on Machine Learning, Mendelson, S., Smola, A. Eds., Springer, 119--184. Google ScholarDigital Library
- Michalewicz, Z. 1996. Genetic Algorithms + Data Structures = Evolution Programs 3<sup>rd</sup> Ed. Springer. Google ScholarDigital Library
- Mingers, J. 1989. An empirical comparison of pruning methods for decision tree induction. Mach. Learn. 4, 227--243. Google ScholarDigital Library
- Moret, S., Langford, W., and Margineantu, D. 2006. Learning to predict channel stability using biogeomorphic features. Ecol. Model. 191, 1, 47--57.Google ScholarCross Ref
- Morrison, D. 1976. Multivariate Statistical Method 2<sup>nd</sup> Ed. McGraw-Hill, New York.Google Scholar
- Murthy, S., Kasif, S., and Salzberg, S. 1994. A system for induction of oblique decision trees. J. Artif. Intell. Res. 2, 1--32. Google ScholarCross Ref
- Murthy, S. and Salzberg, S. 1995. Lookahead and pathology in decision tree induction. In Proceedings of the 14<sup>th</sup> International Joint Conference on Artificial Intelligence. 1025--1033. Google ScholarDigital Library
- Ni, A., Zhang, S., Yang, S., and Zhu, X. 2005. Learning classification rules under multiple costs. Asian J. Inf. Technol. 4, 1080--1085.Google Scholar
- Nilsson, N. J. 1965. Learning Machines. McGraw-Hill, New York.Google Scholar
- Norton, S. W. 1989. Generating better decision trees. In Proceedings of the 11th International Joint Conference on Artificial Intelligence (IJCAI '89). 800--805. Google ScholarDigital Library
- Núnez, M. 1991. The use of background knowledge in decision tree induction. Mach. Learn. 6, 231--250. Google ScholarDigital Library
- Omielan, A. 2005. Evaluation of a cost-sensitive genetic classifier. MPhil thesis, University of Salford.Google Scholar
- Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., and Brunk, C. 1994. Reducing misclassification costs. In Proceedings of the 11<sup>th</sup> International Conference on Machine Learning. 217--225. http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Reducing+&q=intitle:Reducing++misclassification+costs#0.Google Scholar
- Qin, Z., Zhang, S., and Zhang, C. 2004. Cost-Sensitive decision trees with multiple cost scales. In Proceedings of the 17<sup>th</sup> Austrailian Joint Conference on Artificial Intelligence. G. I. Webb and X. Yu, Eds., Lecture Notes in Artificial Intelligence, vol. 3339, Springer, 380--390. Google ScholarDigital Library
- Quinlan, J. R. 1979. Discovering rules by induction from large collections of examples. In Expert Systems in the Micro Electronic Age, D. Michie, Ed., Edinburgh University Press, 168--201.Google Scholar
- Quinlan, J. R. 1983. Learning efficient classification procedures and their application to chess end games. In Machine Learning: An Artificial Intelligence Approach, Michalski, Garbonell and Mitchell Eds., Tioga Publishing Company, Palo Alto, CA.Google Scholar
- Quinlan, J. R. 1986. Induction of decision trees. Mach. Learn. 1, 81--106. Google ScholarCross Ref
- Quinlan, J. R. 1987. Simplifying decision trees. Int. J. Man-Mach. Studies 27, 221--234. Google ScholarDigital Library
- Quinlan, J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo, CA. Google ScholarDigital Library
- Quinlan, J. R., Compton, P. J., Horn, K. A., and Lazarus, L. 1987. Inductive knowledge acquisition: A case study. In Application of Expert Systems, J. Ross Quinlan, Ed., Turning Institute Press/Addison-Wesley, 137--156. Google ScholarDigital Library
- Rissanen, J. 1978. Modelling by shortest data description. Automatica, 14, 465--471. Google ScholarDigital Library
- Schapire, R. E. 1999. A Brief Introduction to Boosting. In Proceedings of the 16<sup>th</sup> International Joint Conference on Artificial Intelligence (IJCAI99). Vol 2, 1401--1406. Google ScholarDigital Library
- Schapire, R. E. and Singer, Y. 1999. Improved boosting algorithms using confidence-rated predictions. Machine Learn. 37, 3, 297--336. Google ScholarDigital Library
- Shannon, C. E. 1948. The mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423.Google ScholarCross Ref
- Sheng, S. and Ling, C. 2005. Hybrid cost-sensitive decision tree. In Proceedings of the 9<sup>th</sup> European Conference on Principles and Practice of Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 3721., Springer, 274--284.Google ScholarCross Ref
- Sheng, S., Ling, C., and Yang, Q. 2005. Simple test strategies for cost-sensitive decision trees. In 16<sup>th</sup> European Conference on Machine Learning (ECML' 05). Lecture Notes in Computer Science, vol. 3720. Springer, 365--376. Google ScholarDigital Library
- Swets, J., Dawes, R., and Monahan, J. 2000. Better decisions through science. Sci. Amer. 283, 4, 82--87.Google Scholar
- Tan, M. 1993. Cost-sensitive learning of classification knowledge and its applications in robotics. Machine Learn. 13, 7--33. Google ScholarDigital Library
- Tan, M. and Schlimmer J. 1989. Cost-Sensitive concept learning of sensor use in approach and recognition. In Proceedings of the 6<sup>th</sup> International Workshop on Machine Learning (ML '89). 392--395. Google ScholarDigital Library
- Ting, K. and Zheng, Z. 1998a. Boosting cost-sensitive trees. In Proceedings of the 1<sup>st</sup> International Conference on Discovery Science. Lecture Notes in Computer Science, vol. 1532., Springer, 244--255. Google ScholarDigital Library
- Ting, K. M. and Zheng, Z. 1998b. Boosting trees for cost-sensitive classifications. In Proceedings of the 10th European Conference on Machine Learning. Springer, 190--195. Google ScholarDigital Library
- Ting, K. M. 1998. Inducing cost-sensitive decision trees via instance weighting. In Proceedings of the 2<sup>nd</sup> European Symposium on Principles of Data Mining and Knowledge Discovery. Springer, 139--147. Google ScholarDigital Library
- Ting, K. 2000a. An empirical study of Metacost using boosting algorithms. In Proceedings of the 11<sup>th</sup> European Conference on Machine Learning. Lecture Notes in Computer Science, vol. 1810., Springer, 413--425. Google ScholarDigital Library
- Ting, K. 2000b. A comparative study of cost-sensitive boosting algorithms. In Proceedings of the 17th International Conference on Machine Learning. 983--990. Google ScholarDigital Library
- Ting, K. M. 2002. An instance-weighting method to induce cost-sensitive decision trees. IEEE Trans. Knowl. Data Engin. 14, 3, 659--665. Google ScholarDigital Library
- Turney, P. D. 1995. Cost-Sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. J. Artif. Intell. Res. 2, 369--409. Google ScholarCross Ref
- Vadera, S. 2005a. Inducing cost-sensitive non-linear decision trees. Tech. rep. http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Inducing+&q=intitle:Inducing++cost-sensitive+non-linear+decision+trees#0.Google Scholar
- Vadera, S. 2005b. Inducing safer oblique trees without costs. Expert Syst: Int. J. Knowl. Engin. Neural Netw. 22, 4, 206--221.Google ScholarCross Ref
- Vadera, S. 2010. CSNL: A cost-sensitive non-linear decision tree algorithm. ACM Trans. Knowl. Discov. Data 4, 2, 1--25. Google ScholarDigital Library
- von Neumann, J. 1951. Various techniques used in connection with random digits. Monte carlo methods. Nat. Bureau Standards 12, 36--38.Google Scholar
- Winston, P. H. 1993. Artificial Intelligence 3rd Ed. Addison Wesley. Google ScholarDigital Library
- Zadrozny, B., Langford, J., and Abe, N. 2003a. A simple method for cost-sensitive learning. Tech. rep. RC22666. http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.7.7947.Google Scholar
- Zadrozny, B., Langford, J., and Abe, N. 2003b. Cost-Sensitive learning by cost-proportionate example weighting. In Proceedings of the 3rd IEEE International Conference on Data Mining. 435. Google ScholarDigital Library
- Zhang, S., Qin, Z., Ling, C., and Sheng, S. 2005. Missing is useful: Missing values in cost-sensitive decision trees. IEEE Trans. Knowl. Data Engin. 17, 12, 1689--1693. Google ScholarDigital Library
- Zhang, S., Zhu, X., Zhang, J., and Zhang, C. 2007. Cost-Time sensitive decision tree with missing values. Knowl. Sci., Engin. Manag. 4798, 447--459. Google ScholarDigital Library
- Zhang, S. 2010. Cost-Sensitive classification with respect to waiting cost. Knowl Based Syst. 23, 5, 369--378. Google ScholarDigital Library
Index Terms
- A survey of cost-sensitive decision tree induction algorithms
Recommendations
Time-constrained cost-sensitive decision tree induction
Cost-sensitive decision tree induction is to build a decision tree from training data with minimal cost.No previous research has studied how to induce the minimal cost tree if the classification task has to be completed in a limited time.This paper ...
A Multi-armed Bandit Approach to Cost-Sensitive Decision Tree Learning
ICDMW '12: Proceedings of the 2012 IEEE 12th International Conference on Data Mining WorkshopsSeveral authors have studied the problem of inducing decision trees that aim to minimize costs of misclassification and take account of costs of tests. The approaches adopted vary from modifying the information theoretic attribute selection measure used ...
A Strategy for Attributes Selection in Cost-Sensitive Decision Trees Induction
CITWORKSHOPS '08: Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology WorkshopsDecision tree learning is one of the most widely used and practical methods for inductive inference. A fundamental issue in decision tree inductive learning is the attribute selection measure at each non-terminal node of the tree. However, existing ...
Comments