Abstract
Two important factors that impact a classification model’s performance are imbalanced data and unequal misclassification cost consequences. These are especially important considerations for neural network models developed to estimate the posterior probabilities of group membership used in classification decisions. This paper explores the issues of asymmetric misclassification costs and unbalanced group sizes on neural network classification performance using an artificial data approach that is capable of generating more complex datasets than used in prior studies and which adds new insights to the problem and the results. A different performance measure, that is capable of directly measuring classification performance consistency with Bayes decision rule, is used. The results show that both asymmetric misclassification costs and imbalanced group sizes have significant effects on neural network classification performance both independently and via interaction effects. These are not always intuitive; they supplement prior findings, and raise issues for the future.
Similar content being viewed by others
References
Barnard E, Botha E (1993) Backpropagation uses prior information efficiently. IEEE Trans Neural Netw 4(5):794–802. doi:10.1109/72.248457
Berardi VL, Patuwo BE, Hu M (2004) A principled approach for building and evaluating neural network classifiers for e-commerce applications. Decis Support Syst 38(2):233–246. doi:10.1016/S0167-9236(03)00093-9
Berardi VL, Patuwo BE, Hu M, Kline DM (2007) Using artificial data to access neural network classification performance. Technical Report
Berardi VL, Zhang GP (1999) The effect of misclassification costs on neural network classifiers. Decis Sci 30(3):659–682. doi:10.1111/j.1540-5915.1999.tb00902.x
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chawla N, Japkowicz N, Kolcz A (eds) (2004) Special issue on learning from imbalanced datasets. SIGKDD 6(1):ACM Press
Cybenko G (1989) Approximation by superposition of a sigmoidal function, mathematics of control, signals, and systems. 2:303–314
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York
Elazmeh W, Japkowicz N, Matwin S (2006) A framework for measuring classification difference with imbalance (technical report ws-06-06). AAAI press, Menlo Park
Fawcett T, Provost F (1996) Combining data mining and machine learning for effective user profile. Proceedings of the 2nd international conference on knowledge discovery and data mining. pp 8–13
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4(1):1–58
Holte RC, Acker LE, Porter BW (1989) Concept learning and the accuracy of small disjuncts. Proceedings of the 11th international joint conference on artificial intelligence. Morgan Kaufmann, Detroit, pp 813–818
Hornik K (1991) Approximation capabilities of multilayer feed-forward networks. Neural Netw 4:251–257. doi:10.1016/0893-6080(91)90009-T
Hornik K, Stinchcombe M, White H (1989) Multilayer feed-forward networks are universal approximators. Neural Netw 2:359–366. doi:10.1016/0893-6080(89)90020-8
Hung MS, Hu MY, Patuwo BE, Shanker M (1996) Estimating posterior probabilities in classification problems with neural networks. Int J Comput Intell Organ 1:49–60
Japkowicz N (2000) Learning from imbalanced data sets: a comparison of various strategies. In: Japkowicz N (ed) Proceedings of the AAAI 2000 workshop on learning from imbalanced data sets. AAAI Press, Menlo Park
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449
Jo T, Japkowicz N (2004) Class imbalances versus small disjuncts. SIGKDD Explor Newsl 6(1):40–49. doi:10.1145/1007730.1007737
Kline DM, Berardi VL (2005) Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural computing and applications. (in press)
Kohers G, Rakes TR, Rees LP (1996) Predicting weekly portfolio returns with the use of composite models: a comparison of neural networks and traditional composite models. Proceedings of the 1996 annual meeting of the decision sciences institute, Atlanta, pp 1332–1334
Kubat M, Holte R, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30:195–215. doi:10.1023/A:1007452223027
Lowe D, Webb AR (1990) Exploiting prior knowledge in network optimization: an illustration from medical prognosis. Network 1(3):299–323
Lowe D, Webb AR (1991) Optimized feature extraction and the Bayes decision in feed-forward classifier networks. IEEE Trans Pattern Anal Mach Intell 13(4):355–364. doi:10.1109/34.88570
Maloof M (2003) Learning when data sets are imbalanced and when costs are unequal. Workshop on ICML 2003
Mazurowski M, Habas P, Zurada J, Lo J, Baker J, Tourassi G (2008) Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw (in press)
Pearson R, Goney G, Shwaber J (2003) Imbalanced clustering for microarray time- series. Proceedings of the ICML 2003 workshop on learning from imbalanced data sets
Philipoom PR, Wiegmann L, Rees LP (1997) Cost-based due-date assignment with the use of classical and neural network approaches. Nav Res Logist 44(1):825–845
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231
Quinlan (1991) Improved estimates for the accuracy of small disjuncts. Mach Learn 6(1):93
Richard MD, Lippmann RP (1991) Neural network classifiers estimate Bayesian posterior probabilities. Neural Comput 3:461–483. doi:10.1162/neco.1991.3.4.461
Salchenberger LM, Cinar EM, Lash NA (1992) Neural networks: a new tool for predicting thrift failures. Decis Sci 23(4):899–916. doi:10.1111/j.1540-5915.1992.tb00425.x
Swets J, Pickett R (1982) Evaluation of diagnostic systems: methods from signal detection theory. Academic Press, New York
Tango T (1998) Equivalence test and confidence interval for the difference in proportions for the paired-sample design. Stat Med 17:891–908. doi:10.1002/(SICI)1097-0258(19980430)17:8<891::AID-SIM780>3.0.CO;2-B
Visa S, Ralescu A (2003) Learning from imbalanced and overlapped data using fuzzy sets. Proceedings of ICML 2003 workshop: learning with imbalanced data sets II, pp 97–104
Weiss GM (1995) Learning with rare case and small disjuncts. Proceedings of the 17th international conference on machine learning. pp 558–565
Weiss GM, Hirsh H (2000) A quantitative study of small disjuncts. Proceedings of the 17th national conference on artificial intelligence. AAAI Press, Menlo Park, pp 665–670
Wu G, Chang EY (2003) Class-boundary alignment for imbalanced dataset learning. Proceedings of the ICML 2003 workshop on learning from imbalanced data sets
Zhou Z-Z, Liu X-Y (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77. doi:10.1109/TKDE.2006.17
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lan, Js., Berardi, V.L., Patuwo, B.E. et al. A joint investigation of misclassification treatments and imbalanced datasets on neural network performance. Neural Comput & Applic 18, 689–706 (2009). https://doi.org/10.1007/s00521-009-0239-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-009-0239-1