Skip to main content

Evolving Neural Networks with Maximum AUC for Imbalanced Data Classification

  • Conference paper
Hybrid Artificial Intelligence Systems (HAIS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6076))

Included in the following conference series:

Abstract

Real-world classification problems usually involve imbalanced data sets. In such cases, a classifier with high classification accuracy does not necessarily imply a good classification performance for all classes. The Area Under the ROC Curve (AUC) has been recognized as a more appropriate performance indicator in such cases. Quite a few methods have been developed to design classifiers with the maximum AUC. In the context of Neural Networks (NNs), however, it is usually an approximation of AUC rather than the exact AUC itself that is maximized, because AUC is non-differentiable and cannot be directly maximized by gradient-based methods. In this paper, we propose to use evolutionary algorithms to train NNs with the maximum AUC. The proposed method employs AUC as the objective function. An evolutionary algorithm, namely the Self-adaptive Differential Evolution with Neighborhood Search (SaNSDE) algorithm, is used to optimize the weights of NNs with respect to AUC. Empirical studies on 19 binary and multi-class imbalanced data sets show that the proposed evolutionary AUC maximization (EAM) method can train NN with larger AUC than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: 15th International Conference on Machine Learning, pp. 445–453. AAAI Press, Menlo Park (1998)

    Google Scholar 

  2. Weiss, G.M.: Mining with rarity: a unifying framework. ACM SIGKDD Explorations Newsletter 6(1), 7–19 (2004)

    Article  Google Scholar 

  3. Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)

    Article  Google Scholar 

  4. Ferri, C., Flach, P., Hernández-Orallo, J.: Decision trees learning using the area under the ROC curve. In: 19th International Conference on Machine Learning, pp. 139–146. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  5. Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: 23rd International Conference on Machine Learning, pp. 161–168. ACM Press, New York (2006)

    Google Scholar 

  6. Brefeld, U., Scheffer, T.: AUC maximizing support vector learning. In: Proc. ICML Workshop on ROC Analysis in Machine Learning (2005)

    Google Scholar 

  7. Yan, L., Dodier, R., Mozer, M.C., Wolniewicz, R.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: 20th International Conference on Machine Learning, vol. 20(2), pp. 848–855. AAAI Press, Menlo Park (2003)

    Google Scholar 

  8. Cortes, C., Mohri, M.: AUC optimization vs. error rate minimization. Advances in Neural Information Processing Systems 16, 313–320 (2004)

    Google Scholar 

  9. Huang, J., Ling, C., Zhang, H., Matwin, S.: Proper model selection with significance test. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 536–547. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  10. Herschtal, A., Raskutti, B.: Optimizing area under the ROC curve using gradient descent. In: 21st International Conference on Machine Learning, vol. 69, pp. 49–56. ACM Press, New York (2004)

    Google Scholar 

  11. Calders, T., Jaroszewicz, S.: Efficient AUC optimization for classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 42–53. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Vanderlooy, S., Hüllermeier, E.: A critical analysis of variants of the AUC. Machine Learning 72, 247–262 (2008)

    Article  Google Scholar 

  13. Yang, Z., Tang, K., Yao, X.: Self-adaptive differential evolution with neighborhood search. In: Proceedings of the 2008 Congress on Evolutionary Computation, pp. 1110–1116 (2008)

    Google Scholar 

  14. Yao, X., Liu, Y.: A New Evolutionary System for Evolving Artificial Neural Networks. IEEE Transaction on Neural Networks 8(3), 694–713 (1997)

    Article  Google Scholar 

  15. Hasheminia, H., Niaki, S.T.A.: A Hybrid Method of Neural Networks and Genetic Algorithm in Econometric Modeling and Analysis. Journal of Applied Science 8(16), 2825–2833 (2008)

    Article  Google Scholar 

  16. Shanthi, D., Sahoo, G., Saravanan, N.: Evolving Connection Weights of Artificial Neural Networks Using Genetic Algorithm with Application to the Prediction of Stroke Disease. International Journal of Soft Computing 4(2), 95–102 (2009)

    Google Scholar 

  17. Hand, D.J., Till, R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), 171–186 (2001)

    Article  MATH  Google Scholar 

  18. Price, K., Storn, R., Lampinen, J.: Differential Evolution: A Practical Approach to Global Optimization. Springer, Berlin (2005)

    MATH  Google Scholar 

  19. Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://archive.ics.uci.edu/ml/datasets.html

  20. Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, X., Tang, K., Yao, X. (2010). Evolving Neural Networks with Maximum AUC for Imbalanced Data Classification. In: Graña Romay, M., Corchado, E., Garcia Sebastian, M.T. (eds) Hybrid Artificial Intelligence Systems. HAIS 2010. Lecture Notes in Computer Science(), vol 6076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13769-3_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13769-3_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13768-6

  • Online ISBN: 978-3-642-13769-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics