On the Impact of Imbalanced Data in Convolutional Neural Networks Performance

Pulgar, Francisco J.; Rivera, Antonio J.; Charte, Francisco; del Jesus, María J.

doi:10.1007/978-3-319-59650-1_19

Francisco J. Pulgar¹⁷,
Antonio J. Rivera¹⁷,
Francisco Charte¹⁷ &
…
María J. del Jesus¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10334))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2976 Accesses
16 Citations

Abstract

In recent years, new proposals have emerged for tackling the classification problem based on Deep Learning (DL) techniques. These proposals have shown good results in certain fields, such as image recognition. However, there are factors that must be analyzed to determine how they influence the results obtained by these new algorithms. In this paper, the classification of imbalanced data with convolutional neural networks (CNNs) is analyzed. To do this, a series of tests will be performed in which the classification of real images of traffic signals by CNNs will be performed based on data with different imbalance levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.tensorflow.org/.

References

Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, USA (2000)
MATH Google Scholar
Kotsiantis, S.: Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies, pp. 3–24 (2007)
Google Scholar
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. 6(1), 1–6 (2004)
Article Google Scholar
He, H., García, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recogn. Artif. Intell. 23(4), 687–719 (2009)
Article Google Scholar
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for class imbalance problem: bagging, boosting and hybrid based approaches. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 42(4), 463–484 (2012)
Article Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behaviour of several methods for balancing machine learning training data. SIGKDD Explor. 6(1), 20–29 (2004)
Article Google Scholar
Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining (KDD01), pp. 204–213 (2001)
Google Scholar
Zadrozny, B., Langford, J., Abe, N.: Costsensitive learning by costproportionate example weighting. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM03), pp. 435–442 (2003)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEE Trans. Pattern Anal. Mach. Intell. 3(8), 1798–1828 (2013)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep learning (2016)
Google Scholar
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. Computer. Neural Netw. 32, 323–332 (2012)
Article Google Scholar
García, V., Sánchez, J.S., Mollineda, R.A.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl. Based Syst. 25(1), 13–21 (2012)
Article Google Scholar
Orriols-Puig, A., Bernad-Mansilla, E.: Evolutionary rule-based systems for imbalanced datasets. Soft Comput. 13(3), 213–225 (2009)
Article Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification, Technical Report No. IDSIA-04-12 (2012)
Google Scholar
McMillan, R.L.: How Skype used AI to build its amazing new language translator, wire (2014)
Google Scholar
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks (1995)
Google Scholar
Sak, H., Senior, A., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Proceedings of Interspeech, pp. 338–342 (2013)
Google Scholar
Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: Proceedings of International Joint Conference on Neural Networks (2011)
Google Scholar
LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on in Circuits and Systems (ISCAS), pp. 253–256 (2010)
Google Scholar

Download references

The work of F. Pulgar was supported by the University of Jaén under the Action 15: Predoctoral aids for the encouragement of the doctorate. This work was partially supported by the Spanish Ministry of Science and Technology under project TIN2015-68454-R.

Author information

Authors and Affiliations

Depart of Computer Science, University of Jaén, Jaén, Spain
Francisco J. Pulgar, Antonio J. Rivera, Francisco Charte & María J. del Jesus

Authors

Francisco J. Pulgar
View author publications
You can also search for this author in PubMed Google Scholar
Antonio J. Rivera
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Charte
View author publications
You can also search for this author in PubMed Google Scholar
María J. del Jesus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco J. Pulgar .

Editor information

Editors and Affiliations

University of La Rioja , Logroño, La Rioja, Spain
Francisco Javier Martínez de Pisón
University of La Rioja , Logroño, La Rioja, Spain
Rubén Urraca
University of A Coruña , Ferrol, La Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pulgar, F.J., Rivera, A.J., Charte, F., del Jesus, M.J. (2017). On the Impact of Imbalanced Data in Convolutional Neural Networks Performance. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-59650-1_19
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Impact of Imbalanced Data in Convolutional Neural Networks Performance