Skip to main content

Fuzzy Data Augmentation for Handling Overlapped and Imbalanced Data

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1516))

Included in the following conference series:

  • 2286 Accesses

Abstract

Class imbalance is a serious issue in classification as a traditional classifier is generally biased towards the majority class. The accuracy of the classifier could be further impacted in cases where additionally to the class imbalance, there are overlapped data instances. Further, data sparsity has shown to be a possible issue that may lead to non- invariance and poor generalisation. Data augmentation is a technique that can handle the generalisation issue and improve the regularisation of the Deep Neural Network (DNN). A method to handle both class overlap and class imbalance while also incorporating regularisation is proposed in this paper. In our work, the imbalanced dataset is balanced using SMOTETomek, and then the non-categorical attributes are fuzzified. The purpose of fuzzifying the attributes is to handle the overlapping in the data and provide some form of data augmentation that can be used as a regularisation technique. Therefore, in this paper, the invariance is achieved as the augmented data are generated based on the fuzzy concept. The balanced augmented dataset is then trained using a DNN classifier. The datasets used in the experiments were selected from UCI and KEEL data repositories. The experiments show that the proposed Fuzzy data augmentation for handling overlapped and imbalanced data can address the overlapped and imbalanced data issues, and provide regularisation using data augmentation for numerical data to improve the performance of a DNN classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016). https://doi.org/10.1007/s13748-016-0094-0

    Article  Google Scholar 

  2. Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Google Scholar 

  3. Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  4. Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)

    MathSciNet  MATH  Google Scholar 

  5. Prati, R.C., Batista, G.E., Monard, M.C.: Class imbalances versus class overlapping: an analysis of a learning system behavior. In: Mexican International Conference on Artificial Intelligence. Springer (2004)

    Google Scholar 

  6. García, V., et al.: Combined effects of class imbalance and class overlap on instance-based classification. In: International Conference on Intelligent Data Engineering and Automated Learning. Springer (2006)

    Google Scholar 

  7. Wang, Z., et al.: SMOTETomek-based resampling for personality recognition. IEEE Access 7, 129678–129689 (2019)

    Article  Google Scholar 

  8. Zixi, L., et al.: Nondestructive detection of apple mouldy core disease based on unbalanced dielectric data. In: 2019 Chinese Automation Congress (CAC). IEEE (2019)

    Google Scholar 

  9. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  10. Wang, C.-S., et al.: Detecting potential adverse drug reactions using a deep neural network model. J. Med. Internet Res. 21(2), e11016 (2019)

    Article  Google Scholar 

  11. Katzman, J.L., et al.: DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18(1), 24 (2018)

    Article  Google Scholar 

  12. Kim, H.-C., Bandettini, P.A., Lee, J.-H.: Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging. Neuroimage 186, 607–627 (2019)

    Article  Google Scholar 

  13. Saleem, N., et al.: Deep neural network for supervised single-channel speech enhancement. Arch. Acoust. 44(1), 3–12 (2019)

    MathSciNet  Google Scholar 

  14. Sumit, S.H., Akhter, S.: C-means clustering and deep-neuro-fuzzy classification for road weight measurement in traffic management system. Soft. Comput. 23(12), 4329–4340 (2018). https://doi.org/10.1007/s00500-018-3086-0

    Article  Google Scholar 

  15. Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019). https://doi.org/10.1186/s40537-019-0192-5

    Article  Google Scholar 

  16. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT press, Cambridge (2016)

    MATH  Google Scholar 

  17. Cui, X., Goel, V., Kingsbury, B.: Data augmentation for deep neural network acoustic modeling. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 23(9), 1469–1477 (2015)

    Article  Google Scholar 

  18. Tomaszewska, K.: The application of horizontal membership functions to fuzzy arithmetic operations. J. Theoret. Appl. Comput. Sci. 8(2), 3–10 (2014)

    Google Scholar 

  19. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  Google Scholar 

  20. Pruengkarn, R., Wong, K.W., Fung, C.C.: Imbalanced data classification using complementary fuzzy support vector machine techniques and SMOTE. In: Systems, Man, and Cybernetics (SMC), 2017 IEEE International Conference on 2017. IEEE (2017)

    Google Scholar 

  21. Dabare, R., Wong, K.W., Shiratuddin, M.F., Koutsakis, P.: Fuzzy deep neural network for classification of overlapped data. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) ICONIP 2019. LNCS, vol. 11953, pp. 633–643. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36708-4_52

    Chapter  Google Scholar 

  22. Chen, S.-Y., Feng, Z., Yi, X.: A general introduction to adjustment for multiple comparisons. J. Thorac. Dis. 9(6), 1725 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kok Wai Wong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dabare, R., Wong, K.W., Shiratuddin, M.F., Koutsakis, P. (2021). Fuzzy Data Augmentation for Handling Overlapped and Imbalanced Data. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92307-5_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92306-8

  • Online ISBN: 978-3-030-92307-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics