Skip to main content

Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning

  • Conference paper
Computational Intelligence for Knowledge-Based Systems Design (IPMU 2010)

Abstract

In a classification task, the imbalance class problem is present when the data-set has a very different distribution of examples among their classes. The main handicap of this type of problem is that standard learning algorithms consider a balanced training set and this supposes a bias towards the majority classes.

In order to provide a correct identification of the different classes of the problem, we propose a methodology based on two steps: first we will use the one-vs-one binarization technique for decomposing the original data-set into binary classification problems. Then, whenever each one of these binary subproblems is imbalanced, we will apply an oversampling step, using the SMOTE algorithm, in order to rebalance the data before the pairwise learning process.

For our experimental study we take as basis algorithm a linguistic Fuzzy Rule Based Classification System, and we aim to show not only the improvement in performance achieved with our methodology against the basic approach, but also to show the good synergy of the pairwise learning proposal with the selected oversampling technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explorations 6(1), 1–6 (2004)

    Article  Google Scholar 

  2. Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of imbalanced data: A review. International Journal of Pattern Recognition and Artificial Intelligence 23(4), 687–719 (2009)

    Article  Google Scholar 

  3. Hastie, T., Tibshirani, R.: Classification by pairwise coupling. The Annals of Statistics 26(2), 451–471 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  4. Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2000)

    Article  MathSciNet  Google Scholar 

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over–sampling technique. Journal of Artificial Intelligent Research 16, 321–357 (2002)

    MATH  Google Scholar 

  6. Fernández, A., García, S., del Jesus, M.J., Herrera, F.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data–sets. Fuzzy Sets and Systems 159(18), 2378–2398 (2008)

    Article  MathSciNet  Google Scholar 

  7. Fernández, A., del Jesus, M.J., Herrera, F.: On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Information Sciences 180(8), 1268–1291 (2010)

    Article  Google Scholar 

  8. Ishibuchi, H., Yamamoto, T., Nakashima, T.: Hybridization of fuzzy GBML approaches for pattern classification problems. IEEE Transactions on System, Man and Cybernetics B 35(2), 359–365 (2005)

    Article  Google Scholar 

  9. Asuncion, A., Newman, D.: UCI machine learning repository. University of California, Berkeley (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

    Google Scholar 

  10. Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45(2), 171–186 (2001)

    Article  MATH  Google Scholar 

  11. Orriols-Puig, A., Bernadó-Mansilla, E.: Evolutionary rule–based systems for imbalanced datasets. Soft Computing 13(3), 213–225 (2009)

    Article  Google Scholar 

  12. Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3), 299–310 (2005)

    Article  Google Scholar 

  13. Hüllermeier, E., Brinker, K.: Learning valued preference structures for solving classification problems. Fuzzy Sets and Systems 159(18), 2337–2352 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  14. Fernández, A., Calderón, M., Barrenechea, E., Bustince, H., Herrera, F.: Enhancing fuzzy rule based systems in multi-classification using pairwise coupling with preference relations. In: EUROFUSE ’09 Workshop on Preference Modelling and Decision Analysis (EUROFUSE ’09), pp. 39–46 (2009)

    Google Scholar 

  15. Orlovsky, S.A.: Decision-making with a fuzzy preference relation. Fuzzy Sets and Systems 1, 155–167 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  16. Rifkin, R., Klautau, A.: In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141 (2004)

    MathSciNet  Google Scholar 

  17. Fürnkranz, J.: Round robin classification. Journal of Machine Learning Research 2, 721–747 (2002)

    Article  MATH  Google Scholar 

  18. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    Google Scholar 

  19. García, S., Herrera, F.: An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)

    Google Scholar 

  20. Sheskin, D.: Handbook of parametric and nonparametric statistical procedures, 2nd edn. Chapman & Hall/CRC, Boca Raton (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fernández, A., del Jesus, M.J., Herrera, F. (2010). Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Computational Intelligence for Knowledge-Based Systems Design. IPMU 2010. Lecture Notes in Computer Science(), vol 6178. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14049-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14049-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14048-8

  • Online ISBN: 978-3-642-14049-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics