Skip to main content

FLSOM with Different Rates for Classification in Imbalanced Datasets

  • Conference paper
Artificial Neural Networks - ICANN 2008 (ICANN 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5163))

Included in the following conference series:

Abstract

There are several successful approaches dealing with imbalanced datasets. In this paper, the Fuzzy Labeled Self-Organizing Map (FLSOM) is extended to work with that type of data. The proposed approach is based on assigning two different values in the learning rate depending on the data vector membership of the class. The technique is tested with several datasets and compared with other approaches. The results seem to prove that FLSOM with different rates is a suitable tool and allows understanding and visualizing the data such as overlapped clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kubat, M., Matwin, S.: Addressing the Curse of Imbalanced Training Sets: One-sided Selection. In: Proc. 14th Int. Conf. on Machine Learning, pp. 179–186 (1997)

    Google Scholar 

  2. Vivaracho, C.: Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 111–120. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Cantador, I., Dorronsoro, J.: Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 43–50. Springer, Heidelberg (2005)

    Google Scholar 

  4. Wilson, D.R., Martínez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)

    Article  MATH  Google Scholar 

  5. Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explorations 6, 30–39 (2004)

    Article  Google Scholar 

  6. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  7. Liu, Y., An, A., Huang, X.: Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 107–118. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)

    Google Scholar 

  9. Soler, V., Roig, J., Prim, M.: Fuzzy Rule Extraction Using Recombined RecBF for Very-Imbalanced Datasets. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 685–690. Springer, Heidelberg (2005)

    Google Scholar 

  10. Veropoulos, K., Cristianini, N., Campbell, C.: Controlling the sensitivity of support vector machines. In: International Joint Conference on Artificial Intelligence, pp. 55–60 (1999)

    Google Scholar 

  11. Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., Brunk, C.: Reducing misclassification costs. In: Proc. 11th Int. Conf. on Machine Learning, pp. 217–225 (1994)

    Google Scholar 

  12. Woods, K., Doss, C., Bowyer, K.W., Solka, J., Priebe, C., Kegelmeyer, W.P.: Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 7, 1417–1436 (1993)

    Article  Google Scholar 

  13. Zheng, Z., Wu, X., Srihari, R.: Feature Selection for Text Categorization on Imbalanced Data. SIGKDD Explorations 6(1), 80–89 (2004)

    Article  Google Scholar 

  14. Fawcett, T., Provost, F.: Combining Data Mining and Machine Learning for Effective User Profile. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 8–13 (1996)

    Google Scholar 

  15. Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998)

    Article  Google Scholar 

  16. Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-organizing maps of document collections: a new approach to interactive exploration. In: Second International Conference on Knowledge Discovery and Data Mining, pp. 238–243 (1996)

    Google Scholar 

  17. Kohonen, T., Oja, E., Simula, O., Visa, A., Kangas, J.: Engineering applications of the self-organizing map. Proceedings of the IEEE 84, 1358–1384 (1996)

    Article  Google Scholar 

  18. Simula, O., Kangas, J.: Process monitoring and visualization using self-organizing maps. Neural Networks for Chemical Engineers, 371–384 (1995)

    Google Scholar 

  19. Villmann, T., Seiffert, U., Schleif, F.-M., Brüss, C., Geweniger, T., Hammer, B.: Fuzzy Labeled Self-Organizing Map with Label-Ajusted Prototypes. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS (LNAI), vol. 4087, pp. 46–56. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Kohonen, T.: Self-organizing maps, 3rd extended edn. 2001. Springer, Berlin (1995)

    Google Scholar 

  21. Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions Transactions on Neural Networks 11(3), 586–600 (2000)

    Article  Google Scholar 

  22. López, H., Machón, I.: Self-organizing map and clustering for wastewater treatment monitoring. Engineering Applications of Artificial Intelligence 17(3), 215–225 (2004)

    Article  Google Scholar 

  23. Machón, I., López, H.: End-point detection of the aerobic phase in a biological reactor using SOM and clustering algorithms. Engineering Applications of Artificial Intelligence 19(1), 19–28 (2006)

    Article  Google Scholar 

  24. Heskes, T.: Energy functions for self-organizing maps. In: Oja, E., Kaski, S. (eds.) Kohonen Maps, pp. 303–316. Elsevier, Amsterdam (1999)

    Chapter  Google Scholar 

  25. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://mlearn.ics.uci.edu/MLSummary.html

  26. Prati, R., Batista, G., Monard, M.: Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 312–321. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Véra Kůrková Roman Neruda Jan Koutník

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Machón-González, I., López-García, H. (2008). FLSOM with Different Rates for Classification in Imbalanced Datasets. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87536-9_66

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87535-2

  • Online ISBN: 978-3-540-87536-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics