FLSOM with Different Rates for Classification in Imbalanced Datasets

Machón-González, Iván; López-García, Hilario

doi:10.1007/978-3-540-87536-9_66

Iván Machón-González¹ &
Hilario López-García¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5163))

Included in the following conference series:

International Conference on Artificial Neural Networks

1980 Accesses
1 Citations

Abstract

There are several successful approaches dealing with imbalanced datasets. In this paper, the Fuzzy Labeled Self-Organizing Map (FLSOM) is extended to work with that type of data. The proposed approach is based on assigning two different values in the learning rate depending on the data vector membership of the class. The technique is tested with several datasets and compared with other approaches. The results seem to prove that FLSOM with different rates is a suitable tool and allows understanding and visualizing the data such as overlapped clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kubat, M., Matwin, S.: Addressing the Curse of Imbalanced Training Sets: One-sided Selection. In: Proc. 14th Int. Conf. on Machine Learning, pp. 179–186 (1997)
Google Scholar
Vivaracho, C.: Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 111–120. Springer, Heidelberg (2006)
Chapter Google Scholar
Cantador, I., Dorronsoro, J.: Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 43–50. Springer, Heidelberg (2005)
Google Scholar
Wilson, D.R., Martínez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Article MATH Google Scholar
Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explorations 6, 30–39 (2004)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar
Liu, Y., An, A., Huang, X.: Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 107–118. Springer, Heidelberg (2006)
Chapter Google Scholar
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Google Scholar
Soler, V., Roig, J., Prim, M.: Fuzzy Rule Extraction Using Recombined RecBF for Very-Imbalanced Datasets. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 685–690. Springer, Heidelberg (2005)
Google Scholar
Veropoulos, K., Cristianini, N., Campbell, C.: Controlling the sensitivity of support vector machines. In: International Joint Conference on Artificial Intelligence, pp. 55–60 (1999)
Google Scholar
Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., Brunk, C.: Reducing misclassification costs. In: Proc. 11th Int. Conf. on Machine Learning, pp. 217–225 (1994)
Google Scholar
Woods, K., Doss, C., Bowyer, K.W., Solka, J., Priebe, C., Kegelmeyer, W.P.: Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 7, 1417–1436 (1993)
Article Google Scholar
Zheng, Z., Wu, X., Srihari, R.: Feature Selection for Text Categorization on Imbalanced Data. SIGKDD Explorations 6(1), 80–89 (2004)
Article Google Scholar
Fawcett, T., Provost, F.: Combining Data Mining and Machine Learning for Effective User Profile. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 8–13 (1996)
Google Scholar
Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998)
Article Google Scholar
Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-organizing maps of document collections: a new approach to interactive exploration. In: Second International Conference on Knowledge Discovery and Data Mining, pp. 238–243 (1996)
Google Scholar
Kohonen, T., Oja, E., Simula, O., Visa, A., Kangas, J.: Engineering applications of the self-organizing map. Proceedings of the IEEE 84, 1358–1384 (1996)
Article Google Scholar
Simula, O., Kangas, J.: Process monitoring and visualization using self-organizing maps. Neural Networks for Chemical Engineers, 371–384 (1995)
Google Scholar
Villmann, T., Seiffert, U., Schleif, F.-M., Brüss, C., Geweniger, T., Hammer, B.: Fuzzy Labeled Self-Organizing Map with Label-Ajusted Prototypes. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS (LNAI), vol. 4087, pp. 46–56. Springer, Heidelberg (2006)
Chapter Google Scholar
Kohonen, T.: Self-organizing maps, 3rd extended edn. 2001. Springer, Berlin (1995)
Google Scholar
Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions Transactions on Neural Networks 11(3), 586–600 (2000)
Article Google Scholar
López, H., Machón, I.: Self-organizing map and clustering for wastewater treatment monitoring. Engineering Applications of Artificial Intelligence 17(3), 215–225 (2004)
Article Google Scholar
Machón, I., López, H.: End-point detection of the aerobic phase in a biological reactor using SOM and clustering algorithms. Engineering Applications of Artificial Intelligence 19(1), 19–28 (2006)
Article Google Scholar
Heskes, T.: Energy functions for self-organizing maps. In: Oja, E., Kaski, S. (eds.) Kohonen Maps, pp. 303–316. Elsevier, Amsterdam (1999)
Chapter Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://mlearn.ics.uci.edu/MLSummary.html
Prati, R., Batista, G., Monard, M.: Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 312–321. Springer, Heidelberg (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrónica de Computadores y Sistemas. Edificio Departamental, Universidad de Oviedo. Escuela Politécnica Superior de Ingeniería. Departamento de Ingeniería Eléctrica, 2. Zona Oeste. Campus de Viesques s/n., 33204, Gijón/Xixón (Asturies), Spain
Iván Machón-González & Hilario López-García

Authors

Iván Machón-González
View author publications
You can also search for this author in PubMed Google Scholar
Hilario López-García
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Véra Kůrková Roman Neruda Jan Koutník

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Machón-González, I., López-García, H. (2008). FLSOM with Different Rates for Classification in Imbalanced Datasets. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_66

Download citation

DOI: https://doi.org/10.1007/978-3-540-87536-9_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87535-2
Online ISBN: 978-3-540-87536-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics