Abstract
There are several successful approaches dealing with imbalanced datasets. In this paper, the Fuzzy Labeled Self-Organizing Map (FLSOM) is extended to work with that type of data. The proposed approach is based on assigning two different values in the learning rate depending on the data vector membership of the class. The technique is tested with several datasets and compared with other approaches. The results seem to prove that FLSOM with different rates is a suitable tool and allows understanding and visualizing the data such as overlapped clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kubat, M., Matwin, S.: Addressing the Curse of Imbalanced Training Sets: One-sided Selection. In: Proc. 14th Int. Conf. on Machine Learning, pp. 179–186 (1997)
Vivaracho, C.: Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 111–120. Springer, Heidelberg (2006)
Cantador, I., Dorronsoro, J.: Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 43–50. Springer, Heidelberg (2005)
Wilson, D.R., Martínez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explorations 6, 30–39 (2004)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Liu, Y., An, A., Huang, X.: Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 107–118. Springer, Heidelberg (2006)
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Soler, V., Roig, J., Prim, M.: Fuzzy Rule Extraction Using Recombined RecBF for Very-Imbalanced Datasets. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 685–690. Springer, Heidelberg (2005)
Veropoulos, K., Cristianini, N., Campbell, C.: Controlling the sensitivity of support vector machines. In: International Joint Conference on Artificial Intelligence, pp. 55–60 (1999)
Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., Brunk, C.: Reducing misclassification costs. In: Proc. 11th Int. Conf. on Machine Learning, pp. 217–225 (1994)
Woods, K., Doss, C., Bowyer, K.W., Solka, J., Priebe, C., Kegelmeyer, W.P.: Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. International Journal of Pattern Recognition and Artificial Intelligence 7, 1417–1436 (1993)
Zheng, Z., Wu, X., Srihari, R.: Feature Selection for Text Categorization on Imbalanced Data. SIGKDD Explorations 6(1), 80–89 (2004)
Fawcett, T., Provost, F.: Combining Data Mining and Machine Learning for Effective User Profile. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 8–13 (1996)
Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998)
Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-organizing maps of document collections: a new approach to interactive exploration. In: Second International Conference on Knowledge Discovery and Data Mining, pp. 238–243 (1996)
Kohonen, T., Oja, E., Simula, O., Visa, A., Kangas, J.: Engineering applications of the self-organizing map. Proceedings of the IEEE 84, 1358–1384 (1996)
Simula, O., Kangas, J.: Process monitoring and visualization using self-organizing maps. Neural Networks for Chemical Engineers, 371–384 (1995)
Villmann, T., Seiffert, U., Schleif, F.-M., Brüss, C., Geweniger, T., Hammer, B.: Fuzzy Labeled Self-Organizing Map with Label-Ajusted Prototypes. In: Schwenker, F., Marinai, S. (eds.) ANNPR 2006. LNCS (LNAI), vol. 4087, pp. 46–56. Springer, Heidelberg (2006)
Kohonen, T.: Self-organizing maps, 3rd extended edn. 2001. Springer, Berlin (1995)
Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions Transactions on Neural Networks 11(3), 586–600 (2000)
López, H., Machón, I.: Self-organizing map and clustering for wastewater treatment monitoring. Engineering Applications of Artificial Intelligence 17(3), 215–225 (2004)
Machón, I., López, H.: End-point detection of the aerobic phase in a biological reactor using SOM and clustering algorithms. Engineering Applications of Artificial Intelligence 19(1), 19–28 (2006)
Heskes, T.: Energy functions for self-organizing maps. In: Oja, E., Kaski, S. (eds.) Kohonen Maps, pp. 303–316. Elsevier, Amsterdam (1999)
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://mlearn.ics.uci.edu/MLSummary.html
Prati, R., Batista, G., Monard, M.: Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 312–321. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Machón-González, I., López-García, H. (2008). FLSOM with Different Rates for Classification in Imbalanced Datasets. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_66
Download citation
DOI: https://doi.org/10.1007/978-3-540-87536-9_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87535-2
Online ISBN: 978-3-540-87536-9
eBook Packages: Computer ScienceComputer Science (R0)