Abstract
In this paper, the performance of the IRBASIR-Imb algorithm (Induction of Rules Based on Similarity Relations for Imbalance datasets) is used in a classical task in the branch of the Civil Engineering: predict if structural failure depends on the connector (canals) or concrete capacity of connectors. The use of similarity relations allows applying this method in the case of mixed data (features with discrete or real domains). The experimental results show a satisfactory performance of the IRBASIR-Imb algorithm in comparison to others such as C4.5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alcalá, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multi.-Valued Log. Soft Comput. 17, 255–287 (2010)
Batista, G.E., et al.: A study of the behaviour of several methods for balancing machine learning training data. SIGKDD Explor. 6(1), 20–29 (2004)
Bonilla, J.D.: Estudio del comportamiento de conectores tipo perno de estructuras compuestas de hormigón y acero mediante modelación numérica. Ph.D., Universidad Central “Marta Abreu” de Las Villas Santa Clara
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C.: Safe-Level-SMOTE: safe-level-synthetic minority over-sampling TEchnique for handling the class imbalanced problem. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 475–482. Springer, Berlin, Heidelberg (2009). doi:10.1007/978-3-642-01307-2_43
Chawla, N.V., et al.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. 6(1), 1–6 (2004)
Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Fernández, A., et al.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)
Filiberto, Y., Bello, R., et al.: Algoritmo para el aprendizaje de reglas de clasificación basado en la teoría de los conjuntos aproximados extendida. Revista DYNA 78, 62–70 (2011)
Filiberto, Y., Bello, R., et al.: A method to built similarity relations into extended Rough set theory. In: Proceedings of the 10th International Conference on Intelligent Systems Design and Applications (ISDA2010), Cairo, Egipto (2010)
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Berlin, Heidelberg (2005). doi:10.1007/11538059_91
Holm, S.: A simple sequentially rejective multiple test procedure. J. Stat. 6, 65–70 (1979)
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)
Iman, R., Davenport, J.: Approximations of the critical region of the friedman statistic. Commun. Stat. Theor. Method, Part A 9, 571–595 (1980)
Khoshgoftaar, T.M., Van Hulse, J.: Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 41, 552–568 (2010)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Pawlak, Z.: Rough Sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Tong, L., Chang, Y.C.: Determining the optimal resampling strategy for a classification model with imbalanced data using design of experiments and response surface methodologies. Expert Syst. Appl. 38, 4222–4227 (2011)
Wu, X., Kumar, V.: The Top Ten Algorithms in Data Mining. Data Mining and Knowledge Discovery Series. Chapman & Hall/CRC, Boca Raton (2001)
Yaima., F., Bello, R., et al.: Método para el aprendizaje de reglas de clasificación para conjuntos de datos no balanceados. Revista Cubana de Ciencias Informáticas (RCCI) 5(4)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Filiberto, Y., Frias, M., Larrua, R., Bello, R. (2016). Induction of Rules Based on Similarity Relations for Imbalance Datasets. A Case of Study. In: Figueroa-García, J., López-Santana, E., Ferro-Escobar, R. (eds) Applied Computer Sciences in Engineering. WEA 2016. Communications in Computer and Information Science, vol 657. Springer, Cham. https://doi.org/10.1007/978-3-319-50880-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-50880-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50879-5
Online ISBN: 978-3-319-50880-1
eBook Packages: Computer ScienceComputer Science (R0)