Abstract
Most supervised classification algorithms produce a soft score (either a probability, a fuzzy degree, a possibility, a cost, etc.) assessing the strength of the association between items and classes. After that, each item is assigned to the class with the highest soft score. In this paper, we show that this last step can be improved through alternative procedures more sensible to the available soft information. To this aim, we propose a general fuzzy bipolar approach that enables learning how to take advantage of the soft information provided by many classification algorithms in order to enhance the generalization power and accuracy of the classifiers. To show the suitability of the proposed approach, we also present some computational experiences for binary classification problems, in which its application to some well-known classifiers as random forest, classification trees and neural networks produces a statistically significant improvement in the performance of the classifiers.
Similar content being viewed by others
References
Alcalá R, Alcalá-Fdez J, Herrera F (2007) A proposal for the genetic lateral tuning of linguistic fuzzy systems and its interaction with rule selection. IEEE Trans Fuzzy Syst 15(4):616–635
Alcalá-Fdez J, Alcala R, Herrera F (2011a) A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning. IEEE Trans Fuzzy Syst 19(5):857–872
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011b) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17(2–3):255–287
Amo A, Montero J, Molina E (2001) Representation of consistent recursive rules. Eur J Oper Res 130:29–53
Atanassov KT (1999) Intuitionistic fuzzy sets theory and applications. Physica-Verlag, Heidelberg
Breiman L (1984) Classification and regression trees. Kluwer Academic Publishers, New York
Breiman L (2001) Random forests. Mach Learn 40:5–32
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46
Cordon O, del Jesús MJ, Herrera F (1999) A proposal on reasoning methods in fuzzy rule-based classification systems. Int J Approx Reason 20(1):21–45
Demsar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30
Dubois D, Prade H (2002) Possibility theory. Probability theory and multiple-valued logics: a clarification. Ann Math Artif Intell 32:35–66
Dubois D, Prade H (2006) A bipolar possibilistic representation of knowledge and preferences and its applications. Fuzzy Logic Appl 3849:1–10
Dubois D, Prade H (2008) An introduction to bipolar representations of information and preference. Int J Intell Syst 23(8):866–877
Fünkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple datasets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694
García S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Goldberg D (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Boston
Gómez D, Montero J (2004) A discussion on aggregation operators. Kybernetika 40(1):107–120
Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70
Hullermeier E (2005) Fuzzy methods in machine learning and data mining: status and prospects. Fuzzy Sets Syst 156(3):387–406
Ishibuchi H, Yamamoto T, Nakashima T (2005) Hybridization of fuzzy GBML approaches for pattern classification problems. IEEE Trans Syst Man Cybern Part B Cybern 35(2):359–365
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26. https://doi.org/10.18637/jss.v028.i05
Kumar R, Verma R (2012) Classification algorithms for data mining: a survey. Int J Innov Eng Technol 2:7–14
Lim TS, Loh WY, Shih YS (2000) A comparison of prediction accuracy. Complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–228
Montero J, Gómez D, Bustince H (2007) On the relevance of some families of fuzzy Sets. Fuzzy Sets Syst 158(22):2429–2442
Montero J, Bustince H, Franco C, Rodríguez JT, Gómez D, Pagola M, Fernandez J, Barrenechea E (2016) Paired structures in knowledge representation. Knowl Based Syst 100:50–58
Osgood CE, Suci GJ, Tannenbaum PH (1957) The measurement of meaning. University of Illinois Press, Urbana
Ozturk M, Tsoukiàs A (2007) Modeling uncertain positive and negative reasons in decision aiding. Decis Support Syst 43:1512–1526
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Rodríguez JT, Vitoriano B, Montero J (2011) Rule-based classification by means of bipolar criteria. In: IEEE symposium on computational intelligence in multicriteria decision-making (MDCM) vol 2011, p 197–204
Rodríguez JT, Vitoriano B, Montero J (2012) A general methodology for data-based rule building and its application to natural disaster management. Comput Oper Res 39(4):863–873
Rodríguez JT, Vitoriano B, Gómez D, Montero J (2013) Classification of disasters and emergencies under bipolar knowledge representation. In: Vitoriano B, Montero J, Ruan D (eds) Decision aid models for disaster management and emergencies, Atlantis computational intelligence systems, vol 7, p 209–232
Rodríguez JT, Turunen E, Ruan D, Montero J (2014) Another paraconsistent algebraic semantics for Lukasiewicz–Pavelka logic. Fuzzy Sets Syst 242:132–147
Rojas K, Gómez D, Montero J, Rodríguez JT, Valdivia A, Paiva F (2014) Development of child’s home environment indexes based on consistent families of aggregation operators with prioritized hierarchical information. Fuzzy Sets Syst 241:41–60
Sivanandam SN, Deepa SN (2007) Introduction to genetic algorithms. Springer, Berlin
Turunen E, Ozturk M, Tsoukiàs A (2010) Paraconsistent semantics for Pavelka style fuzzy sentential logic. Fuzzy Sets Syst 161:1926–1940
[Venables and RipleyVenables and Ripley2002]Venables Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, Berlin
Villarino G, Gómez D, Rodríguez JT (2017) Improving supervised classification algorithms by a bipolar knowledge representation. In: Advances in fuzzy logic and technology 2017, p 518–529
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Willighagen E (2005) Genalg: R based genetic algorithm. http://cran.r-project.org/
Zadeh LA (1988) Fuzzy-logic. Computer 21(4):83–93
Acknowledgements
This research has been partially supported by the Government of Spain, Grant TIN2015-66471-P and the FPU fellowship Grant 2015/06202 from the Ministry of Education of Spain.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by C. Kahraman.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Villarino, G., Gómez, D., Rodríguez, J.T. et al. A bipolar knowledge representation model to improve supervised fuzzy classification algorithms. Soft Comput 22, 5121–5146 (2018). https://doi.org/10.1007/s00500-018-3320-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3320-9