Authors:
André Rodrigo da Silva
;
Leonardo M. Rodrigues
and
Luciana de Oliveira Rech
Affiliation:
Department of Informatics and Statistics (INE), Federal University of Santa Catarina (UFSC) Rua Delfino Conti, s/n, Trindade, Cx.P. 476, CEP: 88040-900, Florianópolis and Brazil
Keyword(s):
Machine Learning, Skewed Classes, Imbalanced Datasets, Binary Classification.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
;
Uncertainty in AI
Abstract:
Imbalanced classes constitute very complex machine learning classification problems, particularly if there are not many examples for training, in which case most algorithms fail to learn discriminant characteristics, and tend to completely ignore the minority class in favour of the model overall accuracy. Datasets with imbalanced classes are common in several machine learning applications, such as sales forecasting and fraud detection. Current strategies for dealing with imbalanced classes rely on manipulation of the datasets as a means of improving classification performance. Instead of optimizing classification boundaries based on some measure of distance to points, this work directly optimizes the decision surface, essentially turning a classification problem into a regression problem. We demonstrate that our approach is competitive in comparison to other classification algorithms for imbalanced classes, in addition to achieving different properties.