Authors:
Terence Fusco
;
Yaxin Bi
;
Haiying Wang
and
Fiona Browne
Affiliation:
Computer Science Research Institute, Ulster University, Shore Road, Newtownabbey, Antrim and Northern Ireland
Keyword(s):
Optimisation, Over-sampling, Schistosomiasis, Synthetic Instance Generation, SMOTE, SMAC, SIMO.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Predictive Modeling
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
In this paper, research is presented for improving optimisation performance using sparse training data for disease vector classification. Optimisation techniques currently available such as Bayesian, Evolutionary and Global optimisation and are capable of providing highly efficient and accurate results however, performance potential can often be restricted when dealing with limited training resources. In this study, a novel approach is proposed to address this issue by introducing Sequential Model-based Algorithm Configuration(SMAC) optimisation in combination with Synthetic Minority Over-sampling Technique(SMOTE) for optimised synthetic prediction modelling. This approach generates additional synthetic instances from a limited training sample while concurrently seeking to improve best algorithm performance. As results show, the proposed Synthetic Instance Model Optimisation (SIMO) technique presents a viable, unified solution for finding optimum classifier performance when faced wit
h sparse training resources. Using the SIMO approach, noticeable performance accuracy and f-measure improvements were achieved over standalone SMAC optimisation. Many results showed significant improvement when comparing collective training data with SIMO instance optimisation including individual performance accuracy increases of up to 46% and a mean overall increase for the entire 240 configurations of 13.96% over standard SMAC optimisation.
(More)