loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Yuxuan Yang ; Hadi Khorshidi and Uwe Aickelin

Affiliation: School of Computing and Information Systems, The University of Melbourne, Grattan Street, Parkville, Victoria, Australia

Keyword(s): Over-sampling, Diversity Optimisation, Genetic Algorithm, Imbalanced Data, Clustering.

Abstract: In many real-life classification tasks, the issue of imbalanced data is commonly observed. The workings of mainstream machine learning algorithms typically assume the classes amongst underlying datasets are relatively well-balanced. The failure of this assumption can lead to a biased representation of the models’ performance. This has encouraged the incorporation of re-sampling techniques to generate more balanced datasets. However, mainstream re-sampling methods fail to account for the distribution of minority data and the diversity within generated instances. Therefore, in this paper, we propose a data-generation algorithm, Cluster-based Diversity Over-sampling (CDO), to consider minority instance distribution during the process of data generation. Diversity optimisation is utilised to promote diversity within the generated data. We have conducted extensive experiments on synthetic and real-world datasets to evaluate the performance of CDO in comparison with SMOTE-based and diversi ty-based methods (DADO, DIWO, BL-SMOTE, DB-SMOTE, and MAHAKIL). The experiments show the superiority of CDO. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.21.104.109

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Yang, Y.; Khorshidi, H. and Aickelin, U. (2022). Cluster-based Diversity Over-sampling: A Density and Diversity Oriented Synthetic Over-sampling for Imbalanced Data. In Proceedings of the 14th International Joint Conference on Computational Intelligence (IJCCI 2022) - ECTA; ISBN 978-989-758-611-8; ISSN 2184-3236, SciTePress, pages 17-28. DOI: 10.5220/0011381000003332

@conference{ecta22,
author={Yuxuan Yang. and Hadi Khorshidi. and Uwe Aickelin.},
title={Cluster-based Diversity Over-sampling: A Density and Diversity Oriented Synthetic Over-sampling for Imbalanced Data},
booktitle={Proceedings of the 14th International Joint Conference on Computational Intelligence (IJCCI 2022) - ECTA},
year={2022},
pages={17-28},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011381000003332},
isbn={978-989-758-611-8},
issn={2184-3236},
}

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Computational Intelligence (IJCCI 2022) - ECTA
TI - Cluster-based Diversity Over-sampling: A Density and Diversity Oriented Synthetic Over-sampling for Imbalanced Data
SN - 978-989-758-611-8
IS - 2184-3236
AU - Yang, Y.
AU - Khorshidi, H.
AU - Aickelin, U.
PY - 2022
SP - 17
EP - 28
DO - 10.5220/0011381000003332
PB - SciTePress