loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Payel Sadhukhan 1 ; Arjun Pakrashi 2 ; Sarbani Palit 3 and Brian Mac Namee 2

Affiliations: 1 Institute for Advancing Intelligence, TCG CREST, Kolkata, India ; 2 School of Computer Science, University College, Dublin, Ireland ; 3 Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India

Keyword(s): Multi-Label, Imbalanced Learning, Unsupervised Clustering, Oversampling.

Abstract: There is often a mixture of very frequent labels and very infrequent labels in multi-label datasets. This variation in label frequency, a type class imbalance, creates a significant challenge for building efficient multi-label classification algorithms. In this paper, we tackle this problem by proposing a minority class oversampling scheme, UCLSO, which integrates Unsupervised Clustering and Label-Specific data Oversampling. Clustering is performed to find out the key distinct and locally connected regions of a multi-label dataset (irrespective of the label information). Next, for each label, we explore the distributions of minority points in the cluster sets. Only the intra-cluster minority points are used to generate the synthetic minority points. Despite having the same cluster set across all labels, we will use the label-specific class information to obtain a variation in the distributions of the synthetic minority points (in congruence with the label-specific class memberships w ithin the clusters) across the labels. The training dataset is augmented with the set of label-specific synthetic minority points, and classifiers are trained to predict the relevance of each label independently. Experiments using 12 multi-label datasets and several multi-label algorithms shows the competency of the proposed method over other competing algorithms in the given context. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.22.248.208

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Sadhukhan, P.; Pakrashi, A.; Palit, S. and Mac Namee, B. (2023). Integrating Unsupervised Clustering and Label-Specific Oversampling to Tackle Imbalanced Multi-Label Data. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-623-1; ISSN 2184-433X, SciTePress, pages 489-498. DOI: 10.5220/0011901200003393

@conference{icaart23,
author={Payel Sadhukhan. and Arjun Pakrashi. and Sarbani Palit. and Brian {Mac Namee}.},
title={Integrating Unsupervised Clustering and Label-Specific Oversampling to Tackle Imbalanced Multi-Label Data},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2023},
pages={489-498},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011901200003393},
isbn={978-989-758-623-1},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Integrating Unsupervised Clustering and Label-Specific Oversampling to Tackle Imbalanced Multi-Label Data
SN - 978-989-758-623-1
IS - 2184-433X
AU - Sadhukhan, P.
AU - Pakrashi, A.
AU - Palit, S.
AU - Mac Namee, B.
PY - 2023
SP - 489
EP - 498
DO - 10.5220/0011901200003393
PB - SciTePress