Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment

Ankita Bansal, Makul Saini, Rakshit Singh, Jai Kumar Yadav

Source Title: International Journal of Information Retrieval Research (IJIRR)11(2)

ISSN: 2155-6377|EISSN: 2155-6385|EISBN13: 9781799861997|DOI: 10.4018/IJIRR.2021040102

MLA

Bansal, Ankita, et al. "Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment." IJIRR vol.11, no.2 2021: pp.15-37. http://doi.org/10.4018/IJIRR.2021040102

APA

Bansal, A., Saini, M., Singh, R., & Yadav, J. K. (2021). Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment. International Journal of Information Retrieval Research (IJIRR), 11(2), 15-37. http://doi.org/10.4018/IJIRR.2021040102

Chicago

Bansal, Ankita, et al. "Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment," International Journal of Information Retrieval Research (IJIRR) 11, no.2: 15-37. http://doi.org/10.4018/IJIRR.2021040102

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

The tremendous amount of data generated through IoT can be imbalanced causing class imbalance problem (CIP). CIP is one of the major issues in machine learning where most of the samples belong to one of the classes, thus producing biased classifiers. The authors in this paper are working on four imbalanced datasets belonging to diverse domains. The objective of this study is to deal with CIP using oversampling techniques. One of the commonly used oversampling approaches is synthetic minority oversampling technique (SMOTE). In this paper, the authors have suggested modifications in SMOTE and proposed their own algorithm, SMOTE-modified (SMOTE-M). To provide a fair evaluation, it is compared with three oversampling approaches, SMOTE, adaptive synthetic oversampling (ADASYN), and SMOTE-Adaboost. To evaluate the performances of sampling approaches, models are constructed using four classifiers (K-nearest neighbour, decision tree, naive Bayes, logistic regression) on balanced and imbalanced datasets. The study shows that the results of SMOTE-M are comparable to that of ADASYN and SMOTE-Adaboost.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment

MLA

APA

Chicago

Export Reference

Abstract

Request Access