skip to main content
10.1145/3293663.3293686acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaivrConference Proceedingsconference-collections
research-article

The Application of SMOTE Algorithm for Unbalanced Data

Published: 23 November 2018 Publication History

Abstract

The current power user data is unbalanced when it is used to analyze the behavior of the leakage user. In other words, the normal user data and the leakage user data have an inconsistent scale. When the automatic identification model of the leakage user is established, the analysis of the information of the leakage user's behavior feature is not clear, which leads to the reduction of the model's efficiency of classification. In this paper, we use Python and deal with the leakage user data based on SMOTE algorithm to increase the basic information of the users and extract more accurate leakage user behavior characteristics.

References

[1]
Allwein. E. L, R. E. Schapire, and Y. Singer. "Reducing multiclass to binary: A unifying approach for margin classifiers." Journal of Machine Learning Research, 2000, 1:113--141.
[2]
Chawla. N. V, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. "SMOTE: Synthetic minority over-sampling technique." Journal of Artificial Intelligence Research, 2002, 16:321--357.
[3]
Allwein. E. L, R. E. Schapire, and Y. Singer. "Reducing multiclass to binary: A unifying approach for margin classifiers." Journal of Machine Learning Research, 2000, 1:113--141.
[4]
Boyd. S, and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004.
[5]
CHAWLA. N. V, JAPKOWICZ. N, KOLCZ. A, Editorial: Special Issue on learning from Imbalanced Data Sets. ACM SIGKDD Explorations Newsletter, 2004--6(1):1--6.
[6]
SUN. Y, KAMEL. M. S, WANG. Y, Boosting for Learning Multiple Classes with Imbalanced Class Distribution. Int. Conf Data Mining, 2006: 592--602.
[7]
JO. T, JAPKOWICZ. N, Class Imbalances versus Small Disjuncts. ACM SIGKDD Explorations Newsletter, 2004, 6(1):40--49.
[8]
Franciso. FN, Cesar. HM, Pedro. AG, A dynamic over-sampling procedure based on sensitivity or muli-class problems. Pattern Recognition, 2011, 44: 1821--1833.
[9]
Alberto. F, Matia. J, Franciso. H, On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets. Expert Systems with Applications, 2009, 36: 9805--9812.
[10]
Lee. Y, Y. lin, and G. Wahba, "Multicategory support vector machines, theory, and application to the classification of microarray data and satellite radiance data." Journal of the American Statistical Association, 2004, 99(465): 67--81.
[11]
KUBAT. M, HOLTE. R. C, MATW. IN. S, Machine learning for the detection of oil spills in satellite radar images. Machine Learning, 1998.30(223): 195--215.
[12]
LIU. Y. H, CHEN. Y. T, Face recognition using total marginbased adaptive fuzzy support vector machines. IEEE Transactions on Neural Networks, 2007: 178--192.
[13]
JAPKOWICZ. N, Learning from Imbalanced Data Sets. Am Assoc for Artificial Intelligence(AAAI)Workshop, 2000.
[14]
Batuwita. R, Palade. V, Class imbalance learning methods for support vector machines. Imbalanced learning: Foundations, algorithms, and applications, 2013: 83--99.
[15]
Barua. S, Islam. M. M. Yao, et al. MWMOTE--majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans on Knowledge and Data Engineering, 2014, 26(2): 405--425.
[16]
Batuwita. R, Palade. V, FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans on Fuzzy Systems, 2010, 18(3): 558--571.
[17]
Branco. P, Torgo. L, Ribeiro. R. P, A Survey of Predictive Modeling on Imbalanced Domains. ACM Computing Surveys (CSUR), 2016, 49(2): 31--80.
[18]
Chang. C. C, Lin. C. J, LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1--27.
[19]
Zhang. S, Sadaoui. S, Mouhoub. M, An empirical analysis of imbalanced data classification. Computer and Information Science, 2015, 8(1): 151--162.
[20]
JO. T, JAPKOWICZ. N, Class Imbalances versus Small Disjuncts. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 40--49.
[21]
BATISTA. G, PRATI. R. C, MONARD. M. C, A Study of the Behavior of Several Methods for Balancing Machine Learning Traing Data. ACM SIGKDD Explorations Newsletter, 2004, 6(1):20--29.

Cited By

View all
  • (2024)Handling Imbalanced Data for Credit Card Fraudulent Detection: A Machine Learning ApproachAdvances in Artificial Intelligence and Machine Learning in Big Data Processing10.1007/978-3-031-73068-9_18(220-233)Online publication date: 1-Oct-2024
  • (2023)SMOTE on Numeric Breast Cancer Dataset to Overcome Imbalance Class2023 6th International Conference of Computer and Informatics Engineering (IC2IE)10.1109/IC2IE60547.2023.10331221(335-339)Online publication date: 14-Sep-2023
  • (2022)Majority-to-minority resampling for boosting-based classification under imbalanced dataApplied Intelligence10.1007/s10489-022-03585-253:4(4541-4562)Online publication date: 11-Jun-2022
  • Show More Cited By

Index Terms

  1. The Application of SMOTE Algorithm for Unbalanced Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIVR 2018: Proceedings of the 2018 International Conference on Artificial Intelligence and Virtual Reality
    November 2018
    144 pages
    ISBN:9781450366410
    DOI:10.1145/3293663
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of Tsukuba: University of Tsukuba

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 November 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Leakage users
    2. Oversampling strategy
    3. SMOTE algorithm
    4. Unbalanced data
    5. Under-sampling strategy

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AIVR 2018

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Handling Imbalanced Data for Credit Card Fraudulent Detection: A Machine Learning ApproachAdvances in Artificial Intelligence and Machine Learning in Big Data Processing10.1007/978-3-031-73068-9_18(220-233)Online publication date: 1-Oct-2024
    • (2023)SMOTE on Numeric Breast Cancer Dataset to Overcome Imbalance Class2023 6th International Conference of Computer and Informatics Engineering (IC2IE)10.1109/IC2IE60547.2023.10331221(335-339)Online publication date: 14-Sep-2023
    • (2022)Majority-to-minority resampling for boosting-based classification under imbalanced dataApplied Intelligence10.1007/s10489-022-03585-253:4(4541-4562)Online publication date: 11-Jun-2022
    • (2020)MRI Radiomics for the Prediction of Fuhrman Grade in Clear Cell Renal Cell Carcinoma: a Machine Learning Exploratory StudyJournal of Digital Imaging10.1007/s10278-020-00336-y33:4(879-887)Online publication date: 20-Apr-2020
    • (2020)Over-Sampling Multi-classification Method Based on Centroid SpaceBig Data and Security10.1007/978-981-15-7530-3_47(616-632)Online publication date: 14-Aug-2020
    • (2019)Improving Imbalanced Students’ Text Feedback Classification Using Re-sampling Based ApproachAdvances in Computational Intelligence Systems10.1007/978-3-030-29933-0_22(262-267)Online publication date: 30-Aug-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media