Skip to main content

Local resampling for locally weighted Naïve Bayes in imbalanced data

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

Locally Weighted Naïve Nayes (LWNB) method establishes a weighted Naïve Bayes model in different neighborhoods of each query point. LWNB, like other classification methods, is affected by class imbalance. The class imbalance problem is the case where the class variable has a skewed distribution and causes the classification algorithms to be biased towards the majority class. It is possible to overcome this problem with resampling approaches such as undersampling and oversampling. Resampling on the data set may not reflect correctly on local regions, since regions are assumed to be independent of outside. Therefore, local regions should be considered without outside interference. In this study, we proposed a novel resampling approach that is applicable for both undersampling and oversampling. We examined how the imbalance of the data set should be reflected in each local region and aimed to prevent the imbalance problem by resampling data in the local regions separately. In this method, we calculated the appropriate resampling rate and the number of neighbors for each local region based on the data imbalance rate and the resampling rate which can be decided by the researcher. The proposed approach was compared with the classical resampling approaches on 25 datasets that are frequently used in the literature and achieved promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and material

The research only uses openly available datasets.

Code availability

Codes are available at https://github.com/fatihsaglam/Locally-Resampling.

References

  1. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Lazy learning 11–73

  2. Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stati Assoc 83(403):596–610

    Article  Google Scholar 

  3. Naes T, Isaksson T, Kowalski B (1990) Locally weighted regression and scatter correction for near-infrared reflectance data. Anal Chem 62(7):664–673

    Article  Google Scholar 

  4. Zhang X, Kano M, Li Y (2017) Locally weighted kernel partial least squares regression based on sparse nonlinear features for virtual sensing of nonlinear time-varying processes. Comput Chem Eng 104:164–171

    Article  Google Scholar 

  5. Wei L et al (2020) Locally weighted moving regression: a non-parametric method for modeling nanofluid features of dynamic viscosity. Phys A Stat Mech Appl 550:124124

    Article  Google Scholar 

  6. Wang Y, Xiang S, Pan C, Wang L, Meng G (2013) Level set evolution with locally linear classification for image segmentation. Pattern Recognit 46(6):1734–1746

    Article  Google Scholar 

  7. Bevilacqua M, Marini F (2014) Local classification: Locally weighted-partial least squares-discriminant analysis (lw-pls-da). Anal chimica acta 838:20–30

    Article  Google Scholar 

  8. Pan Z, Wang Y, Pan Y (2020) A new locally adaptive k-nearest neighbor algorithm based on discrimination class. Knowl Based Syst 204:106185

    Article  Google Scholar 

  9. Yen HPH et al (2021) Locally weighted learning based hybrid intelligence models for groundwater potential mapping and modeling: A case study at gia lai province, vietnam. Geosci Front 12(5):101154

    Article  Google Scholar 

  10. Tuyen TT et al (2021) Mapping forest fire susceptibility using spatially explicit ensemble models based on the locally weighted learning algorithm. Ecol Inf 63:101292

    Article  Google Scholar 

  11. Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286

    Article  Google Scholar 

  12. Frank E, Hall M. & Pfahringer B (2012) Locally weighted naive bayes. arXiv preprint arXiv:1212.2487

  13. I Tomek (1976) Two modifications of cnn. IEEE Trans syst man cybern. 1976 6 11: 769-772

  14. Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybernet 3:408–421

    Article  MathSciNet  Google Scholar 

  15. Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2009) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybernet Part A Syst Humans 40(1):185–197

    Article  Google Scholar 

  16. Laurikkala J, Quaglini S, Barahona P, Andreassen S (2001) Improving identification of difficult small classes by balancing class distribution. In: Quaglini S, Barahona P, Andreassen S (eds) Artificial Intelligence in Medicine. Springer, Berlin, pp 63–66

    Chapter  Google Scholar 

  17. Bach M, Werner A, Palt M (2019) The proposal of undersampling method for learning from imbalanced datasets. Proc Comput Sci 159:125–134

    Article  Google Scholar 

  18. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intel Res 16:321–357

    Google Scholar 

  19. Han H, Wang WY, Mao BH, Huang DS, Zhang XP, Huang GB (2005) (eds) Borderline-smote: A new over-sampling method in imbalanced data sets learning. In: Huang D-S, Zhang X-P, Huang G-B (eds) Advances in Intelligent Computing. Springer, Berlin, pp 878–887

    Chapter  Google Scholar 

  20. He H, Bai Y, Garcia EA. & Li S (2008) Unknown (ed.) Adasyn: Adaptive synthetic sampling approach for imbalanced learning. (ed.Unknown) In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322–1328

  21. Siriseriwan W, Sinapiromsaran K (2017) Adaptive neighbor synthetic minority oversampling technique under 1nn outcast handling. Songklanakarin J Sci Technol 39(5):565–576

    Google Scholar 

  22. Barua S, Islam MM, Yao X, Murase K (2012) Mwmote-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425

    Article  Google Scholar 

  23. Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2012) Dbsmote: density-based synthetic minority over-sampling technique. Appl Intel 36(3):664–684

    Article  Google Scholar 

  24. Douzas G, Bacao F (2019) Geometric smote a geometrically enhanced drop-in replacement for smote. Inf sci 501:118–135

    Article  Google Scholar 

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatih Sağlam.

Ethics declarations

Conflict of interest

We have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

The paper is not currently being considered for publication elsewhere. No human or animal involved in this research.

Consent to participate

There are no human or animal participants in the study.

Consent to publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sağlam, F., Cengiz, M.A. Local resampling for locally weighted Naïve Bayes in imbalanced data. Computing 106, 185–200 (2024). https://doi.org/10.1007/s00607-023-01219-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-023-01219-0

Keywords

Mathematics Subject Classification