Skip to main content
Log in

Learning Imbalanced Classifiers Locally and Globally with One-Side Probability Machine

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We consider the imbalanced learning problem, where the data associated with one class are far fewer than those associated with the other class. Current imbalanced learning methods often handle this problem by adapting certain intermediate parameters so as to impose a bias on the minority data. However, most of these methods are in rigorous and need to adapt those factors via the trial-and-error procedure. Recently, a new model called Biased Minimax Probability Machine (BMPM) presents a rigorous and systematic work and has demonstrated very promising performance on imbalance learning. Despite its success, BMPM exclusively relies on global information, namely, the first order and second order data information; such information might be however unreliable, especially for the minority data. In this paper, we propose a new model called One-Side Probability Machine (OSPM). Different from the previous approaches, OSPM can lead to rigorous treatment on biased classification tasks. Importantly, the proposed OSPM exploits the reliable global information from one side only, i.e., the majority class, while engaging the robust local learning from the other side, i.e., the minority class. To our best knowledge, OSPM presents the first model capable of learning data both locally and globally. Our proposed model has also established close connections with various famous models such as BMPM, Support Vector Machine, and Maxi-Min Margin Machine. One appealing feature is that the optimization problem involved in the novel OSPM model can be cast as a convex second order conic programming problem with the global optimum guaranteed. A series of experimental results on three data sets demonstrate the advantages of our proposed methods over four competitive approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Readers could refer to [9] on why SVM is regarded as one typical local learning model.

  2. Note that it is straightforward to extend binary classification to multi-class problems via the strategy of one versus one or one versus others.

  3. Despite the linear version discussion in the previous sections, it is straightforward to extend OSPM to the non-linear case via standard kernelization.

References

  1. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge, MA

    Book  MATH  Google Scholar 

  2. Cardie C, Howe N (1997) Improving minority class prediction using case specific feature weights. In: Proceedings of the fourteen international conference on machine learning (ICML-1997). Morgan Kaufmann, San Francisco, CA, pp 57–65

  3. Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: sythetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  4. He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  5. Huang K, Yang H, King I, Lyu MR (2004) Learning classifiers from imbalanced data based on biased minimax probability machine. In: Proceedings of CVPR, vol 2, pp 558–563

  6. Huang K, Yang H, King I, Lyu MR (eds) (2004) Learning large margin classifiers locally and globally. In: Russ Greiner and Dale Schuurmansitors proceedings of the twenty-first international conference on machine learning (ICML-2004), pp 401–408

  7. Huang K, Yang H, King I, Lyu MR (2006) Maximizing sensitivity in medical diagnosis using biased minimax probability machine. IEEE Trans Biomed Eng 53:821–831

    Article  Google Scholar 

  8. Huang K, Yang H, King I, Lyu MR (2008) Machine learning: modeling data locally and gloablly. Springer, Berlin, ISBN 3-5407-9451-4

  9. Huang K, Yang H, King I, Lyu MR (2008) Maxi-min margin machine: learning large margin classifiers globally and locally. IEEE Trans Neural Netw 19:260–272

    Article  Google Scholar 

  10. Huang K, Yang H, King I, Lyu MR (2006) Imbalanced learning with biased minimax probability machine. IEEE Trans Syst Man Cybern Part B 36(4):913–923

    Article  Google Scholar 

  11. Huang K, Yang H, King I, Lyu MR, Chan L (2004) The minimum error minimax probability machine. J Mach Learn Res 5:1253–1286

    MATH  MathSciNet  Google Scholar 

  12. Huang K, Zheng D, Sun J, Hotta Y, Fujimoto K, Naoi S (2010) Sparse learning for support vector classification. Pattern Recognit Lett 31(13):1944–1951

    Article  Google Scholar 

  13. Krawczyk B, Wozniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput 14:554–562

    Article  Google Scholar 

  14. Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of the fourteen international conference on machine learning (ICML-1997). Morgan Kaufmann, San Francisco, CA, pp 179–186

  15. Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2002) A robust minimax approach to classification. J Mach Learn Res 3:555–582

    Google Scholar 

  16. Ling C, Li C (1998) Data mining for direct marketing:problems and solutions. In: Proceedings of the fourth international conference on knowledge discovery and data mining (KDD-1998). AAAI Press, Menlo Park, CA, pp 73–79

  17. Lobo M, Vandenberghe L, Boyd S, Lebret H (1998) Applications of second order cone programming. Linear Algebra Appl 284:193–228

    Article  MATH  MathSciNet  Google Scholar 

  18. Maloof MA, Langley P, Binford TO, Nevatia R, Sage S (2003) Improved rooftop detection in aerial images with machine learning. Mach Learn 53:157–191

    Article  Google Scholar 

  19. Provost F (2000) Learning from imbanlanced data sets. In: Proceedings of the seventeenth national conference on artificial intelligence (AAAI-2000)

  20. Schmidt P, Witte A (1988) Predicting recidivism using survival models. Springer, New York

    Book  Google Scholar 

  21. Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, New York

    Book  MATH  Google Scholar 

  22. Zhang R, Huang K (2013) One-side probability machine: Learning imbalanced classifiers locally and globally. In: Proceedings of international conference on neural information processing (ICONIP-2013), pp 140–147

Download references

Acknowledgments

The research was partly supported by the National Basic Research Program of China (2012CB316301) and the National Natural Science Foundation of China (61075052 and 61105018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, K., Zhang, R. & Yin, XC. Learning Imbalanced Classifiers Locally and Globally with One-Side Probability Machine. Neural Process Lett 41, 311–323 (2015). https://doi.org/10.1007/s11063-014-9370-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-014-9370-9

Keywords

Navigation