Skip to main content

z-SVM: An SVM for Improved Classification of Imbalanced Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Abstract

Recent literature has revealed that the decision boundary of a Support Vector Machine (SVM) classifier skews towards the minority class for imbalanced data, resulting in high misclassification rate for minority samples. In this paper, we present a novel strategy for SVM in class imbalanced scenario. In particular, we focus on orienting the trained decision boundary of SVM so that a good margin between the decision boundary and each of the classes is maintained, and also classification performance is improved for imbalanced data. In contrast to existing strategies that introduce additional parameters, the values of which are determined through empirical search involving multiple SVM training, our strategy corrects the skew of the learned SVM model automatically irrespective of the choice of learning parameters without multiple SVM training. We compare our strategy with SVM and SMOTE, a widely accepted strategy for imbalanced data, applied to SVM on five well known imbalanced datasets. Our strategy demonstrates improved classification performance for imbalanced data and is less sensitive to the selection of SVM learning parameters.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Vapnik, N.V.: The Nature of Statistical Learning Theory. Springer, New York (2000)

    MATH  Google Scholar 

  2. Schlkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  3. Begg, R., Palaniswami, M., Owen, B.: Support vector machines for automated gait classification. IEEE Trans. Biomedical Engineering 52(5), 828–838 (2005)

    Article  Google Scholar 

  4. Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection using neural networks and support vector machines. In: International Joint Conference on Neural Networks, vol. 2, pp. 1702–1707 (2002)

    Google Scholar 

  5. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  6. Drucker, H., Wu, D., Vapnik, N.V.: Support vector machines for spam categorization. IEEE Trans. Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  7. Yan, R., Liu, Y., Jin, R., Hauptmann, A.: On predicting rare classes with svm ensembles in scene classification. In: Proc. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 3, pp. III-21–24 (2003)

    Google Scholar 

  8. Liu, Y., An, A., Huang, X.: Boosting prediction accuracy on imbalanced datasets with svm ensembles. In: PAKDD, pp. 107–118 (2006)

    Google Scholar 

  9. Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 55–60 (1999)

    Google Scholar 

  10. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)

    MATH  Google Scholar 

  11. Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  13. Wu, G., Chang, E.Y.: Kba: Kernel boundary alignment considering imbalanced data distribution. IEEE Trans. Knowl. Data Eng. 17(6), 786–795 (2005)

    Article  Google Scholar 

  14. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: One-sided selection. In: ICML, pp. 179–186 (1997)

    Google Scholar 

  15. Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, London (1981)

    MATH  Google Scholar 

  16. Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of svms for very large scale problems. Neural Computation 14(5), 1105–1114 (2002)

    Article  MATH  Google Scholar 

  17. Fawcett, T.: Roc graphs: Notes and practical considerations for researchers (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Imam, T., Ting, K.M., Kamruzzaman, J. (2006). z-SVM: An SVM for Improved Classification of Imbalanced Data. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_30

Download citation

  • DOI: https://doi.org/10.1007/11941439_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49787-5

  • Online ISBN: 978-3-540-49788-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics