z-SVM: An SVM for Improved Classification of Imbalanced Data

Imam, Tasadduq; Ting, Kai Ming; Kamruzzaman, Joarder

doi:10.1007/11941439_30

z-SVM: An SVM for Improved Classification of Imbalanced Data

Tasadduq Imam²⁰,
Kai Ming Ting²⁰ &
Joarder Kamruzzaman²⁰

Conference paper

4144 Accesses
49 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Abstract

Recent literature has revealed that the decision boundary of a Support Vector Machine (SVM) classifier skews towards the minority class for imbalanced data, resulting in high misclassification rate for minority samples. In this paper, we present a novel strategy for SVM in class imbalanced scenario. In particular, we focus on orienting the trained decision boundary of SVM so that a good margin between the decision boundary and each of the classes is maintained, and also classification performance is improved for imbalanced data. In contrast to existing strategies that introduce additional parameters, the values of which are determined through empirical search involving multiple SVM training, our strategy corrects the skew of the learned SVM model automatically irrespective of the choice of learning parameters without multiple SVM training. We compare our strategy with SVM and SMOTE, a widely accepted strategy for imbalanced data, applied to SVM on five well known imbalanced datasets. Our strategy demonstrates improved classification performance for imbalanced data and is less sensitive to the selection of SVM learning parameters.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vapnik, N.V.: The Nature of Statistical Learning Theory. Springer, New York (2000)
MATH Google Scholar
Schlkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Begg, R., Palaniswami, M., Owen, B.: Support vector machines for automated gait classification. IEEE Trans. Biomedical Engineering 52(5), 828–838 (2005)
Article Google Scholar
Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection using neural networks and support vector machines. In: International Joint Conference on Neural Networks, vol. 2, pp. 1702–1707 (2002)
Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Chapter Google Scholar
Drucker, H., Wu, D., Vapnik, N.V.: Support vector machines for spam categorization. IEEE Trans. Neural Networks 10(5), 1048–1054 (1999)
Article Google Scholar
Yan, R., Liu, Y., Jin, R., Hauptmann, A.: On predicting rare classes with svm ensembles in scene classification. In: Proc. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 3, pp. III-21–24 (2003)
Google Scholar
Liu, Y., An, A., Huang, X.: Boosting prediction accuracy on imbalanced datasets with svm ensembles. In: PAKDD, pp. 107–118 (2006)
Google Scholar
Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 55–60 (1999)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)
MATH Google Scholar
Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Chapter Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Google Scholar
Wu, G., Chang, E.Y.: Kba: Kernel boundary alignment considering imbalanced data distribution. IEEE Trans. Knowl. Data Eng. 17(6), 786–795 (2005)
Article Google Scholar
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: One-sided selection. In: ICML, pp. 179–186 (1997)
Google Scholar
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, London (1981)
MATH Google Scholar
Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of svms for very large scale problems. Neural Computation 14(5), 1105–1114 (2002)
Article MATH Google Scholar
Fawcett, T.: Roc graphs: Notes and practical considerations for researchers (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Gippsland School of Information Technology, Monash University, Australia
Tasadduq Imam, Kai Ming Ting & Joarder Kamruzzaman

Authors

Tasadduq Imam
View author publications
You can also search for this author in PubMed Google Scholar
Kai Ming Ting
View author publications
You can also search for this author in PubMed Google Scholar
Joarder Kamruzzaman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DisPRR, National ICT Australia Ltd, QLD, Australia
Abdul Sattar
School of Computing, University of Tasmania, Sandy Bay, 7005, Tasmania, Australia
Byeong-ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Imam, T., Ting, K.M., Kamruzzaman, J. (2006). z-SVM: An SVM for Improved Classification of Imbalanced Data. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_30

Download citation

DOI: https://doi.org/10.1007/11941439_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics