Abstract
The maximum margin of twin spheres support vector machine (MMTSVM) is an effective method for the imbalanced data classification. However, the hinge loss is used in the MMTSVM and easily leads to sensitivity for the noises and instability for re-sampling. In contrast, the pinball loss is related to the quantile distance and less sensitive to noises. To enhance the performance of MMTSVM, we propose a maximum margin of twin spheres machine with pinball loss (Pin-MMTSM) for the imbalanced data classification in this paper. The Pin-MMTSM finds two homocentric spheres by solving a quadratic programming problem (QPP) and a linear programming problem (LPP). The small sphere captures as many majority samples as possible; and the large sphere pushes out most minority samples by increasing the margin between two homocentric spheres. Moreover, our Pin-MMTSM is equipped with noise insensitivity by employing the pinball loss. Experimental results on eighteen imbalanced datasets indicate that our proposed Pin-MMTSM yields a good generalization performance.
Similar content being viewed by others
References
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Xu Y, Wang L (2005) Fault diagnosis system based on rough set theory and support vector machine (LNCS 3614). Springer, Heidelberg, pp 980–988
Manevitz LM, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2(1):139–154
Zhang W, Yoshida T, Tang X (2008) Text classification based on multiword with support vector machine. Knowl Based Syst 21(8):879–886
Thongkam J, Xu G, Zhang Y, Huang F (2008) Support vector machine for outlier detection in Breast Cancer survivability prediction (LNCS 4977). Springer, Heidelberg, pp 99–109
Zhang Y, Meratnia N, Havinga P (2009) Adaptive and online one-class support vector machine-based outlier detection techniques for wireless sensor networks. In: Proceedings of the International Conference Advances in Information Network Applied Workshops, Bradford, pp 990–995
Pang Y, Zhang K, Yuan Y, Wang K (2014) Distributed object detection with linear SVMs. IEEE Trans Cybern 44(11): 2122–2133
Dhar S, Cherkassky V (2015) Development and evaluation of costsensitive universum-SVM. IEEE Trans Cybern 45(4):806–818
Liu Z, et al (2014) A three-domain fuzzy support vector regression for image denoising and experimental studies. IEEE Trans Cybern 44(4):516–525
Xu J, et al (2015) The generalization ability of SVM classification based on Markov sampling. IEEE Trans Cybern 45(6):1169–1179
Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Fung G, Mangasarian O (2001) Proximal support vector machine classifiers. In: Proceedings of 7th Conference Knowledge Discovery Data Mining, San Francisco, 77–86
Ghorai S, Mukherjee A, Dutta PK (2009) Nonparallel plane proximal classifier. Signal Process 89 (4):510–522
Fung GM, Mangasarian OL (2005) Multicategory proximal support vector machine classifiers. Mach Learn 59(1–2):77–97
Tian Y, Qi Z, Ju X, Shi Y, Liu X (2014) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079
Peng X (2010) A ν-twin support vector machine (ν-TSVM) classifier and its geometric algorithms. Inf Sci 180(20):3863–3875
Xu Y, Wang L, Zhong P (2012) A rough margin-based ν-twin support vector machine. Neural Comput Appl 21(6):1307–1317
Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4): 7535–7543
Peng X (2010) TSVR: An efficient twin support vector machine for regression. Neural Netw 23(3):365–372
Xu Y, Wang L (2012) A weighted twin support vector regression. Knowl Based Syst 33:92–101
Xu Y, Yang Z, Pan X (2017) A novel twin support vector machine with pinball loss. IEEE Transactions on Neural Networks and Learning Systems 28(2):p359–p370
Shao Y, Deng N (2012) A coordinate descent margin based-twin support vector machine for classification. Neural Netw 25: 114–121
Wang XZ, He Q, Chen DG, Yeung D (2005) A genetic algorithm for solving the inverse problem of support vector machines. Neurocomputing 68:225–238
Lu SX, Wang XZ, Zhang GQ, Zhou X (2015) Effective algorithms of the Moore-Penrose inverse matrices for extreme learning machine. Intell Data Anal 19(4):743–760
Peng X, Xu D (2013) A twin-hypersphere support vector machine classifier and the fast learning algorithm. Inf Sci 221:12–27
Wang XZ, et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54:45–66
Scholkopf B, Platt JC, Shawe-Taylor JC, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471
Xu Y, Liu C (2013) A rough margin-based one class support vector machine. Neural Comput Appl 22 (6):1077–1084
Ben-Hur A, Horn D, Siegelmann HT, Vapnik V (2002) Support vector clustering. J Mach Learn Res 2:125–137
Bicego M, Figueiredo MAT (2009) Soft clustering using weighted one-class support vector machines. Pattern Recognit 42(1):27–32
Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: One-sided selection. In: Proceedings 14th International Conference on Machine Learning. ICML, Nashville, pp 179–186
Choi YS (2009) Least squares one-class support vector machine. Pattern Recognit Lett 30(13):1236–1240
Cao LJ, Lee HP, Chong WK (2003) Modified support vector novelty detector using training data with outliers. Pattern Recognit Lett 24(14):2479–2487
Wu G, Chang EY (2003) Class-boundary alignment for imbalanced dataset learning. In: Proceedings of ICML Workshop Learn. Imbalanced Data Sets II, Washington, pp 49–56
Hao PY, Chiang JH, Lin Y-H (2009) A new maximal-margin spherical-structured multi-class support vector machine. Appl Intell 30(2):98–111
Cano A, Zafra A, Ventura S (2013) Weighted data gravitation classification for standard and imbalanced data. IEEE Trans Cybern 43(6):1672–1687
Wang XZ, Musa AB (2014) Advances in neural network based learning. Int J Mach Learn Cybern 5(1):1–2
Xu Y (2016) Maximum margin of twin spheres support vector machine for imbalanced data classification, IEEE transactions on cybernetics, doi:10.1109/TCYB.2016.2551735
Xu Y, Yang Z, Zhang Y, Pan X, Wang L (2016) A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowl A maximum Based Syst 95:75–85
Huang X, Shi L, Suykens J (2014) Support vector machine classifier with pinball loss. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(5):984–997
Chang WC, Lee CP, Lin CJ (2013) Dept. Comput. Sci., Nat. Taiwan Univ., Taipei, Taiwan, Tech. Rep.
Steinwart I, Christmann A (2007) How SVMs can estimate quantiles and the median. In: Proceedings of NIPS, Vancouver, pp 305–312
Jumutc V, Huang X, Suykens JAK (2013) Fixed-size Pegasos for hinge and pinball loss SVM. In: Proceedings of International Joint Conference Neural Network, Dallas, pp 1122–1128
Steinwart I, Christmann A (2011) Estimating conditional quantiles with the help of the pinball loss. Bernoulli 17(1):211–225
Huang X, Shi L, Pelckmans K, et al (2014) Asymmetric ν-tube support vector regression. Comput Stat Data Anal 77:371–382
Huang X, Shi L, Suykens JAK (2014) Asymmetric least squares support vector machine classifiers. Comput Stat Data Anal 70:395–405
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press
Khemchandani R, Jayadeva SC (2009) Optimal kernel selection in twin support vector machines. Optim Lett 3(1):77–88
Maldonado S, Weber R, Famili F (2014) Feature selection for high-dimensional class-imbalanced data sets using support vector machines. Inf Sci 286:228–246
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation, vol 4304. Springer, Berlin, pp 1015–1021
Acknowledgements
The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. This work was supported in part by Beijing Natural Science Foundation (No. 4172035) and National Natural Science Foundation of China (No. 11671010).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, Y., Wang, Q., Pang, X. et al. Maximum margin of twin spheres machine with pinball loss for imbalanced data classification. Appl Intell 48, 23–34 (2018). https://doi.org/10.1007/s10489-017-0961-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-017-0961-9