Abstract
Weighted twin support vector machine with local information (WLTSVM) is a novel algorithm for binary classification problems. It can exploit as much underlying correlation information as possible. Unfortunately, it remains challenging to apply WLTSVM into large-scale problems directly. Motivated by the sparse solution of WLTSVM, in this paper, a safe screening rule is proposed for WLTSVM, termed as SSR-WLTSVM. The SSR-WLTSVM can delete a majority of training samples before actually solving it to reduce the scale of WLTSVM. Therefore, computation time can be reduced greatly. More importantly, our screening rule is safe in the sense that the reduced problem can derive an identical optimal solution as the original one. Besides, a different neighbor k having a different effect on the performance of SSR-WLTSVM is further elaborated, that is, a bigger k will achieve a greater speedup. Sequential versions of SSR-WLTSVM are further introduced to substantially accelerate the parameter tuning process. And a fast algorithm clipDCD is introduced in this paper to handle large-scale datasets. In addition, Friedman test and paired-sample t test are used to verify the effectiveness of SSR-WLTSVM. Experimental results on 30 benchmark datasets confirm the efficiency of our proposed algorithm.
Similar content being viewed by others
References
Achlioptas D, Mcsherry F, Scholkopf B (2002) Sampling techniques for kernel methods. In: Conference and workshop on neural information processing systems, pp 335–342
Ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Bartz AE (1988) Basic statistical concepts. Macmillan Publishing Co., Inc, Indianapolis, USA, pp 52–63
Buchinsky M (1998) Recent advances in quantile regression models. J Hum Resour 33(1998):88–126
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Cai D, He XF, Zhou K, Han JW, Bao HJ (2007) Locality sensitive discriminant analysis. In: International joint conference on artificial intelligence, pp 708–713
Cao DW, Boley D (2006) On approximate solutions to support vector machines. In: SIAM international conference on data mining, pp 803–809
Chen SG, Wu XJ (2017) Multiple birth least squares support vector machine for multi-class classification. Int J Mach Learn Cybern 8(6):1731–1742
Chen XB, Yang J, Ye QL, Liang J (2011) Recursive projection twin support vector machine via within-class variance. Pattern Recognit 44(10):2643–2655
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. ACM SIGKDD Int Conf Knowl Discov Data Min 59(1–2):77–86
Fung G, Mangasarian OL (2005) Multicategory proximal support vector machine classifiers. Mach Learn 59(1):77–97
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and datamining: experimental analysis of power. Inf Sci 180(10):2044–2064
Ghaoui LE, Viallon V, Rabbani T (2010) Safe feature elimination in sparse supervised learning. Pac J Optim 8(4):667–698
Ghorai S, Mukherjee A, Dutta PK (2009) Nonparallel plane proximal classifier. Signal Process 89(4):510–522
Güler O (2010) Foundations of optimization. Springer, Berlin
Jayadeva KR, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Jin Z, Ying Z, Wei L (2001) A simple resampling method by perturbing the minimand. Biometrika 88:381–390
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Proceedings of European conference on machine learning, pp 137–142
Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543
Lafon S, Keller Y, Coifman R (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797
Lee Y, Mangasarian OL (2001) RSVM: reduced support vector machines. In: Proceedings of 1st SIAM international conference on data mining, pp 325–361
Li YJ, Leng QK, Fu YZ (2017a) Cross kernel distance minimization for designing support vector machines. Int J Mach Learn Cybern 8(5):1585–1593
Li HX, Zhang LB, Zhou XZ, Huang B (2017b) Cost-sensitive sequential three-way decision modeling using a deep neural network. Int J Approx Reason 85:68–78
Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine. http://archieve.ics.uci.edu/ml
Luo C, Li TR, Chen HM, Fujita H, Zhang Y (2018) Incremental rough set approach for hierarchical multicriteria classification. Inf Sci 429:72–87
Ogawa K, Suzuki Y, Takeuchi I (2013) Safe screening of non-support vectors in pathwise SVM computation. In: International conference on machine learning, pp 1382–1390
Osuna E, Freund R, Girosi F (1997) Training support vector machines: an application to face detection. In: IEEE conference on computer vision and pattern recognition, pp 130–136
Pan XL, Xu YT (2016) Two effective sample selection methods for support vector machine. J Intell Fuzzy Syst 30(2):659–670
Pan XL, Yang ZJ, Xu YT, Wang LS (2017) Safe screening rules for accelerating twin support vector machine classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2017.2688182
Pang XY, Xu C, Xu YT (2018) Scaling KNN multi-class twin support vector machine via safe instance reduction. Knowl Based Syst 148:17–30
Peng XJ (2010) TSVR: an efficient twin support vector machine for regression. Neural Netw 23(3):365–372
Peng XJ (2011) TPMSVM: a novel twin parametric-margin support vector machine for pattern recognition. Pattern Recognit 44(10):2678–2692
Peng XJ, Chen DJ, Kong LY (2014) A clipping dual coordinate descent algorithm for solving support vector machines. Knowl Based Syst 71:266–278
Qi ZQ, Tian YJ, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316
Shao YH, Chen WJ, Zhang JJ, Wang Z, Deng NY (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167
Tanveer M, Shubham K (2017) A regularization on Lagrangian twin support vector regression. Int J Mach Learn Cybern 8(3):807–821
Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6(2005):363–392
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Wang J, Zhou J, Wonka P, Ye JP (2013) Lasso screening rules via dual polytope projection. In: Proceedings of NIPS, pp 1070–1078
Wang J, Wonka P, Ye JP (2014) Scaling SVM and least absolute deviations via exact data reduction. In: International conference on machine learning
Wang J, Wonka P, Ye JP (2015) Lasso screening rules via dual polytope projection. J Mach Learn Res 16(1):1063–1101
Xiang ZJ, Ramadge PJ (2012) Fast lasso screening tests based on correlations. In: Proceedings of 37th IEEE international conference on acoustic speech signal process, pp 2137–2140
Xiang ZJ, Xu H, Ramadge PJ (2011) Learning sparse representations of high dimensional data on large scale dictionaries. Int Conf Neural Inf Process Syst 24:900–908
Xu YT, Wang LS (2012) A weighted twin support vector regression. Knowl Based Syst 33(3):92–101
Xu YT, Wang LS, Zhong P (2012a) A rough margin-based \(\nu \)-twin support vector machine. Neural Comput Appl 21(6):1307–1317
Xu YT, Lv X, Xi W, Guo R (2012b) An improved least squares twin support vector machine. J Inf Comput Sci 9(4):1063–1071
Xu YT, Guo R, Wang LS (2013) A twin multi-class classification support vector machine. Cognit Comput 5(4):580–588
Yan SC, Xu D, Zhang BY, Zhang HJ (2005) Graph embedding: a general framework for dimensionality reduction. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 830–837
Yang ZJ, Xu YT (2018) A safe screening rule for Laplacian support vector machine. Eng Appl Artif Intell 67:309–316
Yang XB, Chen SC, Chen B, Pan ZS (2009) Proximal support vector machine using local information. Neurocomputing 73(1–3):357–365
Ye QL, Zhao CX, Gao SB, Zheng H (2012) Weighted twin support vector machines with local information and its application. Neural Netw 35(11):31–39
Yu H, Yang J, Han JW (2003) Classifying large data sets using SVMs with hierarchical clusters. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 306–315
Zhao J, Xu YT (2017) A safe sample screening rule for Universum support vector machines. Knowl Based Syst 138:46–57
Acknowledgements
The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. This study was funded by National Natural Science Foundation of China (No. 11671010) and Beijing National Natural Science Foundation (No. 4172035).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Author Xinying Pang declares that she has no conflict of interest. Author Yitian Xu declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pang, X., Xu, Y. A safe screening rule for accelerating weighted twin support vector machine. Soft Comput 23, 7725–7739 (2019). https://doi.org/10.1007/s00500-018-3397-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3397-1