Abstract
This study presents an efficient approach for large-scale data training. To deal with the rapid growth of training complexity for big data analysis, a novel mechanism, which utilizes fast kernel ridge regression (Fast KRR) and ridge support vector machines (Ridge SVMs), is proposed in this study. Firstly, Fast KRR based on low-order intrinsic-space computation is developed. Preliminary support vectors are located by using Fast KRR. Subsequently, the system iteratively removes indiscriminant data until a Ridge SVM with a high-order kernel can accommodate the data size and generate a hyperplane. To speed up the removal of indiscriminant data, quick intrinsic-matrix rebuilding is devised in the iteration. Experiments on three databases were carried out for evaluating the proposed method. Moreover, different percentages of data removal were examined in the test. The results show that the performance is enhanced by as high as 78–152 folds. Besides, the mechanisms still maintain the accuracy. These findings thereby demonstrate the effectiveness of the proposed idea.





Similar content being viewed by others
References
Kung S-Y (2014) Kernel methods and machine learning. Cambridge University Press, Cambridge
Chen B-W, Chen C-Y, Wang J-F (2013) Smart homecare surveillance system: behavior identification based on state transition support vector machines and sound directivity pattern analysis. IEEE Trans Syst Man Cybern Syst 43(6):1279–1289
Guo Q, Chen B-W, Jiang F, Ji X, Kung S-Y (2015) Efficient divide-and-conquer classification based on feature-space decomposition. Retrieved from http://arxiv.org/abs/1501.07584
Kung S-Y, Wu P-Y (2012) On efficient learning and classification kernel methods. In: Proceedings of 2012 IEEE international conference on acoustics, speech, and signal processing (ICASSP 2012), Kyoto, pp 2065–2068
Divide and conquer algorithms. Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/wiki/Divide_and_conquer_algorithms
Chang EY, Zhu K, Wang H, Bai H, Li J, Qiu Z, Cui H (2007) PSVM: parallelizing support vector machines on distributed computers. In: Proceedings of 21st annual conference neural information processing system (NIPS 2007). Vancouver, pp 257–264
Gu Q, Han J (2013) Clustered support vector machines. In: Proceedings of 16th international conference artificial intelligence and statistics (AISTATS). Scottsdale, pp 307–315
Zhang Y, Duchi J, Wainwright M (2013) Divide and conquer kernel ridge regression. In: Proceedings of conference on learning theory (Colt 2013), Princeton, pp 592–617
Hsieh C-J, Si S, Dhillon IS (2013) A divide-and-conquer solver for kernel support vector machines. In: Proceedings of 31st international conference on machine learning (ICML 2013). Beijing
Lin C-J, Moré JJ (1999) Incomplete Cholesky factorizations with limited memory. SIAM J Sci Comput 21(1):24–45
Ralaivola L, d’Alché-Buc F (2001) Incremental support vector machine learning: a local approach. In: Proceedings of international conference on artificial neural networks (ICANN 2001). Vienna, pp 322–330
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. In: Proceedings of 7th ACM international conference on knowledge discovery and data mining (SIGKDD 2001), San Francisco, pp 77–86
Cauwenberghs G, Poggio T (2000) Incremental and decremental support vector machine learning: In: Proceedings of 14th annual conference neural information processing system (NIPS 2000) Denver, pp 409–415
Diehl CP, Cauwenberghs G (2003) SVM incremental learning, adaptation and optimization. In: Proceedings of international joint conference on neural networks (IJCNN 2003), Portland, pp 2685–2690
Laskov P, Gehl C, Krüger S, Müller K-R (2006) Incremental support vector learning: analysis, implementation and applications. J Mach Learn Res 7:1909–1936
Karasuyama M, Takeuchi I (2010) Multiple incremental decremental learning of support vector machines. IEEE Trans Neural Netw 21(7):1048–1059
Shilton A, Palaniswami M, Ralph D, Tsoi AC (2005) Incremental training of support vector machines. IEEE Trans Neural Netw 16(1):114–131
Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods: support vector learning. MIT Press, Cambridge
Yu H-F, Hsieh C-J, Chang K-W, Lin C-J (2010) Large linear classification when data cannot fit in memory. In: Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2010). Washington, pp 833–842
Hsieh C-J, Chang K-W, Lin C-J, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear SVM. In: Proceedings of 25th international conference on machine learning. Helsinki, pp 408–415
Acknowledgments
The authors would like to thank Dr. Pei-Yuan Wu for providing source codes. This work was supported in part by the Ministry of Science and Technology, the Republic of China under Grant No. 103-2917-I-564-058. Part of this research is sponsored by the Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, under the open project No. 2015-4.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, BW., He, X., Ji, W. et al. Support vector analysis of large-scale data based on kernels with iteratively increasing order. J Supercomput 72, 3297–3311 (2016). https://doi.org/10.1007/s11227-015-1404-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1404-1