Abstract
Multiple kernel learning (MKL) has recently become a hot topic in kernel methods. However, many MKL algorithms suffer from high computational cost. Moreover, standard MKL algorithms face the challenge of the rapid development of distributed computational environment such as cloud computing. In this study, a framework for parallel multiple kernel learning (PMKL) using hybrid alternating direction method of multipliers (H-ADMM) is developed to integrate the MKL algorithms and the multiprocessor system. The global problem with multiple kernel is divided into multiple local problems each of which is optimized in a local processor with a single kernel. An H-ADMM is proposed to make the local processors coordinate with each other to achieve the global optimal solution. The results of computational experiments show that PMKL exhibits high classification accuracy and fast computational speed.
Similar content being viewed by others
References
Vapnik VN (1998) Statistic learning theory. Wiley, New York
Scholkopf B, Smolla A (2002) Learning with kernels-support vector machines, regularization, optimization and beyond. MIT press, Cambridge
Parrado-Hernandez E, Arenas-Garcia J, Mora-Jimenez I et al (2003) On problem-oriented kernel refining. Neurocomputing 55:135–150
Abbasnejad M, Ramachandram D, Mandava R (2012) A survey of the state of the art in learning the kernels. Knowl Inf Syst 31:193–221
Gonen M, Alpaydın E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268
Chapelle O, Vapnik VN, Bousquet O et al (2002) Choosing multiple parameters for support vector machines. Mach Learn 46:131–159
Gunn SR, Kandola JS (2002) Structural modeling with sparse kernels. Mach Learn 48:137–163
Bi JB, Zhang T, Bennett KP (2004) Column-generation boosting methods for mixture of kernels. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 521–526
Bach FR, Lanckrient GRG, Jordan MI (2004) Multiple kernel learning, conic duality and the SMO algorithm. In: Russell G, Dale S (eds) Proceedings of the twenty first international conference on machine learning. ACM, New York, NY, USA, pp 41–48
Lanckrient GRG, Cristianini N, Bartlett P et al (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:527–572
Lanckriet GRG, Bie TD, Cristianini N et al (2004) A statistical framework for genomic data fusion. Bioinformatics 20:2626–2635
Sonnenburg S, Ratsch G, Schafer C et al (2006) Large scale multiple kernel learning. J Mach Learn Res 1:1–18
Chen ZY, Li JP, Wei LW (2007) A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue. Artif Intell Med 41:161–175
Chen ZY, Fan ZP, Sun M (2012) A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data. Eur J Oper Res 223:461–472
Rakotomamonjy A, Bach FR, Canu S et al (2008) SimpleMKL. J Mach Learn Res 9:2491–2521
Chen ZY, Fan ZP (2012) Distributed customer behavior prediction using multiplex data: a collaborative MK-SVM approach. Knowl Based Syst 35:111–119
Wang S, Huang Q, Jiang S et al (2012) \(\text{ S }^{3}\)MKL: Scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans Multimed 14:1259–1274
Szafranski M, Grandvalet Y, Rakotomamonjy A (2010) Composite kernel learning. Mach Learn 79:73–103
Wu M, Scholkopf B, Bakir G (2006) A direct method for building sparse kernel learning algorithms. J Mach Learn Res 7:603–624
Subrahmanya N, Shin YC (2010) Sparse multiple kernel learning for signal processing applications. IEEE Trans Pattern Anal Mach Intell 32:788–798
Hu M, Chen Y, Kwok JTY (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans Neural Netw 20:827–839
Chen ZY, Li JP, Wei LW et al (2011) Multiple kernel support vector machine based multiple tasks oriented data mining system for gene expression data analysis. Expert Syst Appl 38:12151–12159
Gonen M, Alpaydın E (2010) Supervised learning of local projection kernels. Neurocomputing 73:1694–1703
Domeniconi C, Peng J, Yan BJ (2011) Composite kernels for semi-supervised clustering. Knowl Inf Syst 28:99–116
Zhu X, Li B, Wu X et al (2011) CLAP: collaborative pattern mining for distributed information systems. Decis Support Syst 52:40–51
Marston S, Li Z, Bandyopadhyay S et al (2010) Cloud computing—the business perspective. Decis Support Syst 51:176–189
Graf HP, Cosatto E, Bottou L et al (2005) Parallel support vector machines: the cascade SVM. Adva Neural Inf Process Syst 17:521–528
Alham NK, Li M, Liu Y et al (2012) A distributed SVM ensemble for image: classification and annotation. In: Proceedings of the 9th international conference on fuzzy systems and knowledge discovery. IEEE, Piscataway, NJ, USA, pp 1581–1584
Chang EY, Zhu K, Wang H et al (2007) PSVM: parallelizing support vector machines on distributed computers. Adv Neural Inf Process Syst 20:1–8
Chu CT, Kim SK, Lin YA et al (2006) Map-reduce for machine learning on multicore. Adv Neural Inf Process Syst 17:281–288
Forero PA, Cano A, Giannakis GB (2010) Consensus-based distributed support vector machines. J Mach Learn Res 11:1663–1707
Lu Y, Roychowdhury V, Vandenberghe L (2008) Distributed parallel support vector machines in strongly connected networks. IEEE Trans Neural Netw 19:1167–1178
Zanni L, Serafini T, Zanghirati G (2006) Parallel software for training large scale support vector machines on multiprocessor systems. J Mach Learn Res 7:1467–1492
Woodsend K, Gondzio J (2009) Hybrid MPI/OpenMP parallel linear support vector machine training. J Mach Learn Res 10:1937–1953
Bertsekas DP, Tsitsiklis JN (1989) Parallel and distributed computation: numerical methods. Prentice Hall, Upper Saddle River
Boyd S, Parikh N, Chu E et al (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3:1–122
Nedic A, Ozdaglar A (2010) Cooperative distributed multi-agent optimization. In: Palomar DP, Eldar YC (eds) Convex optimization in signal processing and communications. Cambridge University Press, Cambridge
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Mota JFC, Xavier JMF, Aguiar PMQ et al (2012) D-ADMM: a communication-efficient distributed algorithm for separable optimization. http://arxiv.org/abs/1202.2805
Martins AFT, Smith NA, Xing EP et al (2011) Online learning of structured predictors with multiple kernels. J Mach Learn Res 15:507–515
Suzuki T, Tomioka R (2011) SpicyMKL: a fast algorithm for multiple kernel learning with thousands of kernels. Mach Learn 85:77–108
Cao LJ, Keerthi SS, Ong CJ, Zhang JQ et al (2006) Parallel sequential minimal optimization for the training of support vector machines. IEEE Trans Neural Netw 17:1039–1049
Acknowledgments
This work was partially supported by the National Natural Science Foundation of China (Project No. 71101023, 71021061, 71271051) and the Fundamental Research Funds for the Central Universities, NEU, China (Project No. N120406001, N110706001). The author would like to express his sincere thanks to referees for their constructive comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, ZY., Fan, ZP. Parallel multiple kernel learning: a hybrid alternating direction method of multipliers. Knowl Inf Syst 40, 673–696 (2014). https://doi.org/10.1007/s10115-013-0655-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0655-5