Skip to main content
Log in

The aLS-SVM based multi-task learning classifiers

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The multi-task learning support vector machines (SVMs) have recently attracted considerable attention since the conventional single task learning ones usually ignore the relatedness among multiple related tasks and train them separately. Different from the single task learning, the multi-task learning methods can capture the correlation among tasks and achieve an improved performance by training all tasks simultaneously. In this paper, we make two assumptions on the relatedness among tasks. One is that the normal vectors of the related tasks share a certain common parameter value; the other is that the models of the related tasks are close enough and share a common model. Under these assumptions, we propose two multi-task learning methods, named as MTL-aLS-SVM I and MTL-aLS-SVM II respectively, for binary classification by taking full advantages of multi-task learning and the asymmetric least squared loss. MTL-aLS-SVM I seeks for a trade-off between the maximal expectile distance for each task model and the closeness of each task model to the averaged model. MTL-aLS-SVM II can use different kernel functions for different tasks, and it is an extension of the MTL-aLS-SVM I. Both of them can be easily implemented by solving quadratic programming. In addition, we develop their special cases which include L2-SVM based multi-task learning methods (MTL-L2-SVM I and MTL-L2-SVM II) and the least squares SVM (LS-SVM) based multi-task learning methods (MTL-LS-SVM I and MTL-LS-SVM II). Although the MTL-L2-SVM II and MTL-LS-SVM II appear in the form of special cases, they are firstly proposed in this paper. The experimental results show that the proposed methods are very encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www.ics.uci.edu/~mlearn/MLRepository.html

References

  1. Bakker B, Heskes T (2003) Task clustering and gating for Bayesian multitask learning. J Mach Learn Res 4:83–99

    MATH  Google Scholar 

  2. Ando RK, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6:1817–1953

    MathSciNet  MATH  Google Scholar 

  3. Bi J, Xiong T, Yu S, Dundar M, Rao RB (2008) An improved multi-task learning approach with applications in medical diagnosis. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Datasets–Part I, Antwerp, Belgium, pp 117–132

  4. Birlutiu A, Groot P, Heskes T (2010) Multi-task preference learning with an application to hearing aid personalization. Neurocomputing 73:1177–1185

    Article  Google Scholar 

  5. Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 1189–1198

  6. Ren Y, Xu B, Zhu P (2016) A multiCell visual tracking algorithm using multi-task paticle swarm optimization for low-constrast image seqences. Appl Intell 45(4):1129–1147

    Article  Google Scholar 

  7. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  8. Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272

    Article  Google Scholar 

  9. Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Proceedings of the 16th Annual Conference on Computational Learning Theory and the 7th Kernel Workshop, Washington DC, pp 567–580

  10. Ben-David S, Borbely RS (2008) A notion of task relatedness yielding provable multiple-task learning guarantees. Mach Learn 73(3):273–287

    Article  Google Scholar 

  11. Parameswaran S, Weinberger KQ (2000) Large margin multi-task metric learning. Adv Neural Inf Process Syst 23:1867–1875

  12. Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Tenth ACM SIGKDD International Conference on Knowledge discovery and data mining, Seattle, pp 109–117

  13. Evgeniou T, Micchelli CA, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615–637

    MathSciNet  MATH  Google Scholar 

  14. Li X, Zhao L, Wei L, Yang MH, Wu F, Zhuang Y, Ling H, Wang J (2016) DeepSaliency: Multi-task deep neural network model for salient object detection. IEEE Trans Image Process Publ IEEE Signal Process Soc 25(8):3919–3930

    Article  MathSciNet  Google Scholar 

  15. Yan Y, Ricci E, Subramanian R, Liu G, Lanz O, Sebe N (2016) A multi-task learning framework for head pose estimation under target motion. IEEE Trans Pattern Anal Mach Intell 38(6):1070–1083

    Article  Google Scholar 

  16. Kato T, Kashima H, Sugiyama M, Asai K (2008) Multi-task learning via conic programming. In: Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, pp 737–744

  17. Yang H, King I, Lyu MR (2010) Multi-task learning for one-class classification. In: Proceedings of the International Joint Conference on Neural Networks, Barcelona, pp 1–8

  18. He X, Mourot G, Maquin D, Ragot J, Beauseroy P, Smolarz A, Grall-Maes E (2011) One-class SVM in multi-task learning. In: Advances in Safety, Reliability and Risk Management. ESREL 2011, Troyes, pp 486–494

  19. He X, Mourot G, Maquin D, Ragot J, Beauseroy P, Smolarz A, Grall-Maes E (2014) Multi-task learning with one-class SVM. Neurocomputing 133:416–426

    Article  Google Scholar 

  20. Ji Y, Sun S (2013) Multitask multiclass support vector machines: model and experiments. Pattern Recogn 46(3):914–924

    Article  MATH  Google Scholar 

  21. Ji Y, Sun S, Lu Y (2012) Multitask multiclass privileged information support vector machines. In: Proceedings of the twenty-first international conference on pattern recognition, pp 2323–2326

  22. Xu S, An X, Qiao X, Zhu L (2014) Multi-task least-squares support vector machines. Multimed Tools Appl 71(2):699–715

    Article  Google Scholar 

  23. Li Y, Tian X, Song M, Song MG, Tao DC (2015) Multi-task proximal support vector machine. Pattern Recogn 48(10):3249–3257

    Article  Google Scholar 

  24. Song YY, Zhu WX (2016) Multi-task support vector machine for data classification. Image Process Pattern Recogn 9(7):341–350

    Google Scholar 

  25. Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore

    Book  MATH  Google Scholar 

  26. Maldonado S, López J (2017) Robust kernel-based multiclass support vector machines via second-order cone programming. Appl Intell 46:983–992

    Article  Google Scholar 

  27. Le Thi HA, Pham Dinh T, Thiao M (2016) Efficient approaches for l 2-l 0 regularization and applications to feature selection in SVM. Appl Intell 45:549–565

    Article  Google Scholar 

  28. Li C, Zhang Y, Lu L (2015) An MIMLSVM algorithm based on ECC. Appl Intell 42:537–543

    Article  Google Scholar 

  29. Zhao J, Yang Z, Xu Y (2016) Nonparallel least square support vector machine for classification. Appl Intell 45:1119–1128

    Article  Google Scholar 

  30. Huang X, Shi L, Suykens JAK (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997

    Article  Google Scholar 

  31. Huang X, Shi L, Suykens JAK (2014) Asymmetric least squares support vector machine classifiers. Comput Stat Data Anal 70: 395–405

    Article  MathSciNet  Google Scholar 

  32. Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York

    Book  MATH  Google Scholar 

  33. Wang KN, Zhu WX, Zhong P (2015) Robust support vector regression with generalized Loss Function and Applications. Neural Process Lett 41:89–106

    Article  Google Scholar 

  34. Wang KN, Zhong P (2014) Robust non-convex least squares loss function for regression with outliers. Knowl-Based Syst 71:290–302

    Article  Google Scholar 

  35. Zhong P (2012) Training robust support vector regression with smooth non-convex loss function. Optim Methods Softw 27(6):1039–1058

    Article  MathSciNet  MATH  Google Scholar 

  36. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  37. Vapnik V, Vashist A (2009) A new learning paradigm: learning using privileged information. Neural Netw 22(5):544–557

    Article  MATH  Google Scholar 

  38. Zhu WX, Zhong P (2014) A new one-class SVM based on hidden information. Knowl-Based Syst 60:35–43

    Article  Google Scholar 

Download references

Acknowledgments

The work is supported by the National Natural Science Foundation of China (No.11171346) and Chinese Universities Scientific Fund No. 2017LX003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Zhong.

Appendix: The proof of (12)

Appendix: The proof of (12)

Substituting (11) into the objective function of (4), we have

$$\begin{array}{@{}rcl@{}} &&{} \frac{1}{2}{\| {{\boldsymbol\omega_{0}}} \|^{2}} + \frac{{C_{1}}}{2}{\sum\limits_{k = 1}^{N} {\| {{\boldsymbol \upsilon_{k}}}\|}^{2}} + \frac{{C_{2}}}{2}{\sum\limits_{k = 1}^{N} {\|{{\boldsymbol \zeta_{k}}} \|}^{2}} \\ && =\! \frac{1}{2}\| {\boldsymbol \omega_{0}}\|^{2} + \frac{{C_{1}}}{2}{\sum\limits_{k = 1}^{N} {\| {{\boldsymbol \omega_{k}} - {\boldsymbol\omega_{0}}} \|}^{2}} + \frac{{C_{2}}}{2}{\sum\limits_{k = 1}^{N} {\| {{\boldsymbol\zeta_{k}}} \|}^{2}}\\ &&=\! \frac{1}{2}{\left\| {\frac{{C_{1}N}}{{1 + C_{1} N}}\cdot\frac{1}{N}\sum\limits_{k = 1}^{N}{{\boldsymbol \omega_{k}}}} \right\|^{2}} + \frac{{C_{1}}}{2}\sum\limits_{k = 1}^{N} \left\| {\boldsymbol\omega_{k}}\right.\\ &&\quad\left.\!- \frac{{C_{1}N}}{{1 + C_{1}N}}\cdot\frac{1}{N}\sum\limits_{t = 1}^{N} {{\boldsymbol\omega_{t}}} \right\|^{2} + \frac{{C_{2}}}{2}{\sum\limits_{k = 1}^{N} {\| {{\boldsymbol\zeta_{k}}} \|}^{2}}\\ &&=\!\frac{{\tau_{1}^{2}}N^{2}}{2}{\kern-.5pt}\|{\kern-.5pt}\bar{\boldsymbol \omega}\|^{2}\,+\,\frac{C_{1}}{2}{\kern-1.5pt}\sum\limits_{k = 1}^{N}{\kern-.5pt} \|{\kern-.5pt}\boldsymbol\omega_{k}\,-\,\tau_{1}{\kern-.5pt}N{\kern-.5pt}\bar{\boldsymbol\omega}{\kern-.5pt}\|^{2} \,+\,\frac{C_{2}}{2}{}\sum\limits_{k = 1}^{N} {}{\|{\kern-.5pt} {{\boldsymbol\zeta_{k}}}{\kern-.5pt} \|}^{2} \end{array} $$

where \(\bar {\boldsymbol \omega }=\frac {1}{N}{\sum }_{t = 1}^{N} {\boldsymbol \omega _{t}}\), \(\tau _{1}=\frac {C_{1}}{1+C_{1}N},\,\tau _{2}=\frac {{C^{2}_{1}} N}{1+C_{1} N}\). Noticing that τ 1 + τ 2 = C 1, τ 2 = τ 1 C 1 N, and \(\tau _{2}=(1+C_{1}N){\tau _{1}^{2}}N\), the above equation can be calculated as follows.

$$\begin{array}{@{}rcl@{}} &&{}\frac{{\tau_{1}^{2}}N^{2}}{2}{\kern-.5pt}\|{\kern-.5pt}\bar{\boldsymbol\omega}{\kern-.5pt}\|^{2}\,+\,\frac{C_{1}}{2}{\kern-.5pt}\sum\limits_{k = 1}^{N} \|\boldsymbol\omega_{k}-\tau_{1}N\bar{\boldsymbol \omega}\|^{2}+\frac{C_{2}}{2}\sum\limits_{k = 1}^{N} {\| {{\boldsymbol \zeta_{k}}} \|}^{2}\\ &&=\frac{C_{1}}{2}\sum\limits_{k = 1}^{N}\|\boldsymbol \omega_{k}\|^{2}-\tau_{1}C_{1}N\sum\limits_{k = 1}^{N} {\boldsymbol \omega_{k}}^{T} {\bar{\boldsymbol\omega}}\\&&\quad+\frac{(1+C_{1}N){\tau_{1}^{2}} N^{2}}{2}\| \bar{\boldsymbol \omega}\|^{2}+\frac{C_{2}}{2}\sum\limits_{k = 1}^{N} {\| {{\boldsymbol \zeta_{k}}} \|}^{2}\\ &&=\!\frac{1}{2}\!\left( \!(\tau_{1}\,+\,\tau_{2})\sum\limits_{k = 1}^{N}\|\boldsymbol \omega_{k}\|^{2}\,-\,2\tau_{2}\sum\limits_{k = 1}^{N} {\boldsymbol\omega_{k}^{T}}\bar{\boldsymbol\omega}\,+\,\tau_{2}N\|\bar{\boldsymbol \omega}\|\!\right)\\&&\quad+\frac{C_{2}}{2}\sum\limits_{k = 1}^{N} {\| {{\boldsymbol \zeta_{k}}} \|}^{2}\\ &&=\frac{\tau_{1}}{2}\sum\limits_{k = 1}^{N}\|\boldsymbol \omega_{k}\|^{2}\,+\,\frac{\tau_{2}}{2}\sum\limits_{k = 1}^{N}\|\boldsymbol \omega_{k}\,-\,\bar{\boldsymbol \omega}\|^{2}\,+\,\frac{C_{2}}{2}\sum\limits_{k = 1}^{N} {\| {{\boldsymbol\zeta_{k}}} \|}^{2} \end{array} $$

Therefore, the proof of (12) is completed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, L., Lin, Q., Pei, H. et al. The aLS-SVM based multi-task learning classifiers. Appl Intell 48, 2393–2407 (2018). https://doi.org/10.1007/s10489-017-1087-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1087-9

Keywords

Navigation