Skip to main content
Log in

Multi-task twin spheres support vector machine with maximum margin for imbalanced data classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-task learning (MTL) has been gradually developed to be a quite effective method recently. Different from the single-task learning (STL), MTL can improve overall classification performance by jointly training multiple related tasks. However, most existing MTL methods do not work well for the imbalanced data classification, which is more commonly encountered in our real life. The maximum margin of twin spheres support vector machine (MMTSVM) is proved to be an effective method for handling imbalanced data classification. Inspired by above study, this paper proposes a multi-task twin spheres support vector machine with maximum margin (MTMMTSVM) for imbalanced data classification. MTMMTSVM constructs two homocentric hyper-spheres for each task, meanwhile it explores the commonality to be shared and individuality of each task. Moreover, it introduces the maximum margin principle to separate the majority samples from the minority samples, thereby containing a linear programming problem (LPP) and a smaller quadratic programming problem (QPP). Compared with the latest multi-task algorithms, MTMMTSVM achieves superior g-mean and comparable accuracy on imbalanced datasets. Meanwhile, it dose not cost too much training time. Experiments have been conducted on five benchmark datasets, ten image datasets and one real Chinese wine dataset to explore the effectiveness of the MTMMTSVM. Finally, we employ a fast decomposition algorithm (DM) to handle the large-scale imbalanced problems more efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets.php

  2. http://people.ee.duke.edu/lcarin/LandmineData.zip

  3. http://www.vision.caltech.edu/Image_Datasets/Caltech256/256_ObjectCategories.tar

  4. The code is available in https://github.com/lorenmt/mtan

References

  1. Xu Z, Kersting K (2011) Multi-task learning with task relations. In: Proceedings of the IEEE 11th international conference on data mining, pp 884–893

  2. Zhang Y, Yang Q (2017) Learning sparse task relations in multi-task learning. In: Proceedings of the 31st AAAI conference on artificial intelligence, pp 2914–2920

  3. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  4. Chen K, Laarhoven T et al (2020) Generalized convolution spectral mixture for multitask gaussian processes. IEEE Trans Neural Netw Learn Syst 31(12):5613–5623

    Article  MathSciNet  Google Scholar 

  5. Huang C, Wang S, Pan X, Bi A (2019) V-soft margin multi-task learning logistic regression. Int J Mach Learn Cybern 10(2):369–383

    Article  Google Scholar 

  6. Shireen T, Shao C, Wang H, Li J, Zhang X, Li M (2018) Iterative multi-task learning for time-series modeling of solar panel PV outputs. Appl Energy 212:654–662

    Article  Google Scholar 

  7. Yan Y, Ricci E, Subramanian R, Liu G, Lanz O, Sebe N (2016) A multi-task learning framework for head pose estimation under target motion. IEEE Trans Pattern Anal Mach Intell 38(6):1070–1083

    Article  Google Scholar 

  8. Liu A, Su Y, Nie W, Kankanhalli M (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114

    Article  Google Scholar 

  9. Ranjan R, Patel VM, Chellappa R (2019) Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41(1):121–135

    Article  Google Scholar 

  10. Wang H, Liu Z, Yang M, Qin Y (2021) Feature-level attention-guided multitask CNN for fault diagnosis and working conditions identification of rolling bearing, IEEE Trans Neural Netw Learn Syst. published online https://doi.org/10.1109/TNNLS.2021.3060494

  11. Cui F, Di H, Shen L, Ouchi K, Liu Z, Xu J (2022) Modeling semantic and emotional relationship in multi-turn emotional conversations using multi-task learning. Appl Intell 52(4):4663–4673

    Article  Google Scholar 

  12. Song W, Zheng J, Wu Y, Chen C, Liu F (2021) Discriminative feature extraction for video person re-identification via multi-task network. Appl Intell 51(2):788–803

    Article  Google Scholar 

  13. Zhang Y, Yang Q (2018) An overview of multi-task learning. Natl Sci Rev 5(1):30–43

    Article  Google Scholar 

  14. Thung K-H, Wee C-Y (2018) A brief review on multi-task learning. Multimed Tools Appl 77 (22):29705–29725

    Article  Google Scholar 

  15. Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learnin. Mach Learn 73(3):243–272

    Article  MATH  Google Scholar 

  16. Gong P, Ye J, Zhang C (2013) Multi-stage multi-task feature learning. J Mach Learn Res 14:2979–3010

    MathSciNet  MATH  Google Scholar 

  17. Zhang J, Miao J, Zhao K, Tian Y (2019) Multi-task feature selection with sparse regularization to extract common and task-specific features. Neurocomputing 340:76–89

    Article  Google Scholar 

  18. Misra I, Shrivastava A, Gupta A, Hebert M (2016) Cross-stitch networks for multi-task learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3994–4003

  19. Liu S, Johns E, Davison AJ (2019) End-to-end multi-task learning with attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1871–1880

  20. Liang J, Liu Z, Zhou J, Jiang X, Zhang C, Wang F (2022) Model-Protected Multi-Task Learning. IEEE Trans Pattern Anal Mach Intell 44(2):1002–1019

    Article  Google Scholar 

  21. Zhang X, Zhang X, Liu H, Liu X (2018) Partially related multi-task clustering. IEEE Trans Knowl Data Eng 30(12):2367–2380

    Article  Google Scholar 

  22. Zhang Y, Yeung D (2014) A regularization approach to learning task relationships in multitask learning. ACM Trans Knowl Discov Data 8(3):12

    Article  Google Scholar 

  23. Wang Y, Lin J, Bi S, Sun C, Si L, Liu X (2022) Adaptive multi-task positive-unlabeled learning for joint prediction of multiple chronic diseases using online shopping behaviors. Expert Syst Appl 191:116232

    Article  Google Scholar 

  24. Zhang H, Xu G, Liang X, Zhang W, Sun X, Huang T (2019) Multi-view multitask learning for knowledge base relation detection. Knowledge-Based Syst 183:104870

    Article  Google Scholar 

  25. Sattler F, Müller KR, Samek W (2021) Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints. IEEE Trans Neural Netw Learn Syst 32(8):3710–3722

    Article  MathSciNet  Google Scholar 

  26. Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  27. Jayadeva R, Khemchandani S (2007) Chandra, Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  MATH  Google Scholar 

  28. Peng X, Xu D (2013) A twin-hypersphere support vector machine classifier and the fast learning algorithm. Inf Sci 211:12–27

    Article  MathSciNet  MATH  Google Scholar 

  29. Huang X, Shi L, Suykens J (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997

    Article  Google Scholar 

  30. Xu Y, Li X, Pan X, Yang Z (2018) Asymmetric v-twin support vector regression. Neural Comput Appl 30(12):3799–3814

    Article  Google Scholar 

  31. Evgeniou T, Ponti M (2004) Regularized multi-task learning. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 109–117

  32. Li Y, Tian X, Song M, Tao D (2015) Multi-task proximal support vector machine. Pattern Recognit 48(10):3249–3257

    Article  Google Scholar 

  33. Mei B, Xu Y (2019) Multi-task least squares twin support vector machine for classification. Neurocomputing 338:26– 33

    Article  Google Scholar 

  34. An R, Xu Y, Liu X (2021) A rough margin-based multi-task v-twin support vector machine for pattern classification. Appl Soft Comput 112:107769

    Article  Google Scholar 

  35. Mei B, Xu Y (2020) Multi-task nu-twin support vector machines. Neural Comput Appl 32 (15):11329–11342

    Article  Google Scholar 

  36. Xie F, Pang X, Xu Y (2021) Pinball loss-based multi-task twin support vector machine and its safe acceleration method. Neural Comput Appl 33(22):15523–15539

    Article  Google Scholar 

  37. Xie X, Sun S (2015) Multitask centroid twin support vector machines. Neurocomputing 149:1085–1091

    Article  Google Scholar 

  38. TAX D, DUIN R (2004) Support vector data description. Mach learn 54(1):45–66

    Article  MATH  Google Scholar 

  39. Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092

    Article  Google Scholar 

  40. Liu J (2021) Fuzzy support vector machine for imbalanced data with borderline noise. Fuzzy Sets Syst 413:64–73

    Article  MathSciNet  Google Scholar 

  41. Chen G, Zhang X, Wang Z, Li F (2015) Robust support vector data description for outlier detection with noise or uncertain data. Knowledge-Based Syst 90:129–137

    Article  Google Scholar 

  42. Xu Y (2017) Maximum margin of twin spheres support vector machine for imbalanced data classification. IEEE T Cybern 47(6):1540–1550

    Article  Google Scholar 

  43. Yang H, King I, Lyu M (2010) Multi-task learning for one-class classification. In: Proceedings of the international joint conference on neural networks (IJCNN), pp 1-8

  44. He X, Mourot G, Maquin D, Ragot J, Beauseroy P, Smolarz A, Grall-Maës E (2014) Multi-task learning with one-class SVM. Neurocomputing 133:416–426

    Article  Google Scholar 

  45. Xue Y, Beauseroy P (2016) Multi-task learning for one-class SVM with additional new features. In: Proceedings of the 23rd international conference on pattern recognition (ICPR), pp 1571–1576

  46. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  47. Salvador G, Alberto F, Julián L, Francisco H (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inform Sci 180(10):2044–2064

    Article  Google Scholar 

  48. Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the helpful comments of the reviewers, which have improved the presentation. This work was supported by the National Natural Science Foundation of China (No. 12071475, 11671010) and Beijing Natural Science Foundation, China (No. 4172035).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yitian Xu or Xuhua Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix : A: The proof of (8)

Proof: To solve (6), we introduce the Lagrangian function:

$$ \begin{array}{@{}rcl@{}} L_{1}({R_{t}^{2}},C_{0},f_{t},\xi_{it},\alpha_{it},\beta_{it})&=&{\sum}_{t=1}^{T}{R_{t}^{2}} -{\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}\|\varphi(x_{jt})-C_{0}\|^{2}\\ &&-{\sum}_{t=1}^{T}\frac{v}{l_{t}^{-}}{\sum}_{j=1}^{l_{t}^{-}}\|\varphi(x_{jt}) -f_{t}\|^{2}\\ &&+{\sum}_{t=1}^{T}\frac{1}{v_{1t}l_{t}^{+}}{\sum}_{i=1}^{l_{t}^{+}}\xi_{it} +{\sum}_{t=1}^{T}{\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}(\|\varphi(x_{it})\\ &&-C_{0}-f_{t}\|^{2}-{R_{t}^{2}}-\xi_{it})\\ &&-{\sum}_{t=1}^{T}{\sum}_{i=1}^{l_{t}^{+}}\beta_{it}\xi_{it}, \end{array} $$
(A.1)

where αit ≥ 0 and βit ≥ 0 are Lagrangian multipliers. By differentiating the Lagrangian function L1 with respect to \({R_{t}^{2}}, C_{0}, f_{t}\) and ξit, we get the Karush-Kuhn-Tucker(KKT) conditions,

$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{1}}{\partial {R_{t}^{2}}}=1-{\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}=0, \end{array} $$
(A.2)
$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{1}}{\partial C_{0}}={\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}(\varphi(x_{jt})-C_{0})-{\sum}_{t=1}^{T}{\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}(\varphi(x_{it}) -C_{0}-f_{t})=0, \end{array} $$
(A.3)
$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{1}}{\partial f_{t}}=\frac{v}{l_{t}^{-}}{\sum}_{j=1}^{l_{t}^{-}}(\varphi(x_{jt})-f_{t})-{\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}(\varphi(x_{it})-C_{0}-f_{t})=0, \end{array} $$
(A.4)
$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{1}}{\partial \xi_{it}}=\frac{1}{v_{1t}l_{t}^{+}}-\alpha_{it}-\beta_{it}=0, \end{array} $$
(A.5)
$$ \begin{array}{@{}rcl@{}} \alpha_{it}(\|\varphi(x_{it})-C_{0}-f_{t}\|^{2}-{R_{t}^{2}}-\xi_{it})=0, \end{array} $$
(A.6)
$$ \begin{array}{@{}rcl@{}} \beta_{it}\xi_{it}=0. \end{array} $$
(A.7)

From (A.2), we can get

$$ \begin{array}{@{}rcl@{}} {\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}=1. \end{array} $$
(A.8)

From (A.3) and (A.4), we can derive the common center and bias of t-th homocentric sphere as follows:

$$ \begin{array}{@{}rcl@{}} C_{0}=\frac{1}{m}{\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}\frac{l_{t}^{-}-l_{t}^{-}v-v}{l_{t}^{-}(1-v)}\varphi(x_{jt}) +\frac{1}{m}{\sum}_{t=1}^{T}{\sum}_{i=1}^{l_{t}^{+}}\frac{v}{1-v}\alpha_{it}\varphi(x_{it}), \end{array} $$
(A.9)
$$ \begin{array}{@{}rcl@{}} f_{t}&=&-\frac{v}{l_{t}^{-}(1-v)}{\sum}_{j=1}^{l_{t}^{-}}\varphi(x_{jt})+\frac{1}{1-v}{\sum}_{i=1}^{l_{t}^{+}} \alpha_{it}\varphi(x_{it})\\ &&-\frac{1}{m(1-v)} {\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}\frac{l_{t}^{-}-l_{t}^{-}v-v}{l_{t}^{-}(1-v)}\varphi(x_{jt})\\ &&-\frac{v}{m(1-v)^{2}}{\sum}_{t=1}^{T}{\sum}_{i=1}^{l_{t}^{+}}\alpha_{it}\varphi(x_{it}), \end{array} $$
(A.10)

where \(m=\frac {T}{1-v}+l_{1}^{-}+l_{2}^{-}+...+l_{T}^{-}-T\). Substituting (A.5), (A.8), (A.9) and (A.10) into (A.1), we can get the dual formulation of (6).

Appendix B: The proof of (9)

Proof: To solve (7), the Lagrangian function is similarly introduced as follows:

$$ \begin{array}{@{}rcl@{}} L_{2}({\rho_{t}^{2}},\eta_{jt},\gamma_{jt},\lambda_{jt})&=&{\sum}_{t=1}^{T}({R_{t}^{2}}-{\rho_{t}^{2}})+{\sum}_{t=1}^{T}\frac{1}{v_{2t}l_{t}^{-}}{\sum}_{j=1}^{l_{t}^{-}}\eta_{jt}\\ &&-{\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}\lambda_{jt}\eta_{jt}-{\sum}_{t=1}^{T}{\sum}_{j=1}^{l_{t}^{-}}\gamma_{jt}(\|\varphi(x_{jt})\\ &&-C_{0}-f_{t}\|^{2}-{R_{t}^{2}}-{\rho_{t}^{2}}+\eta_{jt}), \end{array} $$
(A.11)

where γjt ≥ 0 and λjt ≥ 0 are Lagrangian multipliers. Then we differentiate the Lagrangian function L2 with respect to \({\rho _{t}^{2}}\), and ηjt, and then we can yield the KKT conditions,

$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{2}}{\partial {\rho_{t}^{2}}}=-1+{\sum}_{j=1}^{l_{t}^{-}}\gamma_{jt}=0, \end{array} $$
(A.12)
$$ \begin{array}{@{}rcl@{}} \frac{\partial L_{2}}{\partial \eta_{jt}}=\frac{1}{v_{2t}l_{t}^{-}}-\gamma_{jt}-\lambda_{jt}=0, \end{array} $$
(A.13)
$$ \begin{array}{@{}rcl@{}} \gamma_{jt}(\|\varphi(x_{jt})-C_{0}-f_{t}\|^{2}-{R_{t}^{2}}-{\rho_{t}^{2}}+\eta_{jt})=0, \end{array} $$
(A.14)
$$ \begin{array}{@{}rcl@{}} \lambda_{jt}\eta_{jt}=0. \end{array} $$
(A.15)

From (A.12), we can get

$$ \begin{array}{@{}rcl@{}} {\sum}_{j=1}^{l_{t}^{-}}\gamma_{jt}=1. \end{array} $$
(A.16)

Substituting (A.13) and (A.16) into (A.11), we can derive the dual formulation of (7).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Xu, Y. & Liu, X. Multi-task twin spheres support vector machine with maximum margin for imbalanced data classification. Appl Intell 53, 3318–3335 (2023). https://doi.org/10.1007/s10489-022-03707-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03707-w

Keywords

Navigation