Skip to main content
Log in

v-soft margin multi-task learning logistic regression

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Coordinate descent (CD) is an effective method for large scale classification problems with simple operations and fast convergence speed. In this paper, inspired by v-soft margin support vector machine and multi-task learning support vector machine for classification, a novel v-soft margin multi-task learning logistic regression (v-SMMTL-LR) for pattern classification is proposed to improve the generalization performance of logistic regression (LR). The dual of v-SMMTL-LR can be viewed as dual coordinate descent (CDdual) problem with equality constraint and then its large scale classification method named v-SMMTL-LR-CDdual is developed. The proposed method v-SMMTL-LR-CDdual can maximize the between-class margin and effectively improve the generalization performance of LR for large scale multi-task learning scenarios. Experimental results show that the proposed method v-SMMTL-LR-CDdual is effective for large scale multi-task datasets or comparatively high dimensional multi-task datasets and that it is competitive to other related single-task and multi-task learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  2. Bakker B, Heskes T (2003) Task clustering and gating for Bayesian multitask learning. J Mach Learn Res 4:83–99

    MATH  Google Scholar 

  3. Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 109–117

  4. Jiang YZ, Chung FL, Ishibuchi H et al (2015) Multitask TSK fuzzy system modeling by mining intertask common hidden structure. IEEE Trans Cybern 45(3):548–561

    Article  Google Scholar 

  5. Xue Y, Liao XJ, Carin L et al (2007) Multi-task learning for classification with Dirichlet process priors. J Mach Learn Res 8:35–63

    MathSciNet  MATH  Google Scholar 

  6. Li D, Hu G, Wang Y et al (2015) Network traffic classification via non-convex multi-task feature learning. Neurocomputing 152:322–332

    Article  Google Scholar 

  7. He X, Mourot G, Maquin D et al (2014) Multi-task learning with one-class SVM. Neurocomputing 133:416–426

    Article  Google Scholar 

  8. Parameswaran S, Weinberger KQ (2010) Large margin multi-task metric learning. In: Proceedings of advances in neural information processing systems, pp 1867–1875

  9. Bottou L, Bousquet O (2007) The tradeoffs of large scale learning. In: Proceedings of advances in neural information processing systems, pp 161–168

  10. Musa AB (2013) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern 4(1):13–24

    Article  Google Scholar 

  11. Ekbal A, Saha S, Sikdar UK (2014) On active annotation for named entity recognition. Int J Mach Learn Cybern 1–8

  12. Yu HF, Huang FL, Lin CJ (2011) Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn 85(1–2):41–75

    Article  MathSciNet  MATH  Google Scholar 

  13. Darroch JN, Ratcliff D (1972) Generalized iterative scaling for log-linear models. Ann Math Stat 43(5):1470–1480

    Article  MathSciNet  MATH  Google Scholar 

  14. Della PS, Della PV, Lafferty J (1997) Inducing features of random fields. IEEE Trans Pattern Anal Mach Intell 19(4):380–393

    Article  Google Scholar 

  15. Goodman J (2002) Sequential conditional generalized iterative scaling. In: Proceedings of the 40th annual meeting of the association of computational linguistics, pp 9–16

  16. Jin R, Yan R, Zhang J et al (2003) A faster iterative scaling algorithm for conditional exponential model. In: Proceedings of the 20th international conference on machine learning, pp 282–289

  17. Huang FL, Hsien CJ, Chang KW et al (2010) Iterative scaling and coordinate descent methods for maximum entropy. J Mach Learn Res 11:815–848

    MathSciNet  MATH  Google Scholar 

  18. Minka TP (2007) A comparison of numerical optimizers for logistic regression. http://research.microsoft.com/en-us/um/people/minka/papers/logreg/minka-logreg.pdf

  19. Komarek P, Moore AW (2005) Making logistic regression a core data mining tool: a practical investigation of accuracy, speed, and simplicity. Technical report TR-05-27, Robotics Institute of Carnegie Mellon University, Pittsburgh

  20. Lin CJ, Weng RC, Keerthi SS (2008) Trust region Newton method for large-scale logistic regression. J Mach Learn Res 9:627–650

    MathSciNet  MATH  Google Scholar 

  21. Keerthi SS, Duan KB, Shevade SK et al (2005) A fast dual algorithm for kernel logistic regression. Mach Learn 61(1–3):151–165

    Article  MATH  Google Scholar 

  22. Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Proceedings of advances in kernel methods: support vector learning, pp 185–208

  23. Keerthi SS, Shevade SK, Bhattacharyya C et al (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649

    Article  MATH  Google Scholar 

  24. Hsieh CJ, Chang KW, Lin CJ et al (2008) A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th international conference on machine learning, pp 408–415

  25. Chen PH, Lin CJ, Schölkopf B (2005) A tutorial on v-support vector machines. Appl Stoch Models Bus Ind 21(2):111–136

    Article  MathSciNet  MATH  Google Scholar 

  26. Gu X, Wang ST, Xu M (2014) A new cross-multidomain classification algorithm and its fast version for large datasets. Acta Autom Sin 40(3):531–547

    Google Scholar 

  27. Luo ZQ, Tseng P (1992) On the convergence of coordinate descent method for convex differentiable minimization. J Optim Theory Appl 72(1):7–35

    Article  MathSciNet  MATH  Google Scholar 

  28. Lewis DD, Yang Y, Rose TG et al (2004) RCV1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397

    Google Scholar 

  29. Cai D, He XF (2012) Manifold adaptive experimental design for text categorization. IEEE Trans Knowl Data Eng 24(4):707–719

    Article  Google Scholar 

  30. Dai WY, Xue GR, Yang Q et al (2007) Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 210–219

  31. Duan LX, Tsang IW, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479

    Article  Google Scholar 

  32. Duan L, Xu D, Tsang IW (2012) Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans Neural Netw Learn Syst 23(3):504–518

    Article  Google Scholar 

  33. Gu X, Chung FL, Ishibuchi H et al (2015) Multitask coupled logistic regression and its fast implementation for large multitask datasets. IEEE Trans Cybern 45(9):1953–1966

    Article  Google Scholar 

  34. Jiang Y, Chung FL, Wang S et al (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701

    Article  Google Scholar 

  35. Kreyszig E (1970) Introductory mathematical statistics: principles and methods. Wiley, New York

    MATH  Google Scholar 

  36. Baxter J (2000) A model of inductive bias learning. J Artif Intell Res 12(1):149–198

    Article  MathSciNet  MATH  Google Scholar 

  37. Yu K, Tresp V, Schwaighofer A (2005) Learning Gaussian processes from multiple tasks. In: Proceedings of the 22nd international conference on machine learning, pp 1012–1019

  38. Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of the twenty-first international conference on machine learning, p 65

  39. Ando RK, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6:1817–1853

    MathSciNet  MATH  Google Scholar 

  40. Evgeniou T, Micchelli CA, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615–637

    MathSciNet  MATH  Google Scholar 

  41. Gao J, Fan W, Jiang J et al (2008) Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 283–291

  42. Al-Stouhi S, Reddy C K (2014) Multi-task clustering using constrained symmetric non-negative matrix factorization. In: Proceedings of the 2014 SIAM international conference on data mining, pp 785–793

  43. Wang XZ, Xing HJ, Li Y et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654

    Article  Google Scholar 

  44. Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196

    Article  MathSciNet  Google Scholar 

  45. Ashfaq R A R, Wang XZ, Huang JZX et al (2017) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci 378:484–497

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants (61272210, 61572236), the Fundamental Research Funds for the Central Universities (JUDCF13030, JUSRP51614A), 2013 Postgraduate Student’s Creative Research Fund of Jiangsu Province under Grant CXZZ13_0760, and the Natural Science Foundation of Guizhou Province under Grant [2013]2136.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengquan Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, C., Wang, S., Pan, X. et al. v-soft margin multi-task learning logistic regression. Int. J. Mach. Learn. & Cyber. 10, 369–383 (2019). https://doi.org/10.1007/s13042-017-0721-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-017-0721-5

Keywords

Navigation