Abstract
Multi-Task Learning is an effective method for learning cross-task knowledge. However, existing methods cannot fairly treat each task, their public parts are prone to continuously fit new tasks and decrease the performances of previous tasks. In this paper, we propose the Fitting-sharing Multi-Task Learning method to address this problem. In the Fitting step, a group of indicator parameters are trained to extract task-specific features and store them into an in-task template matrix. After all models converge, the indicators and templates are frozen to protect the learned knowledge. In the Sharing step, a group of connector parameters are trained to acquire information from other templates and to reason cross-task knowledge. Since the learning and sharing processes are separate, each model can acquire the learned knowledge from other tasks without affect them, and the imbalanced cross-task knowledge problem can be naturally avoided. Experimental results on public datasets illustrate that the proposed method can insistently improve the performance compared with existing methods.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Yu T, Kumar S, Gupta A, Levine S, Hausman K, Finn C (2020) Gradient surgery for multi-task learning. Adv Neural Inf Process Syst 33:5824–5836
Vandenhende S, Georgoulis S, Van Gool L (2020) Mti-net: multi-scale task interaction networks for multi-task learning. ECCV 2020: Computer Vision–ECCV 2020 12349:527–543. Springer Nature Switzerland AG
Gao M, Li J-Y, Chen C-H, Li Y, Zhang J, Zhan Z-H (2023) Enhanced multi-task learning and knowledge graph-based recommender system. IEEE Trans Knowl Data Eng 35(10):10281–10294. Institute of Electrical and Electronics Engineers
Lin B, Zhang Y (2023) Libmtl: a python library for deep multi-task learning. J Mach Learn Res 24(1–7):18
Xu Y, Yang Y, Zhang L (2023) Demt: deformable mixer transformer for multi-task learning of dense prediction. In: Proceedings of the thirty-seventh AAAI conference on artificial intelligence and thirty-fifth conference on innovative applications of artificial intelligence and thirteenth symposium on educational advances in artificial intelligence, pp 3072–3080
Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, De Laroussilhe Q, Gesmundo A, Attariyan M, Gelly S (2019) Parameter-efficient transfer learning for NLP. In: International conference on machine learning, pp 2790–2799. PMLR
Ma J, Zhao Z, Chen J, Li A, Hong L, Chi EH (2019) Snr: sub-network routing for flexible parameter sharing in multi-task learning. In: Proceedings of the AAAI conference on artificial intelligence 33(1):216–223
Guo P, Lee C-Y, Ulbricht D (2020) Learning to branch for multi-task learning. In: International conference on machine learning, pp 3854–3863. PMLR
Liu B, Liu X, Jin X, Stone P, Liu Q (2021) Conflict-averse gradient descent for multi-task learning. Adv Neural Inf Process Syst 34:18878–18890
Chai H, Cui J, Wang Y, Zhang M, Fang B, Liao Q (2023) Improving gradient trade-offs between tasks in multi-task text classification. In: Proceedings of the 61st annual meeting of the association for computational linguistics, pp 2565–2579
Fifty C, Amid E, Zhao Z, Yu T, Anil R, Finn C (2021) Efficiently identifying task groupings for multi-task learning. Adv Neural Inf Process Syst 34:27503–27516
Gueta A, Venezian E, Raffel C, Slonim N, Katz Y, Choshen L (2023) Knowledge is a region in weight space for fine-tuned language models. In: Findings of the association for computational linguistics: EMNLP 2023, pp 1350–1370
Tripathi S, Singh C, Kumar A, Pandey C, Jain N (2019) Bidirectional transformer based multi-task learning for natural language understanding. In: Natural language processing and information systems: 24th international conference on applications of natural language to information systems, NLDB 2019, Salford, UK, June 26–28, 2019, Proceedings 24, pp 54–65. Springer
Vandenhende S, Georgoulis S, Van Gansbeke W, Proesmans M, Dai D, Van Gool L (2022) Multi-task learning for dense prediction tasks: a survey. IEEE Trans Pattern Anal Mach Intell 44(7):3614–3633
Liu P, Qiu X, Huang X-J (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistic, pp 1–10
Qin Q, Hu W, Liu B (2020) Feature projection for improved text classification. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 8161–8171
Romero R, Celard P, Sorribes-Fdez JM, Seara Vieira A, Iglesias EL, Borrajo L (2022) Mobydeep: a lightweight CNN architecture to configure models for text classification. Knowl-Based Syst 257:109914. Elsevier
Zhang T, Gong X, Chen CLP (2021) Bmt-net: broad multitask transformer network for sentiment analysis. IEEE Trans Cybernet 52(7):6232–6243. IEEE
Soni S, Chouhan SS, Rathore SS (2023) Textconvonet: a convolutional neural network based architecture for text classification. Appl Intell 53(11):14249–14268. Springer
Su J, Ahmed M, Lu Y, Pan S, Bo W, Liu Y (2024) Roformer: enhanced transformer with rotary position embedding. Neurocomputing 568:127063. Elsevier
Merity S, Xiong C, Bradbury J, Socher R (2016) Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Piao, C., Wei, J. Fitting and sharing multi-task learning. Appl Intell 54, 6918–6929 (2024). https://doi.org/10.1007/s10489-024-05549-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05549-0