RI-PCGrad: Optimizing multi-task learning with rescaling and impartial projecting conflict gradients

Meng, Fanyun; Xiao, Zehao; Zhang, Yuanyuan; Wang, Jinlong

doi:10.1007/s10489-024-05805-3

RI-PCGrad: Optimizing multi-task learning with rescaling and impartial projecting conflict gradients

Published: 09 September 2024

Volume 54, pages 12009–12019, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Fanyun Meng¹,
Zehao Xiao ORCID: orcid.org/0000-0002-8510-8971²,
Yuanyuan Zhang¹ &
…
Jinlong Wang¹

218 Accesses
1 Citation
Explore all metrics

Abstract

The multi-task learning model is a learning paradigm that shares features across multiple tasks, and it has achieved great success in fields such as computer vision and natural language processing etc. Multiple tasks often conflict and even compete with each other, which seriously reduces the model performance of multi task learning. Most existing optimization methods alleviate multi task gradient conflicts by adjusting task weights. However, it is also important to consider the magnitude and direction of task gradients during the training process. The magnitude and direction of task gradients reflect the conflict and dominance among tasks, which can disrupt the training process and cause instability. In this paper, we present a rescaling and balancing approach for tackling conflicting and dominating gradients. The approach employs a projecting conflict strategy to mitigate the influence of conflicting gradients from multiple tasks and utilize rescaling and balancing techniques to mitigate gradient dominance during training. The proposed method comprehensively considers the weighting, magnitude, and directions of gradients from tasks. We conduct a series of ablation experiments and comparative experiments on different multi-task networks to validate the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1

HydaLearn

Article 04 July 2022

MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio for Multi-task Learning

A Multi-task Learning Approach by Combining Derivative-Free and Gradient Methods

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Data will be made available on request.

References

Zhang Y, Yang Q (2021) A survey on multi-task learning[J]. IEEE Transactions on Knowledge and Data Engineering 34(12):5586–5609
Article Google Scholar
Vandenhende S, Georgoulis S, Van GW et al (2021) Multi-task learning for dense prediction tasks: A survey[J]. IEEE Trans Pattern Anal Mach Intell 44(7):3614–3633
Chen B, Guan W, Li P et al (2021) Residual multi-task learning for facial landmark localization and expression recognition[J]. Pattern Recogn 115:107893
Article Google Scholar
Guo D, Zhang Z, Yang B et al (2023) Boosting low-resource speech recognition in air traffic communication via pretrained feature aggregation and multi-task learning[J]. IEEE Trans Circuits Syst II Express Briefs 70(9):3714–3718
Google Scholar
Li F, Shan Y, Mao X et al (2022) Multi-task joint training model for machine reading comprehension[J]. Neurocomputing 488:66–77
Article Google Scholar
Zhang W, Yang G, Zhang N et al (2021) Multi-task learning with multi-view weighted fusion attention for artery-specific calcification analysis[J]. Information Fusion 71:64–76
Article Google Scholar
Misra I, Shrivastava A, Gupta A, etc (2016) Cross-stitch networks for multi-task learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3994–4003
Piao C, Wei J (2024) Fitting and sharing multi-task learning[J]. Applied Intelligence : 1–12
Liu S, Johns E, Davison A J (2019) End-to-end multi-task learning with attention. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1871-1880
Turkoglu B, Uymaz SA, Kaya E (2022) Binary artificial algae algorithm for feature selection[J]. Appl Soft Comput 120:108630
Article Google Scholar
Lyu C, Shi Y, Sun L (2023) Data-driven evolutionary multi-task optimization for problems with complex solution spaces[J]. Inf Sci 626:805–820
Article Google Scholar
Turkoglu B, Kaya E (2020) Training multi-layer perceptron with artificial algae algorithm[J]. Engineering Science and Technology 23(6):1342–1350
Google Scholar
Bolte J, Pauwels E (2021) Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning[J]. Mathematical Programming 188(1):19C51
Osawa K, Tsuji Y, Ueno Y et al (2020) Scalable and practical natural gradient for large-scale deep learning[J]. IEEE Trans Pattern Anal Mach Intell 44(1):404–415
Article Google Scholar
Turkoglu B, Uymaz SA, Kaya E (2022) Clustering analysis through artificial algae algorithm[J]. Int J Mach Learn Cybern 13(4):1179–1196
Article Google Scholar
Junru S, Qiong W, Muhua L et al (2023) Decentralized multi-task reinforcement learning policy gradient method with momentum over networks[J]. Appl Intell 53(9):10365–10379
Article Google Scholar
Chen Q, Ma X, Yu Y et al (2022) Multi-objective evolutionary multi-tasking algorithm using cross-dimensional and prediction-based knowledge transfer[J]. Inf Sci 586:540–562
Article Google Scholar
Mao Y, Wang Z, Liu W et al (2022) Task variance regularized multi-task learning[J]. IEEE Trans Knowl Data Eng 35(8):8615–8629
Google Scholar
Hervella S, Rouco J, Novo J et al (2024) Multi-adaptive optimization for multi-task learning with deep neural networks[J]. Neural Netw 170:254–265
Article Google Scholar
Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7482–7491
Liu B, Liu X, Jin X, Stone P, and Liu Q (2021) Conflict averse gradient descent for multi-task learning. Advances in Neural Information Processing Systems, 34
Yu T, Kumar S, Gupta A et al (2020) Gradient surgery for multi-task learning. Adv Neural Inf Process Syst 33:5824–5836
Google Scholar
Gao M, Li JY, Chen CH et al (2023) Enhanced multi-task learning and knowledge graph-based recommender system[J]. IEEE Trans Knowl Data Eng 35(10):10281–10294
Article Google Scholar
Feng X, Liu Z, Wu W et al (2022) Social recommendation via deep neural network-based multi-task learning[J]. Expert Syst Appl 206:117755
Article Google Scholar
Nakamura ATM, Grassi V Jr, Wolf DF (2021) An effective combination of loss gradients for multi-task learning applied on instance segmentation and depth estimation[J]. Eng Appl Artif Intell 100:104205
Article Google Scholar
Xu Y, Zhou F, Wang L et al (2021) Optimization of action recognition model based on multi-task learning and boundary gradient[J]. Electronics 10(19):2380
Article Google Scholar
Nakamura ATM, Grassi V Jr, Wolf DF (2022) Leveraging convergence behavior to balance conflicting tasks in multi-task learning[J]. Neurocomputing 511:43–53
Article Google Scholar
Cui C, Shen Z, Huang J et al (2021) Adaptive feature aggregation in deep multi-task convolutional neural networks[J]. IEEE Trans Circuits Syst Video Technol 32(4):2133–2144
Article Google Scholar
Zhou L, Zhao H, Leng J (2022) MTCNet: Multi-task collaboration network for rotation-invariance face detection[J]. Pattern Recogn 124:108425
Article Google Scholar
Fang Y, Xiao S, Zhou M et al (2022) Enhanced task attention with adversarial learning for dynamic multi-task CNN[J]. Pattern Recogn 128:108672
Article Google Scholar
Chen Z, Liu Z, Geng X et al (2023) Attention guided multi-task network for joint CFO and channel estimation in OFDM systems[J]. IEEE Trans Wireless Commun 23(1):321–333
Article Google Scholar
Chen Z, Badrinarayanan V, Lee C Y, et al. (2018) Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. International Conference on Machine Learning, 794-803
Liu L, Li Y, Kuang Z, et al. (2021)Towards impartial multi-task learning. ICLR
Guo Y, Wei C (2022) Multi-task learning using gradient balance and clipping with an application in joint disparity estimation and semantic segmentation[J]. Electronics 11(8):1217
Article Google Scholar
Tuan TA, Hoang LP, Le DD et al (2024) A framework for controllable pareto front learning with completed scalarization functions and its applications[J]. Neural Netw 169:257–273
Article Google Scholar
Zhou X, Gao Y, Li C et al (2021) A multiple gradient descent design for multi-task learning on edge computing: multi-objective machine learning approach[J]. IEEE Transactions on Network Science and Engineering 9(1):121–133
Article MathSciNet Google Scholar
Bai L, Ong YS, He T et al (2020) Multi-task gradient descent for multi-task learning[J]. Memetic Computing 12:355–369
Article Google Scholar
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Advances in Neural Information Processing Systems, 31
Zhang J, Guo B, Ding X, et al. (2024) An adaptive multi-objective multi-task scheduling method by hierarchical deep reinforcement learning[J]. Applied Soft Computing, 111342
Ruder S, Bingel J, Augenstein I et al (2019) Latent multi-task architecture learning. Proceedings of the AAAI Conference on Artificial Intelligence. 33(01):4822–4829
Article Google Scholar
Gao Y, Ma J, Zhao M, et al. (2019) Nddr-cnn: Layerwise feature fusing in multi-task cnns by neural discriminative dimensionality reduction.Proceedings of the IEEE/CVF Conference on Computer vision and pattern recognition, 3205-3214
Gao T, Wei W, Cai Z et al (2022) CI-Net: A joint depth estimation and semantic segmentation network using contextual information[J]. Appl Intell 52(15):18167–18186
Article Google Scholar
Liu Y, Huang L, Li J et al (2023) Multi-task learning based on geometric invariance discriminative features[J]. Appl Intell 53(3):3505–3518
Article Google Scholar
Cordts M, Omran M, Ramos S, et al. (2016) The CityScapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition, 3213-3223
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In Proceedings of the Computer Vision-ECCV 12:746–760
Google Scholar
Ji NH, Dong HQ, Meng FY et al (2023) Semantic segmentation and depth estimation based on residual attention mechanism. Sensors 23(17):7466
Article Google Scholar
Chen Z, Ngiam J, Huang Y et al (2020) Just pick a sign: Optimizing deep multitask models with gradient sign dropout. Adv Neural Inf Process Syst 33:2039–2050
Google Scholar

Download references

Acknowledgements

The authors thank the two anonymous referees for their many valuable and helpful suggestions. The research were supported by Projecting of Huzhou Science and Technology (Grant No.2023GZ68), and Natural Science Foundation of Shandong Province, Grant ZR2023MF053.

Author information

Authors and Affiliations

School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266033, China
Fanyun Meng, Yuanyuan Zhang & Jinlong Wang
School of Mathematical Sciences, Dalian University of Technology, Dalian, 116024, China
Zehao Xiao

Authors

Fanyun Meng
View author publications
You can also search for this author inPubMed Google Scholar
Zehao Xiao
View author publications
You can also search for this author inPubMed Google Scholar
Yuanyuan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Jinlong Wang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Fanyun Meng: conceptualization and investigation, methodology, writing-review and editing. Zehao Xiao: software, formal analysis, validation, writing-review and editing. Yuanyuan Zhang and Jinlong Wang: resources, project administration and funding acquisition.

Corresponding author

Correspondence to Zehao Xiao.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Meng, F., Xiao, Z., Zhang, Y. et al. RI-PCGrad: Optimizing multi-task learning with rescaling and impartial projecting conflict gradients. Appl Intell 54, 12009–12019 (2024). https://doi.org/10.1007/s10489-024-05805-3

Download citation

Accepted: 22 August 2024
Published: 09 September 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s10489-024-05805-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RI-PCGrad: Optimizing multi-task learning with rescaling and impartial projecting conflict gradients

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HydaLearn

MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio for Multi-task Learning

A Multi-task Learning Approach by Combining Derivative-Free and Gradient Methods

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now