Abstract
Numerous computer vision applications, such as image classification, have benefited from multi-task learning techniques. However, the relative weighting between each task’s loss is hard to be tuned by hand, causing multi-task learning prohibitive in real applications. In this paper, we present a novel and principled adaptive multi-task learning method that weights multiple loss functions based on lagrange multiplier strategy. Our method starts from the standard multi-task learning model. Based on Gaussian likelihood and lagrange multiplier, we then design an adaptive multi-task learning model to learn suitable weightings of each task and boost performance. In order to validate the feasibility of proposed method, we conduct automatic art analysis tests, including art classification and cross-modal art retrieval. Experimental results demonstrate that our method outperforms several state-of-the-art techniques, showing that performance is improved by up to 4.2% in art classification and 8.7% in cross-modal art retrieval when compared with the latest automatic loss weights learning method.
Similar content being viewed by others
References
Bar Y, Levy N, Wolf L (2014) Classification of artistic styles using binarized features derived from a deep neural network. In Proceedings of the European Conference on Computer Vision Workshops 71–84
Bilen H, Vedaldi A (2016) Integrated perception with recurrent multi-task neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems 235–243
Carneiro G, Pinho da Silva N, Del Bue A, Paulo Costeira J (2012) Artistic image classification: an analysis on the printart database. In Proceedings of the European Conference on Computer Vision 143–157
Collomosse J, Bui T, Wilber MJ, Fang C, Jin H (2017) Sketching with style: visual search with sketches and aesthetic context. In Proceedings of the International Conference on Computer Vision 2679–2687
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning 160–167
Chen Z, Badrinarayanan V, Lee C, Rabinovich A (2018) GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In Proceedings of the 35th International Conference on Machine Learning 794–803
Chu W, Wu Y (2018) Image style classification based on learnt deep correlation features. IEEE Trans Multimedia 20(9):2491–2502
Crowley E, Zisserman A (2014) The state of the art: Object retrieval in paintings using discriminative regions. In Proceedings of British Machine Vision Conference 1–8
Crowley EJ, Zisserman A (2016) The art of detection. In Proceedings of the European Conference on Computer Vision Workshops 721–737
Garcia N, Renoust B, Nakashima Y (2019) Context-aware embeddings for automatic art analysis. In Proceedings of the International Conference on Multimedia Retrieval 25–33
Garcia N, Vogiatzis G (2018) How to read paintings: semantic art understanding with multi-modal retrieval. In Proceedings of the European Conference on Computer Vision Workshops 676–691
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778
Johnson CR, Hendriks E, Berezhnoy IJ, Brevdo E, Hughes SM, Daubechies I, Li J, Postma E, Wang JZ (2008) Image processing for artist identification. IEEE Signal Process Mag 25(4):37–48
Kalman D (2009) Leveling with lagrange: an alternate view of constrained optimization. Math Mag 82(3):186–196
Karayev S, Trentacoste M, Han H, Agarwala A, Darrell T, Hertzmann A, Winnemoeller H (2014) Recognizing image style. In Proceedings of the British Machine Vision Conference 1–8
Kendall A, Gal Y (2017) What uncertainties do we need inbayesian deep learning for computer vision? arXiv preprint arXiv: 1703.04977
Kendall A, Yarin G, Roberto C (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 7482–7491
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. In Proceedings of the National Academy of Sciences 201611835
Li L, Pan X, Yang H, Liu Z, He Y, Li Z, Fan Y, Cao Z, Zhang L (2020) Multi-task deep learning for fine-grained classification and grading in breast cancer histopathological images. Multimedia Tools and Applications 79:14509–14528
Li X, Wong K (2019) Evolutionary multiobjective clustering and its applications to patient stratification. IEEE Transactions on Cybernetics 49(5):1680–1693
Long M, Wang J (2015) Learning multiple tasks with deep relationship networks. CoRR, abs/1506.02117 3
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Ma D, Gao F, Bai Y, Lou Y, Wang S, Huang T, Duan L (2017) From part to whole: who is behind the painting?. In Proceedings of the 2017 ACM on Multimedia Conference 1174–1182
Mao H, Cheung M, She J (2017) DeepArt: learning joint representations of visual arts. In Proceedings of the 2017 ACM on Multimedia Conference 1183–1191
Mao H, Cheung M, She J (2017) DeepArt: Learning joint representations of visual arts. In Proceedings of the 25th ACM International Conference on Multimedia 1183–1191
Mensink T, Van Gemert J (2014) The Rijksmuseum challenge: Museum-centered visual recognition. In Proceedings of International Conference on Multimedia Retrieval 451–454
Misra I, Shrivastava A, Gupta A, Hebert M (2016) Cross stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 3994–4003
Rudd EM, Günther M, Boult TE (2016) Moon: A mixed objective optimization network for the recognition of facial attributes. In Proceedings of the European Conference on Computer Vision 19–35
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv: 1706.05098
Saleh B, Elgammal AM (2015) Large-scale classification of fine-art paintings: learning the right metric on the right feature. CoRR
Sanakoyeu A, Kotovenko D, Lang, S Ommer B (2018) A style-aware content loss for real-time HD style transfer. In Proceedings of the European Conference on Computer Vision 715–731
Seguin B, Striolo C, diLenardo I, Kaplan F (2016) Visual link retrieval in a database of paintings. In Proceedings of the European Conference on Computer Vision Workshops 753–767
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. In Advances in Neural Information Processing Systems 525–536
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: Integrated recognition, localization and detection using convolutional networks. In Proceedings of International Conference on Learning Representations 25–33
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations
Shamir L, Macura T, Orlov N, Eckley D, Goldberg IG (2010) Impressionism, expressionism, surrealism: automated recognition of painters and schools of art. ACM Trans Appl Percept 7(2):8
Shahbaz Khan F, Beigpour S, Van de Weijer J, Felsberg M (2014) Painting-91: a large scale database for computational painting categorization. Mach Vis Appl 25(6): 1385–1397
Strezoski G, Worring M (2018) OmniArt: a large-scale artistic benchmark. ACM Trans Multimed Comput Commun Appl 14(4):88
Tan W, Chan C, Aguirre HE, Tanaka K (2016) Ceci n’est pas une pipe: a deep convolutional network for fne-art paintings classification. In Proceedings of 2016 IEEE International Conference on Image Processing 1–5
Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: Real-time joint semantic reasoning for autonomous driving. arXiv preprint arXiv: 612.07695
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning 2048–2057
Xu Y, Li X, Chen D, Li H (2018) Learning rates of regularized regression with multiple gaussian kernels for multi-task learning. IEEE Transactions on Neural Networks and Learning Systems 29(11):5408–5418
Yang Y, Hospedales T (2017) Deep multi-task representation learning: A tensor factorisation approach. In Proceedings of the International Conference on Learning Representations
Yang Y, Yang Y, Yuan Y, Zheng J, Zheng Z (2020) Detecting helicobacter pylori in whole slide images via weakly supervised multi-task learning. Multimedia Tools and Applications 79:26787–26815
Zhang T, Ghanem B, Liu S, Ahuja N (2013) Robust visual tracking via structured multi-task sparse learning. Int J Comput Vision 101(2):367–383
Zhao Y, Tang F, Dong W, Huang F, Zhang X (2019) Joint face alignment and segmentation via deep multi-task learning. Multimedia Tools and Applications 78:13131–13148
Acknowledgements
Bing Yang and Xueqin Xiang prepared the manuscript, Wanzeng Kong provided new ideas about automatic art analysis, Yong Peng designed and conducted experiments, Jinliang Yao focused on algorithm implementation. All authors read and approved the manuscript.
This work was supported by the National Natural Science Foundation of China (U1909202), Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province (2020E10010) and Fundamental Research Funds for the Provincial Universities of Zhejiang, China (GK209907299001-008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests statement
The authors declare that they have no competing financial interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, B., Xiang, X., Kong, W. et al. Adaptive multi-task learning using lagrange multiplier for automatic art analysis. Multimed Tools Appl 81, 3715–3733 (2022). https://doi.org/10.1007/s11042-021-11360-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11360-7