Adaptive multi-task learning using lagrange multiplier for automatic art analysis

Yang, Bing; Xiang, Xueqin; Kong, Wanzeng; Peng, Yong; Yao, Jinliang

doi:10.1007/s11042-021-11360-7

Adaptive multi-task learning using lagrange multiplier for automatic art analysis

Published: 20 November 2021

Volume 81, pages 3715–3733, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Bing Yang¹,
Xueqin Xiang²,
Wanzeng Kong¹,
Yong Peng¹ &
…
Jinliang Yao¹

420 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Numerous computer vision applications, such as image classification, have benefited from multi-task learning techniques. However, the relative weighting between each task’s loss is hard to be tuned by hand, causing multi-task learning prohibitive in real applications. In this paper, we present a novel and principled adaptive multi-task learning method that weights multiple loss functions based on lagrange multiplier strategy. Our method starts from the standard multi-task learning model. Based on Gaussian likelihood and lagrange multiplier, we then design an adaptive multi-task learning model to learn suitable weightings of each task and boost performance. In order to validate the feasibility of proposed method, we conduct automatic art analysis tests, including art classification and cross-modal art retrieval. Experimental results demonstrate that our method outperforms several state-of-the-art techniques, showing that performance is improved by up to 4.2% in art classification and 8.7% in cross-modal art retrieval when compared with the latest automatic loss weights learning method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

References

Bar Y, Levy N, Wolf L (2014) Classification of artistic styles using binarized features derived from a deep neural network. In Proceedings of the European Conference on Computer Vision Workshops 71–84
Bilen H, Vedaldi A (2016) Integrated perception with recurrent multi-task neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems 235–243
Carneiro G, Pinho da Silva N, Del Bue A, Paulo Costeira J (2012) Artistic image classification: an analysis on the printart database. In Proceedings of the European Conference on Computer Vision 143–157
Collomosse J, Bui T, Wilber MJ, Fang C, Jin H (2017) Sketching with style: visual search with sketches and aesthetic context. In Proceedings of the International Conference on Computer Vision 2679–2687
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning 160–167
Chen Z, Badrinarayanan V, Lee C, Rabinovich A (2018) GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In Proceedings of the 35th International Conference on Machine Learning 794–803
Chu W, Wu Y (2018) Image style classification based on learnt deep correlation features. IEEE Trans Multimedia 20(9):2491–2502
Article Google Scholar
Crowley E, Zisserman A (2014) The state of the art: Object retrieval in paintings using discriminative regions. In Proceedings of British Machine Vision Conference 1–8
Crowley EJ, Zisserman A (2016) The art of detection. In Proceedings of the European Conference on Computer Vision Workshops 721–737
Garcia N, Renoust B, Nakashima Y (2019) Context-aware embeddings for automatic art analysis. In Proceedings of the International Conference on Multimedia Retrieval 25–33
Garcia N, Vogiatzis G (2018) How to read paintings: semantic art understanding with multi-modal retrieval. In Proceedings of the European Conference on Computer Vision Workshops 676–691
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778
Johnson CR, Hendriks E, Berezhnoy IJ, Brevdo E, Hughes SM, Daubechies I, Li J, Postma E, Wang JZ (2008) Image processing for artist identification. IEEE Signal Process Mag 25(4):37–48
Article Google Scholar
Kalman D (2009) Leveling with lagrange: an alternate view of constrained optimization. Math Mag 82(3):186–196
Article MathSciNet Google Scholar
Karayev S, Trentacoste M, Han H, Agarwala A, Darrell T, Hertzmann A, Winnemoeller H (2014) Recognizing image style. In Proceedings of the British Machine Vision Conference 1–8
Kendall A, Gal Y (2017) What uncertainties do we need inbayesian deep learning for computer vision? arXiv preprint arXiv: 1703.04977
Kendall A, Yarin G, Roberto C (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 7482–7491
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. In Proceedings of the National Academy of Sciences 201611835
Li L, Pan X, Yang H, Liu Z, He Y, Li Z, Fan Y, Cao Z, Zhang L (2020) Multi-task deep learning for fine-grained classification and grading in breast cancer histopathological images. Multimedia Tools and Applications 79:14509–14528
Article Google Scholar
Li X, Wong K (2019) Evolutionary multiobjective clustering and its applications to patient stratification. IEEE Transactions on Cybernetics 49(5):1680–1693
Article Google Scholar
Long M, Wang J (2015) Learning multiple tasks with deep relationship networks. CoRR, abs/1506.02117 3
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Article Google Scholar
Ma D, Gao F, Bai Y, Lou Y, Wang S, Huang T, Duan L (2017) From part to whole: who is behind the painting?. In Proceedings of the 2017 ACM on Multimedia Conference 1174–1182
Mao H, Cheung M, She J (2017) DeepArt: learning joint representations of visual arts. In Proceedings of the 2017 ACM on Multimedia Conference 1183–1191
Mao H, Cheung M, She J (2017) DeepArt: Learning joint representations of visual arts. In Proceedings of the 25th ACM International Conference on Multimedia 1183–1191
Mensink T, Van Gemert J (2014) The Rijksmuseum challenge: Museum-centered visual recognition. In Proceedings of International Conference on Multimedia Retrieval 451–454
Misra I, Shrivastava A, Gupta A, Hebert M (2016) Cross stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 3994–4003
Rudd EM, Günther M, Boult TE (2016) Moon: A mixed objective optimization network for the recognition of facial attributes. In Proceedings of the European Conference on Computer Vision 19–35
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv: 1706.05098
Saleh B, Elgammal AM (2015) Large-scale classification of fine-art paintings: learning the right metric on the right feature. CoRR
Sanakoyeu A, Kotovenko D, Lang, S Ommer B (2018) A style-aware content loss for real-time HD style transfer. In Proceedings of the European Conference on Computer Vision 715–731
Seguin B, Striolo C, diLenardo I, Kaplan F (2016) Visual link retrieval in a database of paintings. In Proceedings of the European Conference on Computer Vision Workshops 753–767
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. In Advances in Neural Information Processing Systems 525–536
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: Integrated recognition, localization and detection using convolutional networks. In Proceedings of International Conference on Learning Representations 25–33
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations
Shamir L, Macura T, Orlov N, Eckley D, Goldberg IG (2010) Impressionism, expressionism, surrealism: automated recognition of painters and schools of art. ACM Trans Appl Percept 7(2):8
Shahbaz Khan F, Beigpour S, Van de Weijer J, Felsberg M (2014) Painting-91: a large scale database for computational painting categorization. Mach Vis Appl 25(6): 1385–1397
Strezoski G, Worring M (2018) OmniArt: a large-scale artistic benchmark. ACM Trans Multimed Comput Commun Appl 14(4):88
Tan W, Chan C, Aguirre HE, Tanaka K (2016) Ceci n’est pas une pipe: a deep convolutional network for fne-art paintings classification. In Proceedings of 2016 IEEE International Conference on Image Processing 1–5
Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: Real-time joint semantic reasoning for autonomous driving. arXiv preprint arXiv: 612.07695
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning 2048–2057
Xu Y, Li X, Chen D, Li H (2018) Learning rates of regularized regression with multiple gaussian kernels for multi-task learning. IEEE Transactions on Neural Networks and Learning Systems 29(11):5408–5418
Article MathSciNet Google Scholar
Yang Y, Hospedales T (2017) Deep multi-task representation learning: A tensor factorisation approach. In Proceedings of the International Conference on Learning Representations
Yang Y, Yang Y, Yuan Y, Zheng J, Zheng Z (2020) Detecting helicobacter pylori in whole slide images via weakly supervised multi-task learning. Multimedia Tools and Applications 79:26787–26815
Article Google Scholar
Zhang T, Ghanem B, Liu S, Ahuja N (2013) Robust visual tracking via structured multi-task sparse learning. Int J Comput Vision 101(2):367–383
Article MathSciNet Google Scholar
Zhao Y, Tang F, Dong W, Huang F, Zhang X (2019) Joint face alignment and segmentation via deep multi-task learning. Multimedia Tools and Applications 78:13131–13148
Article Google Scholar

Download references

Acknowledgements

Bing Yang and Xueqin Xiang prepared the manuscript, Wanzeng Kong provided new ideas about automatic art analysis, Yong Peng designed and conducted experiments, Jinliang Yao focused on algorithm implementation. All authors read and approved the manuscript.

This work was supported by the National Natural Science Foundation of China (U1909202), Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province (2020E10010) and Fundamental Research Funds for the Provincial Universities of Zhejiang, China (GK209907299001-008).

Author information

Authors and Affiliations

School of Computer Science and Technology, Hangzhou Dianzi University, Xiasha Higher Education Zone, Hangzhou, 310018, China
Bing Yang, Wanzeng Kong, Yong Peng & Jinliang Yao
uSens, Inc, Nanhuan Road, BinJiang District, Hangzhou, 310007, China
Xueqin Xiang

Authors

Bing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xueqin Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Wanzeng Kong
View author publications
You can also search for this author in PubMed Google Scholar
Yong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jinliang Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xueqin Xiang.

Ethics declarations

Competing interests statement

The authors declare that they have no competing financial interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, B., Xiang, X., Kong, W. et al. Adaptive multi-task learning using lagrange multiplier for automatic art analysis. Multimed Tools Appl 81, 3715–3733 (2022). https://doi.org/10.1007/s11042-021-11360-7

Download citation

Received: 19 November 2020
Revised: 29 January 2021
Accepted: 26 July 2021
Published: 20 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11042-021-11360-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive multi-task learning using lagrange multiplier for automatic art analysis

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests statement

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive multi-task learning using lagrange multiplier for automatic art analysis

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests statement

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation