Domain classifier-based transfer learning for visual attention prediction

Zhang, Zhiwen; Duan, Feng; Caiafa, Cesar F.; Solé-Casals, Jordi; Yang, Zhenglu; Sun, Zhe

doi:10.1007/s11280-022-01027-0

Domain classifier-based transfer learning for visual attention prediction

Published: 27 April 2022

Volume 25, pages 1685–1701, (2022)
Cite this article

World Wide Web Aims and scope Submit manuscript

281 Accesses
1 Altmetric
Explore all metrics

Abstract

Benefitting from machine learning techniques based on deep neural networks, data-driven saliency has achieved significant success over the past few decades. However, existing data-hungry models for saliency prediction require large-scale datasets to be trained. Although some studies based on the transfer learning strategy have managed to acquire sufficient information from the limited samples of the target domain, obtaining saliency maps for the transfer process from one image category to another still remains a challenge. To solve this problem, we propose a domain classifier paradigm-based adaptation method for saliency prediction. The method provides sufficient information by classifying the domain from which the data sample originated. Specifically, only a few target domain samples are used in our few-shot transfer learning paradigm, and the prediction results are compared with those obtained through state-of-the-art methods (such as the fine-tuned transfer strategy). To the best of our knowledge, the proposed transfer framework is the first work that conducts saliency prediction while taking the domain adaptation of different image categories into consideration. Comprehensive experiments are conducted on various image category pairs for source and target domains. The experimental results show that our proposed approach achieves a significant performance improvement with respect to conventional transfer learning approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Deep Learning for Generic Object Detection: A Survey

Article Open access 31 October 2019

References

Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: International Conference on Machine Learning (ICML), pp. 1247–1255. PMLR (2013)
Bäuml, B., Tulbure, A.: Deep n-shot transfer learning for tactile material classification with a flexible pressure-sensitive skin. In: International Conference on Robotics and Automation (ICRA), pp. 4262–4268. IEEE (2019)
Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 17–36. JMLR Workshop and Conference Proceedings (2012)
Borji, A., Itti, L.: Scene classification with a sparse set of salient regions. In: International Conference on Robotics and Automation (ICRA), pp. 1902–1908. IEEE (2011)
Borji, A., Itti, L.: CAT2000: A large scale fixation dataset for boosting saliency research. In: Conference on Computer Vision and Pattern Recognition (CVPR) Future of Datasets Workshop. IEEE (2015)
Bylinskii, Z., Recasens, A., Borji, A., Oliva, A., Torralba, A., Durand, F.: Where should saliency models look next? In: European Conference on Computer Vision (ECCV), pp. 809–824. Springer (2016)
Cao, C., Liu, X., Yang, Y., Yu, Y., Wang, J., Wang, Z., Huang, Y., Wang, L., Huang, C., Xu, W., et al.: Look and think twice: Capturing top-down visual attention with feedback convolutional neural networks. In: International Conference on Computer Vision (ICCV), pp. 2956–2964. IEEE (2015)
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: International Conference on Pattern Recognition (ICPR), pp. 3488–3493. IEEE (2016)
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. IEEE Transactions on Image Processing 27(10), 5142–5154 (2018)
Article MathSciNet Google Scholar
Daume, H., III., Marcu, D.: Domain adaptation for statistical classifiers. Journal of artificial Intelligence research 26, 101–126 (2006)
Article MathSciNet Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Frintrop, S., Kessel, M.: Most salient region tracking. In: International Conference on Robotics and Automation (ICRA), pp. 1869–1874. IEEE (2009)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. The Journal of Machine Learning Research 17(1), 2096–2030 (2016)
MathSciNet MATH Google Scholar
Ge, W., Yu, Y.: Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine-tuning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1086–1095. IEEE (2017)
Guo, Y., Shi, H., Kumar, A., Grauman, K., Rosing, T., Feris, R.: SpotTune: transfer learning through adaptive fine-tuning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4805–4814. IEEE (2019)
Hayhoe, M., Ballard, D.: Eye movements in natural behavior. Trends in Cognitive Sciences 9(4), 188–194 (2005)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254–1259 (1998)
Article Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: International Conference on Computer Vision (ICCV), pp. 2106–2113. IEEE (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25, 1097–1105 (2012)
Google Scholar
Kruthiventi, S.S., Ayush, K., Babu, R.V.: DeepFix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing 26(9), 4446–4456 (2017)
Article MathSciNet Google Scholar
Kümmerer, M., Theis, L., Bethge, M.: Deep Gaze I: Boosting saliency prediction with feature maps trained on imagenet. In: International Conference on Learning Representations (ICLR) Workshop (2015)
Li, J., Wong, Y., Zhao, Q., Kankanhalli, M.S.: Attention transfer from web images for video recognition. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1–9 (2017)
Luo, Y., Wong, Y., Kankanhalli, M., Zhao, Q.: Direction concentration learning: Enhancing congruency in machine learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019)
Luo, Y., Wong, Y., Kankanhalli, M.S., Zhao, Q.: n-Reference transfer learning for saliency prediction. In: European Conference on Computer Vision (ECCV), pp. 502–519. Springer (2020)
Min, X., Zhai, G., Zhou, J., Zhang, X.P., Yang, X., Guan, X.: A multimodal saliency model for videos with high audio-visual correspondence. IEEE Transactions on Image Processing 29, 3805–3819 (2020)
Article MathSciNet Google Scholar
Ouerhani, N., Hügli, H., Müri, R., Von Wartburg, R.: Empirical validation of the saliency-based model of visual attention. In: Electronic Letters on Computer Vision and Image Analysis, pp. 13–23 (2003)
Pan, J., Ferrer, C.C., McGuinness, K., O’Connor, N.E., Torres, J., Sayrol, E., Giro-i Nieto, X.: SalGAN: Visual saliency prediction with generative adversarial networks. In: International Conference on Computer Vision and Pattern Recognition (CVPR) Scene Understanding Workshop (2017)
Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks 22(2), 199–210 (2010)
Article Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision Research 45(18), 2397–2416 (2005)
Article Google Scholar
Rasouli, A., Lanillos, P., Cheng, G., Tsotsos, J.K.: Attention-based active visual search for mobile robots. Autonomous Robots 44(2), 131–146 (2020)
Article Google Scholar
Schauerte, B., Richarz, J., Fink, G.A.: Saliency-based identification and recognition of pointed-at objects. In: International Conference on Intelligent Robots and Systems, pp. 4638–4643. IEEE (2010)
Shan, W., Sun, G., Zhou, X., Liu, Z.: Two-stage transfer learning of end-to-end convolutional neural networks for webpage saliency prediction. In: International Conference on Intelligent Science and Big Data Engineering, pp. 316–324. Springer (2017)
Staudte, M., Crocker, M.W.: Visual attention in spoken human-robot interaction. In: International Conference on Human-Robot Interaction (HRI), pp. 77–84. IEEE (2009)
Wahid, M., Waris, A., Gilani, S.O., Subramanian, R.: The effect of eye movements in response to different types of scenes using a graph-based visual saliency algorithm. Applied Sciences 9(24), 5378 (2019)
Article Google Scholar
Wilson, G., Doppa, J.R., Cook, D.J.: Multi-source deep domain adaptation with weak supervision for time-series sensor data. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1768–1778 (2020)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017)
Xu, X., Lin, K., Gao, L., Lu, H., Shen, H.T., Li, X.: Learning cross-modal common representations by private-shared subspaces separation. IEEE Transactions on Cybernetics (2020)
Xuhong, L., Grandvalet, Y., Davoine, F.: Explicit inductive bias for transfer learning with convolutional networks. In: International Conference on Machine Learning (ICML), pp. 2825–2834. PMLR (2018)
Yang, S., Lin, G., Jiang, Q., Lin, W.: A dilated inception network for visual saliency prediction. IEEE Transactions on Multimedia 22(8), 2163–2176 (2019)
Article Google Scholar
Zhang, J., Sclaroff, S.: Saliency detection: A boolean map approach. In: International Conference on Computer Vision (ICCV), pp. 153–160. IEEE (2013)

Download references

Acknowledgements

C.F.C work was partially supported by grants PICT 2017-3208, PICT 2020-SERIEA-00457, UBACYT 20020190200305BA and UBACYT 20020170100192BA (Argentina).

Author information

Authors and Affiliations

College of Artificial Intelligence, Nankai University, Tianjin, China
Zhiwen Zhang, Feng Duan, Cesar F. Caiafa & Jordi Solé-Casals
Instituto Argentino de Radioastronomía - CCT La Plata, CONICET/CIC-PBA/UNLP, Camino Gral Belgrano Km 40, Berazategui, Argentina
Cesar F. Caiafa
Data and Signal Processing Research Group, University of Vic - Central University of Catalonia, Vic, Spain
Jordi Solé-Casals
College of Computer Science, Nankai University, Tianjin, China
Zhenglu Yang
Computational Engineering Applications Unit, RIKEN, Wako, Japan
Zhe Sun

Authors

Zhiwen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Duan
View author publications
You can also search for this author in PubMed Google Scholar
Cesar F. Caiafa
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Solé-Casals
View author publications
You can also search for this author in PubMed Google Scholar
Zhenglu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhe Sun.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Zhiwen Zhang and Feng Duan contributed equally to this work.

This article belongs to the Topical Collection: Special Issue on Synthetic Media on the Web

Guest Editors: Huimin Lu, Xing Xu, Jože Guna, and Gautam Srivastava

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Duan, F., Caiafa, C.F. et al. Domain classifier-based transfer learning for visual attention prediction. World Wide Web 25, 1685–1701 (2022). https://doi.org/10.1007/s11280-022-01027-0

Download citation

Received: 22 June 2021
Revised: 22 December 2021
Accepted: 14 February 2022
Published: 27 April 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11280-022-01027-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Domain classifier-based transfer learning for visual attention prediction

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Domain classifier-based transfer learning for visual attention prediction

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

A survey on Image Data Augmentation for Deep Learning

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation