Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Zhou, Qianli; Wang, Rong; Hu, Haimiao; Tan, Quange; Zhang, Wenjin

doi:10.1007/s11704-022-1136-3

Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Letter
Published: 01 March 2022

Volume 16, article number 166342, (2022)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Qianli Zhou¹,
Rong Wang¹,
Haimiao Hu²,
Quange Tan¹ &
…
Wenjin Zhang¹

47 Accesses
2 Citations
1 Altmetric
Explore all metrics

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Mogadala A, Kalimuthu M, Klakow D. Trends in integration of vision and language research: a survey of tasks, datasets, and methods. Journal of Artificial Intelligence Research, 2021, 71: 1183–1317
Article MathSciNet Google Scholar
Wu Y, Luo X, Yang Z. Semantic separator learning and its applications in unsupervised Chinese text parsing. Frontiers of Computer Science, 2013, 7(1): 55–68
Article MathSciNet Google Scholar
Margffoy-Tuay E, Pérez J C, Botero E, Arbeláez P. Dynamic multimodal instance segmentation guided by natural language queries. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 656–672
Lei T, Zhang Y, Wang S I, Dai H, Artzi Y. Simple recurrent units for highly parallelizable recurrence. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018, 4470–4481
Zhang Y, Lei T. Training RNNs as fast as CNNs. See Openreview.net website. 2018
Ye L, Rochan M, Liu Z, Wang Y. Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10494–10503
Jadon S. A survey of loss functions for semantic segmentation. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2020, 1–7
Yu L, Poirson P, Yang S, Berg A C, Berg T L. Modeling context in referring expressions. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 69–85
Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K. Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 11–20
Kazemzadeh S, Ordonez V, Matten M, Berg T. ReferItGame: referring to objects in photographs of natural scenes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNL). 2014, 787–798

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant No.62076246).

Author information

Authors and Affiliations

Police Information Engineering and Network Security College, People’s Public Security University, Beijing, 100038, China
Qianli Zhou, Rong Wang, Quange Tan & Wenjin Zhang
Computer Science and Engineering, Beihang University, Beijing, 100191, China
Haimiao Hu

Authors

Qianli Zhou
View author publications
Search author on:PubMed Google Scholar
Rong Wang
View author publications
Search author on:PubMed Google Scholar
Haimiao Hu
View author publications
Search author on:PubMed Google Scholar
Quange Tan
View author publications
Search author on:PubMed Google Scholar
Wenjin Zhang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Rong Wang.

Additional information

Supporting information

The supporting information is available online at journal.hep.com.cn and link.springer.com.

Electronic Supplementary Material

Referring Image Segmentation with Attention Guided Cross Modal Fusion for Semantic Oriented Languages

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, Q., Wang, R., Hu, H. et al. Referring image segmentation with attention guided cross modal fusion for semantic oriented languages. Front. Comput. Sci. 16, 166342 (2022). https://doi.org/10.1007/s11704-022-1136-3

Download citation

Received: 21 March 2021
Accepted: 26 September 2021
Published: 01 March 2022
DOI: https://doi.org/10.1007/s11704-022-1136-3

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Access this article

Subscribe and save

Buy Now

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Supporting information

Electronic Supplementary Material

Referring Image Segmentation with Attention Guided Cross Modal Fusion for Semantic Oriented Languages

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now