Skip to main content
Log in

Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

  • Letter
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Mogadala A, Kalimuthu M, Klakow D. Trends in integration of vision and language research: a survey of tasks, datasets, and methods. Journal of Artificial Intelligence Research, 2021, 71: 1183–1317

    Article  MathSciNet  Google Scholar 

  2. Wu Y, Luo X, Yang Z. Semantic separator learning and its applications in unsupervised Chinese text parsing. Frontiers of Computer Science, 2013, 7(1): 55–68

    Article  MathSciNet  Google Scholar 

  3. Margffoy-Tuay E, Pérez J C, Botero E, Arbeláez P. Dynamic multimodal instance segmentation guided by natural language queries. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 656–672

  4. Lei T, Zhang Y, Wang S I, Dai H, Artzi Y. Simple recurrent units for highly parallelizable recurrence. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018, 4470–4481

  5. Zhang Y, Lei T. Training RNNs as fast as CNNs. See Openreview.net website. 2018

  6. Ye L, Rochan M, Liu Z, Wang Y. Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10494–10503

  7. Jadon S. A survey of loss functions for semantic segmentation. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2020, 1–7

  8. Yu L, Poirson P, Yang S, Berg A C, Berg T L. Modeling context in referring expressions. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 69–85

  9. Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K. Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 11–20

  10. Kazemzadeh S, Ordonez V, Matten M, Berg T. ReferItGame: referring to objects in photographs of natural scenes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNL). 2014, 787–798

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant No.62076246).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rong Wang.

Additional information

Supporting information

The supporting information is available online at journal.hep.com.cn and link.springer.com.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Q., Wang, R., Hu, H. et al. Referring image segmentation with attention guided cross modal fusion for semantic oriented languages. Front. Comput. Sci. 16, 166342 (2022). https://doi.org/10.1007/s11704-022-1136-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-022-1136-3

Navigation