References
Mogadala A, Kalimuthu M, Klakow D. Trends in integration of vision and language research: a survey of tasks, datasets, and methods. Journal of Artificial Intelligence Research, 2021, 71: 1183–1317
Wu Y, Luo X, Yang Z. Semantic separator learning and its applications in unsupervised Chinese text parsing. Frontiers of Computer Science, 2013, 7(1): 55–68
Margffoy-Tuay E, Pérez J C, Botero E, Arbeláez P. Dynamic multimodal instance segmentation guided by natural language queries. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 656–672
Lei T, Zhang Y, Wang S I, Dai H, Artzi Y. Simple recurrent units for highly parallelizable recurrence. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018, 4470–4481
Zhang Y, Lei T. Training RNNs as fast as CNNs. See Openreview.net website. 2018
Ye L, Rochan M, Liu Z, Wang Y. Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10494–10503
Jadon S. A survey of loss functions for semantic segmentation. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2020, 1–7
Yu L, Poirson P, Yang S, Berg A C, Berg T L. Modeling context in referring expressions. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 69–85
Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K. Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 11–20
Kazemzadeh S, Ordonez V, Matten M, Berg T. ReferItGame: referring to objects in photographs of natural scenes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNL). 2014, 787–798
Acknowledgements
This work was supported in part by National Natural Science Foundation of China (Grant No.62076246).
Author information
Authors and Affiliations
Corresponding author
Additional information
Supporting information
The supporting information is available online at journal.hep.com.cn and link.springer.com.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Zhou, Q., Wang, R., Hu, H. et al. Referring image segmentation with attention guided cross modal fusion for semantic oriented languages. Front. Comput. Sci. 16, 166342 (2022). https://doi.org/10.1007/s11704-022-1136-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-022-1136-3