Conferences >2021 IEEE International Confe...

Vision And Text Transformer For Predicting Answerability On Visual Question Answering

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Answerability on Visual Question Answering is a novel and attractive task to predict answerable scores between images and questions in multi-modal data. Existing works of...Show More

Metadata

Abstract:

Answerability on Visual Question Answering is a novel and attractive task to predict answerable scores between images and questions in multi-modal data. Existing works often utilize a binary mapping from visual question answering systems into Answerability. It does not reflect the essence of this problem. Together with our consideration of Answerability in a regression task, we propose VT-Transformer, which exploits visual and textual features through Transformer architecture. Experimental results on VizWiz 2020 dataset show the effectiveness and robustness of VT-Transformer for Answerability on Visual Question Answering when comparing with competitive baselines.

Published in: 2021 IEEE International Conference on Image Processing (ICIP)

Date of Conference: 19-22 September 2021

Date Added to IEEE Xplore: 23 August 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/ICIP42928.2021.9506796

Conference Location: Anchorage, AK, USA

Funding Agency:

Contents

References is not available for this document.

Vision And Text Transformer For Predicting Answerability On Visual Question Answering

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Vision And Text Transformer For Predicting Answerability On Visual Question Answering

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?