Conferences >2022 IEEE International Confe...

Evaluating Captioning Models using Markov Logic Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Multimodal problems such as caption generation advances AI as a whole since they require integration of several key domains such as computer vision, NLP and knowledge rep...Show More

Metadata

Abstract:

Multimodal problems such as caption generation advances AI as a whole since they require integration of several key domains such as computer vision, NLP and knowledge representation. In this paper, we develop a new approach to evaluate captioning models by verifying them using Markov Logic Networks (MLNs). Specifically, we compile an MLN from training data and perform probabilistic inference to estimate uncertainty in a generated caption. To reify the caption, we leverage advances in Natural Language Inference (NLI) models and convert a caption into a query for the MLN. Further, we add visual context into the MLN distribution using an attention-based Multiple Instance Learning model and evaluate a caption based on this augmented distribution. We perform experiments using MSCOCO on several state-of-the-art benchmarks and show that our approach can evaluate captioning models just as effectively as methods that require human-generated captions.

Published in: 2022 IEEE International Conference on Big Data (Big Data)

Date of Conference: 17-20 December 2022

Date Added to IEEE Xplore: 26 January 2023

ISBN Information:

DOI: 10.1109/BigData55660.2022.10020793

Conference Location: Osaka, Japan