research-article

Improving Video Retrieval Performance with Query Expansion Using ChatGPT

Authors:
Kazuya Ueki

Meisei University, Japan

Meisei University, Japan

0009-0005-1691-1858
View Profile

,
Yuma Suzuki

SoftBank Corp., Japan

SoftBank Corp., Japan

0000-0003-4542-6243
View Profile

,
Hiroki Takushima

SoftBank Corp., Japan

SoftBank Corp., Japan

0009-0006-2041-8713
View Profile

,
Takayuki Hori

SoftBank Corp., Japan

SoftBank Corp., Japan

0000-0001-8232-5922
View Profile

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics ProcessingJanuary 2024Pages 431–436https://doi.org/10.1145/3647649.3647716

Published:03 May 2024Publication History

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing

Pages 431–436

ABSTRACT

In this study, we investigated methods to improve video retrieval performance to detect more appropriate videos by expanding the input query sentences in the video retrieval task. For query expansion, we used ChatGPT, which can generate rich text, to create multiple query sentences with the same meaning but different expressions from the original query sentences. We conducted a large-scale video retrieval experiment using the latest pre-trained image-text embedding models and confirmed the effectiveness of improving the baseline accuracy.

References

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever, “Learning Transferable Visual Models From Natural Language Supervision,” arXiv:2103.00020, 2021.Google Scholar
C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman, P. Schramowski, S. Kundurthy, K. Crowson, L. Schmidt, R. Kaczmarczyk, J. Jitsev, “LAION-5B: An open large-scale dataset for training next generation image-text models,” In 36th Conference on Neural Information Processing Systems (NeurIPS), 2022.Google Scholar
G. Awad, K. Curtis, A. A. Butt, J. Fiscus, A. Godil, Y. Lee, A. Delgado, J. Zhang, E. Godard, B. Chocot, L. Diduch, J. Liu, Y. Graham, G. Quénot, “An overview on the evaluated video retrieval tasks at TRECVID 2022,” In Proc. of TRECVID 2022, 2022.Google Scholar
K. Ueki, K. Hirakawa, K. Kikuchi, T. Ogawa, T. Kobayashi, “Waseda_Meisei at TRECVID 2017: Ad-hoc Video Search,” In Proc. of TRECVID 2017, 2017.Google Scholar
A. Habibian, T. Mensink, and C. G. M. Snoek, “VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events,” In Proc. of the ACM Conference on Multimedia, 2014.Google ScholarDigital Library
J. Dong, X. Li, C. Xu, S. Ji, Y. He, G. Yang, and X. Wang, “Dual Encoding for Zero-Example Video Retrieval,” In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.Google ScholarCross Ref
J. Xu, T. Mei, T. Yao, Y. Rui, “MSR-VTT: A Large Video Description Dataset for Bridging Video and Language,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.Google ScholarCross Ref
X. Wang, J. Wu, J. Chen, L. Li, Y.-F. Wang, W. Y. Wang, “VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research,” In Proc. of IEEE International Conference on Computer Vision (ICCV), 2019.Google ScholarCross Ref
P. Sharma, N. Ding, S. Goodman, and R. Soricut, “Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning,” In Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2556-2565, 2018.Google ScholarCross Ref
B. Thomee, D.A. Shamma, G. Friedland, B. Elizalde, K. Ni, D. Poland, D. Borth, and L. Li, “YFCC100M: The New Data in Multimedia Research,” Communications of the ACM, vol.59, no.2, pp.64-73, 2016.Google ScholarDigital Library
C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman, P. Schramowski, S. Kundurthy, K. Crowson, L. Schmidt, R. Kaczmarczyk, J. Jitsev, “LAION-5B: An open large-scale dataset for training next generation image-text models,” In 36th Conference on Neural Information Processing Systems (NeurIPS), 2022.Google Scholar
N. Mu, A. Kirillov, D. Wagner, S. Xie, “SLIP: Self-supervision meets Language-Image Pre-training,” arXiv:2112.12750, 2021.Google Scholar
K. Ueki, Y. Suzuki, H. Takushima, H. Okamoto, H. Tanoue, T. Hori, “Waseda_Meisei_SoftBank at TRECVID 2022 Ad-hoc Video Search,” In Proc. of TRECVID 2022, 2022.Google Scholar

Index Terms

Improving Video Retrieval Performance with Query Expansion Using ChatGPT
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Novelty in information retrieval

Recommendations

Improving query expansion using pseudo-relevant web knowledge for information retrieval
Highlights
- Web knowledge-based query expansion technique uses the top N pseudo relevant web pages
Abstract
In the field of information retrieval, query expansion (QE) has long been used as a technique to deal with the fundamental issue of word mismatch between a user’s query and the target information. In the context of the relationship ...
Read More
Document expansion for image retrieval
RIAO '10: Adaptivity, Personalization and Fusion of Heterogeneous Information

Successful information retrieval requires effective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of these documents. One ...
Read More
Query expansion techniques for information retrieval: A survey
Abstract
With the ever increasing size of the web, relevant information extraction on the Internet with a query formed by a few keywords has become a big challenge. Query Expansion (QE) plays a crucial role in improving searches on the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing
January 2024
480 pages
ISBN:9798400716720
DOI:10.1145/3647649

Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ChatGPT
Query Expansion
TRECVID benchmark
Video Retrieval
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 5
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Improving Video Retrieval Performance with Query Expansion Using ChatGPT

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improving query expansion using pseudo-relevant web knowledge for information retrieval

Document expansion for image retrieval

Query expansion techniques for information retrieval: A survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media