ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism

Vuong, Gia-Huy; Ho, Van-Son; Nguyen-Dang, Tien-Thanh; Thai, Xuan-Dang; Le, Tu-Khiem; Pham, Minh-Khoi; Ninh, Van-Tu; Gurrin, Cathal; Tran, Minh-Triet

doi:10.1007/978-3-031-53302-0_38

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14557))

Included in the following conference series:

International Conference on Multimedia Modeling

913 Accesses
4 Citations

Abstract

ViewsInsight revolutionizes video content retrieval with its comprehensive suite of AI-powered features, enabling users to locate relevant videos using a variety of query types effortlessly. Its intelligent query description rewriting capability ensures precise video matching, while the visual example generation feature provides a powerful tool for refining search results. Additionally, the temporal query mechanism allows users to easily pinpoint specific video segments. The system’s intuitive chat-based interface seamlessly integrates these advanced features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

V-FIRST: A Flexible Interactive Retrieval System for Video at VBS 2022

VERGE in VBS 2020

Revisiting SIRET Video Retrieval Tool

Notes

References

Gurrin, C., et al.: Introduction to the sixth annual lifelog search challenge, LSC23. In: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, ICMR 2023, pp. 678–679. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3591106.3592304
Hoang-Xuan, N., et al.: V-first 2.0: video event retrieval with flexible textual-visual intermediary for VBS 2023. In: Dang-Nguyen, D.T., et al. (eds.) MMM 2023, Part I. LNCS, vol. 13833, pp. 652–657. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27077-254
Chapter Google Scholar
Li, J., Li, D., Xiong, C., Hoi, S.: BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: International Conference on Machine Learning, pp. 12888–12900. PMLR (2022)
Google Scholar
Lokoč, J., Vopálková, Z., Dokoupil, P., Peška, L.: Video search with clip and interactive text query reformulation. In: Dang-Nguyen, D.T., et al. (eds.) MMM 2023, Part I. LNCS, vol. 13833, pp. 628–633. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-27077-2_50
Chapter Google Scholar
Nguyen, T.N., et al.: Videoclip: an interactive clip-based video retrieval system at VBS 2023. In: Dang-Nguyen, D.T., et al. (eds.) MMM 2023, Part I. LNCS, vol. 13833, pp. 671–677. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-27077-2_57
Chapter Google Scholar
Nguyen-Dang, T.T., et al.: LifeInsight: an interactive lifelog retrieval system with comprehensive spatial insights and query assistance. In: Proceedings of the 6th Annual ACM Lifelog Search Challenge, LSC 2023, pp. 59–64. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3592573.3593106
Plummer, B.A., Wang, L., Cervantes, C.M., Caicedo, J.C., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2641–2649 (2015)
Google Scholar
Radford, A., et al.: Learning Transferable Visual Models From Natural Language Supervision (2021)
Google Scholar
Schoeffmann, K., Lokoc, J., Bailer, W.: 10 years of video browser showdown. In: Chua, T., et al. (eds.) MMAsia 2020: ACM Multimedia Asia, Virtual Event, Singapore, 7–9 March 2021, pp. 73:1–73:3. ACM (2020). https://doi.org/10.1145/3444685.3450215
Schoeffmann, K., Stefanics, D., Leibetseder, A.: DiveXplore at the video browser showdown 2023. In: Dang-Nguyen, D.T., et al. (eds.) MMM 2023, Part I. LNCS, vol. 13833, pp. 684–689. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-27077-2_59
Chapter Google Scholar
Trong-Le, D., et al.: News event retrieval from large video collection in Ho Chi Minh City AI challenge 2023. In: The 12th International Symposium on Information and Communication Technology (SOICT 2023), Ho Chi Minh, Vietnam, 7–8 December 2023 (2023). https://doi.org/10.1145/3628797.3628940

Download references

Acknowledgment

This research was funded by Vingroup and supported by Vingroup Innovation Foundation (VINIF) under project code VINIF.2019.DA19.

Author information

Authors and Affiliations

University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
Gia-Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai & Minh-Triet Tran
Vietnam National University, Ho Chi Minh City, Vietnam
Gia-Huy Vuong, Van-Son Ho, Tien-Thanh Nguyen-Dang, Xuan-Dang Thai & Minh-Triet Tran
Dublin City University, Dublin, Ireland
Tu-Khiem Le, Minh-Khoi Pham, Van-Tu Ninh & Cathal Gurrin

Authors

Gia-Huy Vuong
View author publications
You can also search for this author in PubMed Google Scholar
Van-Son Ho
View author publications
You can also search for this author in PubMed Google Scholar
Tien-Thanh Nguyen-Dang
View author publications
You can also search for this author in PubMed Google Scholar
Xuan-Dang Thai
View author publications
You can also search for this author in PubMed Google Scholar
Tu-Khiem Le
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Khoi Pham
View author publications
You can also search for this author in PubMed Google Scholar
Van-Tu Ninh
View author publications
You can also search for this author in PubMed Google Scholar
Cathal Gurrin
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Triet Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minh-Triet Tran .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Stevan Rudinac
Delft University of Technology, Delft, The Netherlands
Alan Hanjalic
Delft University of Technology, Delft, The Netherlands
Cynthia Liem
University of Amsterdam, Amsterdam, The Netherlands
Marcel Worring
Reykjavik University, Reykjavik, Iceland
Björn Þór Jónsson
Microsoft Research Lab – Asia, Beijing, China
Bei Liu
The University of Tokyo, Tokyo, Japan
Yoko Yamakata

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vuong, GH. et al. (2024). ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14557. Springer, Cham. https://doi.org/10.1007/978-3-031-53302-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-53302-0_38
Published: 29 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53301-3
Online ISBN: 978-3-031-53302-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism