research-article

BlazeSearch: A multimomal semantic search engine for retrieving in-video information for AI Challenge HCMC 2023

Authors:

Son Ngo Duc Hoang,

Anh Bui Vuong Tam,

Phuoc Phan Hoang,

Giang Tran Thi Cam,

Thinh Nguyen Hung,

Quyen Nguyen Huu,

Van-Hau PhamAuthors Info & Claims

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Pages 901 - 907

https://doi.org/10.1145/3628797.3628942

Published: 07 December 2023 Publication History

Abstract

In the world today, exploring information has become a critical part of modern life. As a result, search engines have shown their ability to enhance the knowledge-seeking process. However, these search engines still focus on searching for websites or images. The capacity to find information in videos is extremely needed to experiment and study more in order to improve the power of search engines. In this study, we investigate the potentiality of in-video information search engines by introducing BlazeSearch, a multimodal search engine designed to search frames of video with simple input text. By leveraging the OpenCLIP model, which is superior for the image-text retrieval task, our search engine can be guaranteed reliability and accuracy. Furthermore, we optimize the searching speed and provide an easy-to-use, fully functional user interface for BlazeSearch, which can help users have a pleasant experience.

References

[1]

Naushad Alam, Yvette Graham, and Cathal Gurrin. 2023. Memento 3.0: An Enhanced Lifelog Search Engine for LSC’23. In Proceedings of the 6th Annual ACM Lifelog Search Challenge. 41–46.

Digital Library

[2]

Ahmed Alateeq, Mark Roantree, and Cathal Gurrin. 2023. Voxento 4.0: A More Flexible Visualisation and Control for Lifelogs. In Proceedings of the 6th Annual ACM Lifelog Search Challenge. 7–12.

Digital Library

[3]

Buckminster Fuller. [n. d.]. Dymaxion Chronofile. https://en.wikipedia.org/wiki/Dymaxion_Chronofile

[4]

Tran Ly Duyen, Nguyen Manh Duy, Nguyen Thanh Binh, Hyowon Lee, and Cathal Gurrin. 2020. Myscéal-an experimental interactive lifelog retrieval system for LSC’20. In Proc. ACM Workshop on Lifelog Search Challenge (LSC@ ICMR 2020). ACM, Dublin, Irelend.

[5]

Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, and Ludwig Schmidt. 2023. DataComp: In search of the next generation of multimodal datasets. arxiv:2304.14108 [cs.CV]

[6]

Silvan Heller, Ralph Gasser, Mahnaz Parian-Scherb, Sanja Popovic, Luca Rossetto, Loris Sauter, Florian Spiess, and Heiko Schuldt. 2021. Interactive multimodal lifelog retrieval with Vitrivr at LSC 2021. In Proceedings of the 4th Annual on Lifelog Search Challenge. 35–39.

Digital Library

[7]

Gabriel Ilharco, Mitchell Wortsman, Ross Wightman, Cade Gordon, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Hannaneh Hajishirzi, Ali Farhadi, and Ludwig Schmidt. 2021. OpenCLIP. https://doi.org/10.5281/zenodo.5143773 If you use this software, please cite it as below.

[8]

Yu. A. Malkov and D. A. Yashunin. 2018. Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. arxiv:1603.09320 [cs.DS]

[9]

Microsoft. 2001. MyLifeBits. https://en.wikipedia.org/wiki/MyLifeBits

[10]

Chinh Ngo, Trieu H. Trinh, Long Phan, Hieu Tran, Tai Dang, Hieu Nguyen, Minh Nguyen, and Minh-Thang Luong. 2022. MTet: Multi-domain Translation for English and Vietnamese. https://doi.org/10.48550/ARXIV.2210.05610

[11]

Thao-Nhu Nguyen, Tu-Khiem Le, Van-Tu Ninh, Cathal Gurrin, Minh-Triet Tran, Thanh Binh Nguyen, Graham Healy, Annalina Caputo, and Sinead Smyth. 2023. E-LifeSeeker: An Interactive Lifelog Search Engine for LSC’23. In Proceedings of the 6th Annual ACM Lifelog Search Challenge. 13–17.

Digital Library

[12]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. arxiv:2103.00020 [cs.CV]

[13]

Ricardo Ribiero, Alina Trifan, and Antonio JR Neves. 2022. MEMORIA: A Memory Enhancement and MOment RetrIeval Application for LSC 2022. In Proceedings of the 5th Annual on Lifelog Search Challenge. 8–13.

[14]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y

Digital Library

[15]

Florian Spiess and Heiko Schuldt. 2022. Multimodal Interactive Lifelog Retrieval with vitrivr-VR. In Proceedings of the 5th Annual on Lifelog Search Challenge. 38–42.

Digital Library

[16]

Ly-Duyen Tran, Manh-Duy Nguyen, Nguyen Thanh Binh, Hyowon Lee, and Cathal Gurrin. 2021. Myscéal 2.0: A Revised Experimental Interactive Lifelog Retrieval System for LSC’21. In Proceedings of the 4th Annual on Lifelog Search Challenge. 11–16.

Digital Library

[17]

Quang-Linh Tran, Ly-Duyen Tran, Binh Nguyen, and Cathal Gurrin. 2023. MemoriEase: An Interactive Lifelog Retrieval System for LSC’23. In Proceedings of the 6th Annual ACM Lifelog Search Challenge. 30–35.

Digital Library

[18]

Vannevar Bush. 1945. Memex. https://en.wikipedia.org/wiki/Memex

[19]

Ash Vardanian. 2022. USearch by Unum Cloud. https://doi.org/10.5281/zenodo.7949416

Index Terms

BlazeSearch: A multimomal semantic search engine for retrieving in-video information for AI Challenge HCMC 2023
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools
2. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
      1. Search interfaces
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

DoppelSearch: A Novel Approach to Content-Based Video Retrieval for AI Challenge HCMC 2023
SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Video retrieval, which has been considered as a critical task in the field of computer vision and pattern recognition recently, finds extensive applications in several aspects such as education, entertainment, security, and healthcare. However, it faces ...
Discovering the representative of a search engine
CIKM '01: Proceedings of the tenth international conference on Information and knowledge management

Given a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon ...
Discovering the representative of a search engine
CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management

Given a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

December 2023

1058 pages

ISBN:9798400708916

DOI:10.1145/3628797

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SOICT 2023

SOICT 2023: The 12th International Symposium on Information and Communication Technology

December 7 - 8, 2023

Ho Chi Minh, Vietnam

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
44
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten