research-article

Vi-ATISO: An Effective Video Search Engine at AI Challenge HCMC 2023

Authors:

Quang-Tan Nguyen,

Xuan-Quang Nguyen,

Duc-Thang Truong,

Minh-Hoang LeAuthors Info & Claims

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Pages 960 - 965

https://doi.org/10.1145/3628797.3628997

Published: 07 December 2023 Publication History

Abstract

In this paper, we present the first version of Vi-ATISO, a fast and efficient video search engine on medium-scale datasets. The tool provides several search functions based on text-to-image retrieval, text-to-video retrieval, optical character recognition, and object detection algorithms. With diverse algorithms provided, our system can handle a larger amount of data from the AI Challenge HCMC 2023 and achieve good results. In addition, we feel confident that this search engine can be applied in practice because we also consider user experience during the development process.

References

[1]

Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, and Claudio Vairo. 2023. VISIONE at Video Browser Showdown 2023. In MultiMedia Modeling: 29th International Conference, MMM 2023, Bergen, Norway, January 9–12, 2023, Proceedings, Part I (Bergen, Norway). Springer-Verlag, Berlin, Heidelberg, 615–621. https://doi.org/10.1007/978-3-031-27077-2_48

Digital Library

[2]

John Canny. 1986. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-8, 6 (1986), 679–698. https://doi.org/10.1109/TPAMI.1986.4767851

Digital Library

[3]

Han Fang, Pengfei Xiong, Luhui Xu, and Yu Chen. 2021. CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. arXiv preprint arXiv:2106.11097 (2021).

[4]

Nhat Hoang-Xuan, E-Ro Nguyen, Thang-Long Nguyen-Ho, Minh-Khoi Pham, Quang-Thuc Nguyen, Hoang-Phuc Trang-Trung, Van-Tu Ninh, Tu-Khiem Le, Cathal Gurrin, and Minh-Triet Tran. 2023. V-FIRST 2.0: Video Event Retrieval with Flexible Textual-Visual Intermediary for VBS 2023. In MultiMedia Modeling, Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, and Phoebe Chen (Eds.). Springer International Publishing, Cham, 652–657.

[5]

Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. CoRR abs/1405.0312 (2014). arXiv:1405.0312http://arxiv.org/abs/1405.0312

[6]

Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato, and Rita Cucchiara. 2022. ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval. In International Conference on Content-based Multimedia Indexing. 64–70.

[7]

George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (nov 1995), 39–41. https://doi.org/10.1145/219717.219748

Digital Library

[8]

Dat Quoc Nguyen and Anh Tuan Nguyen. 2020. PhoBERT: Pre-trained language models for Vietnamese. In Findings of the Association for Computational Linguistics: EMNLP 2020. 1037–1042.

[9]

Liudmila Prokhorenkova and Aleksandr Shekhovtsov. 2020. Graph-Based Nearest Neighbor Search: From Practice to Theory. In Proceedings of the 37th International Conference on Machine Learning(ICML’20). JMLR.org, Article 723, 11 pages.

[10]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. CoRR abs/2103.00020 (2021). arXiv:2103.00020https://arxiv.org/abs/2103.00020

[11]

Jerome Revaud, Jon Almazan, Rafael Sampaio de Rezende, and Cesar Roberto de Souza. 2019. Learning with Average Precision: Training Image Retrieval with a Listwise Loss. In ICCV.

[12]

Loris Sauter, Ralph Gasser, Silvan Heller, Luca Rossetto, Colin Saladin, Florian Spiess, and Heiko Schuldt. 2023. Exploring Effective Interactive Text-Based Video Search in vitrivr. In MultiMedia Modeling, Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, and Phoebe Chen (Eds.). Springer International Publishing, Cham, 646–651.

[13]

Konstantin Schall, Nico Hezel, Klaus Jung, and Kai Uwe Barthel. 2023. Vibro: Video Browsing with Semantic and Visual Image Embeddings. In MultiMedia Modeling, Duc-Tien Dang-Nguyen, Cathal Gurrin, Martha Larson, Alan F. Smeaton, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, and Phoebe Chen (Eds.). Springer International Publishing, Cham, 665–670.

[14]

Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, and Furu Wei. 2023. Image as a foreign language: BEiT pretraining for vision and vision-language tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]

Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, and Chunhua Shen. 2019. Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network. CoRR abs/1908.05900 (2019). arXiv:1908.05900http://arxiv.org/abs/1908.05900

[16]

Haoyang Zhang, Ying Wang, Feras Dayoub, and Niko Sünderhauf. 2021. VarifocalNet: An IoU-aware Dense Object Detector. In CVPR.

Index Terms

Vi-ATISO: An Effective Video Search Engine at AI Challenge HCMC 2023
1. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
      1. Search interfaces

Recommendations

BlazeSearch: A multimomal semantic search engine for retrieving in-video information for AI Challenge HCMC 2023
SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

In the world today, exploring information has become a critical part of modern life. As a result, search engines have shown their ability to enhance the knowledge-seeking process. However, these search engines still focus on searching for websites or ...
Google Search Engine: Seo Tools You Need to Explode Your Website Traffic - Google Seo, Google Ranking
Brilliant Search Engine Optimisation

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

December 2023

1058 pages

ISBN:9798400708916

DOI:10.1145/3628797

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SOICT 2023

SOICT 2023: The 12th International Symposium on Information and Communication Technology

December 7 - 8, 2023

Ho Chi Minh, Vietnam

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
43
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten