Abstract
vitrivr is a general purpose retrieval system that supports a wide range of query modalities. In this paper, we briefly introduce the system and describe the changes and adjustments made for the 2023 iteration of the video browser showdown. These focus primarily on text-based retrieval schemes and corresponding user-feedback mechanisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Including its predecessor, the iMotion system.
- 2.
- 3.
See https://openjdk.org/jeps/338, accessed September 2022.
References
Amato, G., et al.: VISIONE at video browser showdown 2022. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13142, pp. 543–548. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98355-0_52
Bain, M., Nagrani, A., Varol, G., Zisserman, A.: Frozen in time: a joint video and image encoder for end-to-end retrieval. In: International Conference on Computer Vision, ICCV (2021). https://doi.org/10.1109/ICCV48922.2021.00175
Berns, F., Rossetto, L., Schoeffmann, K., Beecks, C., Awad, G.: V3C1 dataset: an evaluation of content characteristics. In: International Conference on Multimedia Retrieval. ACM (2019). https://doi.org/10.1145/3323873.3325051
Cho, J., Yoon, S., Kale, A., Dernoncourt, F., Bui, T., Bansal, M.: Fine-grained image captioning with CLIP reward. In: Findings of the Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.findings-naacl.39
Gasser, R., Rossetto, L., Heller, S., Schuldt, H.: Cottontail DB: an open source database system for multimedia retrieval and analysis. In: International Conference on Multimedia. ACM (2020). https://doi.org/10.1145/3394171.3414538
Gasser, R., Rossetto, L., Schuldt, H.: Multimodal multimedia retrieval with vitrivr. In: International Conference on Multimedia Retrieval (2019)
Heller, S., et al.: Multi-modal interactive video retrieval with temporal queries. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13142, pp. 493–498. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98355-0_44
Heller, S., et al.: Towards explainable interactive multi-modal video retrieval with vitrivr. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 435–440. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_41
Heller, S., et al.: Interactive video retrieval evaluation at a distance: comparing sixteen interactive video search systems in a remote setting at the 10th Video Browser Showdown. Int. J. Multimedia Inf. Retrieval, 1–18 (2022). https://doi.org/10.1007/s13735-021-00225-2
Hezel, N., Schall, K., Jung, K., Barthel, K.U.: Efficient search and browsing of large-scale video collections with vibro. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13142, pp. 487–492. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98355-0_43
Liu, Z., Mao, H., Wu, C., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. CoRR abs/2201.03545 (2022)
Lokoč, J., et al.: A Task category space for user-centric comparative multimedia search evaluations. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13141, pp. 193–204. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98358-1_16
Lokoč, J., Mejzlík, F., Souček, T., Dokoupil, P., Peška, L.: Video search with context-aware ranker and relevance feedback. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13142, pp. 505–510. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98355-0_46
Lokoč, J., et al.: Is the reign of interactive search eternal? Findings from the video browser showdown 2020. ACM Trans. Multimedia Computi. Commun. Appl. (2021). https://doi.org/10.1145/3445031
Mokady, R., Hertz, A., Bermano, A.H.: Clipcap: CLIP prefix for image captioning. CoRR abs/2111.09734 (2021)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (2021)
Rossetto, L.: Multi-modal video retrieval. Ph.D. thesis, University of Basel (2018)
Rossetto, L., Baumgartner, M., Gasser, R., Heitz, L., Wang, R., Bernstein, A.: Exploring graph-querying approaches in LifeGraph. In: Workshop on Lifelog Search Challenge (2021). https://doi.org/10.1145/3463948.3469068
Rossetto, L., et al.: Interactive video retrieval in the age of deep learning – detailed evaluation of VBS 2019. IEEE Trans. Multimedia (2021)
Rossetto, L., Gasser, R., Sauter, L., Bernstein, A., Schuldt, H.: A system for interactive multimedia retrieval evaluations. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 385–390. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_33
Rossetto, L., Giangreco, I., Heller, S., Tănase, C., Schuldt, H.: Searching in video collections using sketches and sample images – the Cineast system. In: Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016. LNCS, vol. 9517, pp. 336–341. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-27674-8_30
Rossetto, L., Giangreco, I., Schuldt, H.: Cineast: a multi-feature sketch-based video retrieval engine. In: International Symposium on Multimedia (2014)
Rossetto, L., Giangreco, I., Tanase, C., Schuldt, H.: vitrivr: A flexible retrieval stack supporting multiple query modes for searching in multimedia collections. In: ACM Conference on Multimedia (2016). https://doi.org/10.1145/2964284.2973797
Rossetto, L., Amiri Parian, M., Gasser, R., Giangreco, I., Heller, S., Schuldt, H.: Deep learning-based concept detection in vitrivr. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 616–621. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_55
Rossetto, L., Schoeffmann, K., Bernstein, A.: Insights on the V3C2 dataset. CoRR abs/2105.01475 (2021)
Rossetto, L., Schuldt, H., Awad, G., Butt, A.A.: V3C – a research video collection. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 349–360. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05710-7_29
Schoeffmann, K.: Video browser showdown 2012–2019: a review. In: International Conference on Content-Based Multimedia Indexing (2019)
Spiess, F., et al.: Multi-modal video retrieval in virtual reality with vitrivr-VR. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13142, pp. 499–504. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98355-0_45
Spiess, F., Gasser, R., Heller, S., Rossetto, L., Sauter, L., Schuldt, H.: Competitive interactive video retrieval in virtual reality with vitrivr-VR. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 441–447. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_42
Truong, Q.T., et al.: Marine video kit: a new marine video dataset for content-based analysis and retrieval. In: Dang-Nguyen, D., et al.: MMM 2023. LNCS, vol. 13833, pp. 539–550. Springer. Cham (2023)
Yang, Y., et al.: Multilingual universal sentence encoder for semantic retrieval (2019). https://doi.org/10.48550/ARXIV.1907.04307
Acknowledgements
This work was partly supported by the Swiss National Science Foundation through projects “Participatory Knowledge Practices in Analog and Digital Image Archives” (contract no. 193788) and “MediaGraph” (contract no. 202125).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sauter, L. et al. (2023). Exploring Effective Interactive Text-Based Video Search in vitrivr. In: Dang-Nguyen, DT., et al. MultiMedia Modeling. MMM 2023. Lecture Notes in Computer Science, vol 13833. Springer, Cham. https://doi.org/10.1007/978-3-031-27077-2_53
Download citation
DOI: https://doi.org/10.1007/978-3-031-27077-2_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27076-5
Online ISBN: 978-3-031-27077-2
eBook Packages: Computer ScienceComputer Science (R0)