Abstract
Continuously participating since the sixth Video Browser Showdown (VBS2017), diveXplore is a veteran interactive search system that throughout its lifetime has offered and evaluated numerous features. After undergoing major refactoring for the most recent VBS2021, however, the system since version 5.0 is less feature rich, yet, more modern, leaner and faster than the original system. This proved to be a sensible decision as the new system showed increasing performance in VBS2021 when compared to the most recent former competitions. With version 6.0 we reconsider shot segmentation, map search and introduce new features for improving concept as well as temporal context search.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Berns, F., Rossetto, L., Schoeffmann, K., Beecks, C., Awad, G.: V3C1 dataset: an evaluation of content characteristics. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 334–338. ACM (2019)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOV4: optimal speed and accuracy of object detection. CoRR abs/2004.10934 (2020). https://arxiv.org/abs/2004.10934
Kay, A.: Tesseract: an open-source optical character recognition engine. Linux J. 2007(159), 2 (2007)
Leibetseder, A., Kletz, S., Schoeffmann, K.: Sketch-based similarity search for collaborative feature maps. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 425–430. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_45
Leibetseder, A., Münzer, B., Primus, J., Kletz, S., Schoeffmann, K.: diveXplore 4.0: the ITEC deep interactive video exploration system at VBS2020. In: Ro, Y.R., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 753–759. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_65
Leibetseder, A., Schoeffmann, K.: Less is more - diveXplore 5.0 at VBS 2021. In: Lokoč, J., Skopal, T., Schoeffmann, K., Mezaris, V., Li, X., Vrochidis, S., Patras, I. (eds.) MMM 2021. LNCS, vol. 12573, pp. 455–460. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_44
Lokoc, J., Bailer, W., Schoeffmann, K., Muenzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimedia 20, 3361–3376 (2018). https://doi.org/10.1109/TMM.2018.2830110
Lokoč, J., et al.: Interactive search or sequential browsing? A detailed analysis of the video browser showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15(1), 29:1–29:18 (2019). https://doi.org/10.1145/3295663
Monfort, M., et al.: Moments in time dataset: one million videos for event understanding. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 502–508 (2020). https://doi.org/10.1109/TPAMI.2019.2901464
Povey, D., et al.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, No. CONF. IEEE Signal Processing Society (2011)
Primus, M.J., Münzer, B., Leibetseder, A., Schoeffmann, K.: The ITEC collaborative video search system at the video browser showdown 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 438–443. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_47
Rossetto, L., et al.: Interactive video retrieval in the age of deep learning - detailed evaluation of VBS 2019. IEEE Trans. Multimedia 23, 243–256 (2021). https://doi.org/10.1109/TMM.2020.2980944
Rossetto, L., Schoeffmann, K., Bernstein, A.: Insights on the V3C2 dataset. arXiv preprint arXiv:2105.01475 (2021)
Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE MultiMedia 21(4), 8–13 (2014). https://doi.org/10.1109/MMUL.2014.56
Schoeffmann, K., Münzer, B., Leibetseder, A., Primus, J., Kletz, S.: Autopiloting feature maps: the deep interactive video exploration (diveXplore) system at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 585–590. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_50
Schoeffmann, K., et al.: Collaborative feature maps for interactive video search. In: Amsaleg, L., Guðmundsson, G.Þ, Gurrin, C., Jónsson, B.Þ, Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 457–462. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_41
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. IEEE (2016). https://doi.org/10.1109/CVPR.2016.308
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018). https://doi.org/10.1109/TPAMI.2017.2723009
Acknowledgments
This work was funded by the FWF Austrian Science Fund under grant P 32010-N38.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Leibetseder, A., Schoeffmann, K. (2022). diveXplore 6.0: ITEC’s Interactive Video Exploration System at VBS 2022. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_56
Download citation
DOI: https://doi.org/10.1007/978-3-030-98355-0_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98354-3
Online ISBN: 978-3-030-98355-0
eBook Packages: Computer ScienceComputer Science (R0)