Abstract
The results of the last Video Browser Showdown in Bangkok 2018 show that multimodal search with interactive query reformulation represents a competitive search strategy for all the evaluated task categories. Therefore, we plan to target the effectiveness of involved retrieval models by making use of the most recent deep network architectures in the new version of our interactive video retrieval VIRET tool. Specifically, we apply the NasNet deep convolutional neural network architecture for automatic annotation and similarity search in the set of selected frames from the provided video collection. In addition, we implement temporal sequence queries and subimage similarity search to provide higher query formulation flexibility for users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The representative was selected as a mean descriptor of images in one category. The original GoogLeNet was used to extract descriptors.
References
Barthel, K.U., Hezel, N., Mackowiak, R.: Navigating a graph of scenes for exploring large video collections. In: Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016. LNCS, vol. 9517, pp. 418–423. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-27674-8_43
Blazek, A., Lokoc, J., Kubon, D.: Video hunter at VBS 2017. In: MultiMedia Modeling - 23rd International Conference, MMM 2017, Proceedings, Part II, Reykjavik, Iceland, 4–6 January 2017, pp. 493–498 (2017)
Čech, P., Maroušek, J., Lokoč, J., Silva, Y.N., Starks, J.: Comparing MapReduce-based k-NN similarity joins on hadoop for high-dimensional data. In: Cong, G., Peng, W.-C., Zhang, W.E., Li, C., Sun, A. (eds.) ADMA 2017. LNCS (LNAI), vol. 10604, pp. 63–75. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69179-4_5
Cobârzan, C., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76(4), 5539–5571 (2017)
Hu, P., Ramanan, D.: Finding tiny faces. CoRR abs/1612.04402 (2016)
Lokoc, J., Bailer, W., Schoeffmann, K., Muenzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimedia 20(12), 3361–3376 (2018). https://ieeexplore.ieee.org/document/8352047
Lokoč, J., Blažek, A., Skopal, T.: Signature-based video browser. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014. LNCS, vol. 8326, pp. 415–418. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04117-9_49
Lokoč, J., Kovalčík, G., Souček, T.: Revisiting SIRET video retrieval tool. In: MultiMedia Modeling - 24th International Conference, MMM 2018, Bangkok, Thailand, Proceedings, Part II, 5–7 February 2018, pp. 419–424 (2018)
Lokoč, J., Souček, T., Kovalčík, G.: Using an interactive video retrieval tool for lifelog data. In: Proceedings of the 2018 ACM Workshop on the Lifelog Search Challenge, LSC 2018, pp. 15–19. ACM, New York (2018)
Nguyen, P.A., Lu, Y.-J., Zhang, H., Ngo, C.-W.: Enhanced VIREO KIS at VBS 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 407–412. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_42
Primus, M.J., Münzer, B., Leibetseder, A., Schoeffmann, K.: The ITEC collaborative video search system at the video browser showdown 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 438–443. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_47
Rossetto, L., Giangreco, I., Tănase, C., Schuldt, H., Dupont, S., Seddati, O.: Enhanced retrieval and browsing in the IMOTION system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 469–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_43
Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015)
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. CoRR abs/1704.03155 (2017)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017)
Acknowledgments
This paper has been supported in part by Czech Science Foundation (GAČR) project Nr. 17-22224S and by Charles University grant SVV-260451.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Bodnár, J., Čech, P. (2019). VIRET Tool Meets NasNet. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-05716-9_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05715-2
Online ISBN: 978-3-030-05716-9
eBook Packages: Computer ScienceComputer Science (R0)