Video Search via Ranking Network with Very Few Query Exemplars

Cheng, De; Jiang, Lu; Gong, Yihong; Zheng, Nanning; Hauptmann, Alexander G.

doi:10.1007/978-3-319-51814-5_35

De Cheng^18,19,
Lu Jiang¹⁹,
Yihong Gong¹⁸,
Nanning Zheng¹⁸ &
…
Alexander G. Hauptmann¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10133))

Included in the following conference series:

International Conference on Multimedia Modeling

1606 Accesses

Abstract

This paper addresses the challenge of video search with only a handful query exemplars by proposing a triplet ranking network-based method. Based on the typical scenario for video search system, a user begins the query process by first utilizing the metadata-based text-to-video search module to find an initial set of videos of interest in the video repository. As bridging the semantic gap between text and video is very challenging, usually only a handful relevant videos appear in the initial retrieved results. The user now can use the video-to-video search module to train a new classifier to search more relevant videos. However, since we found that statistically only fewer than 5 videos are initially relevant, training a complex event classifier with a handful of examples is extremely challenging. Therefore, it is necessary to improve video retrieval method that works for a handful of positive training example videos. The proposed triplet ranking network is mainly designed for this situation and has the following properties: (1) This ranking network can learn an off-line similarity matching projection, which is event independent, from other previous video search tasks or datasets. Such that even with only one query video, we can search its relative videos. Then this method can transfer previous knowledge to the specific video retrieval tasks as more and more relative videos being retrieved, to further improve the retrieval performance; (2) It casts the video search task as a ranking problem, and can exploit partial ordering information in the dataset; (3) Based on the above two merits, this method is suitable for the case where only a handful of positive examples exploit. Experimental results show the effectiveness of our proposed method on video retrieval with only a handful of positive exemplars.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Trecvid med 13. http://www.nist.gov/itl/iad/mig/med13.cfm
Apostolidis, E., Mezaris, V., Sahuguet, M., Huet, B., Červenková, B., Stein, D., Eickeler, S., Redondo Garcia, J.L., Troncy, R., Pikora, L.: Automatic fine-grained hyperlinking of videos within a closed collection using scene segmentation. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1033–1036. ACM (2014)
Google Scholar
Bhattacharya, S., Yu, F.X., Chang, S.-F.: Minimally needed evidence for complex event recognition in unconstrained videos. In: ICMR, p. 105. ACM (2014)
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Cheng, D., Gong, Y., Zhou, S., Wang, J., Nanning, Z.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: CVPR (2016)
Google Scholar
Gkalelis, N., Mezaris, V.: Video event detection using generalized subclass discriminant analysis and linear support vector machines. In: ICMR, p. 25. ACM (2014)
Google Scholar
Habibian, A., Mensink, T., Snoek, C.G.: Composite concept discovery for zero-shot video event detection. In: ICMR, p. 17. ACM (2014)
Google Scholar
Hauptmann, A.G., Christel, M.G., Yan, R.: Video retrieval based on semantic concepts. Proc. IEEE 96(4), 602–622 (2008)
Article Google Scholar
Jiang, L., Meng, D., Mitamura, T., Hauptmann, A.G.: Easy samples first: self-paced reranking for zero-example multimedia search. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 547–556. ACM (2014)
Google Scholar
Jiang, L., Yu, S.-I., Meng, D., Mitamura, T., Hauptmann, A.G.: Bridging the ultimate semantic gap: a semantic search engine for internet videos. In: ICMR (2015)
Google Scholar
Ma, Z., Yang, Y., Sebe, N., Hauptmann, A.G.: Knowledge adaptation with partiallyshared features for event detectionusing few exemplars. PAMI 36, 1789–1802 (2014)
Article Google Scholar
Mazloom, M., Li, X., Snoek, C.G.: Few-example video event retrieval using tag propagation. In: Proceedings of International Conference on Multimedia Retrieval, p. 459. ACM (2014)
Google Scholar
Tamrakar, A., Ali, S., Yu, Q., Liu, J., Javed, O., Divakaran, A., Cheng, H., Sawhney, H.: Evaluation of low-level features and their combinations for complex event detection in open source videos. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3681–3688. IEEE (2012)
Google Scholar
Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.-J.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)
Article Google Scholar
Wu, S., Bondugula, S., Luisier, F., Zhuang, X., Natarajan, P.: Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In: CVPR, pp. 2665–2672 (2014)
Google Scholar
Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: CVPR (2015)
Google Scholar
Yu, S.-I., Jiang, L., Xu, Z., Yang, Y., Hauptmann, A.G.: Content-based video search over 1 million videos with 1 core in 1 second. In: ICMR (2015)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Basic Research Program of China (Grant No.2015CB351705), the State Key Program of National Natural Science Foundation of China (Grant No.61332018).

Author information

Authors and Affiliations

Xi’an Jiaotong University, Xi’an, China
De Cheng, Yihong Gong & Nanning Zheng
Carnegie Mellon University, Pittsburgh, USA
De Cheng, Lu Jiang & Alexander G. Hauptmann

Authors

De Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Lu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Nanning Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Alexander G. Hauptmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to De Cheng .

Editor information

Editors and Affiliations

CNRS–IRISA, Rennes, France
Laurent Amsaleg
Reykjavík University, Reykjavik, Iceland
Gylfi Þór Guðmundsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
Reykjavik University, Reykjavik, Ireland
Björn Þór Jónsson
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, D., Jiang, L., Gong, Y., Zheng, N., Hauptmann, A.G. (2017). Video Search via Ranking Network with Very Few Query Exemplars. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10133. Springer, Cham. https://doi.org/10.1007/978-3-319-51814-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-51814-5_35
Published: 31 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51813-8
Online ISBN: 978-3-319-51814-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics