ABSTRACT
How to estimate cross-media relevance between a given query and an unlabeled image is a key question in the MSR-Bing Image Retrieval Challenge. We answer the question by proposing cross-media relevance fusion, a conceptually simple framework that exploits the power of individual methods for cross-media relevance estimation. Four base cross-media relevance functions are investigated, and later combined by weights optimized on the development set. With DCG25 of 0.5200 on the test dataset, the proposed image retrieval system secures the first place in the evaluation.
- Y. Bai, W. Yu, T. Xiao, C. Xu, K. Yang, W.-Y. Ma, and T. Zhao. Bag-of-words based deep neural network for image retrieval. In ACM MM, 2014. Google ScholarDigital Library
- Q. Fang, H. Xu, R. Wang, S. Qian, T. Wang, J. Sang, and C. Xu. Towards msr-bing challenge: Ensemble of diverse models for image retrieval. In MSR-Bing IRC 2013 Workshop, 2013.Google Scholar
- R. Goulden, P. Nation, and J. Read. How large can a receptive vocabulary be? Applied Linguistics, 11(4):341--363, 1990.Google ScholarCross Ref
- X. S. Hua, L. Yang, J. Wang, J. Wang, M. Ye, K. Wang, Y. Rui, and J. Li. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines. ACM MM, 2013. Google ScholarDigital Library
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093, 2014.Google Scholar
- X. Li, S. Liao, W. Lan, X. Du, and G. Yang. Zero-shot image tagging by hierarchical semantic embedding. SIGIR, 2015. Google ScholarDigital Library
- X. Li, C. Snoek, M. Worring, and A. Smeulders. Fusing concept detection and geo context for visual search. In ICMR, 2012. Google ScholarDigital Library
- D. Metzler and B. Croft. Linear feature-based models for information retrieval. Inf. Retr., 10(3):257--274, 2007. Google ScholarDigital Library
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.Google Scholar
- M. Norouzi, T. Mikolov, S. Bengio, Y. Singer, J. Shlens, A. Frome, G. Corrado, and J. Dean. Zero-shot learning by convex combination of semantic embeddings. ICLR, 2014.Google Scholar
- Y. Pan, T. Yao, X. Tian, H. Li, and C.-W. Ngo. Click-through-based subspace learning for image search. In ACM MM, 2014. Google ScholarDigital Library
- Y. Pan, T. Yao, K. Yang, H. Li, C.-W. Ngo, J. Wang, and T. Mei. Image search by graph-based label propagation with image representation from dnn. In ACM MM, 2013. Google ScholarDigital Library
- C.-C. Wu, K.-Y. Chu, Y.-H. Kuo, Y.-Y. Chen, W.-Y. Lee, and W. H. Hsu. Search-based relevance association with auxiliary contextual cues. In ACM MM, 2013. Google ScholarDigital Library
- Z. Xu, Y. Yang, A. Kassim, and S. Yan. Cross-media relevance mining for evaluating text-based image search engine. In ICME, 2014.Google Scholar
- B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. NIPS, 2014.Google ScholarDigital Library
Index Terms
- Image Retrieval by Cross-Media Relevance Fusion
Recommendations
Cross-media Relevance Computation for Multimedia Retrieval
MM '17: Proceedings of the 25th ACM international conference on MultimediaIn this paper, we summarize our works for cross-media retrieval where the queries and retrieval content are of different media types. We study cross-media retrieval in the context of two applications, i.e., ~image retrieval by textual queries, and ...
Semantic-based cross-media image retrieval
ICAPR'05: Proceedings of the Third international conference on Pattern Recognition and Image Analysis - Volume Part IIIn this paper, we propose a novel method for cross-media semantic-based information retrieval, which combines classical text- based and content-based image retrieval techniques. This semantic-based approach aims at determining the strong relationships ...
Cross-Language and Cross-Media Image Retrieval: An Empirical Study at ImageCLEF2007
Advances in Multilingual and Multimodal Information RetrievalThis paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task ...
Comments