Abstract
As an important element of social media, social images become more and more important to our daily life. Recently, smart hashing scheme has been emerging as a promising approach to support fast social image search. Leveraging semantic labels have shown effectiveness for hashing. However, semantic labels tend to be limited in terms of quantity and quality. In this paper, we propose Multi-Task Multi-modal Semantic Hashing (MTMSH) to index large scale social image data collection with limited supervision. MTMSH improves search accuracy via improving more semantic information from two aspects. First, latent multi-modal structure among labeled and unlabeled data, is explored by Multiple Anchor Graph Learning (MAGL) to enhance the quantity of semantic information. In addition, multi-task based Share Hash Space Learning (SHSL) is proposed to improve the semantic quality. Further, MGAL and SHSL are integrated using a joint framework, where semantic function and hash functions mutually reinforce each other. Then, an alternating algorithm, whose time complexity is linear to the size of training data, is also proposed. Experimental results on two large scale real world image datasets demonstrate the effectiveness and efficiency of MTMSH.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Meng, L., Tan, A.H., Leung, C., et al.: Online multimodal co-indexing and retrieval of weakly labeled web image collections. In: ICMR 2015, pp. 219–226 (2015)
Pereira, J.C., Coviello, E., Doyle, G., et al.: On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI 36, 521–535 (2014)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS 2009 (2009)
Liu, W., Wang, J., Ji, R., et al.: Supervised hashing with kernels. In: CVPR 2012 (2012)
Gong, Y., Lazebnik, S., Gordo, A., et al.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI 35, 2916–2929 (2013)
Zhang, D., Li, W.J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI 2014 (2014)
Cheng, J., Leng, C., Li, P., et al.: Semi-supervised multi-graph hashing for scalable similarity search. Comput. Vis. Image Underst. 124, 12–21 (2014)
Liu, W., He, J., Chang, S.F.: Large graph construction for scalable semi-supervised learning. In: ICML 2010 (2010)
Zhu, L., Shen, J., Jin, H., et al.: Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans. Cybern. 45, 2756–2769 (2015)
Bartels, R.H., Stewart, G.W.: Solution of the matrix equation AX + XB = C [F4]. Commun. ACM 15, 820–826 (1972)
Murty, K.G., Yu, F.T.: Linear Complementarity, Linear and Nonlinear Programming. Heldermann, Berlin (1988)
Rasiwasia, N., Costa Pereira, J., Coviello, E., et al.: A new approach to cross-modal multimedia retrieval. In: ACM Multimedia (2010)
Chua, T.S., Tang, J., Hong, R., et al.: NUS-WIDE: a real-world web image database from National University of Singapore. In: CIVR. ACM (2009)
Xie, L., Zhu, L., Pan, P., et al.: Cross-modal self-taught hashing for large-scale image retrieval. Signal Process. 124, 81–92 (2016)
Kumar, S., Udupa, R.: Learning hash functions for cross-view similarity search. In: IJCAI 2011 (2011)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: CVPR 2014 (2014)
Song, J., Yang, Y., Yang, Y., et al.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD. ACM (2013)
Lin, Z., Ding, G., Hu, M., et al.: Semantics-preserving hashing for cross-view retrieval. In: CVPR 2015 (2015)
Irie, G., Arai, H., Taniguchi, Y.: Alternating co-quantization for cross-modal hashing. In: ICCV 2015 (2015)
Nie, L., Yan, S., Wang, M., et al.: Harvesting visual concepts for image search with complex queries. In: MM 2012 (2012)
Song, X., Nie, L., Zhang, L., et al.: Interest inference via structure-constrained multi-source multi-task learning. In: IJCAI 2015 (2015)
Xie, L., Pan, P., Lu, Y., et al.: A cross-modal multi-task learning framework for image annotation. In: CIKM 2014 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Xie, L., Zhu, L., Cheng, Z. (2017). Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-51811-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)