Skip to main content

Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Abstract

As an important element of social media, social images become more and more important to our daily life. Recently, smart hashing scheme has been emerging as a promising approach to support fast social image search. Leveraging semantic labels have shown effectiveness for hashing. However, semantic labels tend to be limited in terms of quantity and quality. In this paper, we propose Multi-Task Multi-modal Semantic Hashing (MTMSH) to index large scale social image data collection with limited supervision. MTMSH improves search accuracy via improving more semantic information from two aspects. First, latent multi-modal structure among labeled and unlabeled data, is explored by Multiple Anchor Graph Learning (MAGL) to enhance the quantity of semantic information. In addition, multi-task based Share Hash Space Learning (SHSL) is proposed to improve the semantic quality. Further, MGAL and SHSL are integrated using a joint framework, where semantic function and hash functions mutually reinforce each other. Then, an alternating algorithm, whose time complexity is linear to the size of training data, is also proposed. Experimental results on two large scale real world image datasets demonstrate the effectiveness and efficiency of MTMSH.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Meng, L., Tan, A.H., Leung, C., et al.: Online multimodal co-indexing and retrieval of weakly labeled web image collections. In: ICMR 2015, pp. 219–226 (2015)

    Google Scholar 

  2. Pereira, J.C., Coviello, E., Doyle, G., et al.: On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI 36, 521–535 (2014)

    Article  Google Scholar 

  3. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS 2009 (2009)

    Google Scholar 

  4. Liu, W., Wang, J., Ji, R., et al.: Supervised hashing with kernels. In: CVPR 2012 (2012)

    Google Scholar 

  5. Gong, Y., Lazebnik, S., Gordo, A., et al.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI 35, 2916–2929 (2013)

    Article  Google Scholar 

  6. Zhang, D., Li, W.J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI 2014 (2014)

    Google Scholar 

  7. Cheng, J., Leng, C., Li, P., et al.: Semi-supervised multi-graph hashing for scalable similarity search. Comput. Vis. Image Underst. 124, 12–21 (2014)

    Article  Google Scholar 

  8. Liu, W., He, J., Chang, S.F.: Large graph construction for scalable semi-supervised learning. In: ICML 2010 (2010)

    Google Scholar 

  9. Zhu, L., Shen, J., Jin, H., et al.: Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans. Cybern. 45, 2756–2769 (2015)

    Article  Google Scholar 

  10. Bartels, R.H., Stewart, G.W.: Solution of the matrix equation AX + XB = C [F4]. Commun. ACM 15, 820–826 (1972)

    Article  Google Scholar 

  11. Murty, K.G., Yu, F.T.: Linear Complementarity, Linear and Nonlinear Programming. Heldermann, Berlin (1988)

    MATH  Google Scholar 

  12. Rasiwasia, N., Costa Pereira, J., Coviello, E., et al.: A new approach to cross-modal multimedia retrieval. In: ACM Multimedia (2010)

    Google Scholar 

  13. Chua, T.S., Tang, J., Hong, R., et al.: NUS-WIDE: a real-world web image database from National University of Singapore. In: CIVR. ACM (2009)

    Google Scholar 

  14. Xie, L., Zhu, L., Pan, P., et al.: Cross-modal self-taught hashing for large-scale image retrieval. Signal Process. 124, 81–92 (2016)

    Article  Google Scholar 

  15. Kumar, S., Udupa, R.: Learning hash functions for cross-view similarity search. In: IJCAI 2011 (2011)

    Google Scholar 

  16. Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: CVPR 2014 (2014)

    Google Scholar 

  17. Song, J., Yang, Y., Yang, Y., et al.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD. ACM (2013)

    Google Scholar 

  18. Lin, Z., Ding, G., Hu, M., et al.: Semantics-preserving hashing for cross-view retrieval. In: CVPR 2015 (2015)

    Google Scholar 

  19. Irie, G., Arai, H., Taniguchi, Y.: Alternating co-quantization for cross-modal hashing. In: ICCV 2015 (2015)

    Google Scholar 

  20. Nie, L., Yan, S., Wang, M., et al.: Harvesting visual concepts for image search with complex queries. In: MM 2012 (2012)

    Google Scholar 

  21. Song, X., Nie, L., Zhang, L., et al.: Interest inference via structure-constrained multi-source multi-task learning. In: IJCAI 2015 (2015)

    Google Scholar 

  22. Xie, L., Pan, P., Lu, Y., et al.: A cross-modal multi-task learning framework for image annotation. In: CIKM 2014 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Xie, L., Zhu, L., Cheng, Z. (2017). Multi-Task Multi-modal Semantic Hashing for Web Image Retrieval with Limited Supervision. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51811-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics