Completely Unsupervised Cross-Modal Hashing

Duan, Jiasheng; Zhang, Pengfei; Huang, Zi

doi:10.1007/978-3-030-59410-7_11

Jiasheng Duan¹⁴,
Pengfei Zhang¹⁴ &
Zi Huang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12112))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2806 Accesses

Abstract

Cross-modal hashing is an effective and practical way for large-scale multimedia retrieval. Unsupervised hashing, which is a strong candidate for cross-modal hashing, has received more attention due to its easy unlabeled data collection. However, although there has been a rich line of such work in academia, they are hindered by a common disadvantage that the training data must exist in pairs to connect different modalities (e.g., a pair of an image and a text, which have the same semantic information), namely, the learning cannot perform with no pair-wise information available. To overcome this limitation, we explore to design a Completely Unsupervised Cross-Modal Hashing (CUCMH) approach with none but numeric features available, i.e., with neither class labels nor pair-wise information. To the best of our knowledge, this is the first work discussing this issue, for which, a novel dual-branch generative adversarial network is proposed. We also introduce the concept that the representation of multimedia data can be separated into content and style manner. The modality representation codes are employed to improve the effectiveness of the generative adversarial learning. Extensive experiments demonstrate the outperformance of CUCMH in completely unsupervised cross-modal hashing tasks and the effectiveness of the method integrating modality representation with semantic information in representation learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Andrew, G., Arora, R., Bilmes, J.A., Livescu, K.: Deep canonical correlation analysis. In: ICML (2013)
Google Scholar
Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (2010)
Google Scholar
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR (2009)
Google Scholar
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: CVPR (2014)
Google Scholar
Frome, A., et al.: Devise: a deep visual-semantic embedding model. In: NeurIPS (2013)
Google Scholar
Fu, Z., Tan, X., Peng, N., Zhao, D., Yan, R.: Style transfer in text: exploration and evaluation. In: AAAI (2018)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: NeurIPS (2014)
Google Scholar
Hardoon, D.R., Szedmák, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Article Google Scholar
Hu, Y., Jin, Z., Ren, H., Cai, D., He, X.: Iterative multi-view hashing for cross media indexing. In: ACMMM (2014)
Google Scholar
Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: SIGMM (2008)
Google Scholar
Jiang, Q.Y., Li, W.J.: Deep cross-modal hashing. In: CVPR (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP (2014)
Google Scholar
Kumar, S., Udupa, R.: Learning hash functions for cross-view similarity search. In: IJCAI (2011)
Google Scholar
Li, C., Deng, C., Wang, L., Xie, D., Liu, X.: Coupled cyclegan: unsupervised hashing network for cross-modal retrieval. In: AAAI (2019)
Google Scholar
Li, D., Dimitrova, N., Li, M., Sethi, I.K.: Multimedia content processing through cross-modal association. In: ACMMM (2003)
Google Scholar
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: CVPR (2015)
Google Scholar
Liu, W., Mu, C., Kumar, S., Chang, S.: Discrete graph hashing. In: NeurIPS (2014)
Google Scholar
Long, M., Cao, Y., Wang, J., Yu, P.S.: Composite correlation quantization for efficient multimodal retrieval. In: SIGIR (2016)
Google Scholar
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML (2011)
Google Scholar
Rasiwasia, N., et al.: A new approach to cross-modal multimedia retrieval. In: ACMMM (2010)
Google Scholar
Rastegari, M., Choi, J., Fakhraei, S., Hal III, H., Davis, L.S.: Predictable dual-view hashing. In: ICML (2013)
Google Scholar
Rosipal, R., Krämer, N.: Overview and recent advances in partial least squares. In: SLSFS (2005)
Google Scholar
Sharma, A., Kumar, A., Daumé, H., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: CVPR (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Song, J., Yang, Y., Yang, Y., Huang, Z., Shen, H.T.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD (2013)
Google Scholar
Sun, L., Ji, S., Ye, J.: A least squares formulation for canonical correlation analysis. In: ICML (2008)
Google Scholar
Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear models. Neural Comput. 12, 1247–1283 (2000)
Article Google Scholar
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NeurIPS (2008)
Google Scholar
Wu, G., et al.: Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI (2018)
Google Scholar
Wu, L., Wang, Y., Shao, L.: Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Trans. Image Process. 28(4), 1602–1612 (2019)
Article MathSciNet Google Scholar
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26, 2494–2507 (2017)
Article MathSciNet Google Scholar
Ye, Z., Peng, Y.: Multi-scale correlation for sequential cross-modal hashing learning. In: ACMMM (2018)
Google Scholar
Zhang, D., Li, W.J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI (2014)
Google Scholar
Zhang, J., Peng, Y., Yuan, M.: Unsupervised generative adversarial cross-modal hashing. In: AAAI (2018)
Google Scholar
Zhen, Y., Yeung, D.: Co-regularized hashing for multimodal data. In: NeurIPS (2012)
Google Scholar
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: SIGIR (2014)
Google Scholar
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar

Download references

Acknowledgement

This work was partially supported by Australian Research Council Discovery Project (ARC DP190102353).

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
Jiasheng Duan, Pengfei Zhang & Zi Huang

Authors

Jiasheng Duan
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zi Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiasheng Duan .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of System Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duan, J., Zhang, P., Huang, Z. (2020). Completely Unsupervised Cross-Modal Hashing. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12112. Springer, Cham. https://doi.org/10.1007/978-3-030-59410-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-59410-7_11
Published: 18 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59409-1
Online ISBN: 978-3-030-59410-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics