research-article

Local Graph Convolutional Networks for Cross-Modal Hashing

Authors:

Zi HuangAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 1921 - 1928

https://doi.org/10.1145/3474085.3475346

Published: 17 October 2021 Publication History

Abstract

Cross-modal hashing aims to map the data of different modalities into a common binary space to accelerate the retrieval speed. Recently, deep cross-modal hashing methods have shown promising performance by applying deep neural networks to facilitate feature learning. However, the known supervised deep methods mainly rely on the labeled information of datasets, which is insufficient to characterize the latent structures that exist among different modalities. To mitigate this problem, in this paper, we propose to use Graph Convolutional Networks (GCNs) to exploit the local structure information of datasets for cross-modal hash learning. Specifically, a local graph is constructed according to the neighborhood relationships between samples in deep feature spaces and fed into GCNs to generate graph embeddings. Then, a within-modality loss is designed to measure the inner products between deep features and graph embeddings so that hashing networks and GCNs can be jointly optimized. By taking advantage of GCNs to assist model's training, the performance of hashing networks can be improved. Extensive experiments on benchmarks verify the effectiveness of the proposed method.

References

[1]

Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018. Deep priority hashing. In ACM MM. 1653--1661.

Digital Library

[2]

Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.

[3]

Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In CIVR. 1--9.

Digital Library

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[5]

Hugo Jair Escalante, Carlos A Hernández, Jesus A Gonzalez, Aurelio López-López, Manuel Montes, Eduardo F Morales, L Enrique Sucar, Luis Villase nor, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. CVIU, Vol. 114, 4 (2010), 419--428.

Digital Library

[6]

Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In VLDB. 518--529.

Digital Library

[7]

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI, Vol. 35, 12 (2013), 2916--2929.

Digital Library

[8]

David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. 2011. Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal., Vol. 30, 2 (2011), 129--150.

[9]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In ICCV. 2961--2969.

[10]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In CVPR. 4700--4708.

[11]

Q. Jiang and W. Li. 2019. Discrete Latent Factor Model for Cross-Modal Hashing. IEEE TIP, Vol. 28, 7 (2019), 3490--3501.

[12]

Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep Cross-Modal Hashing. In CVPR. 3232--3240.

[13]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[14]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NeurIPS. 1097--1105.

Digital Library

[15]

Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018a. Self-supervised adversarial hashing networks for cross-modal retrieval. In CVPR. 4242--4251.

[16]

Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. 2019. Deepgcns: Can gcns go as deep as cnns?. In ICCV. 9267--9276.

[17]

Qimai Li, Zhichao Han, and Xiao ming Wu. 2018b. Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning. In AAAI. 3538--3545.

[18]

Q. Lin, C. Wenming, Z. He, and Z. He. 2020. Mask Cross-modal Hashing Networks. IEEE TMM (2020), 1--1.

[19]

Hong Liu, Rongrong Ji, Yongjian Wu, and Gang Hua. 2016. Supervised Matrix Factorization for Cross-Modality Hashing. In IJCAI. 1767--1773.

Digital Library

[20]

Wei Liu, Cun Mu, Sanjiv Kumar, and Shih-Fu Chang. 2014. Discrete Graph Hashing. In NeurIPS.

Digital Library

[21]

Xingbo Liu, Xiushan Nie, Wenjun Zeng, Chaoran Cui, Lei Zhu, and Yilong Yin. 2018. Fast discrete cross-modal hashing with regressing from semantic labels. In ACM MM. 1662--1669.

Digital Library

[22]

Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, and Wanli Ouyang. 2020. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. In CVPR. 143--152.

[23]

Xin Luo, Xiao-Ya Yin, Liqiang Nie, Xuemeng Song, Yongxin Wang, and Xin-Shun Xu. 2018b. SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing. In IJCAI. 2518--2524.

Digital Library

[24]

Yadan Luo, Yang Yang, Fumin Shen, Zi Huang, Pan Zhou, and Heng Tao Shen. 2018a. Robust discrete code modeling for supervised hashing. Pattern Recognit., Vol. 75 (2018), 128--135.

Digital Library

[25]

Jiwoong Park, Minsik Lee, Hyung Jin Chang, Kyuewang Lee, and Jin Young Choi. 2019. Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning. In ICCV. 6519--6528.

[26]

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature, Vol. 3, 23 (1986), 533--536.

[27]

Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In CVPR. 37--45.

[28]

H. T. Shen, L. Liu, Y. Yang, X. Xu, Z. Huang, F. Shen, and R. Hong. 2020. Exploiting Subspace Relation in Semantic Labels for Cross-modal Hashing. IEEE TKDE (2020), 1--1.

[29]

Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML. 6105--6114.

[30]

Zijian Wang, Zheng Zhang, Yandan Luo, Zi Huang, and Heng Tao Shen. 2020. Deep collaborative discrete hashing with semantic-invariant structure construction. IEEE TMM, Vol. 23 (2020), 1274--1286.

[31]

Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In NeurIPS. 1753--1760.

Digital Library

[32]

De Xie, Cheng Deng, Chao Li, Xianglong Liu, and Dacheng Tao. 2020. Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE TIP, Vol. 29 (2020), 3626--3637.

[33]

Ruiqing Xu, Chao Li, Junchi Yan, Cheng Deng, and Xianglong Liu. 2019. Graph Convolutional Network Hashing for Cross-Modal Retrieval. In IJCAI. 982--988.

Digital Library

[34]

X. Xu, F. Shen, Y. Yang, H. T. Shen, and X. Li. 2017. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval. IEEE TIP, Vol. 26, 5 (2017), 2494--2507.

Digital Library

[35]

Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In AAAI. 1618--1625.

Digital Library

[36]

Jinrui Yang, Wei-Shi Zheng, Qize Yang, Ying-Cong Chen, and Qi Tian. 2020. Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification. In CVPR. 3289--3299.

[37]

Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks. In CVPR. 2127--2135.

[38]

Yu-Wei Zhan, Xin Luo, Yongxin Wang, and Xin-Shun Xu. 2020. Supervised hierarchical deep hashing for cross-modal retrieval. In ACM MM. 3386--3394.

Digital Library

[39]

Dongqing Zhang and Wu-Jun Li. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI. 2177--2183.

Digital Library

[40]

Peng-Fei Zhang, Yang Li, Zi Huang, and Xin-Shun Xu. 2021 a. Aggregation-based Graph Convolutional Hashing for Unsupervised Cross-modal Retrieval. IEEE TMM (2021).

[41]

Peng-Fei Zhang, Yadan Luo, Zi Huang, Xin-Shun Xu, and Jingkuan Song. 2021 b. High-order nonlocal Hashing for unsupervised cross-modal retrieval. WWW, Vol. 24, 2 (2021), 563--583.

[42]

Xi Zhang, Hanjiang Lai, and Jiashi Feng. 2018. Attention-aware deep adversarial hashing for cross-modal retrieval. In ECCV. 591--606.

[43]

Zheng Zhang, Zhihui Lai, Zi Huang, Wai Keung Wong, Guo-Sen Xie, Li Liu, and Ling Shao. 2019. Scalable supervised asymmetric hashing with semantic and latent factor embedding. IEEE TIP, Vol. 28, 10 (2019), 4803--4818.

[44]

Zheng Zhang, Luyao Liu, Yadan Luo, Zi Huang, Fumin Shen, Heng Tao Shen, and Guangming Lu. 2020. Inductive Structure Consistent Hashing via Flexible Semantic Calibration. IEEE TNNLS (2020).

[45]

X. Zhou, F. Shen, L. Liu, W. Liu, L. Nie, Y. Yang, and H. T. Shen. 2020. Graph Convolutional Network Hashing. IEEE TCYB, Vol. 50, 4 (2020), 1460--1472.

Cited By

Lu BZhao TLiang GLi JDuan X(2025)Adversarial Graph Convolutional Network Hashing for Cross-Modal RetrievalWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_6(69-80)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-96-0055-7_6
Han KLiu YWei RZhou KXu JLong K(2024)Supervised Hierarchical Online Hashing for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363252720:4(1-23)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3632527
Qin QHuo YHuang LDai JZhang HZhang W(2024)Deep Neighborhood-Preserving Hashing With Quadratic Spherical Mutual Information for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.334907526(6361-6374)Online publication date: 2024
https://doi.org/10.1109/TMM.2023.3349075
Show More Cited By

Index Terms

Local Graph Convolutional Networks for Cross-Modal Hashing
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval

Recommendations

Graph Convolutional Semi-Supervised Cross-Modal Hashing
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Cross-modal hashing encodes different modalities of multi-modal data into a low-dimensional Hamming space for fast cross-modal retrieval. Most existing cross-modal hashing methods heavily rely on label semantics to boost retrieval performance; however, ...
Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks
Highlights
- MCGCN for the first time builds cross-modal graph and jointly learns modality-specific and modality-shared features for semi-supervised cross-modal hashing.
- MCGCN provides a three-channel network architecture, including two modality-...
Abstract
Cross-modal hashing maps heterogeneous multimedia data into Hamming space for retrieving relevant samples across modalities, which has received great research interests due to its rapid retrieval and low storage cost. In real-world applications, ...
Semi-supervised constrained graph convolutional network for cross-modal retrieval
Abstract
Exploiting relationship among samples in cross-modal data plays a key role in the task of cross-modal retrieval, but most of existing methods only extract the correlation from pairwise samples and ignore the relations of unpaired ...
Graphical abstract

Display Omitted
Highlights
- We first transform all the samples to semantic embeddings and predicted label, and then we utilize them to construct graph dynamically.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
524
Total Downloads

Downloads (Last 12 months)74
Downloads (Last 6 weeks)7

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu BZhao TLiang GLi JDuan X(2025)Adversarial Graph Convolutional Network Hashing for Cross-Modal RetrievalWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_6(69-80)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-96-0055-7_6
Han KLiu YWei RZhou KXu JLong K(2024)Supervised Hierarchical Online Hashing for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363252720:4(1-23)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3632527
Qin QHuo YHuang LDai JZhang HZhang W(2024)Deep Neighborhood-Preserving Hashing With Quadratic Spherical Mutual Information for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.334907526(6361-6374)Online publication date: 2024
https://doi.org/10.1109/TMM.2023.3349075
Liang MLi YYu YCao XXue ZLi ALu K(2024)Structures Aware Fine-Grained Contrastive Adversarial Hashing for Cross-Media RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.335625836:7(3514-3528)Online publication date: Jul-2024
https://doi.org/10.1109/TKDE.2024.3356258
Lu JZhou JChen YPedrycz WHung K(2024)Asymmetric Transfer Hashing With Adaptive Bipartite Graph LearningIEEE Transactions on Cybernetics10.1109/TCYB.2022.323278754:1(533-545)Online publication date: Jan-2024
https://doi.org/10.1109/TCYB.2022.3232787
Wang TLi FZhu LLi JZhang ZShen H(2024)Cross-Modal Retrieval: A Systematic Review of Methods and Future DirectionsProceedings of the IEEE10.1109/JPROC.2024.3525147112:11(1716-1754)Online publication date: Nov-2024
https://doi.org/10.1109/JPROC.2024.3525147
Zhang ZChen Y(2024)Cross-Modal Semantic Embedding Hashing for Unsupervised Retrieval2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651304(1-7)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651304
Yu YLiang MYin MLu KDu JXue Z(2024)Unsupervised Multimodal Graph Contrastive Semantic Anchor Space Dynamic Knowledge Distillation Network for Cross-Media Hash Retrieval2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00357(4699-4708)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00357
Li KZhang YWang FLiu GWei X(2024)Deep Feature-Based Neighbor Similarity Hashing With Adversarial Learning for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.341318612(128559-128569)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3413186
Song GSu HHuang KSong FYang M(2024)Deep self-enhancement hashing for robust multi-label cross-modal retrievalPattern Recognition10.1016/j.patcog.2023.110079147:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.patcog.2023.110079
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten