skip to main content
10.1145/3474085.3475346acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Local Graph Convolutional Networks for Cross-Modal Hashing

Published: 17 October 2021 Publication History

Abstract

Cross-modal hashing aims to map the data of different modalities into a common binary space to accelerate the retrieval speed. Recently, deep cross-modal hashing methods have shown promising performance by applying deep neural networks to facilitate feature learning. However, the known supervised deep methods mainly rely on the labeled information of datasets, which is insufficient to characterize the latent structures that exist among different modalities. To mitigate this problem, in this paper, we propose to use Graph Convolutional Networks (GCNs) to exploit the local structure information of datasets for cross-modal hash learning. Specifically, a local graph is constructed according to the neighborhood relationships between samples in deep feature spaces and fed into GCNs to generate graph embeddings. Then, a within-modality loss is designed to measure the inner products between deep features and graph embeddings so that hashing networks and GCNs can be jointly optimized. By taking advantage of GCNs to assist model's training, the performance of hashing networks can be improved. Extensive experiments on benchmarks verify the effectiveness of the proposed method.

References

[1]
Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018. Deep priority hashing. In ACM MM. 1653--1661.
[2]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.
[3]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In CIVR. 1--9.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[5]
Hugo Jair Escalante, Carlos A Hernández, Jesus A Gonzalez, Aurelio López-López, Manuel Montes, Eduardo F Morales, L Enrique Sucar, Luis Villase nor, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. CVIU, Vol. 114, 4 (2010), 419--428.
[6]
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In VLDB. 518--529.
[7]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI, Vol. 35, 12 (2013), 2916--2929.
[8]
David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. 2011. Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal., Vol. 30, 2 (2011), 129--150.
[9]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In ICCV. 2961--2969.
[10]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In CVPR. 4700--4708.
[11]
Q. Jiang and W. Li. 2019. Discrete Latent Factor Model for Cross-Modal Hashing. IEEE TIP, Vol. 28, 7 (2019), 3490--3501.
[12]
Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep Cross-Modal Hashing. In CVPR. 3232--3240.
[13]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[14]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NeurIPS. 1097--1105.
[15]
Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018a. Self-supervised adversarial hashing networks for cross-modal retrieval. In CVPR. 4242--4251.
[16]
Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. 2019. Deepgcns: Can gcns go as deep as cnns?. In ICCV. 9267--9276.
[17]
Qimai Li, Zhichao Han, and Xiao ming Wu. 2018b. Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning. In AAAI. 3538--3545.
[18]
Q. Lin, C. Wenming, Z. He, and Z. He. 2020. Mask Cross-modal Hashing Networks. IEEE TMM (2020), 1--1.
[19]
Hong Liu, Rongrong Ji, Yongjian Wu, and Gang Hua. 2016. Supervised Matrix Factorization for Cross-Modality Hashing. In IJCAI. 1767--1773.
[20]
Wei Liu, Cun Mu, Sanjiv Kumar, and Shih-Fu Chang. 2014. Discrete Graph Hashing. In NeurIPS.
[21]
Xingbo Liu, Xiushan Nie, Wenjun Zeng, Chaoran Cui, Lei Zhu, and Yilong Yin. 2018. Fast discrete cross-modal hashing with regressing from semantic labels. In ACM MM. 1662--1669.
[22]
Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, and Wanli Ouyang. 2020. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. In CVPR. 143--152.
[23]
Xin Luo, Xiao-Ya Yin, Liqiang Nie, Xuemeng Song, Yongxin Wang, and Xin-Shun Xu. 2018b. SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing. In IJCAI. 2518--2524.
[24]
Yadan Luo, Yang Yang, Fumin Shen, Zi Huang, Pan Zhou, and Heng Tao Shen. 2018a. Robust discrete code modeling for supervised hashing. Pattern Recognit., Vol. 75 (2018), 128--135.
[25]
Jiwoong Park, Minsik Lee, Hyung Jin Chang, Kyuewang Lee, and Jin Young Choi. 2019. Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning. In ICCV. 6519--6528.
[26]
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature, Vol. 3, 23 (1986), 533--536.
[27]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In CVPR. 37--45.
[28]
H. T. Shen, L. Liu, Y. Yang, X. Xu, Z. Huang, F. Shen, and R. Hong. 2020. Exploiting Subspace Relation in Semantic Labels for Cross-modal Hashing. IEEE TKDE (2020), 1--1.
[29]
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML. 6105--6114.
[30]
Zijian Wang, Zheng Zhang, Yandan Luo, Zi Huang, and Heng Tao Shen. 2020. Deep collaborative discrete hashing with semantic-invariant structure construction. IEEE TMM, Vol. 23 (2020), 1274--1286.
[31]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In NeurIPS. 1753--1760.
[32]
De Xie, Cheng Deng, Chao Li, Xianglong Liu, and Dacheng Tao. 2020. Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE TIP, Vol. 29 (2020), 3626--3637.
[33]
Ruiqing Xu, Chao Li, Junchi Yan, Cheng Deng, and Xianglong Liu. 2019. Graph Convolutional Network Hashing for Cross-Modal Retrieval. In IJCAI. 982--988.
[34]
X. Xu, F. Shen, Y. Yang, H. T. Shen, and X. Li. 2017. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval. IEEE TIP, Vol. 26, 5 (2017), 2494--2507.
[35]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In AAAI. 1618--1625.
[36]
Jinrui Yang, Wei-Shi Zheng, Qize Yang, Ying-Cong Chen, and Qi Tian. 2020. Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification. In CVPR. 3289--3299.
[37]
Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks. In CVPR. 2127--2135.
[38]
Yu-Wei Zhan, Xin Luo, Yongxin Wang, and Xin-Shun Xu. 2020. Supervised hierarchical deep hashing for cross-modal retrieval. In ACM MM. 3386--3394.
[39]
Dongqing Zhang and Wu-Jun Li. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI. 2177--2183.
[40]
Peng-Fei Zhang, Yang Li, Zi Huang, and Xin-Shun Xu. 2021 a. Aggregation-based Graph Convolutional Hashing for Unsupervised Cross-modal Retrieval. IEEE TMM (2021).
[41]
Peng-Fei Zhang, Yadan Luo, Zi Huang, Xin-Shun Xu, and Jingkuan Song. 2021 b. High-order nonlocal Hashing for unsupervised cross-modal retrieval. WWW, Vol. 24, 2 (2021), 563--583.
[42]
Xi Zhang, Hanjiang Lai, and Jiashi Feng. 2018. Attention-aware deep adversarial hashing for cross-modal retrieval. In ECCV. 591--606.
[43]
Zheng Zhang, Zhihui Lai, Zi Huang, Wai Keung Wong, Guo-Sen Xie, Li Liu, and Ling Shao. 2019. Scalable supervised asymmetric hashing with semantic and latent factor embedding. IEEE TIP, Vol. 28, 10 (2019), 4803--4818.
[44]
Zheng Zhang, Luyao Liu, Yadan Luo, Zi Huang, Fumin Shen, Heng Tao Shen, and Guangming Lu. 2020. Inductive Structure Consistent Hashing via Flexible Semantic Calibration. IEEE TNNLS (2020).
[45]
X. Zhou, F. Shen, L. Liu, W. Liu, L. Nie, Y. Yang, and H. T. Shen. 2020. Graph Convolutional Network Hashing. IEEE TCYB, Vol. 50, 4 (2020), 1460--1472.

Cited By

View all
  • (2025)Adversarial Graph Convolutional Network Hashing for Cross-Modal RetrievalWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_6(69-80)Online publication date: 31-Jan-2025
  • (2024)Supervised Hierarchical Online Hashing for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363252720:4(1-23)Online publication date: 11-Jan-2024
  • (2024)Deep Neighborhood-Preserving Hashing With Quadratic Spherical Mutual Information for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.334907526(6361-6374)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Local Graph Convolutional Networks for Cross-Modal Hashing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal retrieval
    2. neighborhood relationship
    3. supervised deep hashing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)74
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Adversarial Graph Convolutional Network Hashing for Cross-Modal RetrievalWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_6(69-80)Online publication date: 31-Jan-2025
    • (2024)Supervised Hierarchical Online Hashing for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363252720:4(1-23)Online publication date: 11-Jan-2024
    • (2024)Deep Neighborhood-Preserving Hashing With Quadratic Spherical Mutual Information for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2023.334907526(6361-6374)Online publication date: 2024
    • (2024)Structures Aware Fine-Grained Contrastive Adversarial Hashing for Cross-Media RetrievalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.335625836:7(3514-3528)Online publication date: Jul-2024
    • (2024)Asymmetric Transfer Hashing With Adaptive Bipartite Graph LearningIEEE Transactions on Cybernetics10.1109/TCYB.2022.323278754:1(533-545)Online publication date: Jan-2024
    • (2024)Cross-Modal Retrieval: A Systematic Review of Methods and Future DirectionsProceedings of the IEEE10.1109/JPROC.2024.3525147112:11(1716-1754)Online publication date: Nov-2024
    • (2024)Cross-Modal Semantic Embedding Hashing for Unsupervised Retrieval2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651304(1-7)Online publication date: 30-Jun-2024
    • (2024)Unsupervised Multimodal Graph Contrastive Semantic Anchor Space Dynamic Knowledge Distillation Network for Cross-Media Hash Retrieval2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00357(4699-4708)Online publication date: 13-May-2024
    • (2024)Deep Feature-Based Neighbor Similarity Hashing With Adversarial Learning for Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.341318612(128559-128569)Online publication date: 2024
    • (2024)Deep self-enhancement hashing for robust multi-label cross-modal retrievalPattern Recognition10.1016/j.patcog.2023.110079147:COnline publication date: 4-Mar-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media