skip to main content
10.1145/3240508.3240684acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval

Published: 15 October 2018 Publication History

Abstract

Cross-modality retrieval has been widely studied, which aims to search images as response to text queries or vice versa. When faced with large-scale dataset, cross-modality hashing serves as an efficient and effective solution, which learns binary codes to approximate the cross-modality similarity in the Hamming space. Most recent cross-modality hashing schemes focus on learning the hash functions from data instances with fully modalities. However, how to learn robust binary codes when facing incomplete modality (i.e., with one modality missed or partially observed), is left unexploited, which however widely occurs in real-world applications. In this paper, we propose a novel cross-modality hashing, termed Dense Auto-encoder Hashing (DAH), which can explicitly impute the missed modality and produce robust binary codes by leveraging the relatedness among different modalities. To that effect, we propose a novel Dense Auto-encoder Network (DAN) to impute the missing modalities, which densely connects each layer to every other layer in a feed-forward fashion. For each layer, a noisy auto-encoder block is designed to calculate the residue between the current prediction and original data. Finally, a hash-layer is added to the end of DAN, which serves as a special binary encoder model to deal with the incomplete modality input. Quantitative experiments on three cross-modality visual search benchmarks, i.e., the Wiki, NUS-WIDE, and FLICKR-25K, have shown that the proposed DAH has superior performance over the state-of-the-art approaches.

References

[1]
Dan Zhang, Fei Wang, and Luo Si. Composite hashing with multiple information sources. In Proceedings of the SIGIR, 2011.
[2]
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of the SIGMOD, 2013.
[3]
Dongqing Zhang and Wu-Jun Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In Proceedings of the AAAI, 2014.
[4]
Hong Liu, Rongrong Ji, Yongjian Wu, and Gang Hua. Supervised matrix factorization for cross-modality hashing. In Proceedings of the IJCAI, 2016.
[5]
Li Liu, Zijia Lin, Ling Shao, Fumin Shen, Guiguang Ding, and Jungong Han. Sequential discrete hashing for scalable cross-modality similarity retrieval. IEEE Transactions on Image Processing, 2017.
[6]
Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang, and Baochang Zhang. Cross-modality binary code learning via fusion similarity hashing. In Proceedings of the CVPR, 2017.
[7]
Ting Yao, Tao Mei, and Chong-Wah Ngo. Learning query and image similarities with ranking canonical correlation analysis. In Proceedings of the ICCV, 2015.
[8]
Yi Zhen, Yue Gao, Dit-Yan Yeung, Hongyuan Zha, and Xuelong Li. Spectral multimodal hashing and its application to multimedia retrieval. IEEE Transactions on cybernetics, 2016.
[9]
Li Liu, Mengyang Yu, and Ling Shao. Multiview alignment hashing for efficient image search. IEEE Transactions on image processing, 2015.
[10]
Yao Hu, Zhongming Jin, Hongyi Ren, Deng Cai, and Xiaofei He. Iterative multi-view hashing for cross media indexing. In Proceedings of the ACM MM, 2014.
[11]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. Semantic topic multimodal hashing for cross-media retrieval. In Proceedings of the IJCAI, 2015.
[12]
Botong Wu, Qiang Yang, Wei-Shi Zheng, Yizhou Wang, and Jingdong Wang. Quantized correlation hashing for fast cross-modal search. In Proceedings of the IJCAI, 2015.
[13]
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, Yun-Hao Yuan, and Heng Tao Shen. Robust cross-view hashing for multimedia retrieval. IEEE Signal Processing Letters, 2016.
[14]
Shaishav Kumar and Raghavendra Udupa. Learning hash functions for cross-view similarity search. In Proceedings of the IJCAI, 2011.
[15]
Mohammad Rastegari, Jonghyun Choi, Shobeir Fakhraei, Daume Hal, and Larry Davis. Predictable dual-view hashing. In Proceedings of the ICML, 2013.
[16]
Guiguang Ding, Yuchen Guo, and Jile Zhou. Collective matrix factorization hashing for multimodal data. In Proceedings of the CVPR, 2014.
[17]
Yi Zhen and Dit-Yan Yeung. Co-regularized hashing for multimodal data. In Proceedings of the NIPS, 2012.
[18]
Ying Wei, Yangqiu Song, Yi Zhen, Bo Liu, and Qiang Yang. Scalable heterogeneous translated hashing. In Proceedings of the SIGKDD, 2014.
[19]
Zijia Lin, Guiguang Ding, Mingqing Hu, and Jianmin Wang. Semantics-preserving hashing for cross-view retrieval. In Proceedings of the CVPR, 2015.
[20]
Devraj Mandal, Kunal N Chaudhury, and Soma Biswas. Generalized semantic preserving hashing for n-label cross-modal retrieval. In Proceedings of the CVPR, 2017.
[21]
Qifan Wang, Luo Si, and Bin Shen. Learning to hash on partial multi-modal data. In Proceedings of the IJCAI, 2015.
[22]
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, Yang Yang, Yun-Hao Yuan, and Heng Tao Shen. Semi-paired discrete hashing: Learning latent hash codes for semi-paired cross-view retrieval. IEEE transactions on cybernetics, 2017.
[23]
Luan Tran, Xiaoming Liu, Jiayu Zhou, and Rong Jin. Missing modalities imputation via cascaded residual autoencoder. In Proceedings of the CVPR, 2017.
[24]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and year=2017 booktitle=CVPR. Densely connected convolutional networks.
[25]
Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the ICML, 2008.
[26]
Kun Ding, Chunlei Huo, Bin Fan, Shiming Xiang, and Chunhong Pan. In defense of locality-sensitive hashing. IEEE Transactions on Neural Networks and Learning Systems, 2016.
[27]
Qing-Yuan Jiang and Wu-Jun Li. Deep cross-modal hashing. In Proceedings of the CVPR, 2017.
[28]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. Supervised discrete hashing. In Proceedings of the CVPR, 2015.
[29]
Jie Gui, Tongliang Liu, Zhenan Sun, Dacheng Tao, and Tieniu Tan. Fast supervised discrete hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
[30]
Bahadir Ozdemir and Larry S Davis. A probabilistic framework for multimodal retrieval using integrative indian buffet process. In Proceedings of the NIPS, 2014.
[31]
Diederik P Kingma. Variational inference & deep learning: A new synthesis. 2017.
[32]
Jile Zhou, Guiguang Ding, and Yuchen Guo. Latent semantic sparse hashing for cross-modal similarity search. In Proceedings of the SIGIR, 2014.
[33]
Xiao Cai, Feiping Nie, and Heng Huang. Multi-view k-means clustering on big data. In Proceeding of the IJCAI, 2013.
[34]
Arthur Zimek, Erich Schubert, and Hans-Peter Kriegel. A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining, 2012.
[35]
Wei Liu, Gang Hua, and John R Smith. Unsupervised one-class learning for automatic outlier removal. In Proceedings of the CVPR, 2014.
[36]
Jingdong Wang, Ting Zhang, Nicu Sebe, Heng Tao Shen, et al. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
[37]
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. Supervised hashing with kernels. In Proceedings of the CVPR, 2012.
[38]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.

Cited By

View all
  • (2024)FedCAFE: Federated Cross-Modal Hashing with Adaptive Feature EnhancementProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681319(9670-9679)Online publication date: 28-Oct-2024
  • (2024)Multi-Modal Hashing for Efficient Multimedia Retrieval: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328292136:1(239-260)Online publication date: Jan-2024
  • (2024)Semantic Reconstruction Guided Missing Cross-modal Hashing2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650222(1-8)Online publication date: 30-Jun-2024
  • Show More Cited By

Index Terms

  1. Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '18: Proceedings of the 26th ACM international conference on Multimedia
    October 2018
    2167 pages
    ISBN:9781450356657
    DOI:10.1145/3240508
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. binary code learning
    2. cross-modal search
    3. hash learning
    4. large- scale image retrieval

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R&D Program
    • Post Doctoral Innovative Talent Support Program
    • Nature Science Foundation of Fujian Province, China
    • Scientific Research Project of National Language Committee of China
    • China Post-Doctoral Science Foundation
    • Nature Science Foundation of China

    Conference

    MM '18
    Sponsor:
    MM '18: ACM Multimedia Conference
    October 22 - 26, 2018
    Seoul, Republic of Korea

    Acceptance Rates

    MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FedCAFE: Federated Cross-Modal Hashing with Adaptive Feature EnhancementProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681319(9670-9679)Online publication date: 28-Oct-2024
    • (2024)Multi-Modal Hashing for Efficient Multimedia Retrieval: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328292136:1(239-260)Online publication date: Jan-2024
    • (2024)Semantic Reconstruction Guided Missing Cross-modal Hashing2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650222(1-8)Online publication date: 30-Jun-2024
    • (2023)Incomplete Cross-Modal Retrieval with Deep Correlation TransferACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3637442Online publication date: 13-Dec-2023
    • (2023)Graph Convolutional Incomplete Multi-modal HashingProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612282(7029-7037)Online publication date: 26-Oct-2023
    • (2023)Partial Multi-Modal Hashing via Neighbor-Aware Completion LearningIEEE Transactions on Multimedia10.1109/TMM.2023.323830825(8499-8510)Online publication date: 2023
    • (2023)Teacher-Student Learning: Efficient Hierarchical Message Aggregation Hashing for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2022.317790125(4520-4532)Online publication date: 2023
    • (2023)CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrievalInternational Journal of Multimedia Information Retrieval10.1007/s13735-023-00268-712:1Online publication date: 22-Feb-2023
    • (2023)Pseudo-label driven deep hashing for unsupervised cross-modal retrievalInternational Journal of Machine Learning and Cybernetics10.1007/s13042-023-01842-514:10(3437-3456)Online publication date: 11-May-2023
    • (2022)Deep Adaptively-Enhanced Hashing With Discriminative Similarity Guidance for Unsupervised Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.317271632:10(7255-7268)Online publication date: Oct-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media