research-article

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval

Authors:
Ting-Kun Yan

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Xin-Shun Xu

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Shanqing Guo

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

,
Zi Huang

The University of Queensland, Queensland, Australia

The University of Queensland, Queensland, Australia
View Profile

,
Xiao-Lin Wang

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementOctober 2016Pages 1271–1280https://doi.org/10.1145/2983323.2983743

Published:24 October 2016Publication History

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1271–1280

ABSTRACT

Recently, multimodal hashing techniques have received considerable attention due to their low storage cost and fast query speed for multimodal data retrieval. Many methods have been proposed; however, there are still some problems that need to be further considered. For example, some of these methods just use a similarity matrix for learning hash functions which will discard some useful information contained in original data; some of them relax binary constraints or separate the process of learning hash functions and binary codes into two independent stages to bypass the obstacle of handling the discrete constraints on binary codes for optimization, which may generate large quantization error; some of them are not robust to noise. All these problems may degrade the performance of a model. To consider these problems, in this paper, we propose a novel supervised hashing framework for cross-modal retrieval, i.e., Supervised Robust Discrete Multimodal Hashing (SRDMH). Specifically, SRDMH tries to make final binary codes preserve label information as same as that in original data so that it can leverage more label information to supervise the binary codes learning. In addition, it learns hashing functions and binary codes directly instead of relaxing the binary constraints so as to avoid large quantization error problem. Moreover, to make it robust and easy to solve, we further integrate a flexible l_2,p loss with nonlinear kernel embedding and an intermediate presentation of each instance. Finally, an alternating algorithm is proposed to solve the optimization problem in SRDMH. Extensive experiments are conducted on three benchmark data sets. The results demonstrate that the proposed method (SRDMH) outperforms or is comparable to several state-of-the-art methods for cross-modal retrieval task.

References

A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008. Google ScholarDigital Library
A. Andoni and I. P. Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793--801, 2015. Google ScholarDigital Library
J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509--517, 1975. Google ScholarDigital Library
M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, pages 3594--3601, 2010.Google ScholarCross Ref
T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. NUS-WIDE: a real-world web image database from national university of singapore. In CIVR, 2009. Google ScholarDigital Library
G. Ding, Y. Guo, and J. Zhou. Collective matrix factorization hashing for multimodal data. In CVPR, pages 2083--2090, 2014. Google ScholarDigital Library
T. Do, A. Doan, and N. Cheung. Discrete hashing with deep neural network. CoRR, abs/1508.07148, 2015.Google Scholar
J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3):209--26, 1977. Google ScholarDigital Library
A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518--529, 1999. Google ScholarDigital Library
Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817--824, 2011. Google ScholarDigital Library
M. J. Huiskes and M. S. Lew. The MIR flickr retrieval evaluation. In MIR, pages 39--43, 2008. Google ScholarDigital Library
B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, pages 1042--1050, 2009. Google ScholarDigital Library
B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.Google ScholarCross Ref
S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, pages 1360--1365, 2011. Google ScholarDigital Library
H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In NIPS, pages 801--808, 2006. Google ScholarDigital Library
R.-S. Lin, D. A. Ross, and J. Yagnik. Spec hashing: Similarity preserving algorithm for entropy-based coding. In CVPR, pages 848--854, 2010.Google ScholarCross Ref
Z. Lin, G. Ding, M. Hu, and J. Wang. Semantics-preserving hashing for cross-view retrieval. In CVPR, pages 3864--3872, 2015.Google ScholarCross Ref
W. Liu, J. Wang, R. Ji, Y. Jiang, and S. Chang. Supervised hashing with kernels. In CVPR, pages 2074--2081, 2012. Google ScholarDigital Library
Y. Liu, J. Cui, Z. Huang, H. Li, and H. T. Shen. SKLSH: An efficient index structure for spproximate nearest neighbor search. In VLDB, pages 745--756, 2014. Google ScholarDigital Library
D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004. Google ScholarDigital Library
S. M. Omohundro. Efficient algorithms with neural network behavior. Complex Systems, 1(2):273--347, 1987.Google Scholar
J. C. Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. G. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3):521--535, 2014. Google ScholarDigital Library
G. Shakhnarovich. Learning task-specific similarity. PhD thesis, MIT, 2005. Google ScholarDigital Library
F. Shen, C. Shen, W. Liu, and H. T. Shen. Supervised discrete hashing. In CVPR, pages 37--45, 2015.Google ScholarCross Ref
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In CVPR, pages 1--8, 2008.Google ScholarCross Ref
J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In MM, pages 423--432, 2011. Google ScholarDigital Library
J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD, pages 785--796, 2013. Google ScholarDigital Library
F. Ture, T. Elsayed, and J. Lin. No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity. In SIGIR, pages 943--952, 2011. Google ScholarDigital Library
J. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40(4):175--179, 1991.Google ScholarCross Ref
D. Wang, X. Gao, X. Wang, and L. He. Semantic topic multimodal hashing for cross-media retrieval. In IJCAI, pages 3890--3896, 2015. Google ScholarDigital Library
J. Wang, O. Kumar, and S. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.Google ScholarCross Ref
J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning for hashing with compact codes. In ICML, pages 1127--1134, 2010.Google ScholarDigital Library
J. Wang, X.-S. Xu, S. Guo, L. Cui, and X. Wang. Linear unsupervised hashing for ann search in euclidean space. Neurocomputing, 171(c):283--292, 2016. Google ScholarDigital Library
S.-S. Wang, Z. Huang, and X.-S. Xu. A multi-label least-squares hashing for scalable image search. In SDM, pages 954--962, 2015.Google ScholarCross Ref
Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS 21, pages 1753--1760, 2009. Google ScholarDigital Library
H. Xu, J. Wang, Z. Li, and G. Zeng. Complementary hashing for approximate nearest neighbor search. In ICCV, pages 1631--1638, 2011. Google ScholarDigital Library
Y. Yang, Z. Ma, Y. Yang, F. Nie, and H. T. Shen. Multitask spectral clustering by exploring intertask correlation. IEEE Transactions on Cybernetics, 45(5):1069--1080, 2015.Google ScholarCross Ref
Y. Yang, Z. Zha, Y. Gao, X. Zhu, and T. Chua. Corrections to "exploiting web images for semantic video indexing via robust sample-specific loss". IEEE Transactions on Multimedia, 17(2):256, 2015.Google ScholarDigital Library
D. Zhang and W. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI, pages 2177--2183, 2014. Google ScholarDigital Library
D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR, pages 225--234, 2011. Google ScholarDigital Library
Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, pages 1385--1393, 2012. Google ScholarDigital Library
Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In KDD, pages 940--948, 2012. Google ScholarDigital Library
J. Zhou, G. Ding, and Y. Guo. Latent semantic sparse hashing for cross-modal similarity search. In SIGIR, pages 415--424, 2014. Google ScholarDigital Library
X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In MM, pages 143--152, 2013. Google ScholarDigital Library
F. Zou, C. Liu, H. Ling, H. Feng, L. Yan, and D. Li. Least square regularized spectral hashing for similarity search. Signal Processing, 93(8):2265--2273, 2013. Google ScholarDigital Library

Index Terms

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Visual content-based indexing and retrieval
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Asymmetric Discrete Cross-Modal Hashing
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Recently, cross-modal hashing (CMH) methods have attracted much attention. Many methods have been explored; however, there are still some issues that need to be further considered. 1) How to efficiently construct the correlations among heterogeneous ...
Read More
Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Recently, some cross-modal hashing methods have been devised for cross-modal search task. Essentially, given a similarity matrix, most of these methods tackle a discrete optimization problem by separating it into two stages, i.e., first relaxing the ...
Read More
Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Cross-modal hashing has attracted much attention in the large-scale multimedia search area. In many real applications, labels of samples have hierarchical structure which also contains much useful information for learning. However, most existing methods ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
approximate nearest neighbor search
cross-media retrieval
discrete hashing
learning to hash
multimodal hashing
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 426
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Asymmetric Discrete Cross-Modal Hashing

Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval

Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval