skip to main content
10.1145/2983323.2983743acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval

Published:24 October 2016Publication History

ABSTRACT

Recently, multimodal hashing techniques have received considerable attention due to their low storage cost and fast query speed for multimodal data retrieval. Many methods have been proposed; however, there are still some problems that need to be further considered. For example, some of these methods just use a similarity matrix for learning hash functions which will discard some useful information contained in original data; some of them relax binary constraints or separate the process of learning hash functions and binary codes into two independent stages to bypass the obstacle of handling the discrete constraints on binary codes for optimization, which may generate large quantization error; some of them are not robust to noise. All these problems may degrade the performance of a model. To consider these problems, in this paper, we propose a novel supervised hashing framework for cross-modal retrieval, i.e., Supervised Robust Discrete Multimodal Hashing (SRDMH). Specifically, SRDMH tries to make final binary codes preserve label information as same as that in original data so that it can leverage more label information to supervise the binary codes learning. In addition, it learns hashing functions and binary codes directly instead of relaxing the binary constraints so as to avoid large quantization error problem. Moreover, to make it robust and easy to solve, we further integrate a flexible l2,p loss with nonlinear kernel embedding and an intermediate presentation of each instance. Finally, an alternating algorithm is proposed to solve the optimization problem in SRDMH. Extensive experiments are conducted on three benchmark data sets. The results demonstrate that the proposed method (SRDMH) outperforms or is comparable to several state-of-the-art methods for cross-modal retrieval task.

References

  1. A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117--122, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Andoni and I. P. Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793--801, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509--517, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR, pages 3594--3601, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  5. T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. NUS-WIDE: a real-world web image database from national university of singapore. In CIVR, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Ding, Y. Guo, and J. Zhou. Collective matrix factorization hashing for multimodal data. In CVPR, pages 2083--2090, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Do, A. Doan, and N. Cheung. Discrete hashing with deep neural network. CoRR, abs/1508.07148, 2015.Google ScholarGoogle Scholar
  8. J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3(3):209--26, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, pages 518--529, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, pages 817--824, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. J. Huiskes and M. S. Lew. The MIR flickr retrieval evaluation. In MIR, pages 39--43, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, pages 1042--1050, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Kulis and K. Grauman. Kernelized locality-sensitive hashing for scalable image search. In ICCV, pages 2130--2137, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. S. Kumar and R. Udupa. Learning hash functions for cross-view similarity search. In IJCAI, pages 1360--1365, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In NIPS, pages 801--808, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R.-S. Lin, D. A. Ross, and J. Yagnik. Spec hashing: Similarity preserving algorithm for entropy-based coding. In CVPR, pages 848--854, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. Z. Lin, G. Ding, M. Hu, and J. Wang. Semantics-preserving hashing for cross-view retrieval. In CVPR, pages 3864--3872, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  18. W. Liu, J. Wang, R. Ji, Y. Jiang, and S. Chang. Supervised hashing with kernels. In CVPR, pages 2074--2081, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Liu, J. Cui, Z. Huang, H. Li, and H. T. Shen. SKLSH: An efficient index structure for spproximate nearest neighbor search. In VLDB, pages 745--756, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. M. Omohundro. Efficient algorithms with neural network behavior. Complex Systems, 1(2):273--347, 1987.Google ScholarGoogle Scholar
  22. J. C. Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. G. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3):521--535, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Shakhnarovich. Learning task-specific similarity. PhD thesis, MIT, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. Shen, C. Shen, W. Liu, and H. T. Shen. Supervised discrete hashing. In CVPR, pages 37--45, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  25. C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In CVPR, pages 1--8, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. Song, Y. Yang, Z. Huang, H. T. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In MM, pages 423--432, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD, pages 785--796, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Ture, T. Elsayed, and J. Lin. No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity. In SIGIR, pages 943--952, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40(4):175--179, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  30. D. Wang, X. Gao, X. Wang, and L. He. Semantic topic multimodal hashing for cross-media retrieval. In IJCAI, pages 3890--3896, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Wang, O. Kumar, and S. Chang. Semi-supervised hashing for scalable image retrieval. In CVPR, pages 3424--3431, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  32. J. Wang, S. Kumar, and S.-F. Chang. Sequential projection learning for hashing with compact codes. In ICML, pages 1127--1134, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Wang, X.-S. Xu, S. Guo, L. Cui, and X. Wang. Linear unsupervised hashing for ann search in euclidean space. Neurocomputing, 171(c):283--292, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S.-S. Wang, Z. Huang, and X.-S. Xu. A multi-label least-squares hashing for scalable image search. In SDM, pages 954--962, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  35. Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS 21, pages 1753--1760, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Xu, J. Wang, Z. Li, and G. Zeng. Complementary hashing for approximate nearest neighbor search. In ICCV, pages 1631--1638, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Yang, Z. Ma, Y. Yang, F. Nie, and H. T. Shen. Multitask spectral clustering by exploring intertask correlation. IEEE Transactions on Cybernetics, 45(5):1069--1080, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  38. Y. Yang, Z. Zha, Y. Gao, X. Zhu, and T. Chua. Corrections to "exploiting web images for semantic video indexing via robust sample-specific loss". IEEE Transactions on Multimedia, 17(2):256, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. Zhang and W. Li. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI, pages 2177--2183, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Zhang, F. Wang, and L. Si. Composite hashing with multiple information sources. In SIGIR, pages 225--234, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Y. Zhen and D.-Y. Yeung. Co-regularized hashing for multimodal data. In NIPS, pages 1385--1393, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Y. Zhen and D.-Y. Yeung. A probabilistic model for multimodal hash function learning. In KDD, pages 940--948, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. Zhou, G. Ding, and Y. Guo. Latent semantic sparse hashing for cross-modal similarity search. In SIGIR, pages 415--424, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. Linear cross-modal hashing for efficient multimedia search. In MM, pages 143--152, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. F. Zou, C. Liu, H. Ling, H. Feng, L. Yan, and D. Li. Least square regularized spectral hashing for similarity search. Signal Processing, 93(8):2265--2273, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
        October 2016
        2566 pages
        ISBN:9781450340731
        DOI:10.1145/2983323

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 October 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader