skip to main content
10.1145/1937728.1937730acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimcsConference Proceedingsconference-collections
research-article

Error-correcting output hashing in fast similarity search

Published: 30 December 2010 Publication History

Abstract

Fast similarity search is one of the key techniques in many large scale learning and data mining applications. Recently, hashing-based methods, which create compact and efficient codes that preserve data distribution, have received considerable attention due to their promising theoretical and empirical results. An ideal hashing method 1) can naturally have out-of-sample extension; 2) has very low computational complexity; and 3) has significant improvement over linear search in the original space in terms of accuracy. However, most existing hashing methods failed to satisfy all the above three requirements. In this paper, we propose a new method called Error-correcting Output Hashing (ECOH) which meets all the above three requirements. ECOH first groups all the samples into clusters using a conventional clustering algorithm. Each cluster is assigned an Error-Correcting Output Code (ECOC) and the linear mappings from the sample vectors to the ECOC are then learned using linear regression models. In this way, ECOH learns both the binary code for each sample and the function which links the input vector and the output code. Experimental results on real world data sets demonstrate the effectiveness and efficiency of the proposed approach.

References

[1]
A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In IEEE Symposium on Foundations of Computer Science, 2006.
[2]
O. Chum, J. Philbin, M. Isard, and A. Zisserman. Scalable near identical image and shot detection. In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 549--556, New York, NY, USA, 2007. ACM.
[3]
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[4]
T. G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. In Journal of Artificial Intelligence Research, 1995.
[5]
J. Hays and A. A. Efros. Scene completion using millions of photographs. Commun. ACM, 51(10):87--94, 2008.
[6]
T. Joachims. Learning to classify text using support vector machines. In Kluwer, 2002.
[7]
J. Philbin, O. Chum, M. Isard, and A. J. Sivic. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on, 2007.
[8]
D. Knuth. The Art of Computer Programming. Addison-Wesley, 3rd edition edition, 1997.
[9]
M. MujaandD and G. Lowe. Fast approximate nearest neighbors with automatic algorithm confguration. In International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2009.
[10]
M. Raginsky and S. Lazebnik. Locality-sensitive binary codes from shift-invariant kernels. In The Neural Information Processing Systems, 2009.
[11]
R. O. Duda, J. W. Machanik, and R. C. Singleton. Function modeling experiments. Technical report, Stanford Research Institute, 1963.
[12]
R. Salakhutdinov and G. Hinton. Semantic hashing. In International Journal of Approximate Reasoning, 2009.
[13]
R. R. Salakhutdinov and G. E. Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In AISTATS, 1993.
[14]
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching. In Journal of ACM, 1998.
[15]
G. Shakhnarovich, P. A. Viola, and T. Darrell. Fast pose estimation with parameter-sensitive hashing. In Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV), 2003.
[16]
J. Shi and J. Malik. Normalized cuts and image segmentation. In IEEE Transactions on pattern analysis and machine intelligence, 2000.
[17]
B. Stein, S. M. zu Eissen, and M. Potthast. Strategies for retrieving plagiarized documents. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2007.
[18]
P. Wegner. A technique for counting ones in a binary computer. In Communications of the ACM (CACM), 1960.
[19]
Y. Weiss, A. B. Torralba, and R. Fergus. Spectral hashing. In Advances in Neural Information Processing Systems (NIPS), 2008.
[20]
D. Zhang, J. Wang, D. Cai, and J. Lu. Self-taught hashing for fast similarity search. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR), 2010.

Cited By

View all
  • (2014)Supervised hashing with error correcting codesProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654963(785-788)Online publication date: 3-Nov-2014
  • (2014)Fast and Accurate Hashing Via Iterative Nearest Neighbors ExpansionIEEE Transactions on Cybernetics10.1109/TCYB.2014.230201844:11(2167-2177)Online publication date: Nov-2014
  • (2014)Density Sensitive HashingIEEE Transactions on Cybernetics10.1109/TCYB.2013.228349744:8(1362-1371)Online publication date: Aug-2014
  • Show More Cited By

Index Terms

  1. Error-correcting output hashing in fast similarity search

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service
      December 2010
      218 pages
      ISBN:9781450304603
      DOI:10.1145/1937728
      • General Chairs:
      • Yong Rui,
      • Klara Nahrstedt,
      • Xiaofei Xu,
      • Program Chairs:
      • Hongxun Yao,
      • Shuqiang Jiang,
      • Jian Cheng
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 December 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. error-correcting output code
      2. semantic hashing
      3. similarity search

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      ICIMCS '10

      Acceptance Rates

      Overall Acceptance Rate 163 of 456 submissions, 36%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2014)Supervised hashing with error correcting codesProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654963(785-788)Online publication date: 3-Nov-2014
      • (2014)Fast and Accurate Hashing Via Iterative Nearest Neighbors ExpansionIEEE Transactions on Cybernetics10.1109/TCYB.2014.230201844:11(2167-2177)Online publication date: Nov-2014
      • (2014)Density Sensitive HashingIEEE Transactions on Cybernetics10.1109/TCYB.2013.228349744:8(1362-1371)Online publication date: Aug-2014
      • (2013)Semi-Supervised Nonlinear Hashing Using Bootstrap Sequential Projection LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2012.7625:6(1380-1393)Online publication date: 1-Jun-2013

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media