Abstract
Algorithms to rapidly search massive image or video collections are critical for many vision applications, including visual search, content-based retrieval, and non-parametric models for object recognition. Recent work shows that learned binary projections are a powerful way to index large collections according to their content. The basic idea is to formulate the projections so as to approximately preserve a given similarity function of interest. Having done so, one can then search the data efficiently using hash tables, or by exploring the Hamming ball volume around a novel query. Both enable sub-linear time retrieval with respect to the database size. Further, depending on the design of the projections, in some cases it is possible to bound the number of database examples that must be searched in order to achieve a given level of accuracy.
This chapter overviews data structures for fast search with binary codes, and then describes several supervised and unsupervised strategies for generating the codes. In particular, we review supervised methods that integrate metric learning, boosting, and neural networks into the hash key construction, and unsupervised methods based on spectral analysis or kernelized random projections that compute affinity-preserving binary codes.Whether learning from explicit semantic supervision or exploiting the structure among unlabeled data, these methods make scalable retrieval possible for a variety of robust visual similarity measures.We focus on defining the algorithms, and illustrate the main points with results using millions of images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andoni, A., Indyk, P.: Near-Optimal Hashing Algorithms for Near Neighbor Problem in High Dimensions. In: IEEE Symposium on Foundations of Computer Science, FOCS (2006)
Athitsos, V., Alon, J., Sclaroff, S., Kollios, G.: BoostMap: A Method for Efficient Approximate Similarity Rankings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2004)
Athitsos, V., Alon, J., Sclaroff, S., Kollios, G.: BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 30(1) (2008)
Babenko, B., Branson, S., Belongie, S.: Similarity Metrics for Categorization: from Monolithic to Category Specific. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2009)
Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)
Belkin, M., Niyogi, P.: Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In: Neural Information Processing Systems (NIPS), pp. 585–591 (2001)
Belkin, M., Niyogi, P.: Towards a theoretical foundation for laplacian based manifold methods. J. of Computer System Sciences (2007)
Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering. In: Neural Information Processing Systems, NIPS (2004)
Bentley, J.: Multidimensional Divide and Conquer. Communications of the ACM 23(4), 214–229 (1980)
Broder, A.: On the Resemblance and Containment of Documents. In: Proceedings of the Compression and Complexity of Sequences (1997)
Bronstein, M., Bronstein, A., Michel, F., Paragios, N.: Data Fusion through Cross-modality Metric Learning using Similarity-Sensitive Hashing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Charikar, M.: Similarity Estimation Techniques from Rounding Algorithms. In: ACM Symp. on Theory of Computing (2002)
Chum, O., Perdoch, M., Matas, J.: Geometric min-Hashing: Finding a (Thick) Needle in a Haystack. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2009)
Chum, O., Philbin, J., Zisserman, A.: Near Duplicate Image Detection: min-Hash and tf-idf Weighting. In: British Machine Vision Conference (2008)
Coifman, R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric Diffusions as a Tool for Harmonic Analysis and Struture Definition of Data: Diffiusion Maps. Proc. Natl. Academy of Sciences 102(21), 7426–7431 (2005)
Crammer, K., Keshet, J., Singer, Y.: Kernel Design Using Boosting. In: Neural Information Processing Systems, NIPS (2002)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: Symposium on Computational Geometry, SOCG (2004)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image Retrieval: Ideas, Influences, and Trends of the New Age. ACM Computing Surveys (2008)
Davis, J., Kulis, B., Jain, P., Sra, S., Dhillon, I.: Information-Theoretic Metric Learning. In: Proceedings of International Conference on Machine Learning, ICML (2007)
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic Label Sharing for Learning with Many Categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)
Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral Grouping Using the Nystrom Method. PAMI 26(2), 214–225 (2004)
Freidman, J., Bentley, J., Finkel, A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)
Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proc. Intl Conf. on Very Large Data Bases (1999)
Globerson, A., Roweis, S.: Metric Learning by Collapsing Classes. In: Neural Information Processing Systems, NIPS (2005)
Goemans, M., Williamson, D.: Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. JACM 42(6), 1115–1145 (1995)
Goldberger, J., Roweis, S.T., Salakhutdinov, R.R., Hinton, G.E.: Neighborhood Components Analysis. In: Neural Information Processing Systems, NIPS (2004)
Grauman, K., Darrell, T.: The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2005)
Grauman, K., Darrell, T.: Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2007)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality Reduction by Learning an Invariant Mapping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)
Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning Distance Functions for Image Retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2004)
Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning a Kernel Function for Classification with Small Training Samples. In: Proceedings of International Conference on Machine Learning, ICML (2006)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Nature 313(5786), 504–507 (2006)
Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: 30th Symposium on Theory of Computing (1998)
Indyk, P., Thaper, N.: Fast Image Retrieval via Embeddings. In: Intl. Workshop on Statistical and Computational Theories of Vision (2003)
Iqbal, Q., Aggarwal, J.K.: CIRES: A System for Content-Based Retrieval in Digital Image Libraries. In: International Conference on Control, Automation, Robotics and Vision (2002)
Jain, P., Kulis, B., Dhillon, I., Grauman, K.: Online Metric Learning and Fast Similarity Search. In: Neural Information Processing Systems, NIPS (2008)
Jain, P., Kulis, B., Grauman, K.: Fast Image Search for Learned Metrics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Kulis, B., Darrell, T.: Learning to Hash with Binary Reconstructive Embeddings. In: Neural Information Processing Systems, NIPS (2009)
Kulis, B., Grauman, K.: Kernelized Locality-Sensitive Hashing for Scalable Image Search. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2009)
Kulis, B., Jain, P., Grauman, K.: Fast Similarity Search for Learned Metrics. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 31 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)
Lin, R.-S., Ross, D., Yagnik, J.: SPEC Hashing: Similarity Preserving Algorithm for Entropy-based Coding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Ling, H., Soatto, S.: Proximity Distribution Kernels for Geometric Context in Category Recognition. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2007)
Liu, T., Moore, A., Gray, A., Yang, K.: An Investigation of Practical Approximate Nearest Neighbor Algorithms. In: Neural Information Processing Systems, NIPS (2005)
Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision (IJCV) 60(2) (2004)
Mu, Y., Shen, J., Yan, S.: Weakly-supervised hashing in kernel space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Muja, M., Lowe, D.: Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In: International Conference on Computer Vision Theory and Application, VISSAPP (2009)
Nadler, B., Lafon, S., Coifman, R., Kevrekidis, I.: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems (2008), http://arxiv.org
Ng, A., Jordan, M.I., Weiss, Y.: On Spectral Clustering, Analysis and an Algorithm. In: Neural Information Processing Systems, NIPS (2001)
Oliva, A., Torralba, A.: Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. International Journal in Computer Vision 42, 145–175 (2001)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object Retrieval with Large Vocabularies and Fast Spatial Matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2007)
Raginsky, M., Lazebnik, S.: Locality-Sensitive Binary Codes from Shift-Invariant Kernels. In: Neural Information Processing Systems, NIPS (2009)
Rahimi, A., Recht, B.: Random Features for Large-Scale Kernel Machines. In: Neural Information Processing Systems, NIPS (2007)
Rice, J.: Mathematical Statistics and Data Aanalysis. Duxbury Press (2001)
Roweis, S., Saul, L.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290(5500), 2323–2326 (2000)
Salakhutdinov, R.R., Hinton, G.E.: Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure. In: AISTATS (2007)
Salakhutdinov, R.R., Hinton, G.E.: Semantic Hashing. In: SIGIR Workshop on Information Retrieval and Applications of Graphical Models (2007)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation 10, 1299–1319 (1998)
Schultz, M., Joachims, T.: Learning a Distance Metric from Relative Comparisons. In: Neural Information Processing Systems, NIPS (2003)
Shakhnarovich, G.: Learning Task-Specific Similarity. PhD thesis. MIT (2005)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2003)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2003)
Tenenbaum, J., de Silva, V., Langford, J.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)
Torralba, A., Fergus, R., Weiss, Y.: Small Codes and Large Image Databases for Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Uhlmann, J.: Satisfying General Proximity/Similarity Queries with Metric Trees. Information Processing Letters 40, 175–179 (1991)
van der Maaten, L., Hinton, G.: Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008)
Varma, M., Ray, D.: Learning the Discriminative Power-Invariance Trade-off. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2007)
Wang, J., Kumar, S., Chang, S.-F.: Semi-Supervised Hashing for Scalable Image Retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Wang, J., Kumar, S., Chang, S.-F.: Sequential Projection Learning for Hashing with Compact Codes. In: Proceedings of International Conference on Machine Learning, ICML (2010)
Weinberger, K., Blitzer, J., Saul, L.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Neural Information Processing Systems, NIPS (2006)
Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: Neural Information Processing Systems, NIPS (2008)
Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: Neural Information Processing Systems, NIPS (2002)
Xu, D., Cham, T.J., Yan, S., Chang, S.-F.: Near Duplicate Image Identification with Spatially Aligned Pyramid Matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Yeh, T., Grauman, K., Tollmar, K., Darrell, T.: A Picture is Worth a Thousand Keywords: Image-Based Object Search on a Mobile Platform. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (2005)
Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision (IJCV) 73(2), 213–238 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Grauman, K., Fergus, R. (2013). Learning Binary Hash Codes for Large-Scale Image Search. In: Cipolla, R., Battiato, S., Farinella, G. (eds) Machine Learning for Computer Vision. Studies in Computational Intelligence, vol 411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28661-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-28661-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28660-5
Online ISBN: 978-3-642-28661-2
eBook Packages: EngineeringEngineering (R0)