Abstract
Approximate near neighbor search plays a critical role in various kinds of multimedia applications. The vocabulary-based hashing scheme uses vocabularies, i.e. selected sets of feature points, to define a hash function family. The function family can be employed to build an approximate near neighbor search index. The critical problem in vocabulary-based hashing is the criteria of choosing vocabularies. This paper proposes a approach to greedily choosing vocabularies via Adaboost. An index quality criterion is designed for the AdaBoost approach to adjust the weight of the training data. We also describe the parallelized version of the index for large scale applications. The promising results of the near-duplicate image detection experiments show the efficiency of the new vocabulary construction algorithm and desired qualities of the parallelized vocabulary-based hashing for large scale applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Flickr, http://www.flickr.com
Holidays dataset, http://lear.inrialpes.fr/people/jegou/data.php
Imagemagick, http://www.imagemagick.org
Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM (1998)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SCG (2004)
Douze, M., Jégou, H., Singh, H., Amsaleg, L., Schmid, C.: Evaluation of gist descriptors for web-scale image search. In: CIVR. ACM, New York (2009)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. of Computer and System Sciences (1997)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB (1999)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC (1998)
Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval. In: MM (2004)
Liang, Y., Li, J., Zhang, B.: Vocabulary-based hashing for image search. In: MM (to appear, 2009)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. In: IJCV (2004)
Meng, Y., Chang, E., Li, B.: Enhancing dpf for near-replica image recognition. In: Proceedings of IEEE Computer Vision and Pattern Recognition (2003)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Poullot, S., Buisson, O., Crucianu, M.: Z-grid-based probabilistic retrieval for scaling up content-based copy detection. In: CIVR (2007)
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liang, Y., Li, J., Zhang, B. (2010). Learning Vocabulary-Based Hashing with AdaBoost. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, YP.P. (eds) Advances in Multimedia Modeling. MMM 2010. Lecture Notes in Computer Science, vol 5916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11301-7_54
Download citation
DOI: https://doi.org/10.1007/978-3-642-11301-7_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11300-0
Online ISBN: 978-3-642-11301-7
eBook Packages: Computer ScienceComputer Science (R0)