Learning Vocabulary-Based Hashing with AdaBoost

Liang, Yingyu; Li, Jianmin; Zhang, Bo

doi:10.1007/978-3-642-11301-7_54

Yingyu Liang²¹,
Jianmin Li²¹ &
Bo Zhang²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5916))

Included in the following conference series:

International Conference on Multimedia Modeling

2087 Accesses
2 Citations

Abstract

Approximate near neighbor search plays a critical role in various kinds of multimedia applications. The vocabulary-based hashing scheme uses vocabularies, i.e. selected sets of feature points, to define a hash function family. The function family can be employed to build an approximate near neighbor search index. The critical problem in vocabulary-based hashing is the criteria of choosing vocabularies. This paper proposes a approach to greedily choosing vocabularies via Adaboost. An index quality criterion is designed for the AdaBoost approach to adjust the weight of the training data. We also describe the parallelized version of the index for large scale applications. The promising results of the near-duplicate image detection experiments show the efficiency of the new vocabulary construction algorithm and desired qualities of the parallelized vocabulary-based hashing for large scale applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Flickr, http://www.flickr.com
Holidays dataset, http://lear.inrialpes.fr/people/jegou/data.php
Imagemagick, http://www.imagemagick.org
Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM (1998)
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SCG (2004)
Google Scholar
Douze, M., Jégou, H., Singh, H., Amsaleg, L., Schmid, C.: Evaluation of gist descriptors for web-scale image search. In: CIVR. ACM, New York (2009)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. of Computer and System Sciences (1997)
Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB (1999)
Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC (1998)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval. In: MM (2004)
Google Scholar
Liang, Y., Li, J., Zhang, B.: Vocabulary-based hashing for image search. In: MM (to appear, 2009)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. In: IJCV (2004)
Google Scholar
Meng, Y., Chang, E., Li, B.: Enhancing dpf for near-replica image recognition. In: Proceedings of IEEE Computer Vision and Pattern Recognition (2003)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Google Scholar
Poullot, S., Buisson, O., Crucianu, M.: Z-grid-based probabilistic retrieval for scaling up content-based copy detection. In: CIVR (2007)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Yingyu Liang, Jianmin Li & Bo Zhang

Authors

Yingyu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jianmin Li
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Oldenburg, Germany
Susanne Boll
University of Texas at San Antonio,, TX, San Antonio, USA
Qi Tian
Microsoft Research Asia, Beijing, P.R. China
Lei Zhang
Southwest University, Beibei, Chongqing, China
Zili Zhang
School of Engineering and Information Technology, Deakin University, 221 Burwood Highway, Vic, 3125, Australia
Yi-Ping Phoebe Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, Y., Li, J., Zhang, B. (2010). Learning Vocabulary-Based Hashing with AdaBoost. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, YP.P. (eds) Advances in Multimedia Modeling. MMM 2010. Lecture Notes in Computer Science, vol 5916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11301-7_54

Download citation

DOI: https://doi.org/10.1007/978-3-642-11301-7_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11300-0
Online ISBN: 978-3-642-11301-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics