Classemes: A Compact Image Descriptor for Efficient Novel-Class Recognition and Search

Torresani, Lorenzo; Szummer, Martin; Fitzgibbon, Andrew

doi:10.1007/978-3-642-44907-9_5

Lorenzo Torresani⁵,
Martin Szummer⁶ &
Andrew Fitzgibbon⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 532))

1661 Accesses

Abstract

In this chapter we review the problem of object class recognition in large image collections.We focus specifically on scenarios where the classes to be recognized are not known in advance. The motivating application is “object-class search by example” where a user provides at query time a small set of training images defining an arbitrary novel category and the system must retrieve images belonging to this class from a large database. This setting poses challenging requirements on the system design: the object classifier must be learned efficiently at query time from few examples; recognition must have low computational cost with respect to the database size; finally, compact image descriptors must be used to allow storage of large collections in memory. We review a method that addresses these requirements by learning a compact image descriptor - classemes - yielding good categorization accuracy even with efficient linear classifiers. We also study how data structures and methods from text-retrieval can be adapted to enable efficient search of an object-class in collections of several million images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multiple Instance Classification in the Image Domain

Large-Scale R-CNN with Classifier Adaptive Quantization

Incremental Estimation of Visual Vocabulary Size for Image Retrieval

References

Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: ICML (2004)
Google Scholar
Bergamo, A., Torresani, L., Fitzgibbon, A.: Picodes: Learning a compact code for novel-category recognition. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 24, pp. 2088–2096 (2011)
Google Scholar
Bo, L., Sminchisescu, C.: Efficient Match Kernel between Sets of Features for Visual Recognition. Adv. in Neural Inform. Proc. Systems (December 2009)
Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proc. Comp. Vision Pattern Recogn (CVPR) (2008)
Google Scholar
Bosch, A.: Image classification using rois and multiple kernel learning (2010), http://eia.udg.es/~aboschr/Publicacions/bosch08a_preliminary.pdf
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: Intl. Conf. Computer Vision (2007)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (1), pp. 886–893 (2005)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: Proc. Comp. Vision Pattern Recogn, CVPR (2011)
Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: A library for large linear classification. J. of Machine Learning Research 9, 1871–1874 (2008)
MATH Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 1778–1785 (2009)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: ICCV, pp. 1816–1823 (2005)
Google Scholar
Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: ICCV (2009)
Google Scholar
Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: Proc. Comp. Vision Pattern Recogn. (CVPR) (2008)
Google Scholar
Hauptmann, A.G., Yan, R., Lin, W.-H., Christel, M.G., Wactlar, H.D.: Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Transactions on Multimedia 9(5), 958–966 (2007)
Article Google Scholar
Heitz, G., Gould, S., Saxena, A., Koller, D.: Cascaded classification models: Combining models for holistic scene understanding. In: Advances in Neural Information Processing Systems (NIPS), pp. 641–648 (2008)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Joachims, T.: An implementation of support vector machines (svms) in c (2002)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Proc. Comp. Vision Pattern Recogn. (CVPR) (2009)
Google Scholar
Li-Jia Li, E.P.X., Su, H., Fei-Fei, L.: Object bank: A high-level image representation for scene classification semantic feature sparsification. In: NIPS (2010)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Intl. Jrnl. of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
LSCOM (2006), http://lastlaugh.inf.cs.cmu.edu/lscom/ontology/LSCOM-20060630.txt http://www.lscom.org/ontology/index.html (Cyc ontology dated June 30, 2006)
Malisiewicz, T., Efros, A.A.: Recognition by association via learning per-exemplar distances. In: Proc. Comp. Vision Pattern Recogn. (CVPR) (2008)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge Univ. Press (2008)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Intl. Jrnl. of Computer Vision 60(1), 63–86 (2004)
Article Google Scholar
Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE MultiMedia 13(3), 86–91 (2006)
Article Google Scholar
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 2161–2168 (2006)
Google Scholar
Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research 155 (2006)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)
Google Scholar
Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: Advances in Neural Information Processing Systems (NIPS) (2010)
Google Scholar
Rastegari, M., Fang, C., Torresani, L.: Scalable object-class retrieval with approximate and top-k ranking. In: ICCV, pp. 2659–2666 (2011)
Google Scholar
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reasoning 50, 969–978 (2009)
Article Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proc. Comp. Vision Pattern Recogn. (CVPR) (June 2007)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)
Google Scholar
Torralba, A., Fergus, R., Weiss, Y.: Small codes and large image databases for recognition. In: Proc. Comp. Vision Pattern Recogn. (CVPR) (2008)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(5), 854–869 (2007)
Article Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Learning query-dependent prefilters for scalable image retrieval. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 2615–2622 (2009)
Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Chapter Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes, web page (2010), http://www.cs.dartmouth.edu/~lorenzo/projects/classemes
Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. Intl. Jrnl. of Computer Vision 72(2), 133–157 (2007)
Article Google Scholar
Wang, G., Hoiem, D., Forsyth, D.: Learning image similarity from flickr using stochastic intersection kernel machines. In: Intl. Conf. Computer Vision (2009)
Google Scholar
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2008)
Google Scholar
Zehnder, P., Koller-Meier, E., Gool, L.V.: An efficient shared multi-class detection cascade. In: British Machine Vision Conf. (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

6211 Sudikoff Lab, Dartmouth College, Hanover, NH, 03755, U.S.A.
Lorenzo Torresani
Microsoft Research, 7 JJ Thomson Avenue, Cambridge, CB3 0FB, U.K.
Martin Szummer & Andrew Fitzgibbon

Authors

Lorenzo Torresani
View author publications
You can also search for this author in PubMed Google Scholar
Martin Szummer
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Fitzgibbon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lorenzo Torresani .

Editor information

Editors and Affiliations

University of Cambridge Department of Engineering, Cambridge, United Kingdom
Roberto Cipolla
Università di Catania Dipartimento di Matematica e Informatica, Catania, Catania, Italy
Sebastiano Battiato
Università di Catania Dipartimento di Matematica e Informatica, Catania, Italy
Giovanni Maria Farinella

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Torresani, L., Szummer, M., Fitzgibbon, A. (2014). Classemes: A Compact Image Descriptor for Efficient Novel-Class Recognition and Search. In: Cipolla, R., Battiato, S., Farinella, G. (eds) Registration and Recognition in Images and Videos. Studies in Computational Intelligence, vol 532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44907-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-44907-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-44906-2
Online ISBN: 978-3-642-44907-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics