short-paper

Refining local descriptors by embedding semantic information for visual categorization

Authors:

Xiangyang XueAuthors Info & Claims

MM '11: Proceedings of the 19th ACM international conference on Multimedia

Pages 1381 - 1384

https://doi.org/10.1145/2072298.2072020

Published: 28 November 2011 Publication History

Abstract

Local descriptor extraction and vector quantization are the important components of widely-used Bag-of-Features (BoF) model for visual categorization. This paper proposes a simple and efficient approach to refine the local descriptors for vector quantization by embedding semantic information. The original local descriptors are integrated by a sequence of category-independent and category-dependent basis. Particularly, the category-dependent basis is learned by minimizing the joint loss minimization over local descriptors from different categories with a shared regularization penalty, which can be formulated as a linear programming problem. The transferred descriptors are further quantized and aggregated to the visual vocabulary. Experiments are performed on PASCAL VOC 2007 benchmark and the quantitative comparisons with several state-of-the-art approaches demonstrate the effectiveness of our proposed approach.

References

[1]

A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM, 51:117--122, 2008.

Digital Library

[2]

A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Advances in Neural Information Processing Systems, 2006.

[3]

E. Bingham and H. Mannila. Random projection in dimensionality reduction: Applications to image and text data. In Annual ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2001.

Digital Library

[4]

C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.

[5]

G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV, pages 1--22, 2004.

[6]

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Conference on Computer Vision and Pattern Recognition, 2005.

Digital Library

[7]

M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.

[8]

Y. Ke and R. Sukthankar. PCA-SIFT: a more distinctive representation for local image descriptors. In Conference on Computer Vision and Pattern Recognition, volume 2, pages 506--513, 2004.

Digital Library

[9]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.

Digital Library

[10]

F. Moosmann, B. Triggs, and F. Jurie. Fast discriminative visual codebooks using randomized clustering forests. In Advances in Neural Information Processing Systems, pages 985--992, 2006.

Digital Library

[11]

Y. Mu, J. Sun, T. X. Han, L.-F. Cheong, and S. Yan. Randomized locality sensitive vocabularies for bag-of-features model. In ECCV, 2010.

Digital Library

[12]

D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Conference on Computer Vision and Pattern Recognition, pages 2161--2168, 2006.

Digital Library

[13]

F. Perronnin, C. Dance, G. Csurka, and M. Bressan. Adapted vocabularies for generic visual categorization. In ECCV, 2006.

Digital Library

[14]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Conference on Computer Vision and Pattern Recognition, 2007.

[15]

J. Shotton, M. Johnson, and R. Cipolla. Semantic texton forests for image categorization and segmentation. In Conference on Computer Vision and Pattern Recognition, 2008.

[16]

J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos. In ICCV, pages 1470--1477, 2003.

Digital Library

[17]

J. Uijlings, A. Smeulders, and R. Scha. Real-time bag of words, approximately. In ACM International Conference on Image and Video Retrieval, 2009.

Digital Library

[18]

L. Yang, R. Jin, R. Sukthankar, and F. Jurie. Unifying discriminative visual codebook generation with classifier training for object category recognition. In Conference on Computer Vision and Pattern Recognition, 2008.

Index Terms

Refining local descriptors by embedding semantic information for visual categorization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Optimal operations for visual categorization
ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service

Bag-of-words is the state-of-the-art method used in visual categorization. The performance of visual categorization depends on four main operations: the detection of interest point, the description of interest point, the design of classifier, and the ...
Image region description using orthogonal combination of local binary patterns enhanced with color information

Visual content description is a key issue for machine-based image analysis and understanding. A good visual descriptor should be both discriminative and computationally efficient while possessing some properties of robustness to viewpoint changes and ...
Local descriptor based on texture of projections
ICVGIP '10: Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing

The aim of a local descriptor or a feature descriptor is to efficiently represent the region detected by an interest point operator in a compact format for use in various applications related to matching. The common design principle behind most of the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '11: Proceedings of the 19th ACM international conference on Multimedia

November 2011

944 pages

ISBN:9781450306164

DOI:10.1145/2072298

General Chairs:
K. Selçuk Candan
Arizona State University, USA
,
Sethuraman Panchanathan
Arizona State University, USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA
,
Program Chairs:
Hari Sundaram
Arizona State University, USA
,
Wu-Chi Feng
Portland State University, USA
,
Nicu Sebe
University of Trento, Italy

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 November 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '11

Sponsor:

SIGMM

MM '11: ACM Multimedia Conference

November 28 - December 1, 2011

Arizona, Scottsdale, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
117
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten