poster

Modeling local descriptors with multivariate gaussians for object and scene recognition

Authors:
Giuseppe Serra

Università degli Studi di Modena e Reggio Emilia, Modena, Italy

Università degli Studi di Modena e Reggio Emilia, Modena, Italy
View Profile

,
Costantino Grana

Università degli Studi di Modena e Reggio Emilia, Modena, Italy

Università degli Studi di Modena e Reggio Emilia, Modena, Italy
View Profile

,
Marco Manfredi

Università degli Studi di Modena e Reggio Emilia, Modena, Italy

Università degli Studi di Modena e Reggio Emilia, Modena, Italy
View Profile

,
Rita Cucchiara

Università degli Studi di Modena e Reggio Emilia, Modena, Italy

Università degli Studi di Modena e Reggio Emilia, Modena, Italy
View Profile

MM '13: Proceedings of the 21st ACM international conference on MultimediaOctober 2013Pages 709–712https://doi.org/10.1145/2502081.2502185

Published:21 October 2013Publication History

MM '13: Proceedings of the 21st ACM international conference on Multimedia

Pages 709–712

ABSTRACT

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose to employ a parametric description and compare its capabilities to histogram based approaches. We use the multivariate Gaussian distribution, applied over the SIFT descriptors, extracted with dense sampling on a spatial pyramid. Every distribution is converted to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. Experiments on Caltech-101 and ImageCLEF2011 are performed using the Stochastic Gradient Descent solver, which allows to deal with large scale datasets and high dimensional feature spaces.

References

S. Ali and S. Silvey. A general class of coefficients of divergence of one distribution from another. J. of the Royal Stat. Soc. (B), 28(1):131--142, 1966.Google Scholar
A. Binder, W. Samek, M. Kloft, C. Müller, K.-R. Müller, and M. Kawanabe. The Joint Submission of the TU Berlin and Fraunhofer FIRST (TUBFI) to the ImageCLEF2011 Photo Annotation Task. In CLEF Workshop, 2011.Google Scholar
L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, pages 161--168, 2008.Google Scholar
K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. In BMVC, 2011.Google ScholarCross Ref
G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshop Stat. Learn. Comput. Vision, 2004.Google Scholar
P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In ICCV, 2009.Google ScholarCross Ref
J. C. Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. Smeulders. Kernel codebooks for scene categorization. In ECCV, 2008. Google ScholarDigital Library
K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res., 8:725--760, 2007. Google ScholarDigital Library
Y. Huang, K. Huang, C. Wang, and T. Tan. Exploring relations of visual codes for image classification. In Proc. of CVPR, 2011. Google ScholarDigital Library
Y. Jia, C. Huang, and T. Darrell. Beyond spatial pyramids: Receptive field learning for pooled image features. In CVPR, 2012. Google ScholarDigital Library
Z. Jiang, G. Zhang, and L. S. Davis. Submodular dictionary learning for sparse coding. In CVPR, 2012. Google ScholarDigital Library
T. Kailath. The divergence and Bhattacharyya distance measures in signal selection. IEEE T. Commun. Techn., 15(1):52--60, 1967.Google ScholarCross Ref
S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. of CVPR, 2006. Google ScholarDigital Library
Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, and K. Yu. Large-scale image classification: Fast feature extraction and svm training. In CVPR, 2011.Google ScholarDigital Library
L. Liu, L. Wang, and X. Liu. In defense of soft-assignment coding. In ICCV, 2011.Google Scholar
S. Martelli, D. Tosato, M. Farenzena, M. Cristani, and V. Murino. An FPGA-based Classification Architecture on Riemannian Manifolds. In DEXA Workshops, 2010. Google ScholarDigital Library
E. Spyromitros-Xioufis, K. Sechidis, G. Tsoumakas, and I. P. Vlahavas. MLKD's Participation at the CLEF 2011 Photo Annotation and Concept-Based Retrieval Tasks. In CLEF Workshop, 2011.Google Scholar
T. Tuytelaars, M. Fritz, K. Saenko, and T. Darrell. The nbnn kernel. In ICCV, 2011. Google ScholarDigital Library
O. Tuzel, F. Porikli, and P. Meer. Pedestrian Detection via Classification on Riemannian Manifolds. IEEE T. Pattern Anal., 30(10):1713--1727, 2008. Google ScholarDigital Library
A. Vedaldi and B. Fulkerson. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008.Google Scholar
J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.Google ScholarCross Ref
J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.Google Scholar

Index Terms

Modeling local descriptors with multivariate gaussians for object and scene recognition
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Random interest regions for object recognition based on texture descriptors and bag of features

In this work we propose a novel method for object recognition based on a random selection of interest regions, texture features (local binary/ternary patterns and local phase quantization) for describing each region, a bag-of-features approach for ...
Read More
New color GPHOG descriptors for object and scene image classification

This paper presents a novel set of image descriptors that encodes information from color, shape, spatial and local features of an image to improve upon the popular Pyramid of Histograms of Oriented Gradients (PHOG) descriptor for object and scene image ...
Read More
Local contour descriptors around scale-invariant keypoints
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

Describing local patches to register image keypoints is an important task for building a huge database from video frames. When searching for an efficient descriptor, task is twofold: features must describe the featuring patches at a high efficiency, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '13: Proceedings of the 21st ACM international conference on Multimedia
October 2013
1166 pages
ISBN:9781450324045
DOI:10.1145/2502081
General Chairs:
Alejandro (Alex) Jaimes
Yahoo!, Spain
,
Nicu Sebe
University of Trento, Italy
,
Nozha Boujemaa
INRIA, France
,
Program Chairs:
Daniel Gatica-Perez
IDIAP & EPFL, Switzerland
,
David A. Shamma
Yahoo!, USA
,
Marcel Worring
University of Amsterdam, The Netherlands
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
image understanding
local features
object recognition
stochastic gradient descent
Qualifiers
- poster
Conference

Acceptance Rates
MM '13 Paper Acceptance Rate47of235submissions,20%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 164
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling local descriptors with multivariate gaussians for object and scene recognition

MM '13: Proceedings of the 21st ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Random interest regions for object recognition based on texture descriptors and bag of features

New color GPHOG descriptors for object and scene image classification

Local contour descriptors around scale-invariant keypoints