skip to main content
10.1145/2502081.2502185acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

Modeling local descriptors with multivariate gaussians for object and scene recognition

Authors Info & Claims
Published:21 October 2013Publication History

ABSTRACT

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose to employ a parametric description and compare its capabilities to histogram based approaches. We use the multivariate Gaussian distribution, applied over the SIFT descriptors, extracted with dense sampling on a spatial pyramid. Every distribution is converted to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. Experiments on Caltech-101 and ImageCLEF2011 are performed using the Stochastic Gradient Descent solver, which allows to deal with large scale datasets and high dimensional feature spaces.

References

  1. S. Ali and S. Silvey. A general class of coefficients of divergence of one distribution from another. J. of the Royal Stat. Soc. (B), 28(1):131--142, 1966.Google ScholarGoogle Scholar
  2. A. Binder, W. Samek, M. Kloft, C. Müller, K.-R. Müller, and M. Kawanabe. The Joint Submission of the TU Berlin and Fraunhofer FIRST (TUBFI) to the ImageCLEF2011 Photo Annotation Task. In CLEF Workshop, 2011.Google ScholarGoogle Scholar
  3. L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, pages 161--168, 2008.Google ScholarGoogle Scholar
  4. K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. In BMVC, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshop Stat. Learn. Comput. Vision, 2004.Google ScholarGoogle Scholar
  6. P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In ICCV, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. C. Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. Smeulders. Kernel codebooks for scene categorization. In ECCV, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res., 8:725--760, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Huang, K. Huang, C. Wang, and T. Tan. Exploring relations of visual codes for image classification. In Proc. of CVPR, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Jia, C. Huang, and T. Darrell. Beyond spatial pyramids: Receptive field learning for pooled image features. In CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Jiang, G. Zhang, and L. S. Davis. Submodular dictionary learning for sparse coding. In CVPR, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Kailath. The divergence and Bhattacharyya distance measures in signal selection. IEEE T. Commun. Techn., 15(1):52--60, 1967.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. of CVPR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, and K. Yu. Large-scale image classification: Fast feature extraction and svm training. In CVPR, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Liu, L. Wang, and X. Liu. In defense of soft-assignment coding. In ICCV, 2011.Google ScholarGoogle Scholar
  16. S. Martelli, D. Tosato, M. Farenzena, M. Cristani, and V. Murino. An FPGA-based Classification Architecture on Riemannian Manifolds. In DEXA Workshops, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Spyromitros-Xioufis, K. Sechidis, G. Tsoumakas, and I. P. Vlahavas. MLKD's Participation at the CLEF 2011 Photo Annotation and Concept-Based Retrieval Tasks. In CLEF Workshop, 2011.Google ScholarGoogle Scholar
  18. T. Tuytelaars, M. Fritz, K. Saenko, and T. Darrell. The nbnn kernel. In ICCV, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. O. Tuzel, F. Porikli, and P. Meer. Pedestrian Detection via Classification on Riemannian Manifolds. IEEE T. Pattern Anal., 30(10):1713--1727, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Vedaldi and B. Fulkerson. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008.Google ScholarGoogle Scholar
  21. J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.Google ScholarGoogle Scholar

Index Terms

  1. Modeling local descriptors with multivariate gaussians for object and scene recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '13: Proceedings of the 21st ACM international conference on Multimedia
      October 2013
      1166 pages
      ISBN:9781450324045
      DOI:10.1145/2502081

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      MM '13 Paper Acceptance Rate47of235submissions,20%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader