tutorial

Discriminative Cellets Discovery for Fine-Grained Image Categories Retrieval

Authors:

Roger ZimmermannAuthors Info & Claims

ICMR '14: Proceedings of International Conference on Multimedia Retrieval

Pages 57 - 64

https://doi.org/10.1145/2578726.2578736

Published: 01 April 2014 Publication History

Abstract

Fine-grained image categories recognition is a challenging task aiming at distinguishing objects belonging to the same basic-level category, such as leaf or mushroom. It is a useful technique that can be applied for species recognition, face verification, and etc. Most of the existing methods have difficulties to automatically detect discriminative object components. In this paper, we propose a new fine-grained image categorization model that can be deemed as an improved version spatial pyramid matching (SPM). Instead of the conventional SPM that enumeratively conducts cell-to-cell matching between images, the proposed model combines multiple cells into cellets that are highly responsive to object fine-grained categories. In particular, we describe object components by cellets that connect spatially adjacent cells from the same pyramid level. Straightforwardly, image categorization can be casted as the matching between cellets extracted from pairwise images. Toward an effective matching process, a hierarchical sparse coding algorithm is derived that represents each cellet by a linear combination of the basis cellets. Further, a linear discriminant analysis (LDA)-like scheme is employed to select the cellets with high discrimination. On the basis of the feature vector built from the selected cellets, fine-grained image categorization is conducted by training a linear SVM. Experimental results on the Caltech-UCSD birds, the Leeds butterflies, and the COSMIC insects data sets demonstrate our model outperforms the state-of-the-art. Besides, the visualized cellets show discriminative object parts are localized accurately.

References

[1]

Jieping Ye, Least squares linear discriminant analysis, in Proc. of ICML, pages 1087--1093, 2007.

Digital Library

[2]

Svetlana Lazebnik, Cordelia Schmid and Jean Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. in Proc. of CVPR, pages 2169--2178, 1006.

Digital Library

[3]

Yue Gao, Jinhui Tang, Richang Hong, Shuicheng Yan, Qionghai Dai, Naiyao Zhang, Tat-Seng Chua, Camera Constraint-Free View-Based 3D Object Retrieval, IEEE T-IP, 21(4), pages: 2269--2281, 2012.

[4]

Yue Gao, Meng Wang, Zhengjun Zha, Qi Tian, Qionghai Dai, Naiyao Zhang, Less is More: Efficient 3D Object Retrieval with Query View Selection, IEEE T-MM, 11(5), pages: 1007--1018, 2011.

Digital Library

[5]

Yue Gao, Meng Wang, Rongrong Ji, Xindong Wu, Qionghai Dai, 3-D Object Retrieval With Hausdorff Distance Learning, IEEE T-IE, 61(4), pages: 2088--2098, 2014.

[6]

Jianchao Yang; Kai Yu; Yihong Gong; Huang, T. Linear spatial pyramid matching using sparse coding for image classification. in Proc. of CVPR, pages: 2169--2178, 2009.

[7]

Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang and Yihong Gong. Locality-constrained Linear Coding for Image Classification. in Proc. of CVPR, 2010.

[8]

Harchaoui Z. and Bach, F. Image Classification with Segmentation Graph Kernels. in Proc. of CVPR, pages: 1--8, 2007.

[9]

Xi Zhou, Na Cui, Zhen Li, Feng Liang, and Thomas S. Huang. Hierarchical Gaussianization for Image Classification. In Proc. of ICCV, pages: 1971--197, 2009

[10]

Jianxin Wu James M. Rehg. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proc. IEEE ICCV, pages:630--637, 2009

[11]

J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. M. Smeulders. Kernel codebooks for scene categorization. In Proc. ECCV, pages:696--709, 2008

Digital Library

[12]

Richard O. Duda, Peter E. Hart and David G. Stork: Pattern Classification. Wiley-Interscience, 2000.

Digital Library

[13]

Honglak Lee, Alexis Battle, Rajat Raina, and Andrew Y. Ng. Efficient sparse coding algorithms. in Proc. of NIPS, 2006.

[14]

J. Porway, K. Wang and B. Yao and S.C. Zhu, Scale-invariant shape features for recognition of object categories. in Proc. of ICCV, 2004.

Digital Library

[15]

Zaïd Harchaoui, Francis Bach, Image Classification with Segmentation Graph Kernels, in Proc. of ICCV, pages: 1--8, 2007.

[16]

Nino Shervashidze, S V N Vishwanathan, Tobias Petri, Kurt Mehlhorn, Karsten Borgwardt, Efficient Graphlet Kernels for Large Graph Comparison, in Proc. of AISTATS, pages: 488--495, 2009.

[17]

Yakov Keselman, ven Dickinson, Generic Model Abstraction from Examples, IEEE T-PAMI, 27(7), pages: 1141--1156, 2005.

Digital Library

[18]

M. Fatih Demirci, Ali Shokoufandeh, Yakov Keselman, Lars Bretzner, Sven Dickinson, Object Recognition as Many-to-Many Feature Matching, IJCV, 69(2), pages: 203--222, 2006.

Digital Library

[19]

Pedro F. Felzenszwalb, Daniel P. Huttenlocher, Pictorial Structures for Object Recognition, IJCV, 61(1), pages: 55--79, 2005.

Digital Library

[20]

Yong Jae Lee, Kristen Grauman, Object-Graphs for Context-Aware Category Discovery, in Proc. of CVPR, pages: 346--358, 2009.

[21]

Bangpeng Yao, Gary Bradski, Li Fei-Fei, A Codebook-Free and Annotation-Free Approach for Fine-Grained Image Categorization, in Proc. of CVPR, pages: 3466--3473, 2012.

Digital Library

[22]

Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman, Vantage Feature Frames For Fine-Grained Categorization, in Proc. of CVPR, pages: 835--842, 2013.

Digital Library

[23]

Thomas Berg, Peter N. Belhumeur, POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation, in Proc. of CVPR, pages: 955--962, 2013.

Digital Library

[24]

Jia Deng, Jonathan Krause, Li Fei-Fei, Fine-Grained Crowdsourcing for Fine-Grained Recognition, in Proc. of CVPR, pages: 580--587, 2013.

Digital Library

[25]

Kun Duan, Devi Parikh, David Crandall, Kristen Grauman, Discovering Localized Attributes for Fine-grained Recognition, in Proc. of CVPR, pages: 3474--3481, 2013.

Digital Library

[26]

Anelia Angelova, Shenghuo Zhu, Efficient Object Detection and Segmentation for Fine-grained Recognition, in Proc. of CVPR, pages: 811--818, 2013.

Digital Library

[27]

Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, Yihong Gong, Locality-constrained Linear Coding for Image Classification, in Proc. of CVPR, pages: 3360--3367, 2010.

[28]

Li-Jia Li, Hao Su, Eric P. Xing, Li Fei-Fei, Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification, in Proc. of NIPS, pages: 1378--1386, 2010.

[29]

Yangqing Jia, Chang Huang, Trevor Darrell, Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features, in Proc. of CVPR, pages: 3370--3377, 2012.

Digital Library

[30]

Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei, Object-Centric Spatial Pooling for Image Classification, in Proc. of ECCV, pages: 1--15, 2012.

Digital Library

[31]

Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Caltech-UCSD Birds 200, California Institute of Technology, CNS-TR-2010-001, 2010.

[32]

Josiah Wang, Katja Markert, Mark Everingham, Object-Centric Spatial Pooling for Image Classification, in Proc. of BMVC, pages: 1--11, 2009.

[33]

Yue Gao, Meng Wang, Dacheng Tao, Rongrong Ji, Qionghai Dai, 3D Object Retrieval and Recognition with Hypergraph Analysis, IEEE T-IP, 21(9), pages: 4290--4303, 2012.

[34]

Yue Gao, Meng Wang, Zhengjun Zha, Jialie Shen, Xuelong Li, Xindong Wu, Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search, IEEE T-IP, 22(1), pages: 363--376, 2013.

Cited By

Xiang XZhang YJin LLi ZTang J(2022)Sub-Region Localized Hashing for Fine-Grained Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2021.313104231(314-326)Online publication date: 2022
https://doi.org/10.1109/TIP.2021.3131042
Zhang LXu MYin JZhang CShao L(2020)Weakly Supervised Complets Ranking for Deep Image Quality ModelingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2019.296254831:12(5041-5054)Online publication date: Dec-2020
https://doi.org/10.1109/TNNLS.2019.2962548
Luo CMeng ZFeng JNi BWang M(2018)Annotation modification for fine-grained visual recognitionNeurocomputing10.1016/j.neucom.2016.05.089274:C(58-65)Online publication date: 24-Jan-2018
https://dl.acm.org/doi/10.1016/j.neucom.2016.05.089
Show More Cited By

Index Terms

Discriminative Cellets Discovery for Fine-Grained Image Categories Retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

Fine-Grained Image Categorization by Localizing TinyObject Parts from Unannotated Images
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

This paper proposes a novel fine-grained image categorization model where no object annotation is required in the training/testing stage. The key technique is a dense graph mining algorithm that localizes multi-scale discriminative object parts in each ...
Fine-grained semi-supervised labeling of large shape collections

In this paper we consider the problem of classifying shapes within a given category (e.g., chairs) into finer-grained classes (e.g., chairs with arms, rocking chairs, swivel chairs). We introduce a multi-label (i.e., shapes can belong to multiple ...
A New Bag of Words LBP BoWL Descriptor for Scene Image Classification
CAIP 2013: Proceedings, Part I, of the 15th International Conference on Computer Analysis of Images and Patterns - Volume 8047

This paper explores a new Local Binary Patterns LBP based image descriptor that makes use of the bag-of-words model to significantly improve classification performance for scene images. Specifically, first, a novel multi-neighborhood LBP is introduced ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMR '14: Proceedings of International Conference on Multimedia Retrieval

April 2014

564 pages

ISBN:9781450327824

DOI:10.1145/2578726

Conference Chairs:
Mohan Kankanhalli
National University of Singapore
,
Stefan Rueger
The Open University, UK
,
R. Manmatha
A9.com, USA
,
General Chairs:
Joemon Jose
University of Glasgow, UK
,
Keith van Rijsbergen
University of Glasgow, UK

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

ICMR '14

ICMR '14: International Conference on Multimedia Retrieval

April 1 - 4, 2014

Glasgow, United Kingdom

Acceptance Rates

ICMR '14 Paper Acceptance Rate 21 of 111 submissions, 19%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
212
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xiang XZhang YJin LLi ZTang J(2022)Sub-Region Localized Hashing for Fine-Grained Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2021.313104231(314-326)Online publication date: 2022
https://doi.org/10.1109/TIP.2021.3131042
Zhang LXu MYin JZhang CShao L(2020)Weakly Supervised Complets Ranking for Deep Image Quality ModelingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2019.296254831:12(5041-5054)Online publication date: Dec-2020
https://doi.org/10.1109/TNNLS.2019.2962548
Luo CMeng ZFeng JNi BWang M(2018)Annotation modification for fine-grained visual recognitionNeurocomputing10.1016/j.neucom.2016.05.089274:C(58-65)Online publication date: 24-Jan-2018
https://dl.acm.org/doi/10.1016/j.neucom.2016.05.089
Zhu JXiong CDu HXiang RLi Y(2018)RETRACTED ARTICLEMultimedia Tools and Applications10.1007/s11042-017-5604-y77:12(16001-16001)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11042-017-5604-y
Han XXiong CLi YHe FDu H(2018)RETRACTED ARTICLE: Medical image encryption technique in big media environmentMultimedia Tools and Applications10.1007/s11042-017-5598-579:13-14(9655-9655)Online publication date: 20-Feb-2018
https://doi.org/10.1007/s11042-017-5598-5
Xiong CXiang RLi YHan XDu H(2018)RETRACTED ARTICLE: Large-scale image-based fog detection based on cloud platformMultimedia Tools and Applications10.1007/s11042-017-5597-679:13-14(9663-9663)Online publication date: 5-Jan-2018
https://doi.org/10.1007/s11042-017-5597-6
Li YXiong CHan XDu HHe F(2018)RETRACTED ARTICLE: Internet-scale secret sharing algorithm with multimedia applicationsMultimedia Tools and Applications10.1007/s11042-017-5558-079:13-14(9669-9669)Online publication date: 22-Jan-2018
https://doi.org/10.1007/s11042-017-5558-0
Li YXiong CHan XXiang RHe FDu H(2018)RETRACTED ARTICLE: Image steganography using cosine transform with large-scale multimedia applicationsMultimedia Tools and Applications10.1007/s11042-017-5557-179:13-14(9665-9665)Online publication date: 24-Feb-2018
https://doi.org/10.1007/s11042-017-5557-1
Zhang CHuang YWang ZJiang HYan DCheng J(2018)RETRACTED ARTICLE: Cross-camera multi-person tracking by leveraging fast graph mining algorithmMultimedia Tools and Applications10.1007/s11042-017-5403-579:13-14(9675-9676)Online publication date: 2-Jan-2018
https://doi.org/10.1007/s11042-017-5403-5
Gui LHe LNi ZHong T(2018)RETRACTED ARTICLE: Visualized image segmentation for multi-object tracking by weak clustering techniqueMultimedia Tools and Applications10.1007/s11042-017-5392-479:13-14(9667-9667)Online publication date: 21-Feb-2018
https://doi.org/10.1007/s11042-017-5392-4
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents