skip to main content
10.1145/2578726.2578736acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
tutorial

Discriminative Cellets Discovery for Fine-Grained Image Categories Retrieval

Published: 01 April 2014 Publication History

Abstract

Fine-grained image categories recognition is a challenging task aiming at distinguishing objects belonging to the same basic-level category, such as leaf or mushroom. It is a useful technique that can be applied for species recognition, face verification, and etc. Most of the existing methods have difficulties to automatically detect discriminative object components. In this paper, we propose a new fine-grained image categorization model that can be deemed as an improved version spatial pyramid matching (SPM). Instead of the conventional SPM that enumeratively conducts cell-to-cell matching between images, the proposed model combines multiple cells into cellets that are highly responsive to object fine-grained categories. In particular, we describe object components by cellets that connect spatially adjacent cells from the same pyramid level. Straightforwardly, image categorization can be casted as the matching between cellets extracted from pairwise images. Toward an effective matching process, a hierarchical sparse coding algorithm is derived that represents each cellet by a linear combination of the basis cellets. Further, a linear discriminant analysis (LDA)-like scheme is employed to select the cellets with high discrimination. On the basis of the feature vector built from the selected cellets, fine-grained image categorization is conducted by training a linear SVM. Experimental results on the Caltech-UCSD birds, the Leeds butterflies, and the COSMIC insects data sets demonstrate our model outperforms the state-of-the-art. Besides, the visualized cellets show discriminative object parts are localized accurately.

References

[1]
Jieping Ye, Least squares linear discriminant analysis, in Proc. of ICML, pages 1087--1093, 2007.
[2]
Svetlana Lazebnik, Cordelia Schmid and Jean Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. in Proc. of CVPR, pages 2169--2178, 1006.
[3]
Yue Gao, Jinhui Tang, Richang Hong, Shuicheng Yan, Qionghai Dai, Naiyao Zhang, Tat-Seng Chua, Camera Constraint-Free View-Based 3D Object Retrieval, IEEE T-IP, 21(4), pages: 2269--2281, 2012.
[4]
Yue Gao, Meng Wang, Zhengjun Zha, Qi Tian, Qionghai Dai, Naiyao Zhang, Less is More: Efficient 3D Object Retrieval with Query View Selection, IEEE T-MM, 11(5), pages: 1007--1018, 2011.
[5]
Yue Gao, Meng Wang, Rongrong Ji, Xindong Wu, Qionghai Dai, 3-D Object Retrieval With Hausdorff Distance Learning, IEEE T-IE, 61(4), pages: 2088--2098, 2014.
[6]
Jianchao Yang; Kai Yu; Yihong Gong; Huang, T. Linear spatial pyramid matching using sparse coding for image classification. in Proc. of CVPR, pages: 2169--2178, 2009.
[7]
Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang and Yihong Gong. Locality-constrained Linear Coding for Image Classification. in Proc. of CVPR, 2010.
[8]
Harchaoui Z. and Bach, F. Image Classification with Segmentation Graph Kernels. in Proc. of CVPR, pages: 1--8, 2007.
[9]
Xi Zhou, Na Cui, Zhen Li, Feng Liang, and Thomas S. Huang. Hierarchical Gaussianization for Image Classification. In Proc. of ICCV, pages: 1971--197, 2009
[10]
Jianxin Wu James M. Rehg. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In Proc. IEEE ICCV, pages:630--637, 2009
[11]
J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. M. Smeulders. Kernel codebooks for scene categorization. In Proc. ECCV, pages:696--709, 2008
[12]
Richard O. Duda, Peter E. Hart and David G. Stork: Pattern Classification. Wiley-Interscience, 2000.
[13]
Honglak Lee, Alexis Battle, Rajat Raina, and Andrew Y. Ng. Efficient sparse coding algorithms. in Proc. of NIPS, 2006.
[14]
J. Porway, K. Wang and B. Yao and S.C. Zhu, Scale-invariant shape features for recognition of object categories. in Proc. of ICCV, 2004.
[15]
Zaïd Harchaoui, Francis Bach, Image Classification with Segmentation Graph Kernels, in Proc. of ICCV, pages: 1--8, 2007.
[16]
Nino Shervashidze, S V N Vishwanathan, Tobias Petri, Kurt Mehlhorn, Karsten Borgwardt, Efficient Graphlet Kernels for Large Graph Comparison, in Proc. of AISTATS, pages: 488--495, 2009.
[17]
Yakov Keselman, ven Dickinson, Generic Model Abstraction from Examples, IEEE T-PAMI, 27(7), pages: 1141--1156, 2005.
[18]
M. Fatih Demirci, Ali Shokoufandeh, Yakov Keselman, Lars Bretzner, Sven Dickinson, Object Recognition as Many-to-Many Feature Matching, IJCV, 69(2), pages: 203--222, 2006.
[19]
Pedro F. Felzenszwalb, Daniel P. Huttenlocher, Pictorial Structures for Object Recognition, IJCV, 61(1), pages: 55--79, 2005.
[20]
Yong Jae Lee, Kristen Grauman, Object-Graphs for Context-Aware Category Discovery, in Proc. of CVPR, pages: 346--358, 2009.
[21]
Bangpeng Yao, Gary Bradski, Li Fei-Fei, A Codebook-Free and Annotation-Free Approach for Fine-Grained Image Categorization, in Proc. of CVPR, pages: 3466--3473, 2012.
[22]
Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman, Vantage Feature Frames For Fine-Grained Categorization, in Proc. of CVPR, pages: 835--842, 2013.
[23]
Thomas Berg, Peter N. Belhumeur, POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation, in Proc. of CVPR, pages: 955--962, 2013.
[24]
Jia Deng, Jonathan Krause, Li Fei-Fei, Fine-Grained Crowdsourcing for Fine-Grained Recognition, in Proc. of CVPR, pages: 580--587, 2013.
[25]
Kun Duan, Devi Parikh, David Crandall, Kristen Grauman, Discovering Localized Attributes for Fine-grained Recognition, in Proc. of CVPR, pages: 3474--3481, 2013.
[26]
Anelia Angelova, Shenghuo Zhu, Efficient Object Detection and Segmentation for Fine-grained Recognition, in Proc. of CVPR, pages: 811--818, 2013.
[27]
Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, Yihong Gong, Locality-constrained Linear Coding for Image Classification, in Proc. of CVPR, pages: 3360--3367, 2010.
[28]
Li-Jia Li, Hao Su, Eric P. Xing, Li Fei-Fei, Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification, in Proc. of NIPS, pages: 1378--1386, 2010.
[29]
Yangqing Jia, Chang Huang, Trevor Darrell, Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features, in Proc. of CVPR, pages: 3370--3377, 2012.
[30]
Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei, Object-Centric Spatial Pooling for Image Classification, in Proc. of ECCV, pages: 1--15, 2012.
[31]
Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Caltech-UCSD Birds 200, California Institute of Technology, CNS-TR-2010-001, 2010.
[32]
Josiah Wang, Katja Markert, Mark Everingham, Object-Centric Spatial Pooling for Image Classification, in Proc. of BMVC, pages: 1--11, 2009.
[33]
Yue Gao, Meng Wang, Dacheng Tao, Rongrong Ji, Qionghai Dai, 3D Object Retrieval and Recognition with Hypergraph Analysis, IEEE T-IP, 21(9), pages: 4290--4303, 2012.
[34]
Yue Gao, Meng Wang, Zhengjun Zha, Jialie Shen, Xuelong Li, Xindong Wu, Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search, IEEE T-IP, 22(1), pages: 363--376, 2013.

Cited By

View all
  • (2022)Sub-Region Localized Hashing for Fine-Grained Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2021.313104231(314-326)Online publication date: 2022
  • (2020)Weakly Supervised Complets Ranking for Deep Image Quality ModelingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2019.296254831:12(5041-5054)Online publication date: Dec-2020
  • (2018)Annotation modification for fine-grained visual recognitionNeurocomputing10.1016/j.neucom.2016.05.089274:C(58-65)Online publication date: 24-Jan-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMR '14: Proceedings of International Conference on Multimedia Retrieval
April 2014
564 pages
ISBN:9781450327824
DOI:10.1145/2578726
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Categories retrieval
  2. Cellets
  3. Fine-grained
  4. Sparse coding
  5. Spatial pyramid

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

ICMR '14
ICMR '14: International Conference on Multimedia Retrieval
April 1 - 4, 2014
Glasgow, United Kingdom

Acceptance Rates

ICMR '14 Paper Acceptance Rate 21 of 111 submissions, 19%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Sub-Region Localized Hashing for Fine-Grained Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2021.313104231(314-326)Online publication date: 2022
  • (2020)Weakly Supervised Complets Ranking for Deep Image Quality ModelingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2019.296254831:12(5041-5054)Online publication date: Dec-2020
  • (2018)Annotation modification for fine-grained visual recognitionNeurocomputing10.1016/j.neucom.2016.05.089274:C(58-65)Online publication date: 24-Jan-2018
  • (2018)RETRACTED ARTICLEMultimedia Tools and Applications10.1007/s11042-017-5604-y77:12(16001-16001)Online publication date: 1-Jun-2018
  • (2018)RETRACTED ARTICLE: Medical image encryption technique in big media environmentMultimedia Tools and Applications10.1007/s11042-017-5598-579:13-14(9655-9655)Online publication date: 20-Feb-2018
  • (2018)RETRACTED ARTICLE: Large-scale image-based fog detection based on cloud platformMultimedia Tools and Applications10.1007/s11042-017-5597-679:13-14(9663-9663)Online publication date: 5-Jan-2018
  • (2018)RETRACTED ARTICLE: Internet-scale secret sharing algorithm with multimedia applicationsMultimedia Tools and Applications10.1007/s11042-017-5558-079:13-14(9669-9669)Online publication date: 22-Jan-2018
  • (2018)RETRACTED ARTICLE: Image steganography using cosine transform with large-scale multimedia applicationsMultimedia Tools and Applications10.1007/s11042-017-5557-179:13-14(9665-9665)Online publication date: 24-Feb-2018
  • (2018)RETRACTED ARTICLE: Cross-camera multi-person tracking by leveraging fast graph mining algorithmMultimedia Tools and Applications10.1007/s11042-017-5403-579:13-14(9675-9676)Online publication date: 2-Jan-2018
  • (2018)RETRACTED ARTICLE: Visualized image segmentation for multi-object tracking by weak clustering techniqueMultimedia Tools and Applications10.1007/s11042-017-5392-479:13-14(9667-9667)Online publication date: 21-Feb-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media