research-article

Integrating hierarchical feature selection and classifier training for multi-label image annotation

Authors:
Cheng Jin

Fudan University, Shanghai, China

Fudan University, Shanghai, China
View Profile

,
Chunlei Yang

UNC-Charlotte, Charlotte, USA

UNC-Charlotte, Charlotte, USA
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 515–524https://doi.org/10.1145/2009916.2009987

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 515–524

ABSTRACT

It is well accepted that using high-dimensional multi-modal visual features for image content representation and classifier training may achieve more sufficient characterization of the diverse visual properties of the images and further result in higher discrimination power of the classifiers. However, training the classifiers in a high-dimensional multi-modal feature space requires a large number of labeled training images, which will further result in the problem of curse of dimensionality. To tackle this problem, a hierarchical feature subset selection algorithm is proposed to enable more accurate image classification, where the processes for feature selection and classifier training are seamlessly integrated in a single framework. First, a feature hierarchy (i.e., concept tree for automatic feature space partition and organization) is used to automatically partition high-dimensional heterogeneous multi-modal visual features into multiple low-dimensional homogeneous single-modal feature subsets according to their certain physical meanings and each of them is used to characterize one certain type of the diverse visual properties of the images. Second, principal component analysis (PCA) is performed on each homogeneous singlemodal feature subset to select the most representative feature dimensions and a weak classifier is learned simultaneously. After the weak classifiers and their representative feature dimensions are available for all these homogeneous single-modal feature subsets, they are combined to generate an ensemble image classifier and achieve hierarchical feature subset selection. Our experiments on a specific domain of natural images have also obtained very positive results.

References

K. Barnard and D. Forsyth. Learning the semantics of words and pictures. ICCV, pages 408--415, 2001.Google ScholarCross Ref
K. Bennet and A. Demiriz. Semi-supervised support vector machines. NIPS, pages 368--374, 1998. Google ScholarDigital Library
D. Blei and M. Jordan. Modeling annotated data. ACM SIGIR, pages 127--134, 2003. Google ScholarDigital Library
L. Breiman. Bagging predictors. Machine Learning, 24:123--140, 1996. Google ScholarDigital Library
C. Carson, S. Belongie, H. Greenspan, and J. Malik. Blobworld: Image segmentation using expectation maximization and its application to image querying. IEEE Trans. on PAMI, 24:1026--1038, 2002. Google ScholarDigital Library
Y. Deng and M. Manjunath. Unsupervised segmentation for color-texture regions in images and video. IEEE Trans. on PAMI, 23:800--810, 2001. Google ScholarDigital Library
P. Duygulu, K. Barnard, J. Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. ECCV, pages 97--112, 2002. Google ScholarDigital Library
J. Fan, Y. Gao, and H. Luo. Multi-level annotation of natural scenes using dominant image components and semantic image concepts. ACM Multimedia, pages 540--547, 2004. Google ScholarDigital Library
W. Fan, S. Stolfo, J. Zhang, and P. Chan. Adacost: Misclassification cost-sensitive boosting. ICML, pages 99--105, 1999. Google ScholarDigital Library
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. ICML, pages 148--156, 1996.Google ScholarDigital Library
Y. Gao, J. Fan, H. Luo, X. Xue, and R. Jain. Automatic image annotation by incorporating feature hierarchy and boosting to scale-up svm classifiers. ACM Multimedia, pages 901--910, 2006. Google ScholarDigital Library
K. Ghahremani, C. Shahabi, S. Yao, and R. Zimmermann. Yima: real-time multimedia storage and retrieval. ACM Multimedia, pages 668--669, 2002. Google ScholarDigital Library
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. ACM SIGIR, pages 119--126, 2003. Google ScholarDigital Library
Y. Jin, L. Khan, L. Wang, and M. Awad. Image annotations by combining multiple evidence and wordnet. ACM Multimedia, pages 706--715, 2005. Google ScholarDigital Library
T. Joachims. Transductive inference for text classification using support vector machines. ICML, pages 200--209, 1999. Google ScholarDigital Library
V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures. NIPS, pages 553--560, 2003.Google Scholar
A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. ECCV, pages 316--329, 2008. Google ScholarDigital Library
D. Modha and W. S. Spangler. Feature weighting in k-means clustering. Machine Learning, 52:217--237, 2003. Google ScholarDigital Library
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, pages 103--134, 2000. Google ScholarDigital Library
J. O'Sullivan, J. Langford, R. Caruana, and A. Blum. Featureboost: A meta learning algorithm that improves model robustness. ICML, pages 703--710, 2000. Google ScholarDigital Library
J. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adavances in Large Margin Classifiers, MIT Press, 1999.Google Scholar
B. Ripley. Neural network and related methods for classification. Journal of the Royal Statistical Society, Series B, 56:409--456, 1994.Google Scholar
A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Trans. on PAMI, 22:1349--1380, 2000. Google ScholarDigital Library
C. Sutton, M. Sindelar, and A. McCallum. Feature boosting: Preventing weight undertraining in structured discriminative learning. CIIR TR-IR-402 University of Masschusetts, 2005.Google Scholar
M. Szummer and T. Jaakkola. Information regularization with partially labeled data. NIPS, pages 1025--1032, 2002.Google Scholar
A. Torralba and A. Oliva. Semantic organization of scenes using discriminant structural templates. ICCV, pages 1253--1258, 1999. Google ScholarDigital Library
V. Vapnik. Statistical learning theory. 1998. Google ScholarDigital Library
N. Vasconcelos and M. Vasconcelos. Scalable discriminant feature selection for image retrieval and recognition. CVPR, pages 770--775, 2004. Google ScholarDigital Library
P. Viola and M. Jones. Robust real-time face detection. Intl. J. Computer Vision, 57:137--154, 2004. Google ScholarDigital Library
X.-J. Wang, W.-Y. Ma, L. Zhang, and X. Li. Iteratively clustering web images based on link and attribute reinforcements. ACM Multimedia, pages 122--131, 2005. Google ScholarDigital Library
M. Weber, M. Welling, and P. Perona. Unsupervised learning of models for recognition. ECCV, pages 18--32, 2000. Google ScholarDigital Library
C. Yang, M. Dong, and J. Hua. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. CVPR, pages 2057--2063, 2006. Google ScholarDigital Library
L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. ICML, pages 856--863, 2003.Google Scholar
R. Zhang, Z. Zhang, M. Li, W.-Y. Ma, and H. Zhang. A probabilistic semantic model for image annotation and multi-modal image retrieva. ICCV, pages 846--851, 2005. Google ScholarDigital Library
S. Zhang, J. Huang, Y. Huang, Y. Yu, H. Li, and D. Metaxas. Automatic image annotation using group sparsity. CVPR, pages 3312--3319, 2010.Google ScholarCross Ref

Index Terms

Integrating hierarchical feature selection and classifier training for multi-label image annotation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features

Hierarchical feature selection is a new research area in machine learning/data mining, which consists of performing feature selection by exploiting dependency relationships among hierarchically structured features. This paper evaluates four hierarchical ...
Read More
Semantic image classification with hierarchical feature subset selection
MIR '05: Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval

High-dimensional visual features for image content characterization enables effective image classification. However, training accurate image classifiers in high-dimensional feature space suffers from the problem of curse of dimensionality and thus ...
Read More
Feature Selection for Hierarchical Multi-label Classification
Advances in Intelligent Data Analysis XIX
Abstract
In this work we study how conventional feature selection methods can be applied to Hierarchical Multi-label Classification Problems. In Hierarchical Multi-label Classification, instances can belong to two or more classes (labels) simultaneously, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
boosting
feature hierarchy
hierarchical feature selection
svm image classifier
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 511
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Integrating hierarchical feature selection and classifier training for multi-label image annotation

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features

Semantic image classification with hierarchical feature subset selection

Feature Selection for Hierarchical Multi-label Classification