research-article

Image categorization combining neighborhood methods and boosting

Author:
Matthew Cooper

FX Palo Alto Laboratory, Palo Alto, CA, USA

FX Palo Alto Laboratory, Palo Alto, CA, USA
View Profile

LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and miningOctober 2009Pages 11–18https://doi.org/10.1145/1631058.1631063

Published:23 October 2009Publication History

LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining

Pages 11–18

ABSTRACT

We describe an efficient and scalable system for automatic image categorization. Our approach seeks to marry scalable "model-free" neighborhood-based annotation with accurate boosting-based per-tag modeling. For accelerated neighborhood-based classification, we use a set of spatial data structures as weak classifiers for an arbitrary number of categories. We employ standard edge and color features and an approximation scheme that scales to large training sets. The weak classifier outputs are combined in a tag-dependent fashion via boosting to improve accuracy. The method performs competitively with standard SVM-based per-tag classification with substantially reduced computational requirements. We present multi-label image annotation experiments using data sets of more than two million photos.

References

J. Adcock, M. L. Cooper, and J. Pickens. Experiments in interactive video search by addition and subtraction. In ACM Conf. on Image and Video Retrieval, 2008. Google ScholarDigital Library
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM, 45(6):891--923, 1998. Google ScholarDigital Library
V. Athitsos, J. Alon, S. Sclaroff, and G. Kollios. Boostmap: An embedding method for efficient nearest neighbor retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1):89--104, Jan. 2008. Google ScholarDigital Library
K. Barnard, P. Duygulu, N. d. Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003. Google ScholarDigital Library
L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, volume 20, Cambridge, MA, 2008. MIT Press.Google Scholar
M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757--1771, 2004.Google ScholarCross Ref
G. Carneiro, A. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(3):394--410, March 2007. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision&Pattern Recognition (CVPR), volume II, pages 886--893, 2005. Google ScholarDigital Library
R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2):1--60, 2008. Google ScholarDigital Library
H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. E. Abbadi. Approximate nearest neighbor searching in multimedia databases. Data Engineering, International Conference on, 0:0503, 2001. Google ScholarDigital Library
D. Fradkin and D. Madigan. Experiments with random projections for machine learning. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 517--522, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933--969, November 2003. Google ScholarDigital Library
A. Halevy, P. Norvig, and F. Pereira. The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2):8--12, March--April 2009. Google ScholarDigital Library
A. Hauptmann, R. Yan, and W.-H. Lin. How many high-level concepts will fill the semantic gap in news video retrieval? In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 627--634, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
J. Hays and A. A. Efros. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007. Google ScholarDigital Library
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 119--126, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma. Image annotation by large-scale content-based image retrieval. In MULTIMEDIA '06: Proceedings of the 14th annual ACM international conference on Multimedia, pages 607--610, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
X. Li, C. G. M. Snoek, and M. Worring. Learning tag relevance by neighbor voting for social image retrieval. In Proceedings of the ACM International Conference on Multimedia Information Retrieval, pages 180 -- 187, Vancouver, Canada, October 2008. Google ScholarDigital Library
X. Li, C. G. M. Snoek, and M. Worring. Annotating images by harnessing worldwide user-tagged photos. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009. Invited paper. Google ScholarDigital Library
X. Li, D. Wang, J. Li, and B. Zhang. Video search in concept subspace: a text-like paradigm. In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 603--610, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
D. M. Mount and S. Arya. Ann: A library for approximate nearest neighbor searching, version 1.1.1. http://www.cs.umd.edu/mount/ANN/.Google Scholar
M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Applications (VISAPP'09), 2009.Google Scholar
M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia Magazine, 13(3), 2006. Google ScholarDigital Library
M. R. Naphade and J. R. Smith. On the detection of semantic concepts at trecvid. In MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia, pages 660--667, New York, NY,USA, 2004. ACM. Google ScholarDigital Library
A. P. Natsev, M. R. Naphade, and J. Tesic. Learning the semantics of multimedia queries and concepts from a small number of examples. In MULTI-MEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia, pages 598--607, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
R. E. Schapire. A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999. Google ScholarDigital Library
C. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring. Adding semantics to detectors for video retrieval. IEEE Trans. on Multimedia, 9(5):975--986, Aug. 2007. Google ScholarDigital Library
C. G. M. Snoek, M. Worring, J. C. van Gemert, J.-M. Geusebroek, and A. W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In MULTIMEDIA '06: Proceedings of the 14th annual ACM international conference on Multimedia, pages 421---430, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
D. Tao, X. Tang, X. Li, and X. Wu. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7):1088--1099, July 2006. Google ScholarDigital Library
K. Tieu and P. A. Viola. Boosting image retrieval. International Journal of Computer Vision, 56(1--2):17--36, 2004. Google ScholarDigital Library
A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Macihine Intelligence, 30(11):1958--1970,Nov. 2008. Google ScholarDigital Library
A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854--869, May 2007. Google ScholarDigital Library
J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, C. G. M. Snoek,and A. W. M. Smeulders. Robust scene categorization by learning image statistics in context. In CVPRW '06: Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, page 105, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Scalable search-based image annotation. Multimedia Systems, 14(4):205--220, 2008.Google ScholarDigital Library
D. Wang and M. Cooper. Image orientation detection using scalable non-parametric classification. Pattern Analysis and Applications, (In preparation) 2009.Google Scholar
X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE. CVPR 2006, pages II: 1483--1490, 2006. Google ScholarDigital Library
R. Yan, M.-Y. Chen, and A. Hauptmann. Mining relationships between concepts using probabalistic graphical models. In Proc. IEEE ICME, 2006.Google Scholar
R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 834---843, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
M.-L. Zhang and Z.-H. Zhou. Ml-knn: A lazy approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007. Google ScholarDigital Library

Index Terms

Image categorization combining neighborhood methods and boosting
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Boosting k-NN for Categorization of Natural Scenes

The k-nearest neighbors (k-NN) classification rule has proven extremely successful in countless many computer vision applications. For example, image categorization often relies on uniform voting among the nearest prototypes in the space of descriptors. ...
Read More
Boosting recombined weak classifiers

Boosting is a set of methods for the construction of classifier ensembles. The differential feature of these methods is that they allow to obtain a strong classifier from the combination of weak classifiers. Therefore, it is possible to use boosting ...
Read More
Using boosting to prune bagging ensembles

Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
October 2009
144 pages
ISBN:9781605587561
DOI:10.1145/1631058
General Chairs:
Rong Yan
IBM TJ Watson Research Center
,
Qi Tian
Microsoft Research Asia and University of Texas, San Antonio
,
John R. Smith
IBM TJ Watson Research Center
,
Rahul Sukthankar
Intel Research and Carnegie Mellon
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 October 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
boosting
image categorization
nearest neighbors
Qualifiers
- research-article
Conference
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 163
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Image categorization combining neighborhood methods and boosting

LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Boosting k-NN for Categorization of Natural Scenes

Boosting recombined weak classifiers

Using boosting to prune bagging ensembles