skip to main content
10.1145/1631058.1631063acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Image categorization combining neighborhood methods and boosting

Published:23 October 2009Publication History

ABSTRACT

We describe an efficient and scalable system for automatic image categorization. Our approach seeks to marry scalable "model-free" neighborhood-based annotation with accurate boosting-based per-tag modeling. For accelerated neighborhood-based classification, we use a set of spatial data structures as weak classifiers for an arbitrary number of categories. We employ standard edge and color features and an approximation scheme that scales to large training sets. The weak classifier outputs are combined in a tag-dependent fashion via boosting to improve accuracy. The method performs competitively with standard SVM-based per-tag classification with substantially reduced computational requirements. We present multi-label image annotation experiments using data sets of more than two million photos.

References

  1. J. Adcock, M. L. Cooper, and J. Pickens. Experiments in interactive video search by addition and subtraction. In ACM Conf. on Image and Video Retrieval, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM, 45(6):891--923, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Athitsos, J. Alon, S. Sclaroff, and G. Kollios. Boostmap: An embedding method for efficient nearest neighbor retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1):89--104, Jan. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Barnard, P. Duygulu, N. d. Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, volume 20, Cambridge, MA, 2008. MIT Press.Google ScholarGoogle Scholar
  6. M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757--1771, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. G. Carneiro, A. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(3):394--410, March 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google ScholarGoogle Scholar
  9. N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision&Pattern Recognition (CVPR), volume II, pages 886--893, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2):1--60, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. E. Abbadi. Approximate nearest neighbor searching in multimedia databases. Data Engineering, International Conference on, 0:0503, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Fradkin and D. Madigan. Experiments with random projections for machine learning. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 517--522, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933--969, November 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Halevy, P. Norvig, and F. Pereira. The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2):8--12, March--April 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Hauptmann, R. Yan, and W.-H. Lin. How many high-level concepts will fill the semantic gap in news video retrieval? In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 627--634, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Hays and A. A. Efros. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 119--126, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma. Image annotation by large-scale content-based image retrieval. In MULTIMEDIA '06: Proceedings of the 14th annual ACM international conference on Multimedia, pages 607--610, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Li, C. G. M. Snoek, and M. Worring. Learning tag relevance by neighbor voting for social image retrieval. In Proceedings of the ACM International Conference on Multimedia Information Retrieval, pages 180 -- 187, Vancouver, Canada, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Li, C. G. M. Snoek, and M. Worring. Annotating images by harnessing worldwide user-tagged photos. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009. Invited paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. Li, D. Wang, J. Li, and B. Zhang. Video search in concept subspace: a text-like paradigm. In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 603--610, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. M. Mount and S. Arya. Ann: A library for approximate nearest neighbor searching, version 1.1.1. http://www.cs.umd.edu/mount/ANN/.Google ScholarGoogle Scholar
  23. M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Applications (VISAPP'09), 2009.Google ScholarGoogle Scholar
  24. M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia Magazine, 13(3), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. R. Naphade and J. R. Smith. On the detection of semantic concepts at trecvid. In MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia, pages 660--667, New York, NY,USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. P. Natsev, M. R. Naphade, and J. Tesic. Learning the semantics of multimedia queries and concepts from a small number of examples. In MULTI-MEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia, pages 598--607, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. E. Schapire. A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring. Adding semantics to detectors for video retrieval. IEEE Trans. on Multimedia, 9(5):975--986, Aug. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. G. M. Snoek, M. Worring, J. C. van Gemert, J.-M. Geusebroek, and A. W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In MULTIMEDIA '06: Proceedings of the 14th annual ACM international conference on Multimedia, pages 421---430, New York, NY, USA, 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Tao, X. Tang, X. Li, and X. Wu. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(7):1088--1099, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Tieu and P. A. Viola. Boosting image retrieval. International Journal of Computer Vision, 56(1--2):17--36, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Macihine Intelligence, 30(11):1958--1970,Nov. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854--869, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, C. G. M. Snoek,and A. W. M. Smeulders. Robust scene categorization by learning image statistics in context. In CVPRW '06: Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, page 105, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Scalable search-based image annotation. Multimedia Systems, 14(4):205--220, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. Wang and M. Cooper. Image orientation detection using scalable non-parametric classification. Pattern Analysis and Applications, (In preparation) 2009.Google ScholarGoogle Scholar
  37. X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In IEEE. CVPR 2006, pages II: 1483--1490, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. R. Yan, M.-Y. Chen, and A. Hauptmann. Mining relationships between concepts using probabalistic graphical models. In Proc. IEEE ICME, 2006.Google ScholarGoogle Scholar
  39. R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 834---843, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M.-L. Zhang and Z.-H. Zhou. Ml-knn: A lazy approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Image categorization combining neighborhood methods and boosting

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          LS-MMRM '09: Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
          October 2009
          144 pages
          ISBN:9781605587561
          DOI:10.1145/1631058

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader