research-article

Object class detection: A survey

Authors:

Chao GaoAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 46, Issue 1

Article No.: 10, Pages 1 - 53

https://doi.org/10.1145/2522968.2522978

Published: 11 July 2013 Publication History

Abstract

Object class detection, also known as category-level object detection, has become one of the most focused areas in computer vision in the new century. This article attempts to provide a comprehensive survey of the recent technical achievements in this area of research. More than 270 major publications are included in this survey covering different aspects of the research, which include: (i) problem description: key tasks and challenges; (ii) core techniques: appearance modeling, localization strategies, and supervised classification methods; (iii) evaluation issues: approaches, metrics, standard datasets, and state-of-the-art results; and (iv) new development: particularly new approaches and applications motivated by the recent boom of social images. Finally, in retrospect of what has been achieved so far, the survey also discusses what the future may hold for object class detection research.

Supplementary Material

a10-zhang-apndx.pdf (zhang.zip)

Supplemental movie, appendix, image and software files for, Object class detection: A survey

Download
23.07 KB

References

[1]

Aggarwal, J. K. and Ryoo, M. S. 2011. Human activity analysis: A review. ACM Comput. Surv. 43, 1--43.

Digital Library

[2]

Alexe, B., Deselaers, T., and Ferrari, V. 2010. What is an object&quest; In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[3]

An, S. J., Peursum, P., Liu, W. Q., and Venkatesh, S. 2009. Efficient algorithms for subwindow search in object detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[4]

Andriluka, M., Roth, S., and Schiele, B. 2009. Pictorial structures revisited: People detection and articulated pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[5]

Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. 2009. From contours to regions: An empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[6]

Atkins, C. B. 2008. Blocked recursive image composition. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08).

Digital Library

[7]

Aytar, Y. and Zisserman, A. 2011. Tabula rasa: Model transfer for object category detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).

Digital Library

[8]

Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. 2008. Speeded-up robust features (surf). Comput Vis. Image Understand. 110, 346--359.

Digital Library

[9]

Bay, H., Tuytelaars, T., and Van Gool, L. 2006. SURF: Speeded up robust features. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[10]

Belongie, S., Malik, J., and Puzicha, J. 2001. Matching shapes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'01).

[11]

Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509--522.

Digital Library

[12]

Bentley, J. 1984. Programming pearls: Algorithm design techniques. Comm. ACM 27, 865--873.

Digital Library

[13]

Biederman, I., Mezzanotte, R., and Rabinowitz, J. 1982. Scene perception: Detecting and judging objects undergoing relational violations. Cogn. Psychol. 14, 143--177.

[14]

Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.

[15]

Boiman, O., Shechtman, E., and Irani, M. 2008. In defense of nearest-neighbor based image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[16]

Borenstein, E., Sharon, E., and Ullman, S. 2004. Combining top-down and bottom-up segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04).

Digital Library

[17]

Borenstein, E. and Ullman, S. 2008. Combined top-down/bottom-up segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2109--2125.

Digital Library

[18]

Bosch, A., Zisserman, A., and Munoz, X. 2007a. Representing shape with a spatial pyramid kernel. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'07).

Digital Library

[19]

Bosch, A., Zisserman, A., and Muoz, X. 2007b. Image classification using random forests and ferns. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[20]

Bouchard, G. and Triggs, B. 2005. Hierarchical part-based visual object categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[21]

Boureau, Y. L., Bach, F., Lecun, Y., and Ponce, J. 2010. Learning mid-level features for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[22]

Bray, M., Kohli, P., and Torr, P. 2006. PoseCut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[23]

Cai, H. P., Yan, F., and Mikolajczyk, K. 2010. Learning weights for codebook in image classification and retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[24]

Cao, Y., Wang, C. H., Li, Z. W., Zhang, L. Q., and Zhang, L. 2010. Spatial-bag-of-features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[25]

Carneiro, G. and Lowe, D. 2006. Sparse flexible models of local features. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[26]

Carreira, J., Li, F., and Sminchisescu, C. 2011. Object recognition by sequential figure-ground ranking. Int. J. Comput. Vis. 98, 3, 243--262.

Digital Library

[27]

Carreira, J. and Sminchisescu, C. 2010. Constrained parametric min-cuts for automatic object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[28]

Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2Photo: Internet image montage. In Proceedings of the ACM SIGGRAPH Asia Papers.

Digital Library

[29]

Chen, Y., Zhu, L. L., Li, C. L., Yuille, A., and Zhang, H. 2007. Rapid inference on a novel and/or graph for object detection, segmentation and parsing. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).

[30]

Chia, A. Y. S., Rahardja, S., Rajan, D., and Leung, M. K. H. 2009. Structural descriptors for category level object detection. IEEE Trans. Multimedia 11, 1407--1421.

Digital Library

[31]

Christoudias, C. M., Urtasun, R., and Darrell, T. 2008. Unsupervised feature selection via distributed coding for multi-view object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[32]

Crandall, D., Felzenszwalb, P., and Huttenlocher, D. 2005. Spatial priors for part-based recognition using statistical models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[33]

Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. 2004. Visual categorization with bags of keypoints. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).

[34]

Csurka, G., Dance, C., Perronnin, F., and Willamowski, J. 2006. Generic visual categorization using weak geometry. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 207--224.

[35]

Dalal, N. 2006. Finding people in images and videos. Tech. rep., Institut National Polytechnique de Grenoble.

[36]

Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[37]

Dalal, N., Triggs, B., and Schmid, C. 2006. Human detection using oriented histograms of flow and appearance. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[38]

Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 1--60.

Digital Library

[39]

Deselaers, T. and Ferrari, V. 2010. Global and efficient self-similarity for object classification and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[40]

Dickinson, S. 2009. The evolution of object categorization and the challenge of image abstraction. In Object Categorization: Computer and Human Vision Perspectives, A. L. S. Dickinson, B. Schiele, and M. Tarr, Eds., Cambridge University Press, 1--37.

[41]

Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A., and Hebert, M. 2009. An empirical study of context in object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[42]

Dollar, P., Belongie, S., and Perona, P. 2010. The fastest pedestrian detector in the west. In Proceedings of the British Machine Vision Conference (BMVC'10). BMVA Press.

[43]

Dollar, P., Tu, Z., Perona, P., and Belongie, S. 2009. Integral channel features. In Proceedings of the British Machine Vision Conference (BMVC'09).

[44]

Dollar, P., Wojek, C., Schiele, B., and Perona, P. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 743--761.

Digital Library

[45]

Endres, I. and Hoiem, D. 2010. Category independent object proposals. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[46]

Enzweiler, M. and Gavrila, D. M. 2008. A mixed generative-discriminative framework for pedestrian classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[47]

Everingham, M., Van Gool, L., Williams, C., Winn, J., and Zisserman, A. 2010. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 2, 303--338.

Digital Library

[48]

Fan, J. P., Shen, Y., Zhou, N., and Gao, Y. L. 2010. Harvesting large-scale weakly-tagged image databases from the web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[49]

Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04).

Digital Library

[50]

Fei-Fei, L. and Perona, P. 2005. A Bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).

Digital Library

[51]

Fei-Fei, L., Vanrullen, R., Koch, C., and Perona, P. 2002. Rapid natural scene categorization in the near absence of attention. Proc. Nat. Acad. Sci. 2, 9596--9601.

[52]

Fei-Fei, L., Fergus, R., and Torralba, A. 2005. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'05). MIT.

[53]

Fei-Fei, L., Fergus, R., and Torralba, A. 2007. Recognizing and learning object categories. In Computer Vision and Pattern Recognition Short Course (CVPR'07).

[54]

Fei-Fei, L., Fergus, R., and Torralba, A. 2009. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'09).

[55]

Felleman, D. J. and Van Essen, D. C. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1, 1--47.

[56]

Felzenszwalb, P., Mcallester, D., and Ramanan, D. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[57]

Felzenszwalb, P. F., Girshick, R. B., and Mcallester, D. 2010a. Cascade object detection with deformable part models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[58]

Felzenszwalb, P. F., Girshick, R. B., Mcallester, D., and Ramanan, D. 2010b. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627--1645.

Digital Library

[59]

Felzenszwalb, P. F. and Huttenlocher, D. P. 2005. Pictorial structures for object recognition. Int. J. Comput. Vis. 61, 55--79.

Digital Library

[60]

Felzenszwalb, P. F. and Veksler, O. 2010. Tiered scene labeling with dynamic programming. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[61]

Ferencz, A., Learned-Miller, E., and Malik, J. 2008. Learning to locate informative features for visual identification. Int. J. Comput. Vis. 77, 3--24.

Digital Library

[62]

Fergus, R., Li, F.-F., Perona, P., and Zisserman, A. 2010. Learning object categories from internet image searches. Proc. IEEE. 98, 1453--1466.

[63]

Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'03).

[64]

Fergus, R., Perona, P., and Zisserman, A. 2005. A sparse object category model for efficient learning and exhaustive recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[65]

Fergus, R., Perona, P., and Zisserman, A. 2007. Weakly supervised scale-invariant learning of models for visual recognition. Int. J. Comput. Vis. 71, 273--303.

Digital Library

[66]

Ferrari, V., Fevrier, L., Jurie, F., and Schmid, C. 2008. Groups of adjacent contour segments for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 30, 36--51.

Digital Library

[67]

Fischler, M. A. and Elschlager, R. A. 1973. The representation and matching of pictorial structures. IEEE Trans. Comput. C-22, 67--92.

Digital Library

[68]

Fleuret, F. and Geman, D. 2001. Coarse-to-fine face detection. Int. J. Comput. Vis. 41, 85--107.

Digital Library

[69]

Fulkerson, B., Vedaldi, A., and Soatto, S. 2009. Class segmentation and object localization with superpixel neighborhoods. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[70]

Gall, J. and Lempitsky, V. 2009. Class-specific hough forests for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[71]

Gallagher, A., Neustaedter, C., Cao, L., Luo, J., and Chen, T. 2008. Image annotation using personal calendars as context. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08).

Digital Library

[72]

Gallagher, A. C. and Chen, T. 2008. Estimating age, gender, and identity using first name priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[73]

Galleguillos, C. and Belongie, S. 2010. Context based object categorization: A critical survey. Comput Vis. Image Understand. 114, 712--722.

Digital Library

[74]

Galleguillos, C., Mcfee, B., Belongie, S., and Lanckriet, G. 2010. Multi-class object localization by combining local contextual interactions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[75]

Gehler, P. and Nowozin, S. 2009. On feature combination for multiclass object classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[76]

Girshick, R. B., Felzenszwalb, P. F., and Mcallester, D. 2011. Object detection with grammar models. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'11).

[77]

Gonfaus, J. M., Boix, X., Van de Weijer, J., Bagdanov, A. D., Serrat, J., and Gonzalez, J. 2010. Harmony potentials for joint classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[78]

Gould, S., Fulton, R., and Koller, D. 2009a. Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[79]

Gould, S., Gao, T. S., and Koller, D. 2009b. Region-based segmentation and object detection. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'09).

[80]

Grabner, H., Roth, P. M., and Bischof, H. 2007. Eigenboosting: Combining discriminative and generative information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[81]

Grauman, K. and Leibe, B. 2011. Visual object recognition. Synthesis Lectures Artif. Intell. Mach. Learn. 5, 1--181.

[82]

Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep., California Institute of Technology, 1-20. http://authors.library.caltech.edu/7694/1/CNS-TR-2007-001.pdf.

[83]

Gu, C. H., Lim, J. J., Arbelaez, P., and Malik, J. 2009. Recognition using regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[84]

Guillaumin, M., Verbeek, J., and Schmid, C. 2010. Multimodal semi-supervised learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[85]

Hays, J. and Efros, A. A. 2008. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[86]

He, X. M., Zemel, R., and Ray, D. 2006. Learning and incorporating top-down cues in image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[87]

He, X. M., Zemel, R. S., and Carreira-Perpinan, M. A. 2004. Multiscale conditional random fields for image labeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).

Digital Library

[88]

Heitz, G., Elidan, G., Packer, B., and Koller, D. 2009. Shape-based object localization for descriptive classification. Int. J. Comput. Vis. 84, 40--62.

Digital Library

[89]

Hochstein, S. and Ahissar, M. 2002. View from the top: Hierarchies and reverse hierarchies in the visual system. Neuron 36, 791--804.

[90]

Hofmann, T. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177--196.

Digital Library

[91]

Hoiem, D., Efros, A., and Hebert, M. 2008. Putting objects in perspective. Int. J. Comput. Vis. 80, 3--15.

Digital Library

[92]

Hoiem, D., Rother, C., and Winn, J. 2007a. 3D layoutcrf for multi-view object class recognition and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[93]

Hoiem, D., Stein, A., Efros, A., and Hebert, M. 2007b. Recovering occlusion boundaries from a single image. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[94]

Huang, Y. Z., Huang, K. Q., Wang, L. S., Tao, D. C., Tan, T. N., and Li, X. L. 2008. Enhanced biologically inspired model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[95]

Hwang, S. J. and Grauman, K. 2010. Reading between the lines: Object localization using implicit cues from image tags. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[96]

Jain, A., Gupta, A., and Davis, L. 2010. Learning what and how of contextual models for scene labeling. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[97]

Jhuang, H., Serre, T., Wolf, L., and Poggio, T. 2007. A biologically inspired system for action recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[98]

Ji, R. R., Yao, H. X., Sun, X. S., Zhong, B. N., and Gao, W. 2010. Towards semantic embedding in visual vocabulary. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[99]

Jiang, Y.-G., Yang, J., Ngo, C.-W., and Hauptmann, A. G. 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans. Multimedia 12, 42--53.

Digital Library

[100]

Joachims, T. 1997. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In Proceedings of the International Conference on Machine Learning (ICML'97).

Digital Library

[101]

Joachims, T. 1998. Making large-scale support vector machine learning practical. In Advances in Kernel Methods: Support Vector Machines, B. Scholkopf, J. C. Burges, and A. J. Smola, Eds. MIT Press, Cambridge, MA, 169--184.

Digital Library

[102]

Jones, J. P. and Palmer, L. A. 1987. An evaluation of the two-dimensional gabor filter model of simple receptive fields in cat striate cortex. J. Neurophys. 58, 1233--1258.

[103]

Jurie, F. and Triggs, B. 2005. Creating efficient codebooks for visual recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).

Digital Library

[104]

Karlinsky, L., Dinerstein, M., Harari, D., and Ullman, S. 2010. The chains model for detecting parts by their context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[105]

Ke, Y. and Sukthankar, R. 2004. PCA-sift: A more distinctive representation for local image descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).

Digital Library

[106]

Knopp, J., Prasad, M., and Gool, L. V. 2011. Scene cut: Class-specific object detection and segmentation in 3d scenes. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT'11).

Digital Library

[107]

Koh, K., Kim, S.-J., and Boyd, S. 2007. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519--1555.

Digital Library

[108]

Kohli, P., Ladicky, L., and Torr, P. 2008. Robust higher order potentials for enforcing label consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[109]

Kotsiantis, S. B. 2007. Supervised machine learning: A review of classification techniques. Informatica 31, 249--268.

[110]

Krüger, V., Kragic, D., Ude, A., and Geib, C. 2007. The meaning of action: A review on action recognition and mapping. Advan. Robot. 21, 1473--1501.

[111]

Kuettel, D. and Ferrari, V. 2012. Figure-ground segmentation by transferring window masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12).

Digital Library

[112]

Kumar, M. P., Ton, P. H. S., and Zisserman, A. 2005. OBJCUT. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

[113]

Kumar, M. P., Torr, P. H. S., and Zisserman, A. 2010. OBJCUT: Efficient segmentation using top-down and bottom-up cues. IEEE Trans. Pattern Anal. Mach. Intell. 32, 530--545.

Digital Library

[114]

Kumar, N., Belhumeur, P., and Nayar, S. 2008. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the European Conference on Computer Vision (ECCV'08).

Digital Library

[115]

Ladicky, L., Sturgess, P., Alahari, K., Russell, C., and Torr, P. 2010. What, where and how many&quest; Combining object detectors and crfs. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[116]

Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. In Proceedings of the International Conference and Exhibition on Computer Graphics and Interactive Techniques (ACM/SIGGRAPH'07).

Digital Library

[117]

Lalonde, J.-F., Narasimhan, S. G., and Efros, A. A. 2010. What do the sun and the sky tell us about the camera&quest; Int. J. Comput. Vis. 88, 24--51.

Digital Library

[118]

Lampert, C. H., Blaschko, M. B., and Hofmann, T. 2008. Beyond sliding windows: Object localization by efficient sub-window search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[119]

Laptev, I. 2006. Improvements of object detection using boosted histograms. In Proceedings of the British Machine Vision Conference (BMVC'06).

[120]

Laptev, I. 2009. Improving object detection with boosted histograms. Image Vis. Comput. 27, 535--544.

Digital Library

[121]

Larlus, D. and Jurie, F. 2008. Combining appearance models and markov random fields for category level object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[122]

Larlus, D., Verbeek, J., and Jurie, F. 2010. Category level object segmentation by combining bag-of-words models with dirichlet processes and random fields. Int. J. Comput. Vision 88, 238--253.

Digital Library

[123]

Lasserre, J. A., Bishop, C. M., and Minka, T. P. 2006. Principled hybrids of generative and discriminative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[124]

Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[125]

Lee, H., Battle, A., Raina, R., and Ng, A. Y. 2006. Efficient sparse coding algorithms. Adv. Neural Inf. Process. Syst. 19, 2007.

[126]

Lee, Y. J. and Grauman, K. 2010. Object-graphs for context-aware category discovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[127]

Leibe, B., Leonardis, A., and Schiele, B. 2004. Combined object categorization and segmentation with an implicit shape model. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).

[128]

Leibe, B., Leonardis, A., and Schiele, B. 2006. An implicit shape model for combined object categorization and segmentation. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 508--524.

[129]

Leibe, B., Leonardis, A., and Schiele, B. 2008. Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vis. 77, 259--289.

Digital Library

[130]

Leibe, B., Seemann, E., and Schiele, B. 2005. Pedestrian detection in crowded scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[131]

Lempitsky, V., Kohli, P., Rother, C., and Sharp, T. 2009. Image segmentation with a bounding box prior. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[132]

Levin, A. and Weiss, Y. 2006. Learning to combine bottom-up and top-down segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[133]

Li, L.-J. and Fei-Fei, L. 2007. What, where and who&quest; Classifying events by scene and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[134]

Li, L.-J. and Fei-Fei, L. 2010. OPTIMOL: Automatic online picture collection via incremental model learning. Int. J. Comput. Vis. 88, 147--168.

Digital Library

[135]

Liang, P. and Jordan, M. I. 2008. An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators. In Proceedings of the International Conference on Machine Learning (ICML'08).

Digital Library

[136]

Liebelt, J. and Schmid, C. 2010. Multi-view object class detection with a 3d geometric model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[137]

Liebelt, J., Schmid, C., and Schertler, K. 2008. Viewpoint-independent object class detection using 3d feature maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[138]

Lin, D., Kapoor, A., Hua, G., and Baker, S. 2010. Joint people, event, and location recognition in personal photo collections using cross-domain context. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[139]

Lin, Z. 2009. Modeling shape, appearance and motion for human movement analysis. Tech. rep., Department of Electrical and Computer Engineering, University of Maryland, College Park, Md. http://hdl.handle.net/1903/9279.

[140]

Lin, Z. and Davis, L. S. 2010. Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Trans. Pattern Anal. Mach. Intell. 32, 604--618.

Digital Library

[141]

Lin, Z., Davis, L. S., Doermann, D., and Dementhon, D. 2007. Hierarchical part-template matching for human detection and segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[142]

Liu, C., Yuen, J., and Torralba, A. 2009a. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[143]

Liu, C., Yuen, J., Torralba, A., Sivic, J., and Freeman, W. 2008. SIFT flow: Dense correspondence across different scenes. In Proceedings of the European Conference on Computer Vision (ECCV'08).

Digital Library

[144]

Liu, T., Wang, J. D., Sun, J., Zheng, N. N., Tang, X. O., and Shum, H. Y. 2009b. Picture collage. IEEE Trans. Multimedia 11, 1225--1239.

Digital Library

[145]

Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91--110.

Digital Library

[146]

Lu, Z. W. and Ip, H. H. S. 2009. Image categorization with spatial mismatch kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[147]

Luo, J., Boutell, M., and Brown, C. 2006. Pictures are not taken in a vacuum. IEEE Signal Process. Mag. 23, 101--114.

[148]

Maire, M., Yu, S. X., and Perona, P. 2011. Object detection and segmentation from joint embedding of parts and pixels. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).

Digital Library

[149]

Maji, S., Berg, A. C., and Malik, J. 2008. Classification using intersection kernel support vector machines is efficient. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[150]

Malisiewicz, T. and Efros, A. A. 2008. Recognition by association via learning per-exemplar distances. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[151]

Marszałek, M. and Schmid, C. 2007. Accurate object localization with shape masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[152]

Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1615--1630.

Digital Library

[153]

Moosmann, F., Nowak, E., and Jurie, F. 2008. Randomized clustering forests for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1632--1646.

Digital Library

[154]

Mu, Y., Yan, S., Liu, Y., Huang, T., and Zhou, B. 2008. Discriminative local binary patterns for human detection in personal album. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[155]

Mutch, J. and Lowe, D. 2008. Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vis. 80, 45--57.

Digital Library

[156]

Mutch, J. and Lowe, D. G. 2006. Multiclass object recognition with sparse, localized features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[157]

Nakayama, H., Harada, T., and Kuniyoshi, Y. 2010. Global gaussian approach for scene categorization using information geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[158]

Narasimhan, S. and Nayar, S. 2002. Vision and the atmosphere. Int. J. Comput. Vis. 48, 233--254.

Digital Library

[159]

Ng, A. and Jordan, M. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'02).

[160]

Ni, B. B., Yan, S. C., and Kassim, A. 2009. Contextualizing histogram. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[161]

Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[162]

Nowak, E., Jurie, F., and Triggs, B. 2006. Sampling strategies for bag-of-features image classification. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[163]

Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multi-resolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971--987.

Digital Library

[164]

Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145--175.

Digital Library

[165]

Oliva, A. and Torralba, A. 2006. Building the gist of a scene: The role of global image features in recognition. Progress Brain Res. 155, 23--36.

[166]

Oliva, A. and Torralba, A. 2007. The role of context in object recognition. Trends Cogn. Sci. 11, 520--527.

[167]

Opelt, A., Pinz, A., and Zisserman, A. 2006. A boundary-fragment-model for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[168]

Palmese, M. and Trucco, A. 2008. From 3-d sonar images to augmented reality models for objects buried on the seafloor. IEEE Trans. Instrument. Measure. 57, 820--828.

[169]

Parikh, D. and Zitnick, C. L. 2010. The role of features, algorithms and data in visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[170]

Park, D., Ramanan, D., and Fowlkes, C. 2010. Multiresolution models for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[171]

Pedersoli, M., Vedaldi, A., and Gonzalez, J. 2011. A coarse-to-fine approach for fast deformable object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[172]

Perronnin, F. 2008. Universal and adapted vocabularies for generic visual categorization. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1243--1256.

Digital Library

[173]

Perrotton, X., Sturzel, M., and Roux, M. 2010. Implicit hierarchical boosting for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[174]

Pinz, A. 2005. Object categorization. Foundat. Trends Comput. Graph. Vis. 1, 4, 255--353.

Digital Library

[175]

Ponce, J., Berg, T. L., Everingham, M., Forsyth, D. A., Hebert, M., Lazebnik, S., Marszałek, M., Schmid, C., Russell, B. C., Torralba, A., Williams, C. K. I., Zhang, J., and Zisserman, A. 2006a. Dataset issues in object recognition. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman Eds., Springer, 29--48.

[176]

Ponce, J., Hebert, M., Schmid, C., and Zisserman, A. 2006b. Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, Springer.

Digital Library

[177]

Porikli, F. 2005. Integral histogram: A fast way to extract histograms in cartesian spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[178]

Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. 2007. Objects in context. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[179]

Ravishankar, S., Jain, A., and Mittal, A. 2008. Multi-stage contour based detection of deformable objects. In Proceedings of the European Conference on Computer Vision (ECCV'08).

Digital Library

[180]

Razavi, N., Gall, J., and Van Gool, L. 2010. Backprojection revisited: Scalable multi-view object detection and similarity metrics for detections. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[181]

Ren, X. and Malik, J. 2003. Learning a classification model for segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).

Digital Library

[182]

Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neurosci. 2, 1019--1025.

[183]

Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. AutoCollage. ACM Trans. Graph. 25, 3, 847--852.

Digital Library

[184]

Rother, C., Kolmogorov, V., and Blake, A. 2004. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 3, 309--314.

Digital Library

[185]

Rubinstein, D. and Hastie, T. 1997. Discriminative vs informative learning. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDDM'97).

[186]

Rubner, Y., Tomasi, C., and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99--121.

Digital Library

[187]

Rui, X., Li, M., Li, Z., Ma, W.-Y., and Yu, N. 2007. Bipartite graph reinforcement model for web image annotation. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'07).

Digital Library

[188]

Russell, B., Torralba, A., Liu, C., Fergus, R., and Freeman, W. 2007. Object recognition by scene alignment. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).

[189]

Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157--173.

Digital Library

[190]

Sabzmeydani, P. and Mori, G. 2007. Detecting pedestrians by learning shapelet features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[191]

Saffari, A., Godec, M., Pock, T., Leistner, C., and Bischof, H. 2010. Online multi-class lpboost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[192]

Salakhutdinov, R., Torralba, A., and Tenenbaum, J. 2011. Learning to share visual appearance for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[193]

Salzmann, M. and Urtasun, R. 2010. Combining discriminative and generative methods for 3d deformable surface and articulated pose reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[194]

Savarese, S. and Li, F.-F. 2007. 3D generic object categorization, localization and pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[195]

Savarese, S., Winn, J., and Criminisi, A. 2006. Discriminative object class models of appearance and shape by correlatons. In Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[196]

Schindler, K., Van Gool, L., and De Gelder, B. 2008. Recognizing emotions expressed by body pose: A biologically inspired neural model. Neural Netw. 21, 1238--1246.

Digital Library

[197]

Schroff, F. 2009. Semantic image segmentation and web-supervised visual learning. Tech. rep., Robotics Research Group, Department of Engineering Science. University of Oxford, Oxford, UK. http://www.robots.ox.ac.uk/&sim;vgg/publications/papers/schroff09.pdf.

[198]

Seemann, E., Leibe, B., and Schiele, B. 2006. Multi-aspect detection of articulated objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[199]

Serre, T., Oliva, A., and Poggio, T. 2007a. A feed-forward architecture accounts for rapid categorization. Proc. National Acad. Sci. 104, 6424--6429.

[200]

Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. 2007b. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411--426.

Digital Library

[201]

Serre, T., Wolf, L., and Poggio, T. 2005. Object recognition with features inspired by visual cortex. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[202]

Shechtman, E. and Irani, M. 2007. Matching local self-similarities across images and videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[203]

Shin, Y., Kim, Y., and Kim, E. Y. 2010. Automatic textile image annotation by predicting emotional concepts from visual features. Image Vis. Comput. 28, 526--537.

Digital Library

[204]

Shotton, J., Blake, A., and Cipolla, R. 2005. Contour-based learning for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).

Digital Library

[205]

Shotton, J., Blake, A., and Cipolla, R. 2008a. Multiscale categorical object recognition using contour fragments. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1270--1281.

Digital Library

[206]

Shotton, J., Johnson, M., and Cipolla, R. 2008b. Semantic texton forests for image categorization and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[207]

Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2006. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).

Digital Library

[208]

Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2009. TextonBoost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2--23.

Digital Library

[209]

Simon, I. and Seitz, S. 2008. Scene segmentation using the wisdom of crowds. In Proceedings of the European Conference on Computer Vision (ECCV'08).

Digital Library

[210]

Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T. 2005. Discovering objects and their location in images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).

Digital Library

[211]

Sivic, J. and Zisserman, A. 2003. Video google: Text retrieval approach to object matching in videos. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).

Digital Library

[212]

Snavely, N., Simon, I., Goesele, M., Szeliski, R., and Seitz, S. M. 2010. Scene reconstruction and visualization from community photo collections. Proc. IEEE. 98, 1370--1390.

[213]

Song, D. J. and Tao, D. C. 2010. Biologically inspired feature manifold for scene classification. IEEE Trans. Image Process. 19, 174--184.

Digital Library

[214]

Song, Z., Chen, Q., Huang, Z., Hua, Y., and Yan, S. 2011. Contextualizing object detection and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[215]

Sonnenburg, S., Rutsch, G., Schafer, C., and Scholkopf, B. 2006. Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531--1565.

Digital Library

[216]

Strat, T. 1993. Employing contextual information in computer vision. In Proceedings of the ARPA Image Understanding Workshop. 217--229.

[217]

Sutton, C. and McCallum, A. 2006. An introduction to conditional random fields for relational learning. In Introduction to Statistical Relational Learning, L. Getoor and B. Taskar, Eds., MIT Press. http://people.cs.umass.edu/&sim;mccallum/papers/crf-tutorial.pdf.

[218]

Szeliski, R. 2010. Computer Vision: Algorithms and Applications. Springer.

[219]

Tao, L., Yuan, L., and Sun, J. 2009. SkyFinder: Attribute-based sky image search. In ACM SIGGRAPH Papers.

Digital Library

[220]

Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., and Van Gool, L. 2006. Towards multi-view object class detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[221]

Torralba, A. 2003. Contextual priming for object detection. Int. J. Comput. Vis. 53, 169--191.

Digital Library

[222]

Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1958--1970.

Digital Library

[223]

Torralba, A., Murphy, K., and Freeman, W. 2006. Shared features for multiclass object detection. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 345--361.

[224]

Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Sharing features: Efficient boosting procedures for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).

Digital Library

[225]

Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. 2003. Context-based vision system for place and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).

Digital Library

[226]

Tu, Z. W. 2007. Learning generative models via discriminative approaches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[227]

Ulusoy, I. and Bishop, C. 2006. Comparison of generative and discriminative techniques for object detection and classification. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 173--195.

[228]

Ulusoy, I. and Bishop, C. M. 2005. Generative versus discriminative methods for object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).

Digital Library

[229]

Van De Sande, K., Gevers, T., and Snoek, C. 2008. Evaluation of color descriptors for object and scene recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[230]

Van De Sande, K., Gevers, T., and Snoek, C. 2010. Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582--1596.

Digital Library

[231]

Van De Sande, K., Uijlings, J., Gevers, T., and Smeulders, A. 2011. Segmentation as selective search for object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).

Digital Library

[232]

Van Gemert, J. C., Veenman, C. J., Smeulders, A. W. M., and Geusebroek, J. M. 2010. Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1271--1283.

Digital Library

[233]

Vapnik, V. N. 1998. Statistical Learning Theory. A Wiley-Interscience Publication, New York.

[234]

Varma, M. and Ray, D. 2007. Learning the discriminative power-invariance trade-off. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[235]

Vedaldi, A., Gulshan, V., Varma, M., and Zisserman, A. 2009. Multiple kernels for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[236]

Verbeek, J. and Triggs, B. 2007a. Region classification with markov field aspect models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[237]

Verbeek, J. and Triggs, B. 2007b. Scene segmentation with conditional random fields learned from partially labeled images. In Proceedings of the Conference on Advances in Neural Information Processing Systems. (NIPS'07).

[238]

Vijayanarasimhan, S. and Grauman, K. 2011. Efficient region search for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[239]

Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'01).

[240]

Walk, S., Majer, N., Schindler, K., and Schiele, B. 2010. New features and insights for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[241]

Wang, G., Gallagher, A., Luo, J., and Forsyth, D. 2010a. Seeing people in social context: Recognizing people and social relationships. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[242]

Wang, G., Hoiem, D., and Forsyth, D. 2009a. Building text features for object image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[243]

Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. 2010b. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[244]

Wang, X. and Grimson, E. 2007. Spatial latent dirichlet allocation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).

[245]

Wang, X., Han, T. X., and Yan, S. 2009b. An hog-lbp human detector with partial occlusion handling. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[246]

Wang, Y. and Mori, G. 2009. Max-margin hidden conditional random fields for human action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[247]

Wang, Y. and Mori, G. 2010. Hidden part models for human action recognition: Probabilistic vs. max-margin. IEEE Trans. Pattern Anal. Mach. Intell. 33, 7, 1310--1323.

Digital Library

[248]

Wang, Z., Hu, Y., and Chia, L.-T. 2010c. Image-to-class distance metric learning for image classification. In Proceedings of the European Conference on Computer Vision (ECCV'10).

Digital Library

[249]

Watanabe, T., Ito, S., and Yokoi, K. 2009. Co-occurrence histograms of oriented gradients for pedestrian detection. In Proceedings of the Pacific-Rim Symposium on Image and Video Technology (PSIVT'09).

Digital Library

[250]

Wei, Y. C. and Tao, L.T. 2010. Efficient histogram-based sliding window. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[251]

Winn, J., Criminisi, A., and Minka, T. 2005. Object categorization by learned universal visual dictionary. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).

Digital Library

[252]

Wnuk, K. and Soatto, S. 2008. Filtering internet image search results towards keyword based category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[253]

Wojek, C. and Schiele, B. 2008. A performance evaluation of single and multi-feature people detection. In Proceedings of the German Association for Pattern Recognition (DAGM'08).

Digital Library

[254]

Wojek, C., Walk, S., and Schiele, B. 2009. Multi-cue onboard pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[255]

Wright, J., Yi, M., Mairal, J., Sapiro, G., Huang, T. S., and Shuicheng, Y. 2010. Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031--1044.

[256]

Wu, B. and Nevatia, R. 2005. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).

Digital Library

[257]

Wu, B. and Nevatia, R. 2007a. Cluster boosted tree classifier for multi-view, multi-pose object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[258]

Wu, B. and Nevatia, R. 2007b. Improving part based object detection by unsupervised, online boosting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).

[259]

Wu, B. and Nevatia, R. 2007c. Simultaneous object detection and segmentation by boosting local shape feature based classifier. In Proceedings of the IEEE Conference on Computer Vision and Pattern Reconition (CVPR'07).

[260]

Wu, Z., Ke, Q. F., Isard, M., and Sun, J. 2009. Bundling features for large scale partial-duplicate web image search. In Proceedings of the IEEE Conferenc on Computer Vision and Pattern Recognition (CVPR'09).

[261]

Xiang, Y., Zhou, X. D., Liu, Z. T., Chua, T. S., and Ngo, C.-W. 2010. Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[262]

Xu, H., Zhou, X., Wang, M., Xiang, Y., and Shi, B. 2009. Exploring flickr's related tags for semantic annotation of web images. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'09).

Digital Library

[263]

Xu, Z., Chen, H., Zhu, S.-C., and Luo, J. 2008. A hierarchical compositional model for face representation and sketching. IEEE Trans. Pattern Anal. Mach. Intell. 30, 955--969.

Digital Library

[264]

Xue, J.-H. 2008. Aspects of generative and discriminative classifiers. Tech. rep., Information and Mathematical Sciences, Department of Statistics, University of Glasgow.

[265]

Xue, J.-H. and Titterington, D. 2008. Comment on “on discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes”. Neural Process. Lett. 28, 169--187.

Digital Library

[266]

Xue, J.-H. and Titterington, D. M. 2010. On the generative-discriminative tradeoff approach: Interpretation, asymptotic efficiency and classification performance. Comput. Statist. Data Anal. 54, 438--451.

Digital Library

[267]

Yan, P. K., Khan, S. M., and Shah, M. 2007. 3D model based object class detection in an arbitrary view. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).

[268]

Yang, B., Mei, T., Sun, L.-F., Yang, S.-Q., and Hua, X.-S. 2008a. Free-shaped video collage. In Proceedings of the 14^th International Conference on Advances in Multimedia Modeling.

Digital Library

[269]

Yang, J. C., Yu, K., Gong, Y. H., and Huang, T. 2009. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[270]

Yang, L., Jin, R., Sukthankar, R., and Jurie, F. 2008b. Unifying discriminative visual codebook generation with classifier training for object category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).

[271]

Yang, Y. and Ramanan, D. 2011. Articulated pose estimation with flexible mixtures-of-parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[272]

Yao, B. Z., Yang, X., Lin, L., Lee, M. W., and Zhu, S. C. 2010. I2T: Image parsing to text description. Proc. IEEE. 98, 1485--1508.

[273]

Yeh, T., Lee, J. J., and Darrell, T. 2009. Fast concurrent object localization and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).

[274]

Yu, C. N. J. and Joachims, T. 2009. Learning structural svms with latent variables. In Proceedings of the International Conference on Machine Learning (ICML'09).

Digital Library

[275]

Zhang, C., Liu, J., Tian, Q., Xu, C., Lu, H., and Ma, S. 2011a. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[276]

Zhang, D. Q. and Chang, S. F. 2006. A generative-discriminative hybrid method for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[277]

Zhang, J., Marszałek, M., Lazebnik, S., and Schmid, C. 2007. Local features and kernels for classification of texture and object categories: A comprehensive study. Int. J. Comput. Vis. 73, 213--238.

Digital Library

[278]

Zhang, J. G., Huang, K. Q., Yu, Y. N., and Tan, T. N. 2011b. Boosted local structured hog-lbp for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).

Digital Library

[279]

Zhang, Z. Q., Cao, Y., Salvi, D., Oliver, K., Waggoner, J., and Wang, S. 2010. Free-shape subwindow search for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[280]

Zheng, W. S., Gong, S. G., and Xiang, T. 2009. Quantifying contextual information for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).

[281]

Zhu, L., Chen, Y., Lin, C., and Yuille, A. 2011. Max margin learning of hierarchical configural deformable templates (hcdts) for efficient object parsing and pose estimation. Int. J. Comput. Vis. 93, 1--21.

Digital Library

[282]

Zhu, L., Chen, Y. H., Yuille, A., and Freeman, W. 2010. Latent hierarchical structural learning for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).

[283]

Zhu, M. 2004. Recall, precision and average precision. Working paper, University of Waterloo.

[284]

Zhu, Q., Yeh, M. C., Cheng, K. T., and Avidan, S. 2006. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).

Digital Library

[285]

Zhu, S.-C. and Mumford, D. 2006. A stochastic grammar of images. Foundations Trends Comput. Graph. Vis. 2, 259--362.

Digital Library

Cited By

Tsirtsakis PZacharis GMaraslidis GFragulis G(2025)Deep learning for object recognition: A comprehensive review of models and algorithmsInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2025.01.004Online publication date: Jan-2025
https://doi.org/10.1016/j.ijcce.2025.01.004
Kawai YMochizuki TNaemura M(2024)A Method of Masked Face Detectionマスク着用に対応した顔検出手法The Journal of The Institute of Image Information and Television Engineers10.3169/itej.78.13178:1(131-141)Online publication date: 2024
https://doi.org/10.3169/itej.78.131
Gyimah NAkinie RYan XNabil MGupta KHomaifar AHemmati VOpoku D(2024)NCAT12-DET: A New Benchmark Dataset for Surface Defect Detection and a Comparative StudyIEEE Access10.1109/ACCESS.2024.340266812(72607-72619)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3402668
Show More Cited By

Index Terms

Object class detection: A survey
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding

Recommendations

Unsupervised object discovery via self-organisation

Object discovery in visual object categorisation (VOC) is the problem of automatically assigning class labels to objects appearing in given images. To achieve state-of-the-art results in this task, a large set of positive and negative training images ...
From Images to Shape Models for Object Detection

We present an object class detection approach which fully integrates the complementary strengths offered by shape matchers. Like an object detector, it can learn class models directly from images, and can localize novel instances in the presence of ...
Combining motion and appearance cues for anomaly detection

In this paper, we present a novel anomaly detection framework which integrates motion and appearance cues to detect abnormal objects and behaviors in video. For motion anomaly detection, we employ statistical histograms to model the normal motion ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 46, Issue 1

October 2013

551 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/2522968

Issue’s Table of Contents

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2013

Accepted: 01 January 2013

Revised: 01 September 2012

Received: 01 May 2011

Published in CSUR Volume 46, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

82
Total Citations
View Citations
3,080
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)5

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tsirtsakis PZacharis GMaraslidis GFragulis G(2025)Deep learning for object recognition: A comprehensive review of models and algorithmsInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2025.01.004Online publication date: Jan-2025
https://doi.org/10.1016/j.ijcce.2025.01.004
Kawai YMochizuki TNaemura M(2024)A Method of Masked Face Detectionマスク着用に対応した顔検出手法The Journal of The Institute of Image Information and Television Engineers10.3169/itej.78.13178:1(131-141)Online publication date: 2024
https://doi.org/10.3169/itej.78.131
Gyimah NAkinie RYan XNabil MGupta KHomaifar AHemmati VOpoku D(2024)NCAT12-DET: A New Benchmark Dataset for Surface Defect Detection and a Comparative StudyIEEE Access10.1109/ACCESS.2024.340266812(72607-72619)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3402668
Azurmendi IZulueta ELopez-Guede JAzkarate JGonzález M(2023)Cooktop Sensing Based on a YOLO Object Detection AlgorithmSensors10.3390/s2305278023:5(2780)Online publication date: 3-Mar-2023
https://doi.org/10.3390/s23052780
Lee KWang BLee S(2023)Analysis of YOLOv5 and DeepLabv3+ Algorithms for Detecting Illegal Cultivation on Public Land: A Case Study of a Riverside in KoreaInternational Journal of Environmental Research and Public Health10.3390/ijerph2003177020:3(1770)Online publication date: 18-Jan-2023
https://doi.org/10.3390/ijerph20031770
Guo YKumazawa IKaku C(2023)Scale Variant Vehicle Object Recognition by CNN Module of Multi-Pooling-PCA ProcessJournal of Intelligent and Connected Vehicles10.26599/JICV.2023.92100176:4(227-236)Online publication date: Dec-2023
https://doi.org/10.26599/JICV.2023.9210017
Sampaio IViterbo JGuerin J(2023)Improving robustness of industrial object detection by automatic generation of synthetic images from CAD modelsComputational Intelligence10.1111/coin.12572Online publication date: 27-Mar-2023
https://doi.org/10.1111/coin.12572
Maji SKriti Narayan Shukla AArya VGupta P(2023)Object Detection Its Progress and Principles2023 12th International Conference on System Modeling & Advancement in Research Trends (SMART)10.1109/SMART59791.2023.10428260(225-233)Online publication date: 22-Dec-2023
https://doi.org/10.1109/SMART59791.2023.10428260
Chenjin JZaixin CYuan WQing WHuihui JLongping C(2023)Intelligent Tools and Whole-process Control System Based on Artificial Intelligence and Internet of Things Technology2023 Panda Forum on Power and Energy (PandaFPE)10.1109/PandaFPE57779.2023.10141397(530-537)Online publication date: Apr-2023
https://doi.org/10.1109/PandaFPE57779.2023.10141397
Sai KAjay KRamesh M(2023)Multimodal Speaker Recognition: Combining FFT, CNN, Speech-to-Text, BERT-Based Punctuation Restoration and Sentence Correction2023 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)10.1109/ICSES60034.2023.10465391(1-7)Online publication date: 14-Dec-2023
https://doi.org/10.1109/ICSES60034.2023.10465391
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents