skip to main content
research-article

Object class detection: A survey

Published: 11 July 2013 Publication History

Abstract

Object class detection, also known as category-level object detection, has become one of the most focused areas in computer vision in the new century. This article attempts to provide a comprehensive survey of the recent technical achievements in this area of research. More than 270 major publications are included in this survey covering different aspects of the research, which include: (i) problem description: key tasks and challenges; (ii) core techniques: appearance modeling, localization strategies, and supervised classification methods; (iii) evaluation issues: approaches, metrics, standard datasets, and state-of-the-art results; and (iv) new development: particularly new approaches and applications motivated by the recent boom of social images. Finally, in retrospect of what has been achieved so far, the survey also discusses what the future may hold for object class detection research.

Supplementary Material

a10-zhang-apndx.pdf (zhang.zip)
Supplemental movie, appendix, image and software files for, Object class detection: A survey

References

[1]
Aggarwal, J. K. and Ryoo, M. S. 2011. Human activity analysis: A review. ACM Comput. Surv. 43, 1--43.
[2]
Alexe, B., Deselaers, T., and Ferrari, V. 2010. What is an object? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[3]
An, S. J., Peursum, P., Liu, W. Q., and Venkatesh, S. 2009. Efficient algorithms for subwindow search in object detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[4]
Andriluka, M., Roth, S., and Schiele, B. 2009. Pictorial structures revisited: People detection and articulated pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[5]
Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. 2009. From contours to regions: An empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[6]
Atkins, C. B. 2008. Blocked recursive image composition. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08).
[7]
Aytar, Y. and Zisserman, A. 2011. Tabula rasa: Model transfer for object category detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).
[8]
Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. 2008. Speeded-up robust features (surf). Comput Vis. Image Understand. 110, 346--359.
[9]
Bay, H., Tuytelaars, T., and Van Gool, L. 2006. SURF: Speeded up robust features. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[10]
Belongie, S., Malik, J., and Puzicha, J. 2001. Matching shapes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'01).
[11]
Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509--522.
[12]
Bentley, J. 1984. Programming pearls: Algorithm design techniques. Comm. ACM 27, 865--873.
[13]
Biederman, I., Mezzanotte, R., and Rabinowitz, J. 1982. Scene perception: Detecting and judging objects undergoing relational violations. Cogn. Psychol. 14, 143--177.
[14]
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.
[15]
Boiman, O., Shechtman, E., and Irani, M. 2008. In defense of nearest-neighbor based image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[16]
Borenstein, E., Sharon, E., and Ullman, S. 2004. Combining top-down and bottom-up segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04).
[17]
Borenstein, E. and Ullman, S. 2008. Combined top-down/bottom-up segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2109--2125.
[18]
Bosch, A., Zisserman, A., and Munoz, X. 2007a. Representing shape with a spatial pyramid kernel. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'07).
[19]
Bosch, A., Zisserman, A., and Muoz, X. 2007b. Image classification using random forests and ferns. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[20]
Bouchard, G. and Triggs, B. 2005. Hierarchical part-based visual object categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[21]
Boureau, Y. L., Bach, F., Lecun, Y., and Ponce, J. 2010. Learning mid-level features for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[22]
Bray, M., Kohli, P., and Torr, P. 2006. PoseCut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[23]
Cai, H. P., Yan, F., and Mikolajczyk, K. 2010. Learning weights for codebook in image classification and retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[24]
Cao, Y., Wang, C. H., Li, Z. W., Zhang, L. Q., and Zhang, L. 2010. Spatial-bag-of-features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[25]
Carneiro, G. and Lowe, D. 2006. Sparse flexible models of local features. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[26]
Carreira, J., Li, F., and Sminchisescu, C. 2011. Object recognition by sequential figure-ground ranking. Int. J. Comput. Vis. 98, 3, 243--262.
[27]
Carreira, J. and Sminchisescu, C. 2010. Constrained parametric min-cuts for automatic object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[28]
Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2Photo: Internet image montage. In Proceedings of the ACM SIGGRAPH Asia Papers.
[29]
Chen, Y., Zhu, L. L., Li, C. L., Yuille, A., and Zhang, H. 2007. Rapid inference on a novel and/or graph for object detection, segmentation and parsing. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).
[30]
Chia, A. Y. S., Rahardja, S., Rajan, D., and Leung, M. K. H. 2009. Structural descriptors for category level object detection. IEEE Trans. Multimedia 11, 1407--1421.
[31]
Christoudias, C. M., Urtasun, R., and Darrell, T. 2008. Unsupervised feature selection via distributed coding for multi-view object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[32]
Crandall, D., Felzenszwalb, P., and Huttenlocher, D. 2005. Spatial priors for part-based recognition using statistical models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[33]
Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. 2004. Visual categorization with bags of keypoints. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).
[34]
Csurka, G., Dance, C., Perronnin, F., and Willamowski, J. 2006. Generic visual categorization using weak geometry. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 207--224.
[35]
Dalal, N. 2006. Finding people in images and videos. Tech. rep., Institut National Polytechnique de Grenoble.
[36]
Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[37]
Dalal, N., Triggs, B., and Schmid, C. 2006. Human detection using oriented histograms of flow and appearance. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[38]
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 1--60.
[39]
Deselaers, T. and Ferrari, V. 2010. Global and efficient self-similarity for object classification and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[40]
Dickinson, S. 2009. The evolution of object categorization and the challenge of image abstraction. In Object Categorization: Computer and Human Vision Perspectives, A. L. S. Dickinson, B. Schiele, and M. Tarr, Eds., Cambridge University Press, 1--37.
[41]
Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A., and Hebert, M. 2009. An empirical study of context in object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[42]
Dollar, P., Belongie, S., and Perona, P. 2010. The fastest pedestrian detector in the west. In Proceedings of the British Machine Vision Conference (BMVC'10). BMVA Press.
[43]
Dollar, P., Tu, Z., Perona, P., and Belongie, S. 2009. Integral channel features. In Proceedings of the British Machine Vision Conference (BMVC'09).
[44]
Dollar, P., Wojek, C., Schiele, B., and Perona, P. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 743--761.
[45]
Endres, I. and Hoiem, D. 2010. Category independent object proposals. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[46]
Enzweiler, M. and Gavrila, D. M. 2008. A mixed generative-discriminative framework for pedestrian classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[47]
Everingham, M., Van Gool, L., Williams, C., Winn, J., and Zisserman, A. 2010. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 2, 303--338.
[48]
Fan, J. P., Shen, Y., Zhou, N., and Gao, Y. L. 2010. Harvesting large-scale weakly-tagged image databases from the web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[49]
Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04).
[50]
Fei-Fei, L. and Perona, P. 2005. A Bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).
[51]
Fei-Fei, L., Vanrullen, R., Koch, C., and Perona, P. 2002. Rapid natural scene categorization in the near absence of attention. Proc. Nat. Acad. Sci. 2, 9596--9601.
[52]
Fei-Fei, L., Fergus, R., and Torralba, A. 2005. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'05). MIT.
[53]
Fei-Fei, L., Fergus, R., and Torralba, A. 2007. Recognizing and learning object categories. In Computer Vision and Pattern Recognition Short Course (CVPR'07).
[54]
Fei-Fei, L., Fergus, R., and Torralba, A. 2009. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'09).
[55]
Felleman, D. J. and Van Essen, D. C. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1, 1--47.
[56]
Felzenszwalb, P., Mcallester, D., and Ramanan, D. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[57]
Felzenszwalb, P. F., Girshick, R. B., and Mcallester, D. 2010a. Cascade object detection with deformable part models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[58]
Felzenszwalb, P. F., Girshick, R. B., Mcallester, D., and Ramanan, D. 2010b. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627--1645.
[59]
Felzenszwalb, P. F. and Huttenlocher, D. P. 2005. Pictorial structures for object recognition. Int. J. Comput. Vis. 61, 55--79.
[60]
Felzenszwalb, P. F. and Veksler, O. 2010. Tiered scene labeling with dynamic programming. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[61]
Ferencz, A., Learned-Miller, E., and Malik, J. 2008. Learning to locate informative features for visual identification. Int. J. Comput. Vis. 77, 3--24.
[62]
Fergus, R., Li, F.-F., Perona, P., and Zisserman, A. 2010. Learning object categories from internet image searches. Proc. IEEE. 98, 1453--1466.
[63]
Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'03).
[64]
Fergus, R., Perona, P., and Zisserman, A. 2005. A sparse object category model for efficient learning and exhaustive recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[65]
Fergus, R., Perona, P., and Zisserman, A. 2007. Weakly supervised scale-invariant learning of models for visual recognition. Int. J. Comput. Vis. 71, 273--303.
[66]
Ferrari, V., Fevrier, L., Jurie, F., and Schmid, C. 2008. Groups of adjacent contour segments for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 30, 36--51.
[67]
Fischler, M. A. and Elschlager, R. A. 1973. The representation and matching of pictorial structures. IEEE Trans. Comput. C-22, 67--92.
[68]
Fleuret, F. and Geman, D. 2001. Coarse-to-fine face detection. Int. J. Comput. Vis. 41, 85--107.
[69]
Fulkerson, B., Vedaldi, A., and Soatto, S. 2009. Class segmentation and object localization with superpixel neighborhoods. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[70]
Gall, J. and Lempitsky, V. 2009. Class-specific hough forests for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[71]
Gallagher, A., Neustaedter, C., Cao, L., Luo, J., and Chen, T. 2008. Image annotation using personal calendars as context. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08).
[72]
Gallagher, A. C. and Chen, T. 2008. Estimating age, gender, and identity using first name priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[73]
Galleguillos, C. and Belongie, S. 2010. Context based object categorization: A critical survey. Comput Vis. Image Understand. 114, 712--722.
[74]
Galleguillos, C., Mcfee, B., Belongie, S., and Lanckriet, G. 2010. Multi-class object localization by combining local contextual interactions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[75]
Gehler, P. and Nowozin, S. 2009. On feature combination for multiclass object classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[76]
Girshick, R. B., Felzenszwalb, P. F., and Mcallester, D. 2011. Object detection with grammar models. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'11).
[77]
Gonfaus, J. M., Boix, X., Van de Weijer, J., Bagdanov, A. D., Serrat, J., and Gonzalez, J. 2010. Harmony potentials for joint classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[78]
Gould, S., Fulton, R., and Koller, D. 2009a. Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[79]
Gould, S., Gao, T. S., and Koller, D. 2009b. Region-based segmentation and object detection. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'09).
[80]
Grabner, H., Roth, P. M., and Bischof, H. 2007. Eigenboosting: Combining discriminative and generative information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[81]
Grauman, K. and Leibe, B. 2011. Visual object recognition. Synthesis Lectures Artif. Intell. Mach. Learn. 5, 1--181.
[82]
Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep., California Institute of Technology, 1-20. http://authors.library.caltech.edu/7694/1/CNS-TR-2007-001.pdf.
[83]
Gu, C. H., Lim, J. J., Arbelaez, P., and Malik, J. 2009. Recognition using regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[84]
Guillaumin, M., Verbeek, J., and Schmid, C. 2010. Multimodal semi-supervised learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[85]
Hays, J. and Efros, A. A. 2008. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[86]
He, X. M., Zemel, R., and Ray, D. 2006. Learning and incorporating top-down cues in image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[87]
He, X. M., Zemel, R. S., and Carreira-Perpinan, M. A. 2004. Multiscale conditional random fields for image labeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).
[88]
Heitz, G., Elidan, G., Packer, B., and Koller, D. 2009. Shape-based object localization for descriptive classification. Int. J. Comput. Vis. 84, 40--62.
[89]
Hochstein, S. and Ahissar, M. 2002. View from the top: Hierarchies and reverse hierarchies in the visual system. Neuron 36, 791--804.
[90]
Hofmann, T. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177--196.
[91]
Hoiem, D., Efros, A., and Hebert, M. 2008. Putting objects in perspective. Int. J. Comput. Vis. 80, 3--15.
[92]
Hoiem, D., Rother, C., and Winn, J. 2007a. 3D layoutcrf for multi-view object class recognition and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[93]
Hoiem, D., Stein, A., Efros, A., and Hebert, M. 2007b. Recovering occlusion boundaries from a single image. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[94]
Huang, Y. Z., Huang, K. Q., Wang, L. S., Tao, D. C., Tan, T. N., and Li, X. L. 2008. Enhanced biologically inspired model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[95]
Hwang, S. J. and Grauman, K. 2010. Reading between the lines: Object localization using implicit cues from image tags. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[96]
Jain, A., Gupta, A., and Davis, L. 2010. Learning what and how of contextual models for scene labeling. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[97]
Jhuang, H., Serre, T., Wolf, L., and Poggio, T. 2007. A biologically inspired system for action recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[98]
Ji, R. R., Yao, H. X., Sun, X. S., Zhong, B. N., and Gao, W. 2010. Towards semantic embedding in visual vocabulary. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[99]
Jiang, Y.-G., Yang, J., Ngo, C.-W., and Hauptmann, A. G. 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans. Multimedia 12, 42--53.
[100]
Joachims, T. 1997. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In Proceedings of the International Conference on Machine Learning (ICML'97).
[101]
Joachims, T. 1998. Making large-scale support vector machine learning practical. In Advances in Kernel Methods: Support Vector Machines, B. Scholkopf, J. C. Burges, and A. J. Smola, Eds. MIT Press, Cambridge, MA, 169--184.
[102]
Jones, J. P. and Palmer, L. A. 1987. An evaluation of the two-dimensional gabor filter model of simple receptive fields in cat striate cortex. J. Neurophys. 58, 1233--1258.
[103]
Jurie, F. and Triggs, B. 2005. Creating efficient codebooks for visual recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).
[104]
Karlinsky, L., Dinerstein, M., Harari, D., and Ullman, S. 2010. The chains model for detecting parts by their context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[105]
Ke, Y. and Sukthankar, R. 2004. PCA-sift: A more distinctive representation for local image descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).
[106]
Knopp, J., Prasad, M., and Gool, L. V. 2011. Scene cut: Class-specific object detection and segmentation in 3d scenes. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT'11).
[107]
Koh, K., Kim, S.-J., and Boyd, S. 2007. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519--1555.
[108]
Kohli, P., Ladicky, L., and Torr, P. 2008. Robust higher order potentials for enforcing label consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[109]
Kotsiantis, S. B. 2007. Supervised machine learning: A review of classification techniques. Informatica 31, 249--268.
[110]
Krüger, V., Kragic, D., Ude, A., and Geib, C. 2007. The meaning of action: A review on action recognition and mapping. Advan. Robot. 21, 1473--1501.
[111]
Kuettel, D. and Ferrari, V. 2012. Figure-ground segmentation by transferring window masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12).
[112]
Kumar, M. P., Ton, P. H. S., and Zisserman, A. 2005. OBJCUT. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[113]
Kumar, M. P., Torr, P. H. S., and Zisserman, A. 2010. OBJCUT: Efficient segmentation using top-down and bottom-up cues. IEEE Trans. Pattern Anal. Mach. Intell. 32, 530--545.
[114]
Kumar, N., Belhumeur, P., and Nayar, S. 2008. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the European Conference on Computer Vision (ECCV'08).
[115]
Ladicky, L., Sturgess, P., Alahari, K., Russell, C., and Torr, P. 2010. What, where and how many? Combining object detectors and crfs. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[116]
Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. In Proceedings of the International Conference and Exhibition on Computer Graphics and Interactive Techniques (ACM/SIGGRAPH'07).
[117]
Lalonde, J.-F., Narasimhan, S. G., and Efros, A. A. 2010. What do the sun and the sky tell us about the camera? Int. J. Comput. Vis. 88, 24--51.
[118]
Lampert, C. H., Blaschko, M. B., and Hofmann, T. 2008. Beyond sliding windows: Object localization by efficient sub-window search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[119]
Laptev, I. 2006. Improvements of object detection using boosted histograms. In Proceedings of the British Machine Vision Conference (BMVC'06).
[120]
Laptev, I. 2009. Improving object detection with boosted histograms. Image Vis. Comput. 27, 535--544.
[121]
Larlus, D. and Jurie, F. 2008. Combining appearance models and markov random fields for category level object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[122]
Larlus, D., Verbeek, J., and Jurie, F. 2010. Category level object segmentation by combining bag-of-words models with dirichlet processes and random fields. Int. J. Comput. Vision 88, 238--253.
[123]
Lasserre, J. A., Bishop, C. M., and Minka, T. P. 2006. Principled hybrids of generative and discriminative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[124]
Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[125]
Lee, H., Battle, A., Raina, R., and Ng, A. Y. 2006. Efficient sparse coding algorithms. Adv. Neural Inf. Process. Syst. 19, 2007.
[126]
Lee, Y. J. and Grauman, K. 2010. Object-graphs for context-aware category discovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[127]
Leibe, B., Leonardis, A., and Schiele, B. 2004. Combined object categorization and segmentation with an implicit shape model. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).
[128]
Leibe, B., Leonardis, A., and Schiele, B. 2006. An implicit shape model for combined object categorization and segmentation. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 508--524.
[129]
Leibe, B., Leonardis, A., and Schiele, B. 2008. Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vis. 77, 259--289.
[130]
Leibe, B., Seemann, E., and Schiele, B. 2005. Pedestrian detection in crowded scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[131]
Lempitsky, V., Kohli, P., Rother, C., and Sharp, T. 2009. Image segmentation with a bounding box prior. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[132]
Levin, A. and Weiss, Y. 2006. Learning to combine bottom-up and top-down segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[133]
Li, L.-J. and Fei-Fei, L. 2007. What, where and who? Classifying events by scene and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[134]
Li, L.-J. and Fei-Fei, L. 2010. OPTIMOL: Automatic online picture collection via incremental model learning. Int. J. Comput. Vis. 88, 147--168.
[135]
Liang, P. and Jordan, M. I. 2008. An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators. In Proceedings of the International Conference on Machine Learning (ICML'08).
[136]
Liebelt, J. and Schmid, C. 2010. Multi-view object class detection with a 3d geometric model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[137]
Liebelt, J., Schmid, C., and Schertler, K. 2008. Viewpoint-independent object class detection using 3d feature maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[138]
Lin, D., Kapoor, A., Hua, G., and Baker, S. 2010. Joint people, event, and location recognition in personal photo collections using cross-domain context. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[139]
Lin, Z. 2009. Modeling shape, appearance and motion for human movement analysis. Tech. rep., Department of Electrical and Computer Engineering, University of Maryland, College Park, Md. http://hdl.handle.net/1903/9279.
[140]
Lin, Z. and Davis, L. S. 2010. Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Trans. Pattern Anal. Mach. Intell. 32, 604--618.
[141]
Lin, Z., Davis, L. S., Doermann, D., and Dementhon, D. 2007. Hierarchical part-template matching for human detection and segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[142]
Liu, C., Yuen, J., and Torralba, A. 2009a. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[143]
Liu, C., Yuen, J., Torralba, A., Sivic, J., and Freeman, W. 2008. SIFT flow: Dense correspondence across different scenes. In Proceedings of the European Conference on Computer Vision (ECCV'08).
[144]
Liu, T., Wang, J. D., Sun, J., Zheng, N. N., Tang, X. O., and Shum, H. Y. 2009b. Picture collage. IEEE Trans. Multimedia 11, 1225--1239.
[145]
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91--110.
[146]
Lu, Z. W. and Ip, H. H. S. 2009. Image categorization with spatial mismatch kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[147]
Luo, J., Boutell, M., and Brown, C. 2006. Pictures are not taken in a vacuum. IEEE Signal Process. Mag. 23, 101--114.
[148]
Maire, M., Yu, S. X., and Perona, P. 2011. Object detection and segmentation from joint embedding of parts and pixels. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).
[149]
Maji, S., Berg, A. C., and Malik, J. 2008. Classification using intersection kernel support vector machines is efficient. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[150]
Malisiewicz, T. and Efros, A. A. 2008. Recognition by association via learning per-exemplar distances. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[151]
Marszałek, M. and Schmid, C. 2007. Accurate object localization with shape masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[152]
Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1615--1630.
[153]
Moosmann, F., Nowak, E., and Jurie, F. 2008. Randomized clustering forests for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1632--1646.
[154]
Mu, Y., Yan, S., Liu, Y., Huang, T., and Zhou, B. 2008. Discriminative local binary patterns for human detection in personal album. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[155]
Mutch, J. and Lowe, D. 2008. Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vis. 80, 45--57.
[156]
Mutch, J. and Lowe, D. G. 2006. Multiclass object recognition with sparse, localized features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[157]
Nakayama, H., Harada, T., and Kuniyoshi, Y. 2010. Global gaussian approach for scene categorization using information geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[158]
Narasimhan, S. and Nayar, S. 2002. Vision and the atmosphere. Int. J. Comput. Vis. 48, 233--254.
[159]
Ng, A. and Jordan, M. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'02).
[160]
Ni, B. B., Yan, S. C., and Kassim, A. 2009. Contextualizing histogram. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[161]
Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[162]
Nowak, E., Jurie, F., and Triggs, B. 2006. Sampling strategies for bag-of-features image classification. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[163]
Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multi-resolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971--987.
[164]
Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145--175.
[165]
Oliva, A. and Torralba, A. 2006. Building the gist of a scene: The role of global image features in recognition. Progress Brain Res. 155, 23--36.
[166]
Oliva, A. and Torralba, A. 2007. The role of context in object recognition. Trends Cogn. Sci. 11, 520--527.
[167]
Opelt, A., Pinz, A., and Zisserman, A. 2006. A boundary-fragment-model for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[168]
Palmese, M. and Trucco, A. 2008. From 3-d sonar images to augmented reality models for objects buried on the seafloor. IEEE Trans. Instrument. Measure. 57, 820--828.
[169]
Parikh, D. and Zitnick, C. L. 2010. The role of features, algorithms and data in visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[170]
Park, D., Ramanan, D., and Fowlkes, C. 2010. Multiresolution models for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[171]
Pedersoli, M., Vedaldi, A., and Gonzalez, J. 2011. A coarse-to-fine approach for fast deformable object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[172]
Perronnin, F. 2008. Universal and adapted vocabularies for generic visual categorization. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1243--1256.
[173]
Perrotton, X., Sturzel, M., and Roux, M. 2010. Implicit hierarchical boosting for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[174]
Pinz, A. 2005. Object categorization. Foundat. Trends Comput. Graph. Vis. 1, 4, 255--353.
[175]
Ponce, J., Berg, T. L., Everingham, M., Forsyth, D. A., Hebert, M., Lazebnik, S., Marszałek, M., Schmid, C., Russell, B. C., Torralba, A., Williams, C. K. I., Zhang, J., and Zisserman, A. 2006a. Dataset issues in object recognition. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman Eds., Springer, 29--48.
[176]
Ponce, J., Hebert, M., Schmid, C., and Zisserman, A. 2006b. Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, Springer.
[177]
Porikli, F. 2005. Integral histogram: A fast way to extract histograms in cartesian spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[178]
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. 2007. Objects in context. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[179]
Ravishankar, S., Jain, A., and Mittal, A. 2008. Multi-stage contour based detection of deformable objects. In Proceedings of the European Conference on Computer Vision (ECCV'08).
[180]
Razavi, N., Gall, J., and Van Gool, L. 2010. Backprojection revisited: Scalable multi-view object detection and similarity metrics for detections. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[181]
Ren, X. and Malik, J. 2003. Learning a classification model for segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).
[182]
Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neurosci. 2, 1019--1025.
[183]
Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. AutoCollage. ACM Trans. Graph. 25, 3, 847--852.
[184]
Rother, C., Kolmogorov, V., and Blake, A. 2004. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 3, 309--314.
[185]
Rubinstein, D. and Hastie, T. 1997. Discriminative vs informative learning. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDDM'97).
[186]
Rubner, Y., Tomasi, C., and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99--121.
[187]
Rui, X., Li, M., Li, Z., Ma, W.-Y., and Yu, N. 2007. Bipartite graph reinforcement model for web image annotation. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'07).
[188]
Russell, B., Torralba, A., Liu, C., Fergus, R., and Freeman, W. 2007. Object recognition by scene alignment. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).
[189]
Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157--173.
[190]
Sabzmeydani, P. and Mori, G. 2007. Detecting pedestrians by learning shapelet features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[191]
Saffari, A., Godec, M., Pock, T., Leistner, C., and Bischof, H. 2010. Online multi-class lpboost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[192]
Salakhutdinov, R., Torralba, A., and Tenenbaum, J. 2011. Learning to share visual appearance for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[193]
Salzmann, M. and Urtasun, R. 2010. Combining discriminative and generative methods for 3d deformable surface and articulated pose reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[194]
Savarese, S. and Li, F.-F. 2007. 3D generic object categorization, localization and pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[195]
Savarese, S., Winn, J., and Criminisi, A. 2006. Discriminative object class models of appearance and shape by correlatons. In Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[196]
Schindler, K., Van Gool, L., and De Gelder, B. 2008. Recognizing emotions expressed by body pose: A biologically inspired neural model. Neural Netw. 21, 1238--1246.
[197]
Schroff, F. 2009. Semantic image segmentation and web-supervised visual learning. Tech. rep., Robotics Research Group, Department of Engineering Science. University of Oxford, Oxford, UK. http://www.robots.ox.ac.uk/∼vgg/publications/papers/schroff09.pdf.
[198]
Seemann, E., Leibe, B., and Schiele, B. 2006. Multi-aspect detection of articulated objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[199]
Serre, T., Oliva, A., and Poggio, T. 2007a. A feed-forward architecture accounts for rapid categorization. Proc. National Acad. Sci. 104, 6424--6429.
[200]
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. 2007b. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411--426.
[201]
Serre, T., Wolf, L., and Poggio, T. 2005. Object recognition with features inspired by visual cortex. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[202]
Shechtman, E. and Irani, M. 2007. Matching local self-similarities across images and videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[203]
Shin, Y., Kim, Y., and Kim, E. Y. 2010. Automatic textile image annotation by predicting emotional concepts from visual features. Image Vis. Comput. 28, 526--537.
[204]
Shotton, J., Blake, A., and Cipolla, R. 2005. Contour-based learning for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).
[205]
Shotton, J., Blake, A., and Cipolla, R. 2008a. Multiscale categorical object recognition using contour fragments. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1270--1281.
[206]
Shotton, J., Johnson, M., and Cipolla, R. 2008b. Semantic texton forests for image categorization and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[207]
Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2006. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06).
[208]
Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2009. TextonBoost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2--23.
[209]
Simon, I. and Seitz, S. 2008. Scene segmentation using the wisdom of crowds. In Proceedings of the European Conference on Computer Vision (ECCV'08).
[210]
Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T. 2005. Discovering objects and their location in images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).
[211]
Sivic, J. and Zisserman, A. 2003. Video google: Text retrieval approach to object matching in videos. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).
[212]
Snavely, N., Simon, I., Goesele, M., Szeliski, R., and Seitz, S. M. 2010. Scene reconstruction and visualization from community photo collections. Proc. IEEE. 98, 1370--1390.
[213]
Song, D. J. and Tao, D. C. 2010. Biologically inspired feature manifold for scene classification. IEEE Trans. Image Process. 19, 174--184.
[214]
Song, Z., Chen, Q., Huang, Z., Hua, Y., and Yan, S. 2011. Contextualizing object detection and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[215]
Sonnenburg, S., Rutsch, G., Schafer, C., and Scholkopf, B. 2006. Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531--1565.
[216]
Strat, T. 1993. Employing contextual information in computer vision. In Proceedings of the ARPA Image Understanding Workshop. 217--229.
[217]
Sutton, C. and McCallum, A. 2006. An introduction to conditional random fields for relational learning. In Introduction to Statistical Relational Learning, L. Getoor and B. Taskar, Eds., MIT Press. http://people.cs.umass.edu/∼mccallum/papers/crf-tutorial.pdf.
[218]
Szeliski, R. 2010. Computer Vision: Algorithms and Applications. Springer.
[219]
Tao, L., Yuan, L., and Sun, J. 2009. SkyFinder: Attribute-based sky image search. In ACM SIGGRAPH Papers.
[220]
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., and Van Gool, L. 2006. Towards multi-view object class detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[221]
Torralba, A. 2003. Contextual priming for object detection. Int. J. Comput. Vis. 53, 169--191.
[222]
Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1958--1970.
[223]
Torralba, A., Murphy, K., and Freeman, W. 2006. Shared features for multiclass object detection. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 345--361.
[224]
Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Sharing features: Efficient boosting procedures for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04).
[225]
Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. 2003. Context-based vision system for place and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03).
[226]
Tu, Z. W. 2007. Learning generative models via discriminative approaches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[227]
Ulusoy, I. and Bishop, C. 2006. Comparison of generative and discriminative techniques for object detection and classification. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 173--195.
[228]
Ulusoy, I. and Bishop, C. M. 2005. Generative versus discriminative methods for object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).
[229]
Van De Sande, K., Gevers, T., and Snoek, C. 2008. Evaluation of color descriptors for object and scene recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[230]
Van De Sande, K., Gevers, T., and Snoek, C. 2010. Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582--1596.
[231]
Van De Sande, K., Uijlings, J., Gevers, T., and Smeulders, A. 2011. Segmentation as selective search for object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).
[232]
Van Gemert, J. C., Veenman, C. J., Smeulders, A. W. M., and Geusebroek, J. M. 2010. Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1271--1283.
[233]
Vapnik, V. N. 1998. Statistical Learning Theory. A Wiley-Interscience Publication, New York.
[234]
Varma, M. and Ray, D. 2007. Learning the discriminative power-invariance trade-off. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[235]
Vedaldi, A., Gulshan, V., Varma, M., and Zisserman, A. 2009. Multiple kernels for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[236]
Verbeek, J. and Triggs, B. 2007a. Region classification with markov field aspect models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[237]
Verbeek, J. and Triggs, B. 2007b. Scene segmentation with conditional random fields learned from partially labeled images. In Proceedings of the Conference on Advances in Neural Information Processing Systems. (NIPS'07).
[238]
Vijayanarasimhan, S. and Grauman, K. 2011. Efficient region search for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[239]
Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'01).
[240]
Walk, S., Majer, N., Schindler, K., and Schiele, B. 2010. New features and insights for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[241]
Wang, G., Gallagher, A., Luo, J., and Forsyth, D. 2010a. Seeing people in social context: Recognizing people and social relationships. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[242]
Wang, G., Hoiem, D., and Forsyth, D. 2009a. Building text features for object image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[243]
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. 2010b. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[244]
Wang, X. and Grimson, E. 2007. Spatial latent dirichlet allocation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).
[245]
Wang, X., Han, T. X., and Yan, S. 2009b. An hog-lbp human detector with partial occlusion handling. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[246]
Wang, Y. and Mori, G. 2009. Max-margin hidden conditional random fields for human action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[247]
Wang, Y. and Mori, G. 2010. Hidden part models for human action recognition: Probabilistic vs. max-margin. IEEE Trans. Pattern Anal. Mach. Intell. 33, 7, 1310--1323.
[248]
Wang, Z., Hu, Y., and Chia, L.-T. 2010c. Image-to-class distance metric learning for image classification. In Proceedings of the European Conference on Computer Vision (ECCV'10).
[249]
Watanabe, T., Ito, S., and Yokoi, K. 2009. Co-occurrence histograms of oriented gradients for pedestrian detection. In Proceedings of the Pacific-Rim Symposium on Image and Video Technology (PSIVT'09).
[250]
Wei, Y. C. and Tao, L.T. 2010. Efficient histogram-based sliding window. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[251]
Winn, J., Criminisi, A., and Minka, T. 2005. Object categorization by learned universal visual dictionary. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).
[252]
Wnuk, K. and Soatto, S. 2008. Filtering internet image search results towards keyword based category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[253]
Wojek, C. and Schiele, B. 2008. A performance evaluation of single and multi-feature people detection. In Proceedings of the German Association for Pattern Recognition (DAGM'08).
[254]
Wojek, C., Walk, S., and Schiele, B. 2009. Multi-cue onboard pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[255]
Wright, J., Yi, M., Mairal, J., Sapiro, G., Huang, T. S., and Shuicheng, Y. 2010. Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031--1044.
[256]
Wu, B. and Nevatia, R. 2005. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05).
[257]
Wu, B. and Nevatia, R. 2007a. Cluster boosted tree classifier for multi-view, multi-pose object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[258]
Wu, B. and Nevatia, R. 2007b. Improving part based object detection by unsupervised, online boosting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).
[259]
Wu, B. and Nevatia, R. 2007c. Simultaneous object detection and segmentation by boosting local shape feature based classifier. In Proceedings of the IEEE Conference on Computer Vision and Pattern Reconition (CVPR'07).
[260]
Wu, Z., Ke, Q. F., Isard, M., and Sun, J. 2009. Bundling features for large scale partial-duplicate web image search. In Proceedings of the IEEE Conferenc on Computer Vision and Pattern Recognition (CVPR'09).
[261]
Xiang, Y., Zhou, X. D., Liu, Z. T., Chua, T. S., and Ngo, C.-W. 2010. Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[262]
Xu, H., Zhou, X., Wang, M., Xiang, Y., and Shi, B. 2009. Exploring flickr's related tags for semantic annotation of web images. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'09).
[263]
Xu, Z., Chen, H., Zhu, S.-C., and Luo, J. 2008. A hierarchical compositional model for face representation and sketching. IEEE Trans. Pattern Anal. Mach. Intell. 30, 955--969.
[264]
Xue, J.-H. 2008. Aspects of generative and discriminative classifiers. Tech. rep., Information and Mathematical Sciences, Department of Statistics, University of Glasgow.
[265]
Xue, J.-H. and Titterington, D. 2008. Comment on “on discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes”. Neural Process. Lett. 28, 169--187.
[266]
Xue, J.-H. and Titterington, D. M. 2010. On the generative-discriminative tradeoff approach: Interpretation, asymptotic efficiency and classification performance. Comput. Statist. Data Anal. 54, 438--451.
[267]
Yan, P. K., Khan, S. M., and Shah, M. 2007. 3D model based object class detection in an arbitrary view. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[268]
Yang, B., Mei, T., Sun, L.-F., Yang, S.-Q., and Hua, X.-S. 2008a. Free-shaped video collage. In Proceedings of the 14th International Conference on Advances in Multimedia Modeling.
[269]
Yang, J. C., Yu, K., Gong, Y. H., and Huang, T. 2009. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[270]
Yang, L., Jin, R., Sukthankar, R., and Jurie, F. 2008b. Unifying discriminative visual codebook generation with classifier training for object category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).
[271]
Yang, Y. and Ramanan, D. 2011. Articulated pose estimation with flexible mixtures-of-parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[272]
Yao, B. Z., Yang, X., Lin, L., Lee, M. W., and Zhu, S. C. 2010. I2T: Image parsing to text description. Proc. IEEE. 98, 1485--1508.
[273]
Yeh, T., Lee, J. J., and Darrell, T. 2009. Fast concurrent object localization and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).
[274]
Yu, C. N. J. and Joachims, T. 2009. Learning structural svms with latent variables. In Proceedings of the International Conference on Machine Learning (ICML'09).
[275]
Zhang, C., Liu, J., Tian, Q., Xu, C., Lu, H., and Ma, S. 2011a. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[276]
Zhang, D. Q. and Chang, S. F. 2006. A generative-discriminative hybrid method for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[277]
Zhang, J., Marszałek, M., Lazebnik, S., and Schmid, C. 2007. Local features and kernels for classification of texture and object categories: A comprehensive study. Int. J. Comput. Vis. 73, 213--238.
[278]
Zhang, J. G., Huang, K. Q., Yu, Y. N., and Tan, T. N. 2011b. Boosted local structured hog-lbp for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11).
[279]
Zhang, Z. Q., Cao, Y., Salvi, D., Oliver, K., Waggoner, J., and Wang, S. 2010. Free-shape subwindow search for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[280]
Zheng, W. S., Gong, S. G., and Xiang, T. 2009. Quantifying contextual information for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).
[281]
Zhu, L., Chen, Y., Lin, C., and Yuille, A. 2011. Max margin learning of hierarchical configural deformable templates (hcdts) for efficient object parsing and pose estimation. Int. J. Comput. Vis. 93, 1--21.
[282]
Zhu, L., Chen, Y. H., Yuille, A., and Freeman, W. 2010. Latent hierarchical structural learning for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[283]
Zhu, M. 2004. Recall, precision and average precision. Working paper, University of Waterloo.
[284]
Zhu, Q., Yeh, M. C., Cheng, K. T., and Avidan, S. 2006. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[285]
Zhu, S.-C. and Mumford, D. 2006. A stochastic grammar of images. Foundations Trends Comput. Graph. Vis. 2, 259--362.

Cited By

View all
  • (2025)Deep learning for object recognition: A comprehensive review of models and algorithmsInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2025.01.004Online publication date: Jan-2025
  • (2024)A Method of Masked Face Detectionマスク着用に対応した顔検出手法The Journal of The Institute of Image Information and Television Engineers10.3169/itej.78.13178:1(131-141)Online publication date: 2024
  • (2024)NCAT12-DET: A New Benchmark Dataset for Surface Defect Detection and a Comparative StudyIEEE Access10.1109/ACCESS.2024.340266812(72607-72619)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 46, Issue 1
October 2013
551 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/2522968
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2013
Accepted: 01 January 2013
Revised: 01 September 2012
Received: 01 May 2011
Published in CSUR Volume 46, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Object class detection
  2. appearance model
  3. categorization
  4. evaluation
  5. intra-class appearance variation
  6. segmentation
  7. social images

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)5
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Deep learning for object recognition: A comprehensive review of models and algorithmsInternational Journal of Cognitive Computing in Engineering10.1016/j.ijcce.2025.01.004Online publication date: Jan-2025
  • (2024)A Method of Masked Face Detectionマスク着用に対応した顔検出手法The Journal of The Institute of Image Information and Television Engineers10.3169/itej.78.13178:1(131-141)Online publication date: 2024
  • (2024)NCAT12-DET: A New Benchmark Dataset for Surface Defect Detection and a Comparative StudyIEEE Access10.1109/ACCESS.2024.340266812(72607-72619)Online publication date: 2024
  • (2023)Cooktop Sensing Based on a YOLO Object Detection AlgorithmSensors10.3390/s2305278023:5(2780)Online publication date: 3-Mar-2023
  • (2023)Analysis of YOLOv5 and DeepLabv3+ Algorithms for Detecting Illegal Cultivation on Public Land: A Case Study of a Riverside in KoreaInternational Journal of Environmental Research and Public Health10.3390/ijerph2003177020:3(1770)Online publication date: 18-Jan-2023
  • (2023)Scale Variant Vehicle Object Recognition by CNN Module of Multi-Pooling-PCA ProcessJournal of Intelligent and Connected Vehicles10.26599/JICV.2023.92100176:4(227-236)Online publication date: Dec-2023
  • (2023)Improving robustness of industrial object detection by automatic generation of synthetic images from CAD modelsComputational Intelligence10.1111/coin.12572Online publication date: 27-Mar-2023
  • (2023)Object Detection Its Progress and Principles2023 12th International Conference on System Modeling & Advancement in Research Trends (SMART)10.1109/SMART59791.2023.10428260(225-233)Online publication date: 22-Dec-2023
  • (2023)Intelligent Tools and Whole-process Control System Based on Artificial Intelligence and Internet of Things Technology2023 Panda Forum on Power and Energy (PandaFPE)10.1109/PandaFPE57779.2023.10141397(530-537)Online publication date: Apr-2023
  • (2023)Multimodal Speaker Recognition: Combining FFT, CNN, Speech-to-Text, BERT-Based Punctuation Restoration and Sentence Correction2023 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)10.1109/ICSES60034.2023.10465391(1-7)Online publication date: 14-Dec-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media