Abstract
Object class detection, also known as category-level object detection, has become one of the most focused areas in computer vision in the new century. This article attempts to provide a comprehensive survey of the recent technical achievements in this area of research. More than 270 major publications are included in this survey covering different aspects of the research, which include: (i) problem description: key tasks and challenges; (ii) core techniques: appearance modeling, localization strategies, and supervised classification methods; (iii) evaluation issues: approaches, metrics, standard datasets, and state-of-the-art results; and (iv) new development: particularly new approaches and applications motivated by the recent boom of social images. Finally, in retrospect of what has been achieved so far, the survey also discusses what the future may hold for object class detection research.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Object class detection: A survey
- Aggarwal, J. K. and Ryoo, M. S. 2011. Human activity analysis: A review. ACM Comput. Surv. 43, 1--43. Google ScholarDigital Library
- Alexe, B., Deselaers, T., and Ferrari, V. 2010. What is an object? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- An, S. J., Peursum, P., Liu, W. Q., and Venkatesh, S. 2009. Efficient algorithms for subwindow search in object detection and localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Andriluka, M., Roth, S., and Schiele, B. 2009. Pictorial structures revisited: People detection and articulated pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J. 2009. From contours to regions: An empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Atkins, C. B. 2008. Blocked recursive image composition. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08). Google ScholarDigital Library
- Aytar, Y. and Zisserman, A. 2011. Tabula rasa: Model transfer for object category detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). Google ScholarDigital Library
- Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. 2008. Speeded-up robust features (surf). Comput Vis. Image Understand. 110, 346--359. Google ScholarDigital Library
- Bay, H., Tuytelaars, T., and Van Gool, L. 2006. SURF: Speeded up robust features. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Belongie, S., Malik, J., and Puzicha, J. 2001. Matching shapes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'01).Google Scholar
- Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509--522. Google ScholarDigital Library
- Bentley, J. 1984. Programming pearls: Algorithm design techniques. Comm. ACM 27, 865--873. Google ScholarDigital Library
- Biederman, I., Mezzanotte, R., and Rabinowitz, J. 1982. Scene perception: Detecting and judging objects undergoing relational violations. Cogn. Psychol. 14, 143--177.Google ScholarCross Ref
- Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022. Google ScholarCross Ref
- Boiman, O., Shechtman, E., and Irani, M. 2008. In defense of nearest-neighbor based image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Borenstein, E., Sharon, E., and Ullman, S. 2004. Combining top-down and bottom-up segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04). Google ScholarDigital Library
- Borenstein, E. and Ullman, S. 2008. Combined top-down/bottom-up segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2109--2125. Google ScholarDigital Library
- Bosch, A., Zisserman, A., and Munoz, X. 2007a. Representing shape with a spatial pyramid kernel. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'07). Google ScholarDigital Library
- Bosch, A., Zisserman, A., and Muoz, X. 2007b. Image classification using random forests and ferns. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Bouchard, G. and Triggs, B. 2005. Hierarchical part-based visual object categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Boureau, Y. L., Bach, F., Lecun, Y., and Ponce, J. 2010. Learning mid-level features for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Bray, M., Kohli, P., and Torr, P. 2006. PoseCut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Cai, H. P., Yan, F., and Mikolajczyk, K. 2010. Learning weights for codebook in image classification and retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Cao, Y., Wang, C. H., Li, Z. W., Zhang, L. Q., and Zhang, L. 2010. Spatial-bag-of-features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Carneiro, G. and Lowe, D. 2006. Sparse flexible models of local features. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Carreira, J., Li, F., and Sminchisescu, C. 2011. Object recognition by sequential figure-ground ranking. Int. J. Comput. Vis. 98, 3, 243--262. Google ScholarDigital Library
- Carreira, J. and Sminchisescu, C. 2010. Constrained parametric min-cuts for automatic object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2Photo: Internet image montage. In Proceedings of the ACM SIGGRAPH Asia Papers. Google ScholarDigital Library
- Chen, Y., Zhu, L. L., Li, C. L., Yuille, A., and Zhang, H. 2007. Rapid inference on a novel and/or graph for object detection, segmentation and parsing. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).Google Scholar
- Chia, A. Y. S., Rahardja, S., Rajan, D., and Leung, M. K. H. 2009. Structural descriptors for category level object detection. IEEE Trans. Multimedia 11, 1407--1421. Google ScholarDigital Library
- Christoudias, C. M., Urtasun, R., and Darrell, T. 2008. Unsupervised feature selection via distributed coding for multi-view object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Crandall, D., Felzenszwalb, P., and Huttenlocher, D. 2005. Spatial priors for part-based recognition using statistical models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. 2004. Visual categorization with bags of keypoints. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).Google Scholar
- Csurka, G., Dance, C., Perronnin, F., and Willamowski, J. 2006. Generic visual categorization using weak geometry. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 207--224.Google Scholar
- Dalal, N. 2006. Finding people in images and videos. Tech. rep., Institut National Polytechnique de Grenoble.Google Scholar
- Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Dalal, N., Triggs, B., and Schmid, C. 2006. Human detection using oriented histograms of flow and appearance. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 1--60. Google ScholarDigital Library
- Deselaers, T. and Ferrari, V. 2010. Global and efficient self-similarity for object classification and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Dickinson, S. 2009. The evolution of object categorization and the challenge of image abstraction. In Object Categorization: Computer and Human Vision Perspectives, A. L. S. Dickinson, B. Schiele, and M. Tarr, Eds., Cambridge University Press, 1--37.Google Scholar
- Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A., and Hebert, M. 2009. An empirical study of context in object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Dollar, P., Belongie, S., and Perona, P. 2010. The fastest pedestrian detector in the west. In Proceedings of the British Machine Vision Conference (BMVC'10). BMVA Press.Google Scholar
- Dollar, P., Tu, Z., Perona, P., and Belongie, S. 2009. Integral channel features. In Proceedings of the British Machine Vision Conference (BMVC'09).Google Scholar
- Dollar, P., Wojek, C., Schiele, B., and Perona, P. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 743--761. Google ScholarDigital Library
- Endres, I. and Hoiem, D. 2010. Category independent object proposals. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Enzweiler, M. and Gavrila, D. M. 2008. A mixed generative-discriminative framework for pedestrian classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Everingham, M., Van Gool, L., Williams, C., Winn, J., and Zisserman, A. 2010. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 2, 303--338. Google ScholarDigital Library
- Fan, J. P., Shen, Y., Zhou, N., and Gao, Y. L. 2010. Harvesting large-scale weakly-tagged image databases from the web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Fei-Fei, L., Fergus, R., and Perona, P. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04). Google ScholarDigital Library
- Fei-Fei, L. and Perona, P. 2005. A Bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Google ScholarDigital Library
- Fei-Fei, L., Vanrullen, R., Koch, C., and Perona, P. 2002. Rapid natural scene categorization in the near absence of attention. Proc. Nat. Acad. Sci. 2, 9596--9601.Google Scholar
- Fei-Fei, L., Fergus, R., and Torralba, A. 2005. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'05). MIT.Google Scholar
- Fei-Fei, L., Fergus, R., and Torralba, A. 2007. Recognizing and learning object categories. In Computer Vision and Pattern Recognition Short Course (CVPR'07).Google Scholar
- Fei-Fei, L., Fergus, R., and Torralba, A. 2009. Recognizing and learning object categories. In International Conference on Computer Vision Short Course (ICCV'09).Google Scholar
- Felleman, D. J. and Van Essen, D. C. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1, 1--47.Google ScholarCross Ref
- Felzenszwalb, P., Mcallester, D., and Ramanan, D. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Felzenszwalb, P. F., Girshick, R. B., and Mcallester, D. 2010a. Cascade object detection with deformable part models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Felzenszwalb, P. F., Girshick, R. B., Mcallester, D., and Ramanan, D. 2010b. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627--1645. Google ScholarDigital Library
- Felzenszwalb, P. F. and Huttenlocher, D. P. 2005. Pictorial structures for object recognition. Int. J. Comput. Vis. 61, 55--79. Google ScholarDigital Library
- Felzenszwalb, P. F. and Veksler, O. 2010. Tiered scene labeling with dynamic programming. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Ferencz, A., Learned-Miller, E., and Malik, J. 2008. Learning to locate informative features for visual identification. Int. J. Comput. Vis. 77, 3--24. Google ScholarDigital Library
- Fergus, R., Li, F.-F., Perona, P., and Zisserman, A. 2010. Learning object categories from internet image searches. Proc. IEEE. 98, 1453--1466.Google ScholarCross Ref
- Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'03).Google Scholar
- Fergus, R., Perona, P., and Zisserman, A. 2005. A sparse object category model for efficient learning and exhaustive recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Fergus, R., Perona, P., and Zisserman, A. 2007. Weakly supervised scale-invariant learning of models for visual recognition. Int. J. Comput. Vis. 71, 273--303. Google ScholarDigital Library
- Ferrari, V., Fevrier, L., Jurie, F., and Schmid, C. 2008. Groups of adjacent contour segments for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 30, 36--51. Google ScholarDigital Library
- Fischler, M. A. and Elschlager, R. A. 1973. The representation and matching of pictorial structures. IEEE Trans. Comput. C-22, 67--92. Google ScholarDigital Library
- Fleuret, F. and Geman, D. 2001. Coarse-to-fine face detection. Int. J. Comput. Vis. 41, 85--107. Google ScholarDigital Library
- Fulkerson, B., Vedaldi, A., and Soatto, S. 2009. Class segmentation and object localization with superpixel neighborhoods. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Gall, J. and Lempitsky, V. 2009. Class-specific hough forests for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Gallagher, A., Neustaedter, C., Cao, L., Luo, J., and Chen, T. 2008. Image annotation using personal calendars as context. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'08). Google ScholarDigital Library
- Gallagher, A. C. and Chen, T. 2008. Estimating age, gender, and identity using first name priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Galleguillos, C. and Belongie, S. 2010. Context based object categorization: A critical survey. Comput Vis. Image Understand. 114, 712--722. Google ScholarDigital Library
- Galleguillos, C., Mcfee, B., Belongie, S., and Lanckriet, G. 2010. Multi-class object localization by combining local contextual interactions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Gehler, P. and Nowozin, S. 2009. On feature combination for multiclass object classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Girshick, R. B., Felzenszwalb, P. F., and Mcallester, D. 2011. Object detection with grammar models. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'11).Google Scholar
- Gonfaus, J. M., Boix, X., Van de Weijer, J., Bagdanov, A. D., Serrat, J., and Gonzalez, J. 2010. Harmony potentials for joint classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Gould, S., Fulton, R., and Koller, D. 2009a. Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Gould, S., Gao, T. S., and Koller, D. 2009b. Region-based segmentation and object detection. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'09).Google Scholar
- Grabner, H., Roth, P. M., and Bischof, H. 2007. Eigenboosting: Combining discriminative and generative information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Grauman, K. and Leibe, B. 2011. Visual object recognition. Synthesis Lectures Artif. Intell. Mach. Learn. 5, 1--181.Google ScholarCross Ref
- Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep., California Institute of Technology, 1-20. http://authors.library.caltech.edu/7694/1/CNS-TR-2007-001.pdf.Google Scholar
- Gu, C. H., Lim, J. J., Arbelaez, P., and Malik, J. 2009. Recognition using regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Guillaumin, M., Verbeek, J., and Schmid, C. 2010. Multimodal semi-supervised learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Hays, J. and Efros, A. A. 2008. IM2GPS: Estimating geographic information from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- He, X. M., Zemel, R., and Ray, D. 2006. Learning and incorporating top-down cues in image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- He, X. M., Zemel, R. S., and Carreira-Perpinan, M. A. 2004. Multiscale conditional random fields for image labeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Google ScholarDigital Library
- Heitz, G., Elidan, G., Packer, B., and Koller, D. 2009. Shape-based object localization for descriptive classification. Int. J. Comput. Vis. 84, 40--62. Google ScholarDigital Library
- Hochstein, S. and Ahissar, M. 2002. View from the top: Hierarchies and reverse hierarchies in the visual system. Neuron 36, 791--804.Google ScholarCross Ref
- Hofmann, T. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177--196. Google ScholarDigital Library
- Hoiem, D., Efros, A., and Hebert, M. 2008. Putting objects in perspective. Int. J. Comput. Vis. 80, 3--15. Google ScholarDigital Library
- Hoiem, D., Rother, C., and Winn, J. 2007a. 3D layoutcrf for multi-view object class recognition and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Hoiem, D., Stein, A., Efros, A., and Hebert, M. 2007b. Recovering occlusion boundaries from a single image. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Huang, Y. Z., Huang, K. Q., Wang, L. S., Tao, D. C., Tan, T. N., and Li, X. L. 2008. Enhanced biologically inspired model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Hwang, S. J. and Grauman, K. 2010. Reading between the lines: Object localization using implicit cues from image tags. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Jain, A., Gupta, A., and Davis, L. 2010. Learning what and how of contextual models for scene labeling. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Jhuang, H., Serre, T., Wolf, L., and Poggio, T. 2007. A biologically inspired system for action recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Ji, R. R., Yao, H. X., Sun, X. S., Zhong, B. N., and Gao, W. 2010. Towards semantic embedding in visual vocabulary. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Jiang, Y.-G., Yang, J., Ngo, C.-W., and Hauptmann, A. G. 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans. Multimedia 12, 42--53. Google ScholarDigital Library
- Joachims, T. 1997. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In Proceedings of the International Conference on Machine Learning (ICML'97). Google ScholarDigital Library
- Joachims, T. 1998. Making large-scale support vector machine learning practical. In Advances in Kernel Methods: Support Vector Machines, B. Scholkopf, J. C. Burges, and A. J. Smola, Eds. MIT Press, Cambridge, MA, 169--184. Google ScholarDigital Library
- Jones, J. P. and Palmer, L. A. 1987. An evaluation of the two-dimensional gabor filter model of simple receptive fields in cat striate cortex. J. Neurophys. 58, 1233--1258.Google ScholarCross Ref
- Jurie, F. and Triggs, B. 2005. Creating efficient codebooks for visual recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). Google ScholarDigital Library
- Karlinsky, L., Dinerstein, M., Harari, D., and Ullman, S. 2010. The chains model for detecting parts by their context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Ke, Y. and Sukthankar, R. 2004. PCA-sift: A more distinctive representation for local image descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Google ScholarDigital Library
- Knopp, J., Prasad, M., and Gool, L. V. 2011. Scene cut: Class-specific object detection and segmentation in 3d scenes. In Proceedings of the International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT'11). Google ScholarDigital Library
- Koh, K., Kim, S.-J., and Boyd, S. 2007. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519--1555. Google ScholarDigital Library
- Kohli, P., Ladicky, L., and Torr, P. 2008. Robust higher order potentials for enforcing label consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Kotsiantis, S. B. 2007. Supervised machine learning: A review of classification techniques. Informatica 31, 249--268.Google Scholar
- Krüger, V., Kragic, D., Ude, A., and Geib, C. 2007. The meaning of action: A review on action recognition and mapping. Advan. Robot. 21, 1473--1501.Google ScholarCross Ref
- Kuettel, D. and Ferrari, V. 2012. Figure-ground segmentation by transferring window masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12). Google ScholarDigital Library
- Kumar, M. P., Ton, P. H. S., and Zisserman, A. 2005. OBJCUT. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05).Google Scholar
- Kumar, M. P., Torr, P. H. S., and Zisserman, A. 2010. OBJCUT: Efficient segmentation using top-down and bottom-up cues. IEEE Trans. Pattern Anal. Mach. Intell. 32, 530--545. Google ScholarDigital Library
- Kumar, N., Belhumeur, P., and Nayar, S. 2008. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the European Conference on Computer Vision (ECCV'08). Google ScholarDigital Library
- Ladicky, L., Sturgess, P., Alahari, K., Russell, C., and Torr, P. 2010. What, where and how many? Combining object detectors and crfs. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Lalonde, J.-F., Hoiem, D., Efros, A. A., Rother, C., Winn, J., and Criminisi, A. 2007. Photo clip art. In Proceedings of the International Conference and Exhibition on Computer Graphics and Interactive Techniques (ACM/SIGGRAPH'07). Google ScholarDigital Library
- Lalonde, J.-F., Narasimhan, S. G., and Efros, A. A. 2010. What do the sun and the sky tell us about the camera? Int. J. Comput. Vis. 88, 24--51. Google ScholarDigital Library
- Lampert, C. H., Blaschko, M. B., and Hofmann, T. 2008. Beyond sliding windows: Object localization by efficient sub-window search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Laptev, I. 2006. Improvements of object detection using boosted histograms. In Proceedings of the British Machine Vision Conference (BMVC'06).Google ScholarCross Ref
- Laptev, I. 2009. Improving object detection with boosted histograms. Image Vis. Comput. 27, 535--544. Google ScholarDigital Library
- Larlus, D. and Jurie, F. 2008. Combining appearance models and markov random fields for category level object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Larlus, D., Verbeek, J., and Jurie, F. 2010. Category level object segmentation by combining bag-of-words models with dirichlet processes and random fields. Int. J. Comput. Vision 88, 238--253. Google ScholarDigital Library
- Lasserre, J. A., Bishop, C. M., and Minka, T. P. 2006. Principled hybrids of generative and discriminative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Lee, H., Battle, A., Raina, R., and Ng, A. Y. 2006. Efficient sparse coding algorithms. Adv. Neural Inf. Process. Syst. 19, 2007.Google Scholar
- Lee, Y. J. and Grauman, K. 2010. Object-graphs for context-aware category discovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Leibe, B., Leonardis, A., and Schiele, B. 2004. Combined object categorization and segmentation with an implicit shape model. In Proceedings of the ECCV Workshop on Statistical Learning in Computer Vision (ECCVW'04).Google Scholar
- Leibe, B., Leonardis, A., and Schiele, B. 2006. An implicit shape model for combined object categorization and segmentation. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 508--524.Google Scholar
- Leibe, B., Leonardis, A., and Schiele, B. 2008. Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vis. 77, 259--289. Google ScholarDigital Library
- Leibe, B., Seemann, E., and Schiele, B. 2005. Pedestrian detection in crowded scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Lempitsky, V., Kohli, P., Rother, C., and Sharp, T. 2009. Image segmentation with a bounding box prior. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Levin, A. and Weiss, Y. 2006. Learning to combine bottom-up and top-down segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Li, L.-J. and Fei-Fei, L. 2007. What, where and who? Classifying events by scene and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Li, L.-J. and Fei-Fei, L. 2010. OPTIMOL: Automatic online picture collection via incremental model learning. Int. J. Comput. Vis. 88, 147--168. Google ScholarDigital Library
- Liang, P. and Jordan, M. I. 2008. An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators. In Proceedings of the International Conference on Machine Learning (ICML'08). Google ScholarDigital Library
- Liebelt, J. and Schmid, C. 2010. Multi-view object class detection with a 3d geometric model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Liebelt, J., Schmid, C., and Schertler, K. 2008. Viewpoint-independent object class detection using 3d feature maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Lin, D., Kapoor, A., Hua, G., and Baker, S. 2010. Joint people, event, and location recognition in personal photo collections using cross-domain context. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Lin, Z. 2009. Modeling shape, appearance and motion for human movement analysis. Tech. rep., Department of Electrical and Computer Engineering, University of Maryland, College Park, Md. http://hdl.handle.net/1903/9279.Google Scholar
- Lin, Z. and Davis, L. S. 2010. Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Trans. Pattern Anal. Mach. Intell. 32, 604--618. Google ScholarDigital Library
- Lin, Z., Davis, L. S., Doermann, D., and Dementhon, D. 2007. Hierarchical part-template matching for human detection and segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Liu, C., Yuen, J., and Torralba, A. 2009a. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Liu, C., Yuen, J., Torralba, A., Sivic, J., and Freeman, W. 2008. SIFT flow: Dense correspondence across different scenes. In Proceedings of the European Conference on Computer Vision (ECCV'08). Google ScholarDigital Library
- Liu, T., Wang, J. D., Sun, J., Zheng, N. N., Tang, X. O., and Shum, H. Y. 2009b. Picture collage. IEEE Trans. Multimedia 11, 1225--1239. Google ScholarDigital Library
- Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91--110. Google ScholarDigital Library
- Lu, Z. W. and Ip, H. H. S. 2009. Image categorization with spatial mismatch kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Luo, J., Boutell, M., and Brown, C. 2006. Pictures are not taken in a vacuum. IEEE Signal Process. Mag. 23, 101--114.Google ScholarCross Ref
- Maire, M., Yu, S. X., and Perona, P. 2011. Object detection and segmentation from joint embedding of parts and pixels. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). Google ScholarDigital Library
- Maji, S., Berg, A. C., and Malik, J. 2008. Classification using intersection kernel support vector machines is efficient. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Malisiewicz, T. and Efros, A. A. 2008. Recognition by association via learning per-exemplar distances. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Marszałek, M. and Schmid, C. 2007. Accurate object localization with shape masks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1615--1630. Google ScholarDigital Library
- Moosmann, F., Nowak, E., and Jurie, F. 2008. Randomized clustering forests for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1632--1646. Google ScholarDigital Library
- Mu, Y., Yan, S., Liu, Y., Huang, T., and Zhou, B. 2008. Discriminative local binary patterns for human detection in personal album. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Mutch, J. and Lowe, D. 2008. Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vis. 80, 45--57. Google ScholarDigital Library
- Mutch, J. and Lowe, D. G. 2006. Multiclass object recognition with sparse, localized features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Nakayama, H., Harada, T., and Kuniyoshi, Y. 2010. Global gaussian approach for scene categorization using information geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Narasimhan, S. and Nayar, S. 2002. Vision and the atmosphere. Int. J. Comput. Vis. 48, 233--254. Google ScholarDigital Library
- Ng, A. and Jordan, M. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'02).Google Scholar
- Ni, B. B., Yan, S. C., and Kassim, A. 2009. Contextualizing histogram. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Nowak, E., Jurie, F., and Triggs, B. 2006. Sampling strategies for bag-of-features image classification. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multi-resolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971--987. Google ScholarDigital Library
- Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145--175. Google ScholarDigital Library
- Oliva, A. and Torralba, A. 2006. Building the gist of a scene: The role of global image features in recognition. Progress Brain Res. 155, 23--36.Google ScholarCross Ref
- Oliva, A. and Torralba, A. 2007. The role of context in object recognition. Trends Cogn. Sci. 11, 520--527.Google ScholarCross Ref
- Opelt, A., Pinz, A., and Zisserman, A. 2006. A boundary-fragment-model for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Palmese, M. and Trucco, A. 2008. From 3-d sonar images to augmented reality models for objects buried on the seafloor. IEEE Trans. Instrument. Measure. 57, 820--828.Google ScholarCross Ref
- Parikh, D. and Zitnick, C. L. 2010. The role of features, algorithms and data in visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Park, D., Ramanan, D., and Fowlkes, C. 2010. Multiresolution models for object detection. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Pedersoli, M., Vedaldi, A., and Gonzalez, J. 2011. A coarse-to-fine approach for fast deformable object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Perronnin, F. 2008. Universal and adapted vocabularies for generic visual categorization. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1243--1256. Google ScholarDigital Library
- Perrotton, X., Sturzel, M., and Roux, M. 2010. Implicit hierarchical boosting for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Pinz, A. 2005. Object categorization. Foundat. Trends Comput. Graph. Vis. 1, 4, 255--353. Google ScholarDigital Library
- Ponce, J., Berg, T. L., Everingham, M., Forsyth, D. A., Hebert, M., Lazebnik, S., Marszałek, M., Schmid, C., Russell, B. C., Torralba, A., Williams, C. K. I., Zhang, J., and Zisserman, A. 2006a. Dataset issues in object recognition. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman Eds., Springer, 29--48.Google Scholar
- Ponce, J., Hebert, M., Schmid, C., and Zisserman, A. 2006b. Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, Springer. Google ScholarDigital Library
- Porikli, F. 2005. Integral histogram: A fast way to extract histograms in cartesian spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. 2007. Objects in context. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Ravishankar, S., Jain, A., and Mittal, A. 2008. Multi-stage contour based detection of deformable objects. In Proceedings of the European Conference on Computer Vision (ECCV'08). Google ScholarDigital Library
- Razavi, N., Gall, J., and Van Gool, L. 2010. Backprojection revisited: Scalable multi-view object detection and similarity metrics for detections. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Ren, X. and Malik, J. 2003. Learning a classification model for segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03). Google ScholarDigital Library
- Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neurosci. 2, 1019--1025.Google ScholarCross Ref
- Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. AutoCollage. ACM Trans. Graph. 25, 3, 847--852. Google ScholarDigital Library
- Rother, C., Kolmogorov, V., and Blake, A. 2004. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 3, 309--314. Google ScholarDigital Library
- Rubinstein, D. and Hastie, T. 1997. Discriminative vs informative learning. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDDM'97).Google Scholar
- Rubner, Y., Tomasi, C., and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99--121. Google ScholarDigital Library
- Rui, X., Li, M., Li, Z., Ma, W.-Y., and Yu, N. 2007. Bipartite graph reinforcement model for web image annotation. In Proceedings of the ACM International Conference on Multimedia (ACM/MM'07). Google ScholarDigital Library
- Russell, B., Torralba, A., Liu, C., Fergus, R., and Freeman, W. 2007. Object recognition by scene alignment. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).Google Scholar
- Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157--173. Google ScholarDigital Library
- Sabzmeydani, P. and Mori, G. 2007. Detecting pedestrians by learning shapelet features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Saffari, A., Godec, M., Pock, T., Leistner, C., and Bischof, H. 2010. Online multi-class lpboost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Salakhutdinov, R., Torralba, A., and Tenenbaum, J. 2011. Learning to share visual appearance for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Salzmann, M. and Urtasun, R. 2010. Combining discriminative and generative methods for 3d deformable surface and articulated pose reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Savarese, S. and Li, F.-F. 2007. 3D generic object categorization, localization and pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Savarese, S., Winn, J., and Criminisi, A. 2006. Discriminative object class models of appearance and shape by correlatons. In Preceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Schindler, K., Van Gool, L., and De Gelder, B. 2008. Recognizing emotions expressed by body pose: A biologically inspired neural model. Neural Netw. 21, 1238--1246. Google ScholarDigital Library
- Schroff, F. 2009. Semantic image segmentation and web-supervised visual learning. Tech. rep., Robotics Research Group, Department of Engineering Science. University of Oxford, Oxford, UK. http://www.robots.ox.ac.uk/∼vgg/publications/papers/schroff09.pdf.Google Scholar
- Seemann, E., Leibe, B., and Schiele, B. 2006. Multi-aspect detection of articulated objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Serre, T., Oliva, A., and Poggio, T. 2007a. A feed-forward architecture accounts for rapid categorization. Proc. National Acad. Sci. 104, 6424--6429.Google ScholarCross Ref
- Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. 2007b. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411--426. Google ScholarDigital Library
- Serre, T., Wolf, L., and Poggio, T. 2005. Object recognition with features inspired by visual cortex. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Shechtman, E. and Irani, M. 2007. Matching local self-similarities across images and videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Shin, Y., Kim, Y., and Kim, E. Y. 2010. Automatic textile image annotation by predicting emotional concepts from visual features. Image Vis. Comput. 28, 526--537. Google ScholarDigital Library
- Shotton, J., Blake, A., and Cipolla, R. 2005. Contour-based learning for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). Google ScholarDigital Library
- Shotton, J., Blake, A., and Cipolla, R. 2008a. Multiscale categorical object recognition using contour fragments. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1270--1281. Google ScholarDigital Library
- Shotton, J., Johnson, M., and Cipolla, R. 2008b. Semantic texton forests for image categorization and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2006. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of the European Conference on Computer Vision (ECCV'06). Google ScholarDigital Library
- Shotton, J., Winn, J., Rother, C., and Criminisi, A. 2009. TextonBoost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2--23. Google ScholarDigital Library
- Simon, I. and Seitz, S. 2008. Scene segmentation using the wisdom of crowds. In Proceedings of the European Conference on Computer Vision (ECCV'08). Google ScholarDigital Library
- Sivic, J., Russell, B. C., Efros, A. A., Zisserman, A., and Freeman, W. T. 2005. Discovering objects and their location in images. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). Google ScholarDigital Library
- Sivic, J. and Zisserman, A. 2003. Video google: Text retrieval approach to object matching in videos. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03). Google ScholarDigital Library
- Snavely, N., Simon, I., Goesele, M., Szeliski, R., and Seitz, S. M. 2010. Scene reconstruction and visualization from community photo collections. Proc. IEEE. 98, 1370--1390.Google ScholarCross Ref
- Song, D. J. and Tao, D. C. 2010. Biologically inspired feature manifold for scene classification. IEEE Trans. Image Process. 19, 174--184. Google ScholarDigital Library
- Song, Z., Chen, Q., Huang, Z., Hua, Y., and Yan, S. 2011. Contextualizing object detection and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Sonnenburg, S., Rutsch, G., Schafer, C., and Scholkopf, B. 2006. Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531--1565. Google ScholarDigital Library
- Strat, T. 1993. Employing contextual information in computer vision. In Proceedings of the ARPA Image Understanding Workshop. 217--229.Google Scholar
- Sutton, C. and McCallum, A. 2006. An introduction to conditional random fields for relational learning. In Introduction to Statistical Relational Learning, L. Getoor and B. Taskar, Eds., MIT Press. http://people.cs.umass.edu/∼mccallum/papers/crf-tutorial.pdf.Google Scholar
- Szeliski, R. 2010. Computer Vision: Algorithms and Applications. Springer. Google Scholar
- Tao, L., Yuan, L., and Sun, J. 2009. SkyFinder: Attribute-based sky image search. In ACM SIGGRAPH Papers. Google ScholarDigital Library
- Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., and Van Gool, L. 2006. Towards multi-view object class detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Torralba, A. 2003. Contextual priming for object detection. Int. J. Comput. Vis. 53, 169--191. Google ScholarDigital Library
- Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1958--1970. Google ScholarDigital Library
- Torralba, A., Murphy, K., and Freeman, W. 2006. Shared features for multiclass object detection. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 345--361.Google Scholar
- Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Sharing features: Efficient boosting procedures for multiclass object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Google ScholarDigital Library
- Torralba, A., Murphy, K. P., Freeman, W. T., and Rubin, M. A. 2003. Context-based vision system for place and object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'03). Google ScholarDigital Library
- Tu, Z. W. 2007. Learning generative models via discriminative approaches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google ScholarCross Ref
- Ulusoy, I. and Bishop, C. 2006. Comparison of generative and discriminative techniques for object detection and classification. In Toward Category-Level Object Recognition, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, Eds., Springer, 173--195.Google Scholar
- Ulusoy, I. and Bishop, C. M. 2005. Generative versus discriminative methods for object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'05). Google ScholarDigital Library
- Van De Sande, K., Gevers, T., and Snoek, C. 2008. Evaluation of color descriptors for object and scene recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Van De Sande, K., Gevers, T., and Snoek, C. 2010. Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582--1596. Google ScholarDigital Library
- Van De Sande, K., Uijlings, J., Gevers, T., and Smeulders, A. 2011. Segmentation as selective search for object recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11). Google ScholarDigital Library
- Van Gemert, J. C., Veenman, C. J., Smeulders, A. W. M., and Geusebroek, J. M. 2010. Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1271--1283. Google ScholarDigital Library
- Vapnik, V. N. 1998. Statistical Learning Theory. A Wiley-Interscience Publication, New York.Google Scholar
- Varma, M. and Ray, D. 2007. Learning the discriminative power-invariance trade-off. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Vedaldi, A., Gulshan, V., Varma, M., and Zisserman, A. 2009. Multiple kernels for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Verbeek, J. and Triggs, B. 2007a. Region classification with markov field aspect models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Verbeek, J. and Triggs, B. 2007b. Scene segmentation with conditional random fields learned from partially labeled images. In Proceedings of the Conference on Advances in Neural Information Processing Systems. (NIPS'07).Google Scholar
- Vijayanarasimhan, S. and Grauman, K. 2011. Efficient region search for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'01).Google Scholar
- Walk, S., Majer, N., Schindler, K., and Schiele, B. 2010. New features and insights for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Wang, G., Gallagher, A., Luo, J., and Forsyth, D. 2010a. Seeing people in social context: Recognizing people and social relationships. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Wang, G., Hoiem, D., and Forsyth, D. 2009a. Building text features for object image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. 2010b. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Wang, X. and Grimson, E. 2007. Spatial latent dirichlet allocation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS'07).Google Scholar
- Wang, X., Han, T. X., and Yan, S. 2009b. An hog-lbp human detector with partial occlusion handling. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Wang, Y. and Mori, G. 2009. Max-margin hidden conditional random fields for human action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Wang, Y. and Mori, G. 2010. Hidden part models for human action recognition: Probabilistic vs. max-margin. IEEE Trans. Pattern Anal. Mach. Intell. 33, 7, 1310--1323. Google ScholarDigital Library
- Wang, Z., Hu, Y., and Chia, L.-T. 2010c. Image-to-class distance metric learning for image classification. In Proceedings of the European Conference on Computer Vision (ECCV'10). Google ScholarDigital Library
- Watanabe, T., Ito, S., and Yokoi, K. 2009. Co-occurrence histograms of oriented gradients for pedestrian detection. In Proceedings of the Pacific-Rim Symposium on Image and Video Technology (PSIVT'09). Google ScholarDigital Library
- Wei, Y. C. and Tao, L.T. 2010. Efficient histogram-based sliding window. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Winn, J., Criminisi, A., and Minka, T. 2005. Object categorization by learned universal visual dictionary. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). Google ScholarDigital Library
- Wnuk, K. and Soatto, S. 2008. Filtering internet image search results towards keyword based category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Wojek, C. and Schiele, B. 2008. A performance evaluation of single and multi-feature people detection. In Proceedings of the German Association for Pattern Recognition (DAGM'08). Google ScholarDigital Library
- Wojek, C., Walk, S., and Schiele, B. 2009. Multi-cue onboard pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Wright, J., Yi, M., Mairal, J., Sapiro, G., Huang, T. S., and Shuicheng, Y. 2010. Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031--1044.Google ScholarCross Ref
- Wu, B. and Nevatia, R. 2005. Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). Google ScholarDigital Library
- Wu, B. and Nevatia, R. 2007a. Cluster boosted tree classifier for multi-view, multi-pose object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Wu, B. and Nevatia, R. 2007b. Improving part based object detection by unsupervised, online boosting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'07).Google Scholar
- Wu, B. and Nevatia, R. 2007c. Simultaneous object detection and segmentation by boosting local shape feature based classifier. In Proceedings of the IEEE Conference on Computer Vision and Pattern Reconition (CVPR'07).Google Scholar
- Wu, Z., Ke, Q. F., Isard, M., and Sun, J. 2009. Bundling features for large scale partial-duplicate web image search. In Proceedings of the IEEE Conferenc on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Xiang, Y., Zhou, X. D., Liu, Z. T., Chua, T. S., and Ngo, C.-W. 2010. Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Xu, H., Zhou, X., Wang, M., Xiang, Y., and Shi, B. 2009. Exploring flickr's related tags for semantic annotation of web images. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR'09). Google ScholarDigital Library
- Xu, Z., Chen, H., Zhu, S.-C., and Luo, J. 2008. A hierarchical compositional model for face representation and sketching. IEEE Trans. Pattern Anal. Mach. Intell. 30, 955--969. Google ScholarDigital Library
- Xue, J.-H. 2008. Aspects of generative and discriminative classifiers. Tech. rep., Information and Mathematical Sciences, Department of Statistics, University of Glasgow.Google Scholar
- Xue, J.-H. and Titterington, D. 2008. Comment on “on discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes”. Neural Process. Lett. 28, 169--187. Google ScholarDigital Library
- Xue, J.-H. and Titterington, D. M. 2010. On the generative-discriminative tradeoff approach: Interpretation, asymptotic efficiency and classification performance. Comput. Statist. Data Anal. 54, 438--451. Google ScholarDigital Library
- Yan, P. K., Khan, S. M., and Shah, M. 2007. 3D model based object class detection in an arbitrary view. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).Google Scholar
- Yang, B., Mei, T., Sun, L.-F., Yang, S.-Q., and Hua, X.-S. 2008a. Free-shaped video collage. In Proceedings of the 14th International Conference on Advances in Multimedia Modeling. Google ScholarDigital Library
- Yang, J. C., Yu, K., Gong, Y. H., and Huang, T. 2009. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Yang, L., Jin, R., Sukthankar, R., and Jurie, F. 2008b. Unifying discriminative visual codebook generation with classifier training for object category recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08).Google Scholar
- Yang, Y. and Ramanan, D. 2011. Articulated pose estimation with flexible mixtures-of-parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Yao, B. Z., Yang, X., Lin, L., Lee, M. W., and Zhu, S. C. 2010. I2T: Image parsing to text description. Proc. IEEE. 98, 1485--1508.Google ScholarCross Ref
- Yeh, T., Lee, J. J., and Darrell, T. 2009. Fast concurrent object localization and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09).Google Scholar
- Yu, C. N. J. and Joachims, T. 2009. Learning structural svms with latent variables. In Proceedings of the International Conference on Machine Learning (ICML'09). Google ScholarDigital Library
- Zhang, C., Liu, J., Tian, Q., Xu, C., Lu, H., and Ma, S. 2011a. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Zhang, D. Q. and Chang, S. F. 2006. A generative-discriminative hybrid method for multi-view object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Zhang, J., Marszałek, M., Lazebnik, S., and Schmid, C. 2007. Local features and kernels for classification of texture and object categories: A comprehensive study. Int. J. Comput. Vis. 73, 213--238. Google ScholarDigital Library
- Zhang, J. G., Huang, K. Q., Yu, Y. N., and Tan, T. N. 2011b. Boosted local structured hog-lbp for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). Google ScholarDigital Library
- Zhang, Z. Q., Cao, Y., Salvi, D., Oliver, K., Waggoner, J., and Wang, S. 2010. Free-shape subwindow search for object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Zheng, W. S., Gong, S. G., and Xiang, T. 2009. Quantifying contextual information for object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09).Google Scholar
- Zhu, L., Chen, Y., Lin, C., and Yuille, A. 2011. Max margin learning of hierarchical configural deformable templates (hcdts) for efficient object parsing and pose estimation. Int. J. Comput. Vis. 93, 1--21. Google ScholarDigital Library
- Zhu, L., Chen, Y. H., Yuille, A., and Freeman, W. 2010. Latent hierarchical structural learning for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).Google Scholar
- Zhu, M. 2004. Recall, precision and average precision. Working paper, University of Waterloo.Google Scholar
- Zhu, Q., Yeh, M. C., Cheng, K. T., and Avidan, S. 2006. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). Google ScholarDigital Library
- Zhu, S.-C. and Mumford, D. 2006. A stochastic grammar of images. Foundations Trends Comput. Graph. Vis. 2, 259--362. Google ScholarDigital Library
Index Terms
- Object class detection: A survey
Recommendations
Unsupervised object discovery via self-organisation
Object discovery in visual object categorisation (VOC) is the problem of automatically assigning class labels to objects appearing in given images. To achieve state-of-the-art results in this task, a large set of positive and negative training images ...
From Images to Shape Models for Object Detection
We present an object class detection approach which fully integrates the complementary strengths offered by shape matchers. Like an object detector, it can learn class models directly from images, and can localize novel instances in the presence of ...
Scalable multi-class object detection
CVPR '11: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern RecognitionScalability of object detectors with respect to the number of classes is a very important issue for applications where many object classes need to be detected. While combining single-class detectors yields a linear complexity for testing, multi-class ...
Comments