Abstract
Object detection and segmentation are two challenging tasks in computer vision, which are usually considered as independent steps. In this paper, we propose a framework which jointly optimizes for both tasks and implicitly provides detection hypotheses and corresponding segmentations. Our novel approach is attachable to any of the available generalized Hough voting methods. We introduce Hough Regions by formulating the problem of Hough space analysis as Bayesian labeling of a random field. This exploits provided classifier responses, object center votes and low-level cues like color consistency, which are combined into a global energy term. We further propose a greedy approach to solve this energy minimization problem providing a pixel-wise assignment to background or to a specific category instance. This way we bypass the parameter sensitive non-maximum suppression that is required in related methods. The experimental evaluation demonstrates that state-of-the-art detection and segmentation results are achieved and that our method is inherently able to handle overlapping instances and an increased range of articulations, aspect ratios and scales.
This work was supported by the Austrian Research Promotion Agency (FFG) under the projects CityFit (815971/14472-GLE/ROD) and MobiTrick (8258408) in the FIT-IT program and SHARE (831717) in the IV2Splus program and the Austrian Science Fund (FWF) under the projects MASA (P22299) and Advanced Learning for Tracking and Detection in Medical Workflow Analysis (I535-N23).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ramanan, D.: Using segmentation to verify object hypotheses. In: CVPR (2007)
Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, Where and How Many? Combining Object Detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Barinova, O., Lempitsky, V., Kohli, P.: On the detection of multiple object instances using hough transforms. PAMI 34, 1773–1784 (2012)
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. IJCV 95, 1–12 (2011)
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77, 259–289 (2008)
Borenstein, E., Ullman, S.: Class-Specific, Top-Down Segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part II. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)
Yu, S., Shi, J.: Object-specific figure-ground segregation. In: CVPR (2003)
Amit, Y., Geman, D., Fan, X.: A coarse-to-fine strategy for multiclass shape detection. PAMI 26, 1606–1621 (2004)
Larlus, D., Jurie, F.: Combining appearance models and markov random fields for category level object segmentation. In: CVPR (2008)
Gu, C., Lim, J., Arbelaez, P., Malik, J.: Recognition using regions. In: CVPR (2009)
Tu, Z., Chen, X., Yuille, A., Zhu, S.: Image parsing: Unifying segmentation, detection, and recognition. IJCV 62, 113–140 (2005)
Gould, S., Gao, T., Koller, D.: Region-based segmentation and object detection. In: NIPS (2009)
Wojek, C., Schiele, B.: A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008)
Winn, J., Shotton, J.: The layout consistent random field for recognizing and segmenting partially occluded objects. In: CVPR (2006)
Yang, Y., Hallman, S., Ramanan, D., Fowlkes, C.: Layered object models for image segmentation. PAMI 34, 1731–1743 (2011)
Floros, G., Rematas, K., Leibe, B.: Multi-Class Image Labeling with Top-Down Segmentation and Generalized Robust P^N Potentials. In: BMVC (2011)
Gall, J., Lempitsky, V.: Class-specific hough forests for object detection. In: CVPR (2009)
Maji, S., Malik, J.: Object detection using a max-margin hough transform. In: CVPR (2009)
Okada, R.: Discriminative generalized hough transform for object dectection. In: ICCV (2009)
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (2008)
Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. IJCV 82, 302–324 (2009)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23, 1222–1239 (2001)
Ommer, B., Malik, J.: Multi-scale object detection by clustering lines. In: ICCV (2009)
Yarlagadda, P., Monroy, A., Ommer, B.: Voting by Grouping Dependent Parts. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 197–210. Springer, Heidelberg (2010)
Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: CVPR (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Riemenschneider, H., Sternig, S., Donoser, M., Roth, P.M., Bischof, H. (2012). Hough Regions for Joining Instance Localization and Segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-33712-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33711-6
Online ISBN: 978-3-642-33712-3
eBook Packages: Computer ScienceComputer Science (R0)