Abstract
Deformable Part Models (DPMs) play a prominent role in current object recognition research, as they rigorously model the shape variability of an object category by breaking an object into parts and modelling the relative locations of the parts. Still, inference with such models requires solving a combinatorial optimization task. In this chapter, we will see how Branch-and-Bound can be used to efficiently perform inference with such models. Instead of evaluating the classifier score exhaustively for all part locations and scales, such techniques allow us to quickly focus on promising image locations. The core problem that we will address is how to compute bounds that accommodate part deformations; this allows us to apply Branch-and-Bound to our problem. When comparing to a baseline DPM implementation, we obtain exactly the same results but can perform the part combination substantially faster, yielding up to tenfold speedups for single object detection, or even higher speedups for multiple objects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. PAMI 24 (4), 509–522 (2002)
Boussaid, H., Kokkinos, I.: Fast and exact: ADMM-based discriminative shape segmentation with loopy part models. In: CVPR, Columbus (2014)
Boussaid, H., Kokkinos, I., Paragios, N.: Rapid mode estimation for 3D brain MRI tumor segmentation. In: Energy Minimization Methods in Computer Vision and Pattern Recognition, Lund (2013)
Chen, Y., Zhu, L., Lin, C., Yuille, A.L., Zhang, H.: Rapid inference on a novel AND/OR graph for object detection, segmentation and parsing. In: Proceedings of NIPS, Vancouver (2007)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of CVPR, San Diego (2005)
Dean, T., Ruzon, M., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J.: Fast, accurate detection of 100,000 object classes on a single machine. In: Proceedings of CVPR, Portland (2013)
Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. In: ECCV (3), Florence (2012)
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61 (1), 55–79 (2005)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proceedings of CVPR, Anchorage (2008)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A.: Cascade object detection with deformable part models. In: Proceedings of CVPR, San Francisco (2010)
Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions. Technical report, Cornell CS (2004)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of CVPR, Madison (2003)
Ferrari, V., Marin-Jimenez, M.J., Zisserman, A.: Progressive search space reduction for human pose estimation. In: Proceedings of CVPR, Anchorage (2008)
Fleuret, F., Geman, D.: Coarse-to-fine face detection. Int. J. Comput. Vis. 41 (1/2), 85–107 (2001)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of CVPR, Columbus (2014)
Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. arXiv preprint arXiv:1409.5403 (2014)
Girshick, R.B., Felzenszwalb, P.F., McAllester, D.: Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/~rbg/latent-release5/
Gray, A.G., Moore, A.W.: Nonparametric density estimation: toward computational tractability. In: SIAM International Conference on Data Mining, San Francisco (2003)
Grimson, W.E.L.: Object Recognition by Computer: The Role of Geometric Constraints. MIT Press, Cambridge, MA (1990). ISBN:0-262-07130-4. http://dl.acm.org/citation.cfm?id=102900
Huttenlocher, D., Klanderman, G., Rucklidge, W.: Comparing images using the Hausdorff distance. IEEE Trans. PAMI 15 (9), 850–863 (1993)
Ihler, A., Sudderh, E., Freeman, W., Willsky, A.: Efficient multiscale sampling from products of Gaussian mixtures. In: Proceedings of NIPS, Vancouver (2003)
Ihler, A., Sudderth, E., Freeman, W., Willsky, A.: Efficient sampling of Gaussian distributions. In: Proceedings of NIPS, Vancouver (2004)
Jordan, M.: Graphical models. Stat. Sci. 19, 140–155 (2004)
Kokkinos, I.: Rapid deformable object detection using dual-tree branch-and-bound. In: Proceedings of NIPS, Granada (2011)
Kokkinos, I.: Bounding part scores for rapid detection with deformable part models. In: 2nd Parts and Attributes Workshop, in Conjunction with ECCV 2012, Florence (2012)
Kokkinos, I.: Shufflets: shared mid-level parts for fast multi-category detection. In: ICCV – International Conference on Computer Vision, Sydney (2013)
Kokkinos, I., Yuille, A.: Inference and learning with hierarchical shape models. Int. J. Comput. Vis. 93, 201–225 (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Proceedings of NIPS, Lake Tahoe (2012)
Lampert, C., Blaschko, M., Hofmann, T.: Beyond sliding windows: object localization by efficient subwindow search. In: Proceedings of CVPR, Anchorage (2008)
Lampert, C.H.: An efficient divide-and-conquer cascade for nonlinear object detection. In: Proceedings of CVPR, San Francisco (2010)
Lehmann, A., Leibe, B., Gool, L.V.: Fast PRISM: branch and bound hough transform for object class detection. Int. J. Comput. Vis. 94 (2), 175–197 (2011)
Lempitsky, V., Blake, A., Rother, C.: Image segmentation by branch-and-mincut. In: Proceedings of ECCV, Marseille (2008)
Lowe, D.: Perceptual Organization and Visual Recognition. Kluwer, Boston (1985)
Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of ICCV, Kerkyra (1999)
Moreels, P., Maire, M., Perona, P.: Recognition by probabilistic hypothesis construction. In: Proceedings of ECCV, Prague, p. 55 (2004)
Mundy, J.L., Zisserman, A. (eds.): Geometric invariance in computer vision. MIT Press, Cambridge (1992)
Savalle, P.-A., Tsogkas, S., Papandreou, G., Kokkinos, I.: Deformable part models with CNN features. In: 3rd Parts and Attributes Workshop, ECCV, Zurich (2014)
Papandreou, G., Kokkinos, I., Savalle, P.A.: Untangling local and global deformations in deep convolutional networks for image classification and sliding window detection. arXiv (2014)
Pedersoli, M., Vedaldi, A., GonzĂ lez, J.: A coarse-to-fine approach for fast deformable object detection. In: Proceedings of CVPR, Colorado Springs (2011)
Pirsiavash, H., Ramanan, D.: Steerable part models. In: CVPR, Providence (2012)
Sadeghi, M.A., Forsyth, D.A.: Fast template evaluation with vector quantization. In: NIPS, Lake Tahoe (2013)
Sadeghi, M.A., Forsyth, D.A.: 30 hz object detection with DPM V5. In: ECCV, Zurich (2014)
Sapp, B., Toshev, A., Taskar, B.: Cascaded models for articulated pose estimation. In: Proceedings of ECCV, Heraklion (2010)
Song, H.O., Zickler, S., Althoff, T., Girshick, R.B., Fritz, M., Geyer, C., Felzenszwalb, P.F., Darrell, T.: Sparselet models for efficient multiclass object detection. In: Proceedings of ECCV, Florence (2012)
Trulls, E., Tsogkas, S., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F.: Segmentation-aware deformable part models. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, 23–28 June, pp. 168–175 (2014)
Vedaldi, A., Zisserman, A.: Sparse kernel approximations for efficient classification and detection. In: Proceedings of CVPR, Providence (2012)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Kauai (2001)
Wan, L., Eigen, D., Fergus, R.: End-to-end integration of a convolutional network, deformable parts model and non-maximum suppression. arXiv (2014)
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Proceedings of ECCV, Dublin (2000)
Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35 (12), 2878–2890 (2013)
Zhu, S.C., Mumford, D.: Quest for a stochastic grammar of images. Found. Trends Comput. Graph. Vis. 2 (4), 259–362 (2007)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, 16–21 June, pp. 2879–2886 (2012)
Zisserman, A., Forsyth, D.A., Mundy, J.L., Rothwell, C.A., Liu, J., Pillow, N.: 3D object recognition using invariance. Artif. Intell. 78, 239–288 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kokkinos, I. (2016). Accelerating Deformable Part Models with Branch-and-Bound. In: BreuĂź, M., Bruckstein, A., Maragos, P., Wuhrer, S. (eds) Perspectives in Shape Analysis. Mathematics and Visualization. Springer, Cham. https://doi.org/10.1007/978-3-319-24726-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-24726-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24724-3
Online ISBN: 978-3-319-24726-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)