Abstract
Skeletonization algorithms typically decompose an object’s silhouette into a set of symmetric parts, offering a powerful representation for shape categorization. However, having access to an object’s silhouette assumes correct figure-ground segmentation, leading to a disconnect with the mainstream categorization community, which attempts to recognize objects from cluttered images. In this paper, we present a novel approach to recovering and grouping the symmetric parts of an object from a cluttered scene. We begin by using a multiresolution superpixel segmentation to generate medial point hypotheses, and use a learned affinity function to perceptually group nearby medial points likely to belong to the same medial branch. In the next stage, we learn higher granularity affinity functions to group the resulting medial branches likely to belong to the same object. The resulting framework yields a skeletal approximation that is free of many of the instabilities that occur with traditional skeletons. More importantly, it does not require a closed contour, enabling the application of skeleton-based categorization systems to more realistic imagery.












Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Both the shape and the appearance affinities, as well as final affinity \(A_s\), were trained with a regularization parameter of \(0.5\) on the L1-norm of the logistic coefficients.
All the logistic regressors for part affinities were trained with a regularization parameter of 0.1 on the L1-norm of the logistic coefficients.
The dataset can be downloaded from http://www.cs.toronto.edu/~babalex/horse_parts_dataset.tgz.
Supplementary material (http://www.cs.toronto.edu/~babalex/symmetry_supplementary.tgz) contains additional examples.
References
Biederman, I. (1985). Human image understanding: Recent research and a theory. Computer Vision, Graphics and Image Processing, 32, 29–73.
Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147.
Binford, T. O. (1971). Visual perception by computer. In: Proceedings, IEEE Conference on Systems and Control. Miami.
Blum, H. A. (1967). Transformation for extracting new descriptors of shape. In W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form (pp. 362–380). Cambridge: MIT Press.
Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In: European Conference on Computer Vision (pp. 109–124).
Brady, M., & Asada, H. (1984). Smoothed local symmetries and their implementation. International Journal of Robotics Research, 3(3), 36–61.
Carreira, J., & Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In: IEEE International Conference on Computer Vision and Pattern Recognition.
Carreira, J. Sminchisescu, C. (2012). CPMC: Automatic object segmentation using constrained parametric min-cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Cham, T. J., & Cipolla, R. (1995). Symmetry detection through local skewed symmetries. Image Vision Computer, 13(5), 439–450.
Cham, T. J., & Cipolla, R. (1996). Geometric saliency of curve correspondences and grouping of symmetric contours. In: European Conference on Computer Vision (pp. 385–398). Florence.
Connell, J. H., & Brady, M. (1987). Generating and generalizing models of visual objects. Artificial Intelligence, 31(2), 159–183.
Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models-their training and application. Computer Vision and Image Understanding, 61(1), 38–59.
Crowley, J., & Parker, A. (1984). A representation for shape based on peaks and ridges in the difference of low-pass transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(2), 156–169.
Crowley, J., & Sanderson, A. C. (1987). Multiple resolution representation and probabilistic matching of 2-D gray-scale shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(1), 113–121.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Hoffman, D. D., Richards, W., Pentland, A., Rubin, J., & Scheuhammer, J. (1984). Parts of recognition. Cognition, 18, 65–96.
Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: Active contour models. International Journal of Computer Vision, 1(4), 321–331.
Kolmogorov, V., Boykov, Y., & Rother, C. (2007). Applications of parametric maxflow in computer vision. In: IEEE International Conference on Computer Vision (pp. 1–8).
Levinshtein, A., Dickinson, S., & Sminchisescu, C. (2009). Multiscale symmetric part detection and grouping. In: IEEE International Conference on Computer Vision.
Levinshtein, A., Sminchisescu, C., & Dickinson, S. (2005). Learning hierarchical shape models from examples. In: International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (pp. 251–267).
Levinshtein, A., Sminchisescu, C., & Dickinson, S. (2010). Optimal contour closure by superpixel grouping. In: ECCV.
Lindeberg, T. (1996). Edge detection and ridge detection with automatic scale selection. In: IEEE International Conference on Computer Vision and Pattern Recognition (pp. 465–470).
Lindeberg, T., & Bretzner, L. (2003). Real-time scale selection in hybrid multi-scale representations. In: Scale-space vol. 2695, (pp. 148–163). Springer LNCS.
Liu, T., Geiger, D., & Yuille, A. (1998). Segmenting by seeking the symmetry axis. In: IEEE International Conference on Pattern Recognition, vol. 2, (pp. 994–998).
Liu, Y., Hel-Or, H., Kaplan, C. S., & Gool, L. V. (2010). Computational symmetry in computer vision and computer graphics: A survey. Foundations and Trends in Computer Graphics and Vision, 5(2), 1–195.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Macrini, D., Dickinson, S., Fleet, D., & Siddiqi, K. (2011). Bone graphs: Medial shape parsing and abstraction. Computer Vision and Image Understanding, 115, 1044–1061.
Macrini, D., Dickinson, S., Fleet, D., & Siddiqi, K. (2011). Object categorization using bone graphs. Computer Vision and Image Understanding, 115, 1187–1206.
Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 530–549.
Mikolajczyk, K., & Schmid, C. (2002). An affine invariant interest point detector. European Conference on Computer Vision, (pp. 128–142). London: Springer.
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., et al. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1–2), 43–72.
Mori, G. (2005). Guiding model search using segmentation. In: IEEE International Conference on Computer Vision (pp. 1417–1423).
Mori, G., Ren, X., Efros, A. A., & Malik, J. (2004). Recovering human body configurations: Combining segmentation and recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (pp. 326–333).
Munoz, D., Bagnell, J. A., & Hebert, M. (2010). Stacked hierarchical labeling. In: ECCV.
Pelillo, M., Siddiqi, K., & Zucker, S. (1999). Matching hierarchical structures using association graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(11), 1105–1120.
Pentland, A. (1986). Perceptual organization and the representation of natural form. Artificial Intelligence, 28, 293–331.
Pentland, A. P. (1990). Automatic extraction of deformable part models. International Journal of Computer Vision, 4(2), 107–126.
Ponce, J. (1990). On characterizing ribbons and finding skewed symmetries. Computer Vision, Graphics and Image Processing, 52(3), 328–340.
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In: IEEE International Conference on Computer Vision (pp. 10–17).
Saint-Marc, P., Rom, H., & Medioni, G. (1993). B-spline contour representation and symmetry detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), 1191–1197.
Sala, P., & Dickinson, S. (2008). Model-based perceptual grouping and shape abstraction. In: Proceedings, Sixth IEEE Computer Society Workshop on Perceptual Organization in Computer Vision.
Sala, P., Dickinson, S. (2010). Contour grouping and abstraction using simple part models. In: Proceedings, European Conference on Computer Vision (ECCV). Crete.
Sclaroff, S., & Liu, L. (2001). Deformable shape detection and description via model-based region grouping. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(5), 475–489.
Sebastian, T., Klein, P., & Kimia, B. (2004). Recognition of shapes by editing their shock graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 550–571.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Shokoufandeh, A., Bretzner, L. D., Demirci, M. F., Jönsson, C., & Dickinson, S. (2006). The representation and matching of categorical shape. Computer Vision and Image Understanding, 103(2), 139–154.
Shokoufandeh, A., Macrini, D., Dickinson, S., Siddiqi, K., & Zucker, S. W. (2005). Indexing hierarchical structures using graph spectra. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(7), 1125–1140.
Shokoufandeh, A., Marsic, I., & Dickinson, S. (1999). View-based object recognition using saliency maps. Image and Vision Computing, 17(5–6), 445–460.
Siddiqi, K., Shokoufandeh, A., & Dickinson, S. J. Y. S. W. Z. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35, 13–32.
Siddiqi, K., Zhang, J., Macrini, D., Shokoufandeh, A., Bioux, S., & Dickinson, S. (2008). Retrieving articulated 3-d models using medial surfaces. Machine Vision and Applications, 19(4), 261–275.
Stahl, J., & Wang, S. (2008). Globally optimal grouping for symmetric closed boundaries by combining boundary and region information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(3), 395–411.
Ylä-Jääski, A., & Ade, F. (1996). Grouping symmetrical structures for object segmentation and description. Computer Vision and Image Understanding, 63(3), 399–417.
Acknowledgments
We thank David Fleet, Allan Jepson, and James Elder for providing valuable advice as members of the thesis committee. We also thank Yuri Boykov and Vladimir Kolmogorov for providing their parametric maxow implementation. This research was sponsored in part by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-2-0060. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either express or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes, notwithstanding any copyright notation herein. This work was also supported by the European Commission under a Marie Curie Excellence Grant MCEXT-025481 (Cristian Sminchisescu), CNCSIS-UEFISCU under project number PN II- RU-RC-2/2009 (Cristian Sminchisescu), NSERC (Alex Levinshtein, Sven Dickinson), MITACs (Alex Levinshtein).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Levinshtein, A., Sminchisescu, C. & Dickinson, S. Multiscale Symmetric Part Detection and Grouping. Int J Comput Vis 104, 117–134 (2013). https://doi.org/10.1007/s11263-013-0614-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-013-0614-3