Skip to main content
Log in

Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We describe a top-down object detection and segmentation approach that uses a skeleton-based shape model and that works directly on real images. The approach is based on three components. First, we propose a fragment-based generative model for shape that is based on the shock graph and has minimal dependency among its shape fragments. The model is capable of generating a wide variation of shapes as instances of a given object category. Second, we develop a progressive selection mechanism to search among the generated shapes for the category instances that are present in the image. The search begins with a large pool of candidates identified by a dynamic programming (DP) algorithm and progressively reduces it in size by applying series of criteria, namely, local minimum criterion, extent of shape overlap, and thresholding of the objective function to select the final object candidates. Third, we propose the Partitioned Chamfer Matching (PCM) measure to capture the support of image edges for a hypothesized shape. This measure overcomes the shortcomings of the Oriented Chamfer Matching and is robust against spurious edges, missing edges, and accidental alignment between the image edges and the shape boundary contour. We have evaluated our approach on the ETHZ dataset and found it to perform well in both object detection and object segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adluru, N., & Latecki, L. J. (2009). Contour grouping based on contour-skeleton duality. International Journal of Computer Vision, 83(1), 12–29.

    Article  Google Scholar 

  • Adluru, N., Latecki, L. J., Lakaemper, R., Yong, T., Bai, X., & Gross, A. (2005). Deformation invariant image matching. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (Vol. II, pp. 1466–1473). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Amit, Y., & Kong, A. (1996). Graphical templates for model registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(3), 225–236.

    Article  Google Scholar 

  • Bai, X., Wang, X., Latecki, L. J., Liu, W., & Tu, Z. (2009). Active skeleton for non-rigid object detection. In ICCV ’09: proceedings of the twelfth IEEE international conference on computer vision. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Balan, A. O., & Black, M. J. (2006). An adaptive appearance model approach for model-based articulated object tracking. In CVPR’06 (pp. 758–765). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Barrow, H. (1977). Parametric correspondence and chamfer matching: two new techniques for image matching. In Proc 5th int joint conf artificial intelligence.

    Google Scholar 

  • Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4), 509–522.

    Article  Google Scholar 

  • Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondences. In CVPR’05 (pp. 26–33). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Bertele, U., & Brioschi, F. (1972). Nonserial dynamic programming. Orlando: Academic Press.

    MATH  Google Scholar 

  • Bishop, C. M. (2007). Pattern recognition and machine learning. Berlin: Springer.

    Google Scholar 

  • Chui, H., & Rangarajan, A. (2003). A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding, 89(2–3), 114–141. doi:10.1016/S1077-3142(03)00009-2.

    Article  MATH  Google Scholar 

  • Coughlan, J., Yuille, A., English, C., & Snow, D. (2000). Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding, 78(3), 303–319. doi:10.1006/cviu.2000.0842.

    Article  Google Scholar 

  • Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In ECCV international workshop on statistical learning in computer vision.

    Google Scholar 

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR’05 (pp. 886–893). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Demirci, M. F., Shokoufandeh, A., & Dickinson, S. J. (2009). Skeletal shape abstraction from examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5), 944–952.

    Article  Google Scholar 

  • Dorkó, G., & Schmid, C. (2003). Selection of scale-invariant parts for object class recognition. In ICCV ’03: proceedings of the ninth IEEE international conference on computer vision (pp. 634–640). Los Alamitos: IEEE Computer Society Press.

    Chapter  Google Scholar 

  • Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The PASCAL Visual Object Classes Challenge 2009 (VOC2009) Results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html.

  • Felzenszwalb, P. F. (2005). Representation and detection of deformable shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 208–220.

    Article  Google Scholar 

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.

    Article  Google Scholar 

  • Felzenszwalb, P. F., & Schwartz, J. D. (2007). Hierarchical matching of deformable shapes. In CVPR’07. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Madison, Wisconsin (pp. 264–271). Los Alamitos: IEEE Computer Society Press. URL citeseer.nj.nec.com/580536.html.

    Google Scholar 

  • Ferrari, V., Tuytelaars, T., & Gool, L. V. (2006). Object detection by contour segment networks. In Lecture notes in computer science: Vol. 3951. ECCV2006 (pp. 14–28). Berlin: Springer.

    Google Scholar 

  • Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In CVPR’07 (pp. 1–8). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Ferrari, V., Fevrier, L., Jurie, F., & Schmid, C. (2008). Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 36–51.

    Article  Google Scholar 

  • Ferrari, V., Jurie, F., & Schmid, C. (2010). From images to shape models for object detection. International Journal of Computer Vision, 87(3), 284–303.

    Article  Google Scholar 

  • Geman, S., & Kochanek, K. (2001). Dynamic programming and the graphical representation of error-correcting codes. IEEE Transactions on Information Theory, 47(2), 549–568.

    Article  MathSciNet  MATH  Google Scholar 

  • Giblin, P. J., & Kimia, B. B. (2003a). On the intrinsic reconstruction of shape from its symmetries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 895–911.

    Article  Google Scholar 

  • Giblin, P. J., & Kimia, B. B. (2003b). On the local form and transitions of symmetry sets, medial axes, and shocks. International Journal of Computer Vision, 54(1–3), 143–157.

    Article  MATH  Google Scholar 

  • Gu, C., Lim, J. J., Arbelaez, P., & Malik, J. (2009). Recognition using regions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Miami, Florida, USA (pp. 1030–1037). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Huttenlocher, D. P., Klanderman, G. A., & Rucklidge, W. (1993). Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.

    Article  Google Scholar 

  • Jain, V., Kimia, B. B., & Mundy, J. L. (2007). Segregation of moving objects using elastic matching. Computer Vision and Image Understanding, 108, 230–242.

    Article  Google Scholar 

  • Jiang, X., Münger, A., & Bunke, H. (2001). On median graphs: Properties, algorithms, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1144–1151.

    Article  Google Scholar 

  • Jurie, F., & Schmid, C. (2004). Scale-invariant shape features for recognition of object categories. In CVPR (Vol. II, pp. 90–96).

    Google Scholar 

  • Jurie, F., & Triggs, B. (2005). Creating efficient codebooks for visual recognition. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (ICCV’05) (Vol. 1, pp. 604–610). Los Alamitos: IEEE Computer Society Press.

    Chapter  Google Scholar 

  • Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: active contour models. International Journal of Computer Vision, 1(4), 321–331.

    Article  Google Scholar 

  • Kelly, M. F., & Levine, M. D. (1995). Annular symmetry operators: a method for locating and describing objects. In ICCV.

    Google Scholar 

  • Kimia, B. (1991). Conservation laws and a theory of shape. Ph.D. dissertation, McGill Center for Intelligent Machines, McGill University, Montreal, Canada.

  • Kimia, B. B. (2003). On the role of medial geometry in human vision. Journal of Physiology-Paris, 97(2–3), 155–190.

    Article  Google Scholar 

  • Kimia, B. B. (2009). Shapes and shock graphs: from segmented shapes to shapes embedded in images. In S. J. Dickinson, A. Leonardis, B. Schiele, & M. J. Tarr (Eds.), Object categorization: computer and human vision perspectives (pp. 430–450). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Kimia, B. B., Tannenbaum, A. R., & Zucker, S. W. (1990). Toward a computational theory of shape: an overview. In O. D. Faugeras (Ed.), Lecture notes in computer science: Vol. 427. ECCV (pp. 402–407). Berlin: Springer.

    Google Scholar 

  • Kimia, B. B., Tannenbaum, AR, & Zucker, S. W. (1995). Shapes, shocks, and deformations, I: the components of shape and the reaction-diffusion space. International Journal of Computer Vision, 15(3), 189–224.

    Article  Google Scholar 

  • Kimia, B. B., Frankel, I., & Popescu, A. M. (2003). Euler spiral for shape completion. International Journal of Computer Vision, 54, 159–182.

    Article  MATH  Google Scholar 

  • Kovesi, P. D. (2009). MATLAB and Octave functions for computer vision and image processing. School of Computer Science & Software Engineering, The University of Western Australia. Available from: http://www.csse.uwa.edu.au/~pk/research/matlabfns/.

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2004a). Extending pictorial structures for object recognition. In BMVC’04, British Machine Vision Association (pp. 789–798).

    Google Scholar 

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2004b). Learning layered pictorial structures from video. In B. Chanda, S. Chandran, & L. S. Davis (Eds.), ICVGIP 2004 (pp. 158–164). Mumbai: Allied Publishers.

    Google Scholar 

  • Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Obj cut. In CVPR’05 (pp. 18–25). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR’06 (pp. 2169–2178). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Leibe, B., & Schiele, B. (2004). Scale-invariant object categorization using a scale-adaptive mean-shift search. In DAGM-Symposium (pp. 145–153).

    Google Scholar 

  • Leordeanu, M., Hebert, M., & Sukthankar, R. (2007). Beyond local appearance: category recognition from pairwise interactions of simple features. In CVPR’07. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Lin, L., Peng, S., Porway, J., Zhu, S., & Wang, Y. (2007). An empirical study of object category recognition: sequential testing with generalized samples. In ICCV07 (pp. 1–8).

    Google Scholar 

  • Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 1038–1045). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Martin, D. R., Fowlkes, C. C., & Malik, J., (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549. doi:10.1109/TPAMI.2004.1273918.

    Article  Google Scholar 

  • Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.

    Article  Google Scholar 

  • Mori, G. (2005). Guiding model search using segmentation. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (pp. 1417–1423). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Nilsson, D. (1998). An efficient algorithm for finding the m most probable configurations in probabilistic expert systems. Statistics and Computing, 8(2), 159–173. doi:10.1023/A:1008990218483.

    Article  Google Scholar 

  • Olson, C. F., & Huttenlocher, D. P. (1997). Automatic target recognition by matching oriented edge pixels. IEEE Transactions on Image Processing, 6(1), 103–113.

    Article  Google Scholar 

  • Ommer, B., & Malik, J. (2009). Multi-scale object detection by clustering lines. In ICCV ’09: proceedings of the twelfth IEEE international conference on computer vision. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Opelt, A., Pinz, A., & Zisserman, A. (2006a). A boundary-fragment-model for object detection. In Lecture notes in computer science: Vol. 3951. ECCV’06 (pp. 575–588). Berlin: Springer.

    Google Scholar 

  • Opelt, A., Pinz, A., & Zisserman, A. (2006b). Incremental learning of object detectors using a visual shape alphabet. In CVPR’06 (pp. 3–10). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Opelt, A., Pinz, A., & Zisserman, A. (2008). Learning an alphabet of shape and appearance for multi-class object detection. International Journal of Computer Vision, 80(1), 16–44.

    Article  Google Scholar 

  • Ozcanli, O. C., & Kimia, B. B. (2007). Generic object recognition via shock patch fragments. In N. M. Rajpoot & A. Bhalerao (Eds.), Proceedings of the British machine vision conference (pp. 1030–1039). Coventry: Warwick Print.

    Google Scholar 

  • Ozcanli, O. C., Tamrakar, A., Kimia, B. B., & Mundy, J. L. (2006). Augmenting shape with appearance in vehicle category recognition. In CVPR’06 (pp. 935–942). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Ramanan, D. (2007). Learning to parse images of articulated bodies. In B. Schölkopf, J. Platt & T. Hoffman (Eds.), NIPS’06. Cambridge: MIT Press.

    Google Scholar 

  • Sala, P., & Dickinson, S. (2008). Model-based perceptual grouping and shape abstraction. In Computer vision and pattern recognition workshops, CVPRW ’08. IEEE computer society conference on (pp. 1–8). Los Alamitos: IEEE Computer Society Press.

    Chapter  Google Scholar 

  • Sebastian, T., Klein, P., & Kimia, B. (2004). Recognition of shapes by editing their shock graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 551–571.

    Article  Google Scholar 

  • Sebastian, T. B., Klein, P. N., & Kimia, B. B. (2001). Recognition of shapes by editing shock graphs. In Proceedings of the eighth international conference on computer vision, Vancouver, Canada (pp. 755–762). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Selinger, A., & Nelson, R. C. (1999). A perceptual grouping hierarchy for appearance-based 3d object recognition. Computer Vision and Image Understanding, 76(1), 83–92.

    Article  Google Scholar 

  • Sharvit, D., Chan, J., & Kimia, B. B. (1998). Symmetry-based indexing of image databases. In Workshop on content-based access of image and video libraries, CVPR98 (pp. 56–62).

    Chapter  Google Scholar 

  • Shotton, J., Blake, A., & Cipolla, R. (2005). Contour-based learning for object detection. In ICCV (pp. 281–288).

    Google Scholar 

  • Shotton, J., Winn, J. M., Rother, C., & Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Lecture notes in computer science: Vol. 3951. ECCV’06 (pp. 1–15). Berlin: Springer.

    Google Scholar 

  • Shotton, J., Blake, A., & Cipolla, R. (2008). Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1270–1281.

    Article  Google Scholar 

  • Siddiqi, K., & Kimia, B. B. (1995). Parts of visual form: computational aspects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(3), 239–251.

    Article  Google Scholar 

  • Siddiqi, K., & Kimia, B. B. (1996). A shock grammar for recognition. In Proc. CVPR (pp. 507–513).

    Google Scholar 

  • Siddiqi, K., Tresness, K. J., & Kimia, B. B. (1996). Parts of visual form: ecological and psychophysical aspects. Perception, 25, 399–424.

    Article  Google Scholar 

  • Siddiqi, K., Shokoufandeh, A., Dickinson, S. J., & Zucker, S. W. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35(1), 13–32.

    Article  Google Scholar 

  • Siddiqi, K., Kimia, B. B., Tannenbaunm, AR, & Zucker, S. W. (2001). On the psychophysics of the shape triangle. Vision Research, 41(9), 1153–1178.

    Article  Google Scholar 

  • Tek, H., & Kimia, B. B. (2003). Symmetry maps of free-form curve segments via wave propagation. International Journal of Computer Vision, 54(1–3), 35–81.

    Article  MATH  Google Scholar 

  • Todorovic, S., & Ahuja, N. (2006). Extracting subimages of an unknown category from a set of images. In CVPR’06 (pp. 927–934). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In CVPR’04 (pp. 762–769). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Torsello, A. (2008). An importance sampling approach to learning structural representations of shape. In CVPR’08. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Torsello, A., & Hancock, ER (2006). Learning shape-classes using a mixture of tree-unions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 954–967.

    Article  Google Scholar 

  • Triesch, J., & von der Malsburg, C. (2002). Classification of hand postures against complex backgrounds using elastic graph matching. Image and Vision Computing, 20(13-14), 937–943.

    Article  Google Scholar 

  • Trinh, N. H., & Kimia, B. B. (2007). A symmetry-based generative model for shape. In ICCV ’07: proceedings of the eleventh IEEE international conference on computer vision, Rio de Janeiro, Brazil. Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Trinh, N. H., & Kimia, B. B. (2009). Category-specific object recognition and segmentation using a skeletal shape model. In BMVC’09: proceedings of the British Machine Vision Conference.

    Google Scholar 

  • Trinh, N. H., & Kimia, B. B. (2010). Learning prototypical shapes for object categories. In Proceedings of CVPR workshop on structured models in computer vision (SMiCV’10). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Kauai, Hawaii, USA (pp. 511–518). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Winn, J. M., & Jojic, N. (2005). Locus: Learning object classes with unsupervised segmentation. In ICCV ’05: Proceedings of the tenth IEEE international conference on computer vision. (pp. 756–763). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Yanover, C., & Weiss, Y. (2004). Finding the m most probable configurations in arbitrary graphical models. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), NIPS’03. Cambridge: MIT Press.

    Google Scholar 

  • Yuille, AL, Hallinan, P. W., & Cohen, D. S. (1992). Feature extraction from faces using deformable templates. International Journal of Computer Vision, 8(2), 99–111.

    Article  Google Scholar 

  • Zhang, J., Luo, J., Collins, R. T., & Liu, Y. (2006). Body localization in still images using hierarchical models and hybrid search. In CVPR’06 (pp. 1536–1543). Los Alamitos: IEEE Computer Society Press.

    Google Scholar 

  • Zhu, Q., Wang, L., Wu, Y., & Shi, J. (2008). Contour context selection for object detection: A set-to-set contour matching approach. In Lecture notes in computer science: Vol. 5303. ECCV (pp. 774–787). Berlin: Springer.

    Google Scholar 

  • Zhu, S. C., & Yuille, AL (1996). FORMS: a flexible object recognition and modeling system. International Journal of Computer Vision, 20(3), 187–212.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nhon H. Trinh.

Additional information

The authors gratefully acknowledge the support of US National Foundation Grant NSF 0957045.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trinh, N.H., Kimia, B.B. Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model. Int J Comput Vis 94, 215–240 (2011). https://doi.org/10.1007/s11263-010-0412-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-010-0412-0

Keywords

Navigation