Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model

Trinh, Nhon H.; Kimia, Benjamin B.

doi:10.1007/s11263-010-0412-0

Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model

Published: 22 January 2011

Volume 94, pages 215–240, (2011)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Nhon H. Trinh¹ &
Benjamin B. Kimia²

825 Accesses
42 Citations
3 Altmetric
Explore all metrics

Abstract

We describe a top-down object detection and segmentation approach that uses a skeleton-based shape model and that works directly on real images. The approach is based on three components. First, we propose a fragment-based generative model for shape that is based on the shock graph and has minimal dependency among its shape fragments. The model is capable of generating a wide variation of shapes as instances of a given object category. Second, we develop a progressive selection mechanism to search among the generated shapes for the category instances that are present in the image. The search begins with a large pool of candidates identified by a dynamic programming (DP) algorithm and progressively reduces it in size by applying series of criteria, namely, local minimum criterion, extent of shape overlap, and thresholding of the objective function to select the final object candidates. Third, we propose the Partitioned Chamfer Matching (PCM) measure to capture the support of image edges for a hypothesized shape. This measure overcomes the shortcomings of the Oriented Chamfer Matching and is robust against spurious edges, missing edges, and accidental alignment between the image edges and the shape boundary contour. We have evaluated our approach on the ETHZ dataset and found it to perform well in both object detection and object segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Skeleton-Based Recognition of Shapes in Images via Longest Path Matching

Grouping Active Contour Fragments for Object Recognition

The Propagated Skeleton: A Robust Detail-Preserving Approach

References

Adluru, N., & Latecki, L. J. (2009). Contour grouping based on contour-skeleton duality. International Journal of Computer Vision, 83(1), 12–29.
Article Google Scholar
Adluru, N., Latecki, L. J., Lakaemper, R., Yong, T., Bai, X., & Gross, A. (2005). Deformation invariant image matching. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (Vol. II, pp. 1466–1473). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Amit, Y., & Kong, A. (1996). Graphical templates for model registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(3), 225–236.
Article Google Scholar
Bai, X., Wang, X., Latecki, L. J., Liu, W., & Tu, Z. (2009). Active skeleton for non-rigid object detection. In ICCV ’09: proceedings of the twelfth IEEE international conference on computer vision. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Balan, A. O., & Black, M. J. (2006). An adaptive appearance model approach for model-based articulated object tracking. In CVPR’06 (pp. 758–765). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Barrow, H. (1977). Parametric correspondence and chamfer matching: two new techniques for image matching. In Proc 5th int joint conf artificial intelligence.
Google Scholar
Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4), 509–522.
Article Google Scholar
Berg, A. C., Berg, T. L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondences. In CVPR’05 (pp. 26–33). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Bertele, U., & Brioschi, F. (1972). Nonserial dynamic programming. Orlando: Academic Press.
MATH Google Scholar
Bishop, C. M. (2007). Pattern recognition and machine learning. Berlin: Springer.
Google Scholar
Chui, H., & Rangarajan, A. (2003). A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding, 89(2–3), 114–141. doi:10.1016/S1077-3142(03)00009-2.
Article MATH Google Scholar
Coughlan, J., Yuille, A., English, C., & Snow, D. (2000). Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding, 78(3), 303–319. doi:10.1006/cviu.2000.0842.
Article Google Scholar
Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In ECCV international workshop on statistical learning in computer vision.
Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR’05 (pp. 886–893). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Demirci, M. F., Shokoufandeh, A., & Dickinson, S. J. (2009). Skeletal shape abstraction from examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5), 944–952.
Article Google Scholar
Dorkó, G., & Schmid, C. (2003). Selection of scale-invariant parts for object class recognition. In ICCV ’03: proceedings of the ninth IEEE international conference on computer vision (pp. 634–640). Los Alamitos: IEEE Computer Society Press.
Chapter Google Scholar
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The PASCAL Visual Object Classes Challenge 2009 (VOC2009) Results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html.
Felzenszwalb, P. F. (2005). Representation and detection of deformable shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 208–220.
Article Google Scholar
Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.
Article Google Scholar
Felzenszwalb, P. F., & Schwartz, J. D. (2007). Hierarchical matching of deformable shapes. In CVPR’07. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Madison, Wisconsin (pp. 264–271). Los Alamitos: IEEE Computer Society Press. URL citeseer.nj.nec.com/580536.html.
Google Scholar
Ferrari, V., Tuytelaars, T., & Gool, L. V. (2006). Object detection by contour segment networks. In Lecture notes in computer science: Vol. 3951. ECCV2006 (pp. 14–28). Berlin: Springer.
Google Scholar
Ferrari, V., Jurie, F., & Schmid, C. (2007). Accurate object detection with deformable shape models learnt from images. In CVPR’07 (pp. 1–8). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Ferrari, V., Fevrier, L., Jurie, F., & Schmid, C. (2008). Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 36–51.
Article Google Scholar
Ferrari, V., Jurie, F., & Schmid, C. (2010). From images to shape models for object detection. International Journal of Computer Vision, 87(3), 284–303.
Article Google Scholar
Geman, S., & Kochanek, K. (2001). Dynamic programming and the graphical representation of error-correcting codes. IEEE Transactions on Information Theory, 47(2), 549–568.
Article MathSciNet MATH Google Scholar
Giblin, P. J., & Kimia, B. B. (2003a). On the intrinsic reconstruction of shape from its symmetries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 895–911.
Article Google Scholar
Giblin, P. J., & Kimia, B. B. (2003b). On the local form and transitions of symmetry sets, medial axes, and shocks. International Journal of Computer Vision, 54(1–3), 143–157.
Article MATH Google Scholar
Gu, C., Lim, J. J., Arbelaez, P., & Malik, J. (2009). Recognition using regions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Miami, Florida, USA (pp. 1030–1037). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Huttenlocher, D. P., Klanderman, G. A., & Rucklidge, W. (1993). Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9), 850–863.
Article Google Scholar
Jain, V., Kimia, B. B., & Mundy, J. L. (2007). Segregation of moving objects using elastic matching. Computer Vision and Image Understanding, 108, 230–242.
Article Google Scholar
Jiang, X., Münger, A., & Bunke, H. (2001). On median graphs: Properties, algorithms, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1144–1151.
Article Google Scholar
Jurie, F., & Schmid, C. (2004). Scale-invariant shape features for recognition of object categories. In CVPR (Vol. II, pp. 90–96).
Google Scholar
Jurie, F., & Triggs, B. (2005). Creating efficient codebooks for visual recognition. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (ICCV’05) (Vol. 1, pp. 604–610). Los Alamitos: IEEE Computer Society Press.
Chapter Google Scholar
Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: active contour models. International Journal of Computer Vision, 1(4), 321–331.
Article Google Scholar
Kelly, M. F., & Levine, M. D. (1995). Annular symmetry operators: a method for locating and describing objects. In ICCV.
Google Scholar
Kimia, B. (1991). Conservation laws and a theory of shape. Ph.D. dissertation, McGill Center for Intelligent Machines, McGill University, Montreal, Canada.
Kimia, B. B. (2003). On the role of medial geometry in human vision. Journal of Physiology-Paris, 97(2–3), 155–190.
Article Google Scholar
Kimia, B. B. (2009). Shapes and shock graphs: from segmented shapes to shapes embedded in images. In S. J. Dickinson, A. Leonardis, B. Schiele, & M. J. Tarr (Eds.), Object categorization: computer and human vision perspectives (pp. 430–450). Cambridge: Cambridge University Press.
Chapter Google Scholar
Kimia, B. B., Tannenbaum, A. R., & Zucker, S. W. (1990). Toward a computational theory of shape: an overview. In O. D. Faugeras (Ed.), Lecture notes in computer science: Vol. 427. ECCV (pp. 402–407). Berlin: Springer.
Google Scholar
Kimia, B. B., Tannenbaum, AR, & Zucker, S. W. (1995). Shapes, shocks, and deformations, I: the components of shape and the reaction-diffusion space. International Journal of Computer Vision, 15(3), 189–224.
Article Google Scholar
Kimia, B. B., Frankel, I., & Popescu, A. M. (2003). Euler spiral for shape completion. International Journal of Computer Vision, 54, 159–182.
Article MATH Google Scholar
Kovesi, P. D. (2009). MATLAB and Octave functions for computer vision and image processing. School of Computer Science & Software Engineering, The University of Western Australia. Available from: http://www.csse.uwa.edu.au/~pk/research/matlabfns/.
Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2004a). Extending pictorial structures for object recognition. In BMVC’04, British Machine Vision Association (pp. 789–798).
Google Scholar
Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2004b). Learning layered pictorial structures from video. In B. Chanda, S. Chandran, & L. S. Davis (Eds.), ICVGIP 2004 (pp. 158–164). Mumbai: Allied Publishers.
Google Scholar
Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Obj cut. In CVPR’05 (pp. 18–25). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR’06 (pp. 2169–2178). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Leibe, B., & Schiele, B. (2004). Scale-invariant object categorization using a scale-adaptive mean-shift search. In DAGM-Symposium (pp. 145–153).
Google Scholar
Leordeanu, M., Hebert, M., & Sukthankar, R. (2007). Beyond local appearance: category recognition from pairwise interactions of simple features. In CVPR’07. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Lin, L., Peng, S., Porway, J., Zhu, S., & Wang, Y. (2007). An empirical study of object category recognition: sequential testing with generalized samples. In ICCV07 (pp. 1–8).
Google Scholar
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 1038–1045). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Martin, D. R., Fowlkes, C. C., & Malik, J., (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549. doi:10.1109/TPAMI.2004.1273918.
Article Google Scholar
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Article Google Scholar
Mori, G. (2005). Guiding model search using segmentation. In ICCV ’05: proceedings of the tenth IEEE international conference on computer vision (pp. 1417–1423). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Nilsson, D. (1998). An efficient algorithm for finding the m most probable configurations in probabilistic expert systems. Statistics and Computing, 8(2), 159–173. doi:10.1023/A:1008990218483.
Article Google Scholar
Olson, C. F., & Huttenlocher, D. P. (1997). Automatic target recognition by matching oriented edge pixels. IEEE Transactions on Image Processing, 6(1), 103–113.
Article Google Scholar
Ommer, B., & Malik, J. (2009). Multi-scale object detection by clustering lines. In ICCV ’09: proceedings of the twelfth IEEE international conference on computer vision. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Opelt, A., Pinz, A., & Zisserman, A. (2006a). A boundary-fragment-model for object detection. In Lecture notes in computer science: Vol. 3951. ECCV’06 (pp. 575–588). Berlin: Springer.
Google Scholar
Opelt, A., Pinz, A., & Zisserman, A. (2006b). Incremental learning of object detectors using a visual shape alphabet. In CVPR’06 (pp. 3–10). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Opelt, A., Pinz, A., & Zisserman, A. (2008). Learning an alphabet of shape and appearance for multi-class object detection. International Journal of Computer Vision, 80(1), 16–44.
Article Google Scholar
Ozcanli, O. C., & Kimia, B. B. (2007). Generic object recognition via shock patch fragments. In N. M. Rajpoot & A. Bhalerao (Eds.), Proceedings of the British machine vision conference (pp. 1030–1039). Coventry: Warwick Print.
Google Scholar
Ozcanli, O. C., Tamrakar, A., Kimia, B. B., & Mundy, J. L. (2006). Augmenting shape with appearance in vehicle category recognition. In CVPR’06 (pp. 935–942). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Ramanan, D. (2007). Learning to parse images of articulated bodies. In B. Schölkopf, J. Platt & T. Hoffman (Eds.), NIPS’06. Cambridge: MIT Press.
Google Scholar
Sala, P., & Dickinson, S. (2008). Model-based perceptual grouping and shape abstraction. In Computer vision and pattern recognition workshops, CVPRW ’08. IEEE computer society conference on (pp. 1–8). Los Alamitos: IEEE Computer Society Press.
Chapter Google Scholar
Sebastian, T., Klein, P., & Kimia, B. (2004). Recognition of shapes by editing their shock graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 551–571.
Article Google Scholar
Sebastian, T. B., Klein, P. N., & Kimia, B. B. (2001). Recognition of shapes by editing shock graphs. In Proceedings of the eighth international conference on computer vision, Vancouver, Canada (pp. 755–762). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Selinger, A., & Nelson, R. C. (1999). A perceptual grouping hierarchy for appearance-based 3d object recognition. Computer Vision and Image Understanding, 76(1), 83–92.
Article Google Scholar
Sharvit, D., Chan, J., & Kimia, B. B. (1998). Symmetry-based indexing of image databases. In Workshop on content-based access of image and video libraries, CVPR98 (pp. 56–62).
Chapter Google Scholar
Shotton, J., Blake, A., & Cipolla, R. (2005). Contour-based learning for object detection. In ICCV (pp. 281–288).
Google Scholar
Shotton, J., Winn, J. M., Rother, C., & Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Lecture notes in computer science: Vol. 3951. ECCV’06 (pp. 1–15). Berlin: Springer.
Google Scholar
Shotton, J., Blake, A., & Cipolla, R. (2008). Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1270–1281.
Article Google Scholar
Siddiqi, K., & Kimia, B. B. (1995). Parts of visual form: computational aspects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(3), 239–251.
Article Google Scholar
Siddiqi, K., & Kimia, B. B. (1996). A shock grammar for recognition. In Proc. CVPR (pp. 507–513).
Google Scholar
Siddiqi, K., Tresness, K. J., & Kimia, B. B. (1996). Parts of visual form: ecological and psychophysical aspects. Perception, 25, 399–424.
Article Google Scholar
Siddiqi, K., Shokoufandeh, A., Dickinson, S. J., & Zucker, S. W. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35(1), 13–32.
Article Google Scholar
Siddiqi, K., Kimia, B. B., Tannenbaunm, AR, & Zucker, S. W. (2001). On the psychophysics of the shape triangle. Vision Research, 41(9), 1153–1178.
Article Google Scholar
Tek, H., & Kimia, B. B. (2003). Symmetry maps of free-form curve segments via wave propagation. International Journal of Computer Vision, 54(1–3), 35–81.
Article MATH Google Scholar
Todorovic, S., & Ahuja, N. (2006). Extracting subimages of an unknown category from a set of images. In CVPR’06 (pp. 927–934). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In CVPR’04 (pp. 762–769). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Torsello, A. (2008). An importance sampling approach to learning structural representations of shape. In CVPR’08. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Torsello, A., & Hancock, ER (2006). Learning shape-classes using a mixture of tree-unions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 954–967.
Article Google Scholar
Triesch, J., & von der Malsburg, C. (2002). Classification of hand postures against complex backgrounds using elastic graph matching. Image and Vision Computing, 20(13-14), 937–943.
Article Google Scholar
Trinh, N. H., & Kimia, B. B. (2007). A symmetry-based generative model for shape. In ICCV ’07: proceedings of the eleventh IEEE international conference on computer vision, Rio de Janeiro, Brazil. Los Alamitos: IEEE Computer Society Press.
Google Scholar
Trinh, N. H., & Kimia, B. B. (2009). Category-specific object recognition and segmentation using a skeletal shape model. In BMVC’09: proceedings of the British Machine Vision Conference.
Google Scholar
Trinh, N. H., & Kimia, B. B. (2010). Learning prototypical shapes for object categories. In Proceedings of CVPR workshop on structured models in computer vision (SMiCV’10). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Kauai, Hawaii, USA (pp. 511–518). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Winn, J. M., & Jojic, N. (2005). Locus: Learning object classes with unsupervised segmentation. In ICCV ’05: Proceedings of the tenth IEEE international conference on computer vision. (pp. 756–763). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Yanover, C., & Weiss, Y. (2004). Finding the m most probable configurations in arbitrary graphical models. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), NIPS’03. Cambridge: MIT Press.
Google Scholar
Yuille, AL, Hallinan, P. W., & Cohen, D. S. (1992). Feature extraction from faces using deformable templates. International Journal of Computer Vision, 8(2), 99–111.
Article Google Scholar
Zhang, J., Luo, J., Collins, R. T., & Liu, Y. (2006). Body localization in still images using hierarchical models and hybrid search. In CVPR’06 (pp. 1536–1543). Los Alamitos: IEEE Computer Society Press.
Google Scholar
Zhu, Q., Wang, L., Wu, Y., & Shi, J. (2008). Contour context selection for object detection: A set-to-set contour matching approach. In Lecture notes in computer science: Vol. 5303. ECCV (pp. 774–787). Berlin: Springer.
Google Scholar
Zhu, S. C., & Yuille, AL (1996). FORMS: a flexible object recognition and modeling system. International Journal of Computer Vision, 20(3), 187–212.
Article Google Scholar

Download references

Author information

Authors and Affiliations

SRI International Sarnoff, 201 Washington Rd, Princeton, NJ, 08540, USA
Nhon H. Trinh
Brown University, Box D, Providence, RI, 02912, USA
Benjamin B. Kimia

Authors

Nhon H. Trinh
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin B. Kimia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nhon H. Trinh.

Additional information

The authors gratefully acknowledge the support of US National Foundation Grant NSF 0957045.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trinh, N.H., Kimia, B.B. Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model. Int J Comput Vis 94, 215–240 (2011). https://doi.org/10.1007/s11263-010-0412-0

Download citation

Received: 04 October 2009
Accepted: 02 December 2010
Published: 22 January 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s11263-010-0412-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model

Abstract

Access this article

Similar content being viewed by others

Skeleton-Based Recognition of Shapes in Images via Longest Path Matching

Grouping Active Contour Fragments for Object Recognition

The Propagated Skeleton: A Robust Detail-Preserving Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Skeleton Search: Category-Specific Object Recognition and Segmentation Using a Skeletal Shape Model

Abstract

Access this article

Similar content being viewed by others

Skeleton-Based Recognition of Shapes in Images via Longest Path Matching

Grouping Active Contour Fragments for Object Recognition

The Propagated Skeleton: A Robust Detail-Preserving Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation