Abstract
We propose a method that detects and segments multiple, partially occluded objects in images. A part hierarchy is defined for the object class. Both the segmentation and detection tasks are formulated as binary classification problem. A whole-object segmentor and several part detectors are learned by boosting local shape feature based weak classifiers. Given a new image, the part detectors are applied to obtain a number of part responses. All the edge pixels in the image that positively contribute to the part responses are extracted. A joint likelihood of multiple objects is defined based on the part detection responses and the object edges. Computation of the joint likelihood includes an inter-object occlusion reasoning that is based on the object silhouettes extracted with the whole-object segmentor. By maximizing the joint likelihood, part detection responses are grouped, merged, and assigned to multiple object hypotheses. The proposed approach is demonstrated with the class of pedestrians. The experimental results show that our method outperforms the previous ones.
Similar content being viewed by others
References
Bray, M., Kohli, P., & Torr, P. (2006). POSECUT: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In ECCV.
Chan, A. B., Liang, Z. S. J., & Vasconcelos, N. (2008). Privacy preserving crowd monitoring: counting people without people models or tracking. In CVPR.
Dalal, N., Triggs, B., & Schmid, C. (2006). Human detection using oriented histograms of flow and appearance. In ECCV.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.
Ess, A., Leibe, B., & Gool, L. V. (2007). Depth and appearance for mobile scene analysis. In ICCV.
Gavrila, D. M. (2000). Pedestrian detection from a moving vehicle. In ECCV.
Gavrila, D. M. (2007). A Bayesian, exemplar-based approach to hierarchical shape matching. PAMI, 29(8), 1408–1421.
Gavrila, D. M., & Philomin, V. (1999). Real-time object detection for smart vehicles. In ICCV.
Huang, C., Ai, H., Li, Y., & Lao, S. (2005). Vector boosting for rotation invariant multi-view face detection. In ICCV.
Huang, C., Ai, H., Li, Y., & Lao, S. (2006). Learning sparse features in granular space for multi-view face detection. In FG.
Huang, C., Ai, H., Li, Y., & Lao, S. (2007). High performance rotation invariant multi-view face detection. PAMI, 29(4), 671–686.
Kapoor, A., & Winn, J. (2006). Located hidden random fields: learning discriminative parts for object detection. In ECCV.
Kong, D., Gray, D., & Tao, H. (2006). A viewpoint invariant approach for crowd counting. In ICPR.
Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2, 83–87.
Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, in conjunction with ECCV.
Leibe, B., Seemann, E., & Schiele, B. (2005). Pedestrian detection in crowded scenes. In CVPR.
Lin, Y.-Y., Liu, T.-L., & Fuh, C.-S. (2004). Fast object detection with occlusion. In ECCV.
Lin, Z., Davis, L. S., Doermann, D., & DeMenthon, D. (2007). Hierarchical part-template matching for human detection and segmentation. In ICCV.
Medioni, G., Lee, M. S., & Tang, C. K. (2000). A computational framework for segmentation and grouping. Amsterdam: Elsevier.
Mikolajczyk, C., Schmid, C., & Zisserman, A. (2004). Human detection based on a probabilistic assembly of robust part detectors. In ECCV.
Mohan, A., Papageorgiou, C., & Poggio, T. (2001). Example-based object detection in images by components. IEEE Transactions on PAMI, 23(4), 349–361.
Munder, S., & Gavrila, D. M. (2006). An experimental study on pedestrian classification. IEEE Transactions on PAMI, 28(11), 1863–1868.
Mutch, J., & Lowe, D. (2006). Multiclass object recognition with sparse, localized features. In CVPR.
Opelt, A., Pinz, A., & Zisserman, A. (2006). A boundary-fragment-model for object detection. In ECCV.
Papageorgiou, C., Evgeniou, T., & Poggio, T. (1998). A trainable pedestrian detection system. In Proceedings of intelligent vehicles.
Pawan Kumar, M., Torr, P., & Zisserman, A. (2005). OBJ CUT. In CVPR.
Rowley, H., Baluja, S., & Kanade, T. (1998). Neural network-based face detection. PAMI, 20(1), 23–38.
Sabzmeydani, P., & Mori, G. (2007). Detecting pedestrians by learning shapelet features. In CVPR.
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37, 297–336.
Schneiderman, H., & Kanade, T. (2000). A statistical method for 3D object detection applied to faces and cars. In CVPR.
Sharma, V., & Davis, J. W. (2007). Integrating appearance and motion cues for simultaneous detection and segmentation of pedestrians. In ICCV.
Shashua, A., Gdalyahu, Y., & Hayun, G. (2004). Pedestrian detection for driving assistance systems: single-frame classification and system level performance. In IEEE intelligent vehicles symposium.
Shet, V. D., Neumann, J., Ramesh, V., & Davis, L. S. (2007). Bilattice-based logical reasoning for human detection. In CVPR.
Shotton, J., Blake, A., & Cipolla, R. (2005). Contour-based learning for object detection. In ICCV.
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). TextonBoost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In ECCV.
Todorovic, S., & Ahuja, N. (2006). Extracting subimages of an unknown category from a set of images. In CVPR.
Tu, Z., Zhu, S.-C., & Shum, H.-Y. (2001). Image segmentation by data driven Markov chain Monte Carlo. In ICCV.
Tuzel, O., Porikli, F., & Meer, P. (2007). Human detection via classification on Riemannian manifolds. In CVPR.
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR.
Viola, P., Jones, M., & Snow, D. (2003). Detecting pedestrians using patterns of motion and appearance. In ICCV.
Winn, J., & Jojic, N. (2005). LOCUS: Learning object class with unsupervised segmentation. In ICCV.
Winn, J., & Shotton, J. (2006). The layout consistent random field for recognition and segmentation partially occluded objects. In CVPR.
Wu, B., & Nevatia, R. (2005). Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In ICCV.
Wu, B., & Nevatia, R. (2007a). Cluster boosted tree classifier for multi-view, multi-pose object detection. In ICCV.
Wu, B., & Nevatia, R. (2007b). Simultaneous object detection and segmentation by boosting local shape feature based classifier. In CVPR.
Wu, B., & Nevatia, R. (2007c). Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. International Journal of Computer Vision, 75(2), 247–266.
Wu, B., Nevatia, R., & Li, Y. (2008). Segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. In CVPR.
Zhao, L., & Davis, L. (2005). Closely coupled object detection and segmentation. In ICCV.
Zhu, Q., Avidan, S., Yeh, M.-C., & Cheng, K.-T. (2006). Fast human detection using a cascade of histograms of oriented gradients. In CVPR.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, B., Nevatia, R. Detection and Segmentation of Multiple, Partially Occluded Objects by Grouping, Merging, Assigning Part Detection Responses. Int J Comput Vis 82, 185–204 (2009). https://doi.org/10.1007/s11263-008-0194-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-008-0194-9