Abstract
We present an efficient method for learning part-based object class models from unsegmented images represented as sets of salient features. A model includes parts’ appearance, as well as location and scale relations between parts. The object class is generatively modeled using a simple Bayesian network with a central hidden node containing location and scale information, and nodes describing object parts. The model’s parameters, however, are optimized to reduce a loss function of the training error, as in discriminative methods. We show how boosting techniques can be extended to optimize the relational model proposed, with complexity linear in the number of parts and the number of features per image. This efficiency allows our method to learn relational models with many parts and features. The method has an advantage over purely generative and purely discriminative approaches for learning from sets of salient features, since generative method often use a small number of parts and features, while discriminative methods tend to ignore geometrical relations between parts. Experimental results are described, using some bench-mark data sets and three sets of newly collected data, showing the relative merits of our method in recognition and localization tasks.
Similar content being viewed by others
References
Agarwal, S., Awan, A., & Roth, D. (2004). Learning to detect objects in images via a sparse, part based representation. Pattern Analysis and Machine Intelligence, 20(11), 1475–1490.
Agarwal, S., & Roth, D. (2002). Learning a sparse representation for object detection. In ECCV (pp. 113–130).
Bar-Hillel, A., Hertz, T., & Weinshall, D. (2005a). Efficient learning of relational object class models. In ICCV.
Bar-Hillel, A., Hertz, T., & Weinshall, D. (2005b). Object class recognition by boosting a part based model. In CVPR. Los Alamitos: IEEE Computer Society
Borenstein, E., Sharon, E., & Ullman, S. (2004). Combining top-down and bottom-up segmentation. In IEEE workshop on perceptual organization in computer vision (CVPR).
Chan, A. B., Vasconcelos, N., & Moreno, P. J. (2004). A family of probabilistic kernels based on information divergence.
Csurka, G., Bray, C., Dance, C., & Fan, L. (2004). Visual categorization with bags of keypoints. In ECCV.
Dorkó, G., & Schmid, C. (2005, submitted). Object class recognition using discriminative local features. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Everingham, M. R., Zisserman, A., Williams, C. K. I., & Van Gool, L. et al. (2006). The 2005 pascal visual object classes challenge. In J. Quinonero-Candela, I. Dagan, B. Magnini, & F. d’Alche-Buc (Eds.), LNAI: Vol. 3944. Machine learning challenges. Evaluating predictive uncertainty, visual object classification, and recognising textual entailment (pp. 117–176).
Fei-Fei, L., Fergus, R., & Perona, P. (2003). A bayesian approach to unsupervised one shot learning of object catgories. In ICCV.
Feltzenswalb, P., & Huttenlocher, D. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61, 55–79.
Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale invariant learning. In CVPR. Los Alamitos: IEEE Computer Society
Fergus, R., Perona, P., & Zisserman, A. (2005). A sparse object category model for efficient learning and exhaustive recognition. In CVPR.
Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In ICML (pp. 148–156).
Friedman, J. H., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view ofboosting. Annals of Statistics, 28, 337–407.
Fritz, M., Leibe, B., Caputo, B., & Schiele, B. (2005). Integrating representative and discriminant models for object category detection. In ICCV.
Gao, D., & Vasconcelos, N. (2004). Discriminant saliency for visual recognition from cluttered scenes. In NIPS.
Holub, A. D., & Perona, P. (2005). A discriminative framework for modeling object classes. In CVPR.
Holub, A. D., Welling, M., & Perona, P. (2005). Combining generative models and fisher kernels for object class recognition. In ICCV.
Kadir, T., & Brady, M. (2001). Scale, saliency and image description. International Journal of Computer Vision, 45(2), 83–105.
Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In ECCV workshop on statistical learning in computer vision.
Li, Y., Shapiro, L., & Bilmes, J. (2005). A generative /discriminative learning algorithm for image classification. In ICCV (Vol. 2, pp. 1605–1612).
Loeff, N., Arora, H., Sorokin, A., & Forsyth, D. (2005). Efficient unsupervised learning for localization and detection in object categories. In NIPS.
Lowe, D. (2001). Local feature view clustering for 3D object recognition. In CVPR, (pp. 682–688).
Mason, L., Baxter, J., Bartlett, P., & Frean, M. (2000). Boosting algorithms as gradient descent in function space. In NIPS (pp. 512–518).
Murphy, K. P., Torralba, A., & Freeman, W. T. (2003). Using the forest to see the trees: a graphical model relating features, objects and scenes. In NIPS.
NG, A. Y., & Jordan, M. I. (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS.
Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004a). Object recognition with boosting (Technical report tr-emt-2004-01). Submitted to PAMI.
Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004b). Weak hypotheses and boosting for generic object detection and recognition. In ECCV.
Schapire, R. E., & Singer, Y. (1999). Improved boosting using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Serre, T., Wolf, L., & Poggio, T. (2005). A new biologically motivated framework for robust object recognition. In CVPR.
Thureson, J., & Carlsson, S. (2004). Appearance based qualitative image description for object class recognition. In ECCV (pp. 518–529).
Torralba, A., Murphy, K., & Freeman, W. T. (2004). Contextual models for object detection using boosted random fields. In NIPS.
Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5, 682–687.
Ulusoy, I., & Bishop, C. M. (2005). Generative versus discriminative methods for object recognition. In CVPR (Vol. 2, pp. 258–265).
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
Vidal-Naquet, M., & Ullman, S. (2003). Object recognition with informative features and linear classification. In ICCV.
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bar-Hillel, A., Weinshall, D. Efficient Learning of Relational Object Class Models. Int J Comput Vis 77, 175–198 (2008). https://doi.org/10.1007/s11263-007-0091-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0091-7