Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Cevikalp, Hakan; Triggs, Bill

doi:10.1007/s11263-016-0986-2

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Published: 11 January 2017

Volume 123, pages 334–349, (2017)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Hakan Cevikalp¹ &
Bill Triggs²

1459 Accesses
12 Citations
Explore all metrics

Abstract

We describe an efficient approach to visual object detection that uses short cascades of asymmetric ‘one class’ classifiers to quickly reject negatives (windows not centered on an object of the desired class) within a sliding window framework. Current detectors typically use binary discriminants such as Support Vector Machines or Boosting to implement each stage of the cascade. These treat the positive and negative classes symmetrically. We argue that this is suboptimal because object detectors typically see a great many negative windows with extremely diverse contents and only a few positive ones with comparatively coherent contents. We show that asymmetric representations that focus on tightly modeling the extent of the rare, coherent positive class can lead to simpler classifiers and faster rejection. Our cascades use asymmetric classifiers based on simple convex models to progressively tighten the bound on the positive class. They typically start with a conventional linear SVM for initial pruning, followed by a cascade of linear distance-to-hyperplane and interior-of-hypersphere classifiers and finishing with a kernelized hypersphere classifier. We show that the resulting detectors have competitive performance on the Labeled Faces in the Wild dataset and state-of-the-art performance on the FDDB face detection, ESOGU face detection and INRIA Person datasets. The results on the PASCAL VOC 2007 dataset are also respectable given that they use neither object parts nor context. The one-class formulations provide significant reductions in classifier complexity relative to the corresponding two-class ones, making them suitable for real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Classifier Boosting and Tree-Structured Classifiers

Fast Image Classification with Reduced Multiclass Support Vector Machines

Deconstructing Binary Classifiers in Computer Vision

Notes

The name “one class” is conventional. It emphasizes the origin of these methods in density modeling and the predominant role of the positive class but it is something of a misnomer in that negative examples usually can be, often are, and in some formulations must be included during training.
http://cmp.felk.cvut.cz/~xfrancv/ocas/html/index.html.
http://www.csie.ntu.edu.tw/~cjlin/libsvm.
It would be possible to learn \(\varDelta \) by including a \((\text {weight})\cdot \,\varDelta \) term in the cost function but we have not done this here owing to a limitation of the QP solver that we used. Instead we set \(\varDelta \) directly using cross validation. (Cross validation might be needed in any case, to set the weight).
http://cmp.felk.cvut.cz.
The code is available from http://mlcv.ogu.edu.tr/softwares.html.
http://mlcvdb.ogu.edu.tr/facedetection.html.
http://picasa.google.com.
Typically only a few hundred—about a thousand per class partitioned among 3 pairs of roots.
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/results/index.shtml.
http://www.cs.berkeley.edu/~rbg/latent/index.html.

References

Ahonen, T., Hadid, A., & Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE T-PAMI, 28(12), 2037–2041.
Article MATH Google Scholar
Aldavert, D., Ramisa, A., Mantaras, R. L., & Toledo, R. (2010). Fast and robust object segmentation with the integral linear classifier. In CVPR.
Amit, Y., & Geman, D. (1999). A computational model for visual selection. Neural Computation, 11, 1691–1715.
Article Google Scholar
Angelova, A., Krizhevsky, A., Vanhoucke, V., Ogale, A., & Ferguson, D.(2015). Real-time pedestrian detection with deep network cascades. In BMVC.
Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Surf: Speeded up robust features. CVIU, 110(3), 346–359.
Google Scholar
Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE T-PAMI, 24(24), 509–521.
Article Google Scholar
Benenson, R., Mathias, M., Timofte, R., & Van Gool, L. (2010). Pedestrian detection at 100 frames per second. In CVPR.
Burges, C. J. C. (1996). Simplified support vector decisions. In International conference on machine learning.
Cevikalp, H., & Triggs, B. (2008). Nearest hyperdisk methods for high-dimensional classification. In International conference on machine learning.
Cevikalp, H., & Triggs, B. (2012). Efficient object detection using cascades of nearest convex model classifiers. In CVPR.
Cevikalp, H., Triggs, B., & Franc, V. (2013). Face and landmark detection by using cascade of classifiers. In IEEE International conference on automatic face and gesture recognition.
Cevikalp, H., Larlus, D., Neamtu, M., Triggs, B., & Jurie, F. (2010). Manifold based local classifiers: Linear and nonlinear approaches. Journal of Signal Processing Systems, 61, 61–73.
Article Google Scholar
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273–297.
MATH Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.
Felzenszwalb, P. F., & Girshick, R. B., & McAllester, D. (2010a). Cascade object detection with deformable part models. In CVPR.
Felzenszwalb, P., McAllester, D., & Ramanan, D. (2008). A discriminatively trained, multiscale deformable part model. In CVPR.
Felzenszwalb, P., Girshick, R. B., McAllester, D., & Ramanan, D. (2010b). Object detection with discriminatively trained part based models. IEEE T-PAMI, 32(9), 1627–1645.
Gasimov, R. N., & Ozturk, G. (2006). Separation via polyhedral conic functions. Optimization Methods and Software, 21, 527–540.
Article MathSciNet MATH Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR4.
Harzallah, H., Jurie, F., & Schmid, C.(2009). Combining efficient object localization and image classification. In ICCV.
Huang, G., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October.
Hussain, S. (2011). Machine learning methods for visual object detection. PhD thesis, Laboratoire Jean Kuntzmann.
Hussain, S., & Triggs, B.(2010). Feature sets and dimensionality reduction for visual object detection. In BMVC.
Jain, V., & Learned-Miller, E., (2010). Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst.
Jin, H., Liu, Q., & Lu, H. (2004). Face detection using one-class-based support vectors. In International conference on automatic face and gesture recognition.
Kalal, Z., Matas, J., & Mikolajczyk, K. (2008). Weighted sampling for large-scale boosting. In BMVC.
Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2008). Beyond sliding windows: Object localization by efficient subwindow search. In CVPR.
Levi, K., & Weiss, Y. (2004). Learning object detection from a small number of examples: The importance of good features. In CVPR.
Li, H., Lin, Z., Shen, X., Brandt, J., & Hua, G.(2015). A convoulutional neural network cascade for face detection. In CVPR.
Lowe, D. G. (2004). Distinctive image features from scale invariant keypoints. IJCV, 60, 91–110.
Article Google Scholar
Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-SVTS for object detection and beyond. In ICCV.
Mangasarian, O. L., & Wild, E. W. (2006). Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE T-PAMI, 28, 69–74.
Article Google Scholar
Mika, S., Schölkopf, B., Smola, A., Müller, K.-R., Scholz, M., & Ratsch, G. (1999). Kernel PCA and de-noising in feature spaces. In Neural information processing systems (NIPS).
Murat Dundar, M., Wolf, M., Lakare, S., Salganicoff, M., & Raykar, V. C. (2008). Polyhedral clasifier for target detection a case study: Colorectal cancer. In International conference on machine learning.
Murty, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1–32.
MATH Google Scholar
Orozco, J., Martinez, B., & Pantic, M. (2015). Empirical analysis of cascade deformable models for multi-view face detection. Image and Vision Computing, 42, 47–61.
Article Google Scholar
Papageorgiou, C., & Poggio, T. (2000). A trainable system for object detection. IJCV, 38, 15–33.
Article MATH Google Scholar
Perrotton, X., Sturzel, M., & Roux, M. (2010). Implicit hierarchical boosting for multi-view object detection. In CVPR.
Platt, J. C., (1998). Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods-support vector learning. Cambridge, MA: MIT Press.
Porikli, F. (2005). Integral histogram: A fast way to extract histograms in Cartesian spaces. In CVPR.
Rowley, H. A., Baluja, S., & Kanade, T. (1998). Neural network-based face detection. IEEE T-PAMI, 20, 23–38.
Article Google Scholar
Scheirer, W. J., Rocha, A., Sapkota, A., & Boult, T. E. (2013). Towards open set recognition. IEEE Transactions on PAMI, 35, 1757–1772.
Schölkopf, B., Mika, S., Burges, C. J. C., Knirsch, P., Müller, K. R., Ratsch, G., et al. (1999). Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 10, 1000–1017.
Article Google Scholar
Schölkopf, B., Platt, J., Smola, A., & Williamson, R. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13, 1443–1471.
Article MATH Google Scholar
Shams, L., & Speslstra, J. (1996). Learning Gabor-based features for face detection. In World congress in neural networks.
Sizintsev, M., Derpanis, K. G., & Hogue, A. (2010). Histogram-based search: A comparative study. In CVPR.
Tan, X., & Triggs, B. (2010). Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Transactions on Image Processing, 19, 1635–1650.
Article MathSciNet Google Scholar
Tax, D. M. J., & Duin, R. P. W. (2004). Support vector data description. Machine Learning, 54, 45–66.
Article MATH Google Scholar
Tenmoto, H., Kudo, M., & Shimbo, M. (1998). Piecewise linear classifiers with an appropriate number of hyperplanes. Pattern Recognition, 31, 1627–1634.
Article MATH Google Scholar
Ullman, S., & Sali, E. (2000). Object classification using a fragment-based representation. In Proceedings of the first IEEE international workshop on biologically motivated computer vision, BMVC ’00, pp. 73–87. London. Springer.
Varma, M., & Ray, D. (2007). Learning the discriminative power-invariance trade-off. In ICCV.
Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). Multiple kernels for object detection. In ICCV.
Vedaldi, A., & Zisserman, A. (2012). Efficient additive kernels via explicit feature maps. IEEE Transactions on PAMI, 34, 480–492.
Article Google Scholar
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. IJCV, 57(2), 137–154.
Article Google Scholar
Wang, X., Han, T. X., & Yan, S. (2009). A HOG-LBP human detector with partial occlusion handling. In ICCV.
Wei, Y., & Tao, L. (2010). Efficient histogram-based sliding window. In CVPR.
Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In CVPR.
Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In CVPR.
Zhu, X., Vondrick, C., Ramanan, D., & Fowlkes, C. C. (2012). Do we need more training data or better models for object detection. In BMVC.

Download references

Acknowledgements

This work was supported in part by the Scientific and Technological Research Council of Turkey (TUBİTAK) under Grant Number EEEAG-109E279.

Author information

Authors and Affiliations

Electrical and Electronics Engineering Department, Eskisehir Osmangazi University, Eskisehir, Turkey
Hakan Cevikalp
Laboratoire Jean Kuntzmann, Grenoble, France
Bill Triggs

Authors

Hakan Cevikalp
View author publications
You can also search for this author in PubMed Google Scholar
Bill Triggs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hakan Cevikalp.

Additional information

Communicated by Takayuki Okatani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cevikalp, H., Triggs, B. Visual Object Detection Using Cascades of Binary and One-Class Classifiers. Int J Comput Vis 123, 334–349 (2017). https://doi.org/10.1007/s11263-016-0986-2

Download citation

Received: 08 November 2014
Accepted: 23 December 2016
Published: 11 January 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s11263-016-0986-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Abstract

Access this article

Similar content being viewed by others

Multiple Classifier Boosting and Tree-Structured Classifiers

Fast Image Classification with Reduced Multiclass Support Vector Machines

Deconstructing Binary Classifiers in Computer Vision

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Abstract

Access this article

Similar content being viewed by others

Multiple Classifier Boosting and Tree-Structured Classifiers

Fast Image Classification with Reduced Multiclass Support Vector Machines

Deconstructing Binary Classifiers in Computer Vision

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation