Skip to main content
Log in

Kernelized pyramid nearest-neighbor search for object categorization

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Nearest-neighbor-based image classification has drawn considerable attention in the past several years thanks to its simplicity and efficiency. Recently, a Kernelized version of Naive-Bayes Nearest-Neighbor (KNBNN) approach has been proposed to combine Nearest-Neighbor-based approaches with other bag-of-feature (BoF) based kernels. However, similar to an orderless BoF image representation, the KNBNN ignores global geometric correspondence. In this paper, our contributions are threefolded. First, we present a technique to exploit the global geometric correspondence in a kernelized NBNN classifier framework. We divide an image into increasingly fine sub-regions like the spatial pyramid matching (SPM) approach; Second, we introduce a pyramid nearest-neighbor kernel by measuring the local similarity in each pyramid window. Third, for better calibrating the outputs of each window, we fit a sigmoid function to add posterior probability to its SVM outputs, and then weight these outputs of all windows. The sigmoid parameters and weight values are learned in a class-dependent and window-dependent manner. By doing so, we learn a class-specific geometric correspondence. Finally, the proposed approach is evaluated on two public datasets: Scene-15 and Caltech-101. We reach 85.2 % recognition rate on Scene-15 and 73.3 % on Caltech-101 only using single descriptor. The experimental results show that our approach significantly outperforms existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Behmo, R., Marcombes, P., Dalalyan, A., Prinet, V.: Towards optimal Naive Bayes nearest neighbors. In: European Conference on Computer Vision (2010)

  2. Bishop, C., en ligne), S.S.: Pattern recognition and machine learning, vol. 4. springer New York (2006)

  3. Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)

  4. Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: Advances in Neural Information Processing Systems (2010)

  5. Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. Neural Information Processing Systems 2(3) (2009)

  6. Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: IEEE International Conference on Computer Vision and Pattern Recognition (2008)

  7. Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: ACM International Conference on Image and Video Retrieval (2007)

  8. Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial-bag-of-features. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

  9. Cargill, P., Quiroz, D., Sucar, L.: Object tracking based on covariance descriptors and on-line Naive Bayes nearest neighbor classifier. In: Fourth Pacific-Rim Symposium on Image and Video Technology (2010)

  10. Cheng, H., Dai, Z., Liu, Z.: Image-to-class dynamic time warping for 3d hand gesture recognition. In: IEEE International Conference on Multimedia and Expo (2013)

  11. Cheng, H., Liu, Z., Zhao, Y., Ye, G., Sun, X.: Real world activity summary for senior home monitoring. Multimedia Tools and Applications (2012)

  12. Cheng, H., Yu, R., Liu, Z.: A pyramid nearest neighbor search kernel for object categorization. In: IEEE International Conference on Pattern Recognition (2012)

  13. Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: IEEE International Conference on Computer Vision (2011)

  14. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In: IEEE CVPR Workshop on Generative-Model Based Vision (2004)

  15. Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (2005)

  16. Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: IEEE International Conference on Computer Vision (2005)

  17. Grauman, K., Darrell, T.: The pyramid match kernel: Efficient learning with sets of features. The Journal of Machine Learning Research 8(5), 725–760 (2007)

    MATH  Google Scholar 

  18. Hu, J., Lam, K., Qiu, G.: A hierarchical algorithm for image multi-labeling. In: IEEE International Conference on Image Processing (2010)

  19. Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Transaction on Pattern Analysis and Machine Intelligence 33(1), 117–128 (2010)

    Article  Google Scholar 

  20. Krapac, J., Verbeek, J., Jurie, F., et al.: Modeling spatial layout with Fisher vectors for image categorization. In: IEEE International Conference on Computer Vision (2011)

  21. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (2006)

  22. Li, X., Hu, W., Wang, H., Zhang, Z.: Robust object tracking using a spatial pyramid heat kernel structural information representation. Neurocomputing 73(16–18), 3179–3190 (2010)

    Article  Google Scholar 

  23. Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  24. Lu, H., Lu, S., Chen, Y.: Robust tracking based on pixel-wise spatial pyramid and biased fusion. In: Asian Conference on Computer Vision (2011)

  25. Lyu, S.: Mercer kernels for object recognition with local features. In: IEEE International Conference on Computer Vision and Pattern Recognition (2005)

  26. Malisiewicz, T., Gupta, A., Efros, A.: Ensemble of Examplar-SVMs for object detection and beyond. In: IEEE International Conference on Computer Vision (2011)

  27. McCann, S., Lowe, D.: Local naive bayes nearest neighbor for image classification. In: IEEE International Conference on Computer Vision and Pattern Recognition (2012)

  28. Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: IEEE International Conference on Computer Vision (2001)

  29. Muja, M., Lowe, D.: Fast approximate nearest neighbors with automatic algorithm configuration. In: IEEE International Conference on Computer Vision (2009)

  30. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  31. Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers (1999)

  32. Redondo-Cabrera, C., Lopez-Sastre, R.J.: SURFing the point clouds: Selective 3d spatial pyramid for category-level object recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (2011)

  33. Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. Pattern Analysis and Machine Intelligence. IEEE Transactions on 19(5), 530–535 (1997)

    Google Scholar 

  34. Shahiduzzaman, M., Zhang, D., Lu, G.: Improved spatial pyramid matching for image classification. In: Asian Conference on Computer Vision (2011)

  35. Timofte, R., Van Gool, L.: Iterative nearest neighbors for classification and dimensionality reduction. In: IEEE International Conference on Computer Vision and Pattern Recognition (2012)

  36. Tuytelaars, T., Fritz, M., Saenko, K., Darrell, T.: The NBNN kernel. In: IEEE International Conference on Computer Vision (2011)

  37. Varma, M., Ray, D.: Learning the discriminative power-invariance trade-off. In: IEEE International Conference on Computer Vision (2007)

  38. Wang, Z., Hu, Y., Chia, L.: Learning instance-to-class distance for human action recognition. In: IEEE International Conference on Image Processing (2009)

  39. Wang, Z., Hu, Y., Chia, L.: Image-to-class distance metric learning for image classification. In: European Conference on Computer Vision (2010)

  40. Wang, Z., Hu, Y., Chia, L.: Multi-label learning by Image-to-Class distance for scene classification and image annotation. In: ACM International Conference on Image and Video Retrieval (2010)

  41. Wang, Z., Hu, Y., Chia, L.: Improved learning of I2C distance and accelerating the neighborhood search for image classification. Pattern Recognition 44(10–11) (2011)

  42. Weinberger, K., Blitzer, J., Saul, L.: Distance metric learning for large margin nearest neighbor classification. In: Neural Information Processing Systems (2006)

  43. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE International Conference on Computer Vision and Pattern Recognition (2009)

  44. Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: IEEE International Conference on Computer Vision and Pattern Recognition (2009)

  45. Yuan, J., Liu, Z., Wu, Y.: Discriminative video pattern search for efficient action detection. IEEE Transaction on Pattern Analysis and Machine Intelligence 33 (2011)

  46. Yuan, J., Liu, Z., Wu, Y., Zhang, Z.: Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos. In: ACM international workshop on Events in multimedia (2009)

  47. Zhang, C., Liu, J., Wang, J., Tian, Q., Xu, C., Lu, H., Ma, S.: Image classification using spatial pyramid coding and visual word reweighting. In: Asian Conference on Computer Vision (2011)

  48. Zhang, D., Liu, B., Sun, C., Wang, X.: Random sampling image to class distance for photo annotation. Working Notes of CLEF (2010)

  49. Zhang, D., Liu, B., Sun, C., Wang, X.: Making image to class distance comparable. In: Neural Information Processing, pp. 671–680. Springer (2011)

  50. Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (2006)

  51. Zhou, X., Cui, N., Li, Z., Liang, F., Huang, T.: Hierarchical gaussianization for image classification. In: IEEE International Conference on Computer Vision (2009)

  52. Zhu, J., Rosset, S., Zou, H., Hastie, T.: Multi-class Adaboost. University of Michigan, Techical Report, Ann Arbor (2006)

Download references

Acknowledgments

This work is supported by the grant from “National Natural Science Foundation of China (NSFC)” (No. 61075045, No. 61273256 and No. 61305033), “the Program for New Century Excellent Talents in University” (NECT-10-0292).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Cheng.

Additional information

The preliminary version of this paper appeared in ICPR 2012.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, H., Yu, R., Liu, Z. et al. Kernelized pyramid nearest-neighbor search for object categorization. Machine Vision and Applications 25, 931–941 (2014). https://doi.org/10.1007/s00138-014-0608-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-014-0608-3

Keywords

Navigation