Skip to main content
Log in

SN-SVM: a sparse nonparametric support vector machine classifier

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

This paper introduces a novel sparse nonparametric support vector machine classifier (SN-SVM) which combines data distribution information from two state-of-the-art kernel-based classifiers, namely, the kernel support vector machine (KSVM) and the kernel nonparametric discriminant (KND). The proposed model incorporates some near-global variations of the data provided by the KND and, hence, may be viewed as an extension to the KSVM. Similarly, since the support vectors improve the choice of \(\kappa \)-nearest neighbors (\(\kappa -NN\)’s), it can also serve as an extension to the KND. The proposed model is capable of dealing with both heteroscedastic and non-normal data while avoiding the small sample size problem. The model is a convex quadratic optimization problem with one global optimal solution, so it can be estimated easily and efficiently using numerical methods. It can also be reduced to the classical KSVM model and as such existing SVM programs can be used for easy implementation. Through the Bayesian interpretation with the help of a Gaussian prior, we show that our method provides a sparse solution by assigning non-zero weights to only a fraction of the total number of training samples. This sparsity can be used by existing sparse classification algorithms to obtain better computational efficiency. The experimental results on real-world datasets and face recognition applications show that the proposed SN-SVM model improves the classification accuracy over contemporary classifiers and also provides sparser solution than the KSVM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Asuncion, A., Newman, D.: UCI Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine (2007)

    Google Scholar 

  2. Barber, D., Williams, C.K.: Gaussian processes for bayesian classification via hybrid monte carlo. In: Advances in Neural Information Processing Systems 9, pp. 340–346. MIT Press (1997)

  3. Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. 12(10), 2385–2404 (2000)

    Article  Google Scholar 

  4. Belegundu, A.D., Chandrupatla, T.R.: Optimization Concepts and Applications in Engineering. Prentice Hall, Englewood Cliffs (1999)

    MATH  Google Scholar 

  5. Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. In: IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997)

  6. Belkin, M., Niyogi, P., Sindhwani, V.: On Manifold Regularization. In: Proceedings of the Artificial Intelligence and Statistics (2005)

  7. Camps-Valls, G., Bruzzone, L.: Kernel-based methods for hyperspectral image classification. In: IEEE Trans. Geosci. Remote Sens. 43(6), 1351–1362 (2005)

  8. Coleman, T., Li, Y.: A reflective newton method for minimizing a quadratic function subject to bounds on some of the variables. SIAM J. Optim. 6(4), 1040–1058 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  9. Crammer, K., Dredze, M., Pereira, F.: Exact Convex Confidence-Weighted Learning. Advances in Neural Information Processing Systems 21 (2009)

  10. Cristianini, M., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  11. Dai, G., Qian, Y.: Kernel generalized nonlinear discriminant analysis algorithm for pattern recognition. In: International Conference on Image Processing, vol. 4, pp. 2697–2700 (2004)

  12. Duda, R., Hart, P.E., Stork, D.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2000)

    Google Scholar 

  13. Elkan, C.: Naive bayes learning. Technical Report CS97-557, Department of Computer Science and Engineering, University of California, San Diego (1997)

  14. Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press, London (2000)

    Google Scholar 

  15. Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. In: IEEE Trans. Inf. Theory 21(1), 32–40 (1975)

  16. Georghiades, A.: Yale face database (1997) . http://cvc.yale.edu/projects/yalefaces/yalefaces.html

  17. Golub, G., Loan, C.V.: Matrix Computations, 3rd edn. John Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  18. Yu, H., Yang, J.: A direct LDA algorithm for high-dimensional data with application to face recognition. Pattern Recognit. 34, 2067–2070 (2001)

    Article  MATH  Google Scholar 

  19. Horn, R., Charles, R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)

    MATH  Google Scholar 

  20. Huang, L., Ma, Y., Ijiri, Y., Lao, S., Kawade, M., Zhao, Y.: An adaptive nonparametric discriminant analysis method and its application to face recognition. In: Asian Conference on Computer Vision, pp. 680–689 (2007).

  21. Jain, A., Bolle, R., Pankanti, S. (eds.): BIOMETRIC-Personal Identification in Networked Society. Kluwer Academic Publishers, London (1999)

    Google Scholar 

  22. Jain, V., Mukherjee, A.: The indian face database (2002). http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/

  23. Keerthi, S.: Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. In: IEEE Trans. Neural Netw. 13(5), 1225–1229 (2002)

  24. Khan, N., Ksantini, R., Ahmad, I., Boufama, B.: A novel SVM+NDA model for classification with an application to face recognition. Pattern Recognit. 45(1), 66–79 (2012)

    Article  MATH  Google Scholar 

  25. Kuhn, H., Tucker, A.: Nonlinear programming. In: Proceedings of the 2nd Berkeley, Symposium pp. 481–492 (1950)

  26. Lee, C., Landgrebe, D.: Feature extraction based on decision boundaries. In: IEEE Trans. Pattern Anal. Mach. Intell. 15(4), 388–400 (1993)

  27. Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition pp. 200–205 (1998).

  28. Lyons, M., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. In: IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)

  29. MATLAB Bioinformatics Toolbox: The mathworks™(2011)

  30. Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers, K.: Fisher discriminant analysis with kernels. Neural Netw. Signal Process. IX, 41–48 (1999)

    Google Scholar 

  31. Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes: The Art of Scientific Computing, 3rd edn. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  32. Ratsch, G., Onoda, T., Muller, K.: Soft margins for adaboost. Mach. Learn. 42(3), 287–320 (2000)

    Article  Google Scholar 

  33. Rish, I.: An empirical study of the naive bayes classifier. In: IJCAI Workshop on Empirical Methods in AI (2001)

  34. Scholkopf, B., Smola, A.: Learning With Kernels-Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  35. Shivaswamy, P., Jebara, T.: Elliposoidal kernel machines. In Proceedings of the Artificial Intelligence and Statistics (2007)

  36. Shivaswamy, P., Jebara, T.: Maximum relative margin and data-dependent regularization. J. Mach. Learn. Res. 11, 747–788 (2010)

    MATH  MathSciNet  Google Scholar 

  37. Sinz, F., Chapelle, O., Agarwal, A., Scholkopf, B.: An analysis of inference with the universum. Adv Neural Inf. Process. Syst. 20, 1369–1376 (2008)

    Google Scholar 

  38. Sollich, P.: Probabilistic interpretation and Bayesian methods for support vector machines. In: Proceedings of the Ninth International Conference on Artificial, Neural Networks, pp. 91–96 (1999)

  39. Sollich, P.: Bayesian methods for support vector machines: evidence and predictive class probabilities. Mach. Learn. 46(1–3), 21–52 (2002)

    Article  MATH  Google Scholar 

  40. Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)

    MATH  MathSciNet  Google Scholar 

  41. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  42. Weston, J., Collobert, R., Sinz, F.H., Bottou, L., Vapnik, V.: Inference with the universum. In: Proceedings of the International Conference on, Machine Learning, pp. 1009–1016 (2006)

  43. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. Adv. Neural Inf. Process. Syst. 13, 668–674 (2000)

    Google Scholar 

  44. You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. In: IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 631–638 (2011)

  45. Zhang, B., Chen, X., Shan, S., Gao, W.: Nonlinear face recognition based on maximum average margin criterion. In: IEEE Conference on Computer Vision and, Pattern Recognition, pp. 554–559 (2005)

  46. Zhang, Z., Rao, B.D.: Exploiting correlation in sparse signal recovery problems: Multiple measurement vectors, block sparsity, and time-varying sparsity. In: ICML 2011 Workshop on Structured Sparsity: Learning and Inference (2011)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naimul Mefraz Khan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, N.M., Ksantini, R., Ahmad, I.S. et al. SN-SVM: a sparse nonparametric support vector machine classifier. SIViP 8, 1625–1637 (2014). https://doi.org/10.1007/s11760-012-0404-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-012-0404-3

Keywords

Navigation