Abstract
This paper introduces a novel sparse nonparametric support vector machine classifier (SN-SVM) which combines data distribution information from two state-of-the-art kernel-based classifiers, namely, the kernel support vector machine (KSVM) and the kernel nonparametric discriminant (KND). The proposed model incorporates some near-global variations of the data provided by the KND and, hence, may be viewed as an extension to the KSVM. Similarly, since the support vectors improve the choice of \(\kappa \)-nearest neighbors (\(\kappa -NN\)’s), it can also serve as an extension to the KND. The proposed model is capable of dealing with both heteroscedastic and non-normal data while avoiding the small sample size problem. The model is a convex quadratic optimization problem with one global optimal solution, so it can be estimated easily and efficiently using numerical methods. It can also be reduced to the classical KSVM model and as such existing SVM programs can be used for easy implementation. Through the Bayesian interpretation with the help of a Gaussian prior, we show that our method provides a sparse solution by assigning non-zero weights to only a fraction of the total number of training samples. This sparsity can be used by existing sparse classification algorithms to obtain better computational efficiency. The experimental results on real-world datasets and face recognition applications show that the proposed SN-SVM model improves the classification accuracy over contemporary classifiers and also provides sparser solution than the KSVM.


Similar content being viewed by others
References
Asuncion, A., Newman, D.: UCI Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine (2007)
Barber, D., Williams, C.K.: Gaussian processes for bayesian classification via hybrid monte carlo. In: Advances in Neural Information Processing Systems 9, pp. 340–346. MIT Press (1997)
Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. 12(10), 2385–2404 (2000)
Belegundu, A.D., Chandrupatla, T.R.: Optimization Concepts and Applications in Engineering. Prentice Hall, Englewood Cliffs (1999)
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. In: IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997)
Belkin, M., Niyogi, P., Sindhwani, V.: On Manifold Regularization. In: Proceedings of the Artificial Intelligence and Statistics (2005)
Camps-Valls, G., Bruzzone, L.: Kernel-based methods for hyperspectral image classification. In: IEEE Trans. Geosci. Remote Sens. 43(6), 1351–1362 (2005)
Coleman, T., Li, Y.: A reflective newton method for minimizing a quadratic function subject to bounds on some of the variables. SIAM J. Optim. 6(4), 1040–1058 (1996)
Crammer, K., Dredze, M., Pereira, F.: Exact Convex Confidence-Weighted Learning. Advances in Neural Information Processing Systems 21 (2009)
Cristianini, M., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Dai, G., Qian, Y.: Kernel generalized nonlinear discriminant analysis algorithm for pattern recognition. In: International Conference on Image Processing, vol. 4, pp. 2697–2700 (2004)
Duda, R., Hart, P.E., Stork, D.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2000)
Elkan, C.: Naive bayes learning. Technical Report CS97-557, Department of Computer Science and Engineering, University of California, San Diego (1997)
Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press, London (2000)
Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. In: IEEE Trans. Inf. Theory 21(1), 32–40 (1975)
Georghiades, A.: Yale face database (1997) . http://cvc.yale.edu/projects/yalefaces/yalefaces.html
Golub, G., Loan, C.V.: Matrix Computations, 3rd edn. John Hopkins University Press, Baltimore (1996)
Yu, H., Yang, J.: A direct LDA algorithm for high-dimensional data with application to face recognition. Pattern Recognit. 34, 2067–2070 (2001)
Horn, R., Charles, R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)
Huang, L., Ma, Y., Ijiri, Y., Lao, S., Kawade, M., Zhao, Y.: An adaptive nonparametric discriminant analysis method and its application to face recognition. In: Asian Conference on Computer Vision, pp. 680–689 (2007).
Jain, A., Bolle, R., Pankanti, S. (eds.): BIOMETRIC-Personal Identification in Networked Society. Kluwer Academic Publishers, London (1999)
Jain, V., Mukherjee, A.: The indian face database (2002). http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/
Keerthi, S.: Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. In: IEEE Trans. Neural Netw. 13(5), 1225–1229 (2002)
Khan, N., Ksantini, R., Ahmad, I., Boufama, B.: A novel SVM+NDA model for classification with an application to face recognition. Pattern Recognit. 45(1), 66–79 (2012)
Kuhn, H., Tucker, A.: Nonlinear programming. In: Proceedings of the 2nd Berkeley, Symposium pp. 481–492 (1950)
Lee, C., Landgrebe, D.: Feature extraction based on decision boundaries. In: IEEE Trans. Pattern Anal. Mach. Intell. 15(4), 388–400 (1993)
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition pp. 200–205 (1998).
Lyons, M., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. In: IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)
MATLAB Bioinformatics Toolbox: The mathworks™(2011)
Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers, K.: Fisher discriminant analysis with kernels. Neural Netw. Signal Process. IX, 41–48 (1999)
Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes: The Art of Scientific Computing, 3rd edn. Cambridge University Press, Cambridge (2007)
Ratsch, G., Onoda, T., Muller, K.: Soft margins for adaboost. Mach. Learn. 42(3), 287–320 (2000)
Rish, I.: An empirical study of the naive bayes classifier. In: IJCAI Workshop on Empirical Methods in AI (2001)
Scholkopf, B., Smola, A.: Learning With Kernels-Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2001)
Shivaswamy, P., Jebara, T.: Elliposoidal kernel machines. In Proceedings of the Artificial Intelligence and Statistics (2007)
Shivaswamy, P., Jebara, T.: Maximum relative margin and data-dependent regularization. J. Mach. Learn. Res. 11, 747–788 (2010)
Sinz, F., Chapelle, O., Agarwal, A., Scholkopf, B.: An analysis of inference with the universum. Adv Neural Inf. Process. Syst. 20, 1369–1376 (2008)
Sollich, P.: Probabilistic interpretation and Bayesian methods for support vector machines. In: Proceedings of the Ninth International Conference on Artificial, Neural Networks, pp. 91–96 (1999)
Sollich, P.: Bayesian methods for support vector machines: evidence and predictive class probabilities. Mach. Learn. 46(1–3), 21–52 (2002)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Weston, J., Collobert, R., Sinz, F.H., Bottou, L., Vapnik, V.: Inference with the universum. In: Proceedings of the International Conference on, Machine Learning, pp. 1009–1016 (2006)
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. Adv. Neural Inf. Process. Syst. 13, 668–674 (2000)
You, D., Hamsici, O.C., Martinez, A.M.: Kernel optimization in discriminant analysis. In: IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 631–638 (2011)
Zhang, B., Chen, X., Shan, S., Gao, W.: Nonlinear face recognition based on maximum average margin criterion. In: IEEE Conference on Computer Vision and, Pattern Recognition, pp. 554–559 (2005)
Zhang, Z., Rao, B.D.: Exploiting correlation in sparse signal recovery problems: Multiple measurement vectors, block sparsity, and time-varying sparsity. In: ICML 2011 Workshop on Structured Sparsity: Learning and Inference (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Khan, N.M., Ksantini, R., Ahmad, I.S. et al. SN-SVM: a sparse nonparametric support vector machine classifier. SIViP 8, 1625–1637 (2014). https://doi.org/10.1007/s11760-012-0404-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-012-0404-3