Combining partially global and local characteristics for improved classification

Ksantini, Riadh; Boufama, Boubakeur

doi:10.1007/s13042-011-0045-9

Combining partially global and local characteristics for improved classification

Original Article
Published: 25 August 2011

Volume 3, pages 119–131, (2012)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Riadh Ksantini¹ &
Boubakeur Boufama¹

197 Accesses
6 Citations
Explore all metrics

Abstract

The Support Vector Machine (SVM) has achieved promising classification performance. However, since it is based only on local information (Support Vectors), it is sensitive to directions with large data spread. On the other hand, Nonparametric Discriminant Analysis (NDA) is an improvement over the more general Linear Discriminant Analysis (LDA) where, the normality assumption from LDA is relaxed. Furthermore, NDA incorporates the partially global information to detect the dominant normal directions to the decision surface, which represent the true data spread. However, NDA relies on the choice of the κ-nearest neighbors (κ-NN’s) on the decision boundary. This paper introduces a novel Combined SVM and NDA (CSVMNDA) model which controls the spread of the data, while maximizing a relative margin separating the data classes. This model is considered as an improvement to SVM by incorporating the data spread information represented by the dominant normal directions to the decision boundary. This can also be viewed as an extension to the NDA where the support vectors improve the choice of κ-nearest neighbors (κ-NN’s) on the decision boundary by incorporating local information. Since our model is an extension to both SVM and NDA, it can deal with heteroscedastic and non-normal data. It also avoids the small sample size problem. Interestingly, the proposed improvements only require a rigorous and simple combination of NDA and SVM objective functions, and preserve the computational efficiency of SVM. Through the optimization of the CSVMNDA objective function, surprising performance gains were achieved on real-world problems. In particular, the experiments on face recognition have clearly shown the superiority of CSVMNDA over other state-of-the-art classification methods, especially, SVM and NDA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Locally Linear Support Vector Machines for Imbalanced Data Classification

Correlation maximization machine for multi-modalities multiclass classification

Article 18 February 2019

Fast Image Classification with Reduced Multiclass Support Vector Machines

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The data sets can be obtained from http://www.first.gmd.de/raetsch/

References

Vapnik VN (1998) Statistical learning theory. Wiley, New York, USA
MATH Google Scholar
Baudat G, Anouar B (2000) Generalized discriminant analysis using Kernel approach. Neural Comput 12:2385–2404
Google Scholar
Mika S et al (1999) Fisher discriminant analysis with kernels. In: Proceedings of IEEE neural networks for signal processing workshop, pp 41–48
Lim TS, Loh WY, Shih YS (2000)A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40(3):203–228
Article MATH Google Scholar
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19:711–720
Article Google Scholar
Weng JJ (1996) Crescepton and SHOSLIF: towards comprehensive visual learning. Early Visual Learn 183–214
Fukunaga K (2000) Introduction to statistical pattern recognition, 2nd edn. Academic Press, London
Loog M, Duin RPW (2004) Linear dimensionality reduction via heteroscedastic extension of LDA: the Chernoff criterion. IEEE Trans Pattern Anal Mach Intell 26(6):732-739
Article Google Scholar
Lee C, Landgrebe DA (1993) Feature extraction based on decision boundaries. IEEE Trans Pattern Anal Mach Intell 15(4):388–400
Article Google Scholar
Hart EP, Duda OR, Stork GD (2000) Pattern classification, 2nd edn. Wiley-Interscience, New York
Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. Adv Neural Inform Process Syst 13:668–674
Google Scholar
Shivaswamy PK, Jebara T (2007) Ellipsoidal kernel machines. In: Proceedings of the artificial intelligence and statistics, pp 484–491
Zhang B, Chen X, Shan S, Gao W (2005) Nonlinear face recognition based on maximum average margin criterion. Comput Vis Pattern Recogn 1:554–559
Google Scholar
Cesa-Bianchi N, Conconi A, Gentile C (2005) A second-order perceptron algorithm. SIAM J Comput 34(3):640–668
Article MathSciNet MATH Google Scholar
Cristianini M, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Dredze M, Crammer K, Pereira F (2008) Confidence-weighted linear classification. In: International conference on machine leaarning, pp 264–271
Crammer K, Dredze M, Pereira F (2009) Exact convex confidence-weighted learning. Adv Neural Inform Process Syst 21:345–352
Google Scholar
Mohri M, Pereira F (2009) Gaussian margin machines. In: Proceedings of the artificial intelligence and statistics, pp 105–112
Belkin M, Niyogi P, Sindhwani V (2005) On manifold regularization. In: Proceedings of the Artificial Intelligence and Statistics
Weston J, Collobert R, Sinz FH, Bottou L, Vapnik V (2006) Inference with the universum. In: Proceedings of the international conference on machine learning, pp 1009–1016
Sinz F, Chapelle O, Agarwal A, Scholkopf B (2008) An analysis of inference with the universum. Adv Neural Inform Process Syst 20:1369–1376
Google Scholar
Shivaswamy PK, Jebara T (2010) Maximum relative margin and data-dependent regularization. J Mach Learn Res 11:747–788
MathSciNet Google Scholar
Yu H, Yang J (2001) A direct LDA algorithm for high-dimensional data with application to face recognition. Pattern Recogn 34:2067–2070
Article MATH Google Scholar
Jain A, Bolle R, Pankanti S (eds) (1999) BIOMETRIC-personal identification in networked society. Kluwer Academic Publishers, London
Xiong T, Cherkassy V (2005) A combined SVM and LDA approach for classification. In: Proceedings of the international joint conference on neural networks, pp 1455–1459
Park CH, Park H (2005) Nonlinear discriminant analysis using kernel functions and the generalized singular value decomposition. SIAM J Matrix Anal Appl 27(1):98–102
Article Google Scholar
Xiong T et al. (2005) Efficient kernel discriminant analysis via QR decomposition. Adv Neural Inform Process Syst 17:1529–1536
Google Scholar
Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inform Theory 21(1):32–40
Article MathSciNet MATH Google Scholar
Kuhn HW, Tucker AW (1950) Nonlinear programming. In: Proceedings of 2nd Berkeley symposium, pp 481–492
The MathWorks™. MATLAB Bioinformatics Toolbox™ (2009)
Golub GH, Van-Loan CFV (1996) Matrix computations, 3rd edn. The John Hopkins University Press, Baltimore
Coleman TF, Li Y (1996) A reflective newton method for minimizing a quadratic function subject to bounds on some of the variables. SIAM J Optim 6(4):1040–1058
Article MathSciNet MATH Google Scholar
Yiu ML, Mamoulis N (2005) Iterative projected clustering by subspace mining. IEEE Trans Knowl Data Eng 17(2):176–189
Article Google Scholar
Ratsch G, Onoda T, Muller KR (2000) Soft margins for Adaboost. Mach Learn 42(3):287–320
Article Google Scholar
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences
Athinodoros SG (1997) Yale face database. Yale University, Center for Computational Vision and Control, USA
Spacek L (2008) Face recognition data. University of Essex, Computer Vision Science Research Projects
Budynek J, Lyons MJ, Akamatsu S (1999) Automatic classification of single facial images. IEEE Trans Pattern Anal Mach Intell 21(12):1357–1362
Article Google Scholar
Kamachi M, Lyons M, Akamatsu S, Gyoba J (1998) Coding facial expressions with gabor wavelets. In: Proceedings of the Third IEEE international conference on automatic face and gesture recognition, 1998, pp 200–205

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Windsor, Windsor, ON, N9B3P4, Canada
Riadh Ksantini & Boubakeur Boufama

Authors

Riadh Ksantini
View author publications
You can also search for this author in PubMed Google Scholar
Boubakeur Boufama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riadh Ksantini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ksantini, R., Boufama, B. Combining partially global and local characteristics for improved classification. Int. J. Mach. Learn. & Cyber. 3, 119–131 (2012). https://doi.org/10.1007/s13042-011-0045-9

Download citation

Received: 19 April 2011
Accepted: 12 August 2011
Published: 25 August 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s13042-011-0045-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining partially global and local characteristics for improved classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Locally Linear Support Vector Machines for Imbalanced Data Classification

Correlation maximization machine for multi-modalities multiclass classification

Fast Image Classification with Reduced Multiclass Support Vector Machines

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Combining partially global and local characteristics for improved classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Locally Linear Support Vector Machines for Imbalanced Data Classification

Correlation maximization machine for multi-modalities multiclass classification

Fast Image Classification with Reduced Multiclass Support Vector Machines

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation