Skip to main content
Log in

Membership score machine for highly nonlinear classification for small data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

We introduce a novel classification algorithm, called the membership score machine (MSM), for highly nonlinear classification for small data, which is particularly applicable for highly irregular/non-globular dataset of arbitrarily many classes. Given a training dataset, the method utilizes within-class clustering and dimensionality reduction to extract useful geometric features, for each of classes. For data points in a class, the method first performs clustering and then applies the principal component analysis (PCA) for each cluster to form a reliable geometric representation in lower dimensions. The goal in the training stage is to draw out reliable geometric features and related anisotropic measures, one for each cluster. At the prediction stage, it calculates the membership scores based on the anisotropic measures with respect to each of the clusters; a test data point is classified for the class which contains the cluster that makes the maximum membership score. The proposed algorithm, the MSM, turns out to be scalable and more effective than existing algorithms in accuracy, especially for small and highly irregular datasets. The main idea behind the MSM is to represent the dataset geometrically by expressing it as a combination of multiple easy-to-classify clusters transformed into principal components in low dimensions. Numerical experiments are presented and compared with existing popular classifiers, to demonstrate its superior performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Guillaumin M, Verbeek J, Schmid C (2009) Is that you? metric learning approaches for face identification. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 498–505

  2. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, IEEE, pp 138–142

  3. Lanitis A, Taylor CJ, Cootes TF (1995) Automatic face identification system using flexible appearance models. Image Vis Comput 13(5):393–401

    Article  Google Scholar 

  4. Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), IEEE, pp 3304–3308

  5. Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304

    Article  Google Scholar 

  6. Juang B-H, Hou W, Lee C-H (1997) Minimum classification error rate methods for speech recognition. IEEE Trans Speech Audio Process 5(3):257–265

    Article  Google Scholar 

  7. El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn 44(3):572–587

    Article  MATH  Google Scholar 

  8. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489

  9. Manevitz LM, Yousef M (2001) One-class svms for document classification. J Mach Learn Res 2(Dec):139–154

    MATH  Google Scholar 

  10. Minsky M (1961) Steps toward artificial intelligence. Proc IRE 49(1):8–30

    Article  MathSciNet  Google Scholar 

  11. Maron ME (1961) Automatic indexing: an experimental inquiry. Journal of the ACM (JACM) 8(3):404–417

    Article  MATH  Google Scholar 

  12. Cox DR (1958) The regression analysis of binary sequences. J R Stat Soc Series B (Methodological) 20(2):215–232

    MathSciNet  MATH  Google Scholar 

  13. Schölkopf B, Smola AJ, Bach F, et al (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond, MIT press

  14. Fix E, Hodges JL (1989) Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review/Revue Internationale de Statistique 57(3):238–247

    MATH  Google Scholar 

  15. Ho T K (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol 1, IEEE, pp 278–282

  16. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  17. Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319

    Article  Google Scholar 

  18. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  19. Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114

  20. Kingma DP, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. In: Advances in neural information processing systems, pp 10215–10224

  21. Oord A , Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. arXiv:1601.06759

  22. Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  MATH  Google Scholar 

  23. Shao Y-H, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

    Article  Google Scholar 

  24. Peng X (2010) Tsvr: an efficient twin support vector machine for regression. Neural Netw 23 (3):365–372

    Article  MATH  Google Scholar 

  25. Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recogn 46(1):305–316

    Article  MATH  Google Scholar 

  26. Gu Q, Han J (2013) Clustered support vector machines. In: Artificial Intelligence and Statistics, PMLR, pp 307–315

  27. Hsieh C-J, Si S, Dhillon I (2014) A divide-and-conquer solver for kernel support vector machines. In: International conference on machine learning, PMLR, pp 566–574

  28. Guo G, Wang H, Bell D, Bi Y, Greer K (2003) Knn model-based approach in classification. In: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, Springer, pp 986–996

  29. Zhang S, Li X, Zong M, Zhu X, Wang R (2017) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29(5):1774–1785

    Article  MathSciNet  Google Scholar 

  30. Yong Z, Youwen L, Shixiong X (2009) An improved knn text classification algorithm based on clustering. J Comput 4(3):230–237

    Google Scholar 

  31. MacQueen J, et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, Oakland, CA, USA, pp 281–297

  32. Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  MATH  Google Scholar 

  33. Davis JV, Kulis B, Jain P, Sra S, Dhillon I S (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning, pp 209–216

  34. Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. The Am Stat 46(3):175–185

    MathSciNet  Google Scholar 

  35. Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition. i. IEEE Trans Syst Man Cybern, Part B (Cybernetics) 29(6):778–785

    Article  Google Scholar 

  36. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Phil Mag 2(11):559–572

    Article  MATH  Google Scholar 

  37. Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441 and 498–520

    Article  MATH  Google Scholar 

  38. Hotelling H (1936) Relations between two sets of variates. Biometrika 28(3/4):321–377

    Article  MATH  Google Scholar 

  39. Thorndike RL (1953) Who belongs in the family. In: Psychometrika, Citeseer

  40. Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: an analysis and critique. Strategic management journal 17(6):441–458

    Article  Google Scholar 

  41. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  42. Dua D, Graff C (2017) UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

  43. Don DR, Iacob IE (2020) Dcsvm: fast multi-class classification using support vector machines. Int J Mach Learn Cybern 11(2):433–447

    Article  Google Scholar 

  44. Tanveer M, Gautam C, Suganthan PN (2019) Comprehensive evaluation of twin svm based classifiers on uci datasets. Appl Soft Comput 83:105617

    Article  Google Scholar 

Download references

Acknowledgements

The work of Byungjoon Lee is partially supported by The Catholic University of Korea, Research Fund, 2020 and the National Research Foundation (NRF) of Korea under Grant NRF-2020R1A2C4002378.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byungjoon Lee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, B., Kim, H. & Kim, S. Membership score machine for highly nonlinear classification for small data. Appl Intell 53, 6511–6524 (2023). https://doi.org/10.1007/s10489-022-03652-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03652-8

Keywords

Navigation