Membership score machine for highly nonlinear classification for small data

Lee, Byungjoon; Kim, Hwamog; Kim, Seongjai

doi:10.1007/s10489-022-03652-8

Membership score machine for highly nonlinear classification for small data

Published: 09 July 2022

Volume 53, pages 6511–6524, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

205 Accesses
1 Altmetric
Explore all metrics

Abstract

We introduce a novel classification algorithm, called the membership score machine (MSM), for highly nonlinear classification for small data, which is particularly applicable for highly irregular/non-globular dataset of arbitrarily many classes. Given a training dataset, the method utilizes within-class clustering and dimensionality reduction to extract useful geometric features, for each of classes. For data points in a class, the method first performs clustering and then applies the principal component analysis (PCA) for each cluster to form a reliable geometric representation in lower dimensions. The goal in the training stage is to draw out reliable geometric features and related anisotropic measures, one for each cluster. At the prediction stage, it calculates the membership scores based on the anisotropic measures with respect to each of the clusters; a test data point is classified for the class which contains the cluster that makes the maximum membership score. The proposed algorithm, the MSM, turns out to be scalable and more effective than existing algorithms in accuracy, especially for small and highly irregular datasets. The main idea behind the MSM is to represent the dataset geometrically by expressing it as a combination of multiple easy-to-classify clusters transformed into principal components in low dimensions. Numerical experiments are presented and compared with existing popular classifiers, to demonstrate its superior performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Very large-scale data classification based on K-means clustering and multi-kernel SVM

Article 29 January 2018

Tinglong Tang, Shengyong Chen, … Jake Luo

Large-Scale Instance Selection Using a Heterogeneous Value Difference Matrix

Massive Classification with Support Vector Machines

References

Guillaumin M, Verbeek J, Schmid C (2009) Is that you? metric learning approaches for face identification. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 498–505
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, IEEE, pp 138–142
Lanitis A, Taylor CJ, Cootes TF (1995) Automatic face identification system using flexible appearance models. Image Vis Comput 13(5):393–401
Article Google Scholar
Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), IEEE, pp 3304–3308
Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304
Article Google Scholar
Juang B-H, Hou W, Lee C-H (1997) Minimum classification error rate methods for speech recognition. IEEE Trans Speech Audio Process 5(3):257–265
Article Google Scholar
El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn 44(3):572–587
Article MATH Google Scholar
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489
Manevitz LM, Yousef M (2001) One-class svms for document classification. J Mach Learn Res 2(Dec):139–154
MATH Google Scholar
Minsky M (1961) Steps toward artificial intelligence. Proc IRE 49(1):8–30
Article MathSciNet Google Scholar
Maron ME (1961) Automatic indexing: an experimental inquiry. Journal of the ACM (JACM) 8(3):404–417
Article MATH Google Scholar
Cox DR (1958) The regression analysis of binary sequences. J R Stat Soc Series B (Methodological) 20(2):215–232
MathSciNet MATH Google Scholar
Schölkopf B, Smola AJ, Bach F, et al (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond, MIT press
Fix E, Hodges JL (1989) Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review/Revue Internationale de Statistique 57(3):238–247
MATH Google Scholar
Ho T K (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol 1, IEEE, pp 278–282
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Article Google Scholar
Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kingma DP, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. In: Advances in neural information processing systems, pp 10215–10224
Oord A , Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. arXiv:1601.06759
Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Article MATH Google Scholar
Shao Y-H, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968
Article Google Scholar
Peng X (2010) Tsvr: an efficient twin support vector machine for regression. Neural Netw 23 (3):365–372
Article MATH Google Scholar
Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recogn 46(1):305–316
Article MATH Google Scholar
Gu Q, Han J (2013) Clustered support vector machines. In: Artificial Intelligence and Statistics, PMLR, pp 307–315
Hsieh C-J, Si S, Dhillon I (2014) A divide-and-conquer solver for kernel support vector machines. In: International conference on machine learning, PMLR, pp 566–574
Guo G, Wang H, Bell D, Bi Y, Greer K (2003) Knn model-based approach in classification. In: OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, Springer, pp 986–996
Zhang S, Li X, Zong M, Zhu X, Wang R (2017) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29(5):1774–1785
Article MathSciNet Google Scholar
Yong Z, Youwen L, Shixiong X (2009) An improved knn text classification algorithm based on clustering. J Comput 4(3):230–237
Google Scholar
MacQueen J, et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, Oakland, CA, USA, pp 281–297
Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet MATH Google Scholar
Davis JV, Kulis B, Jain P, Sra S, Dhillon I S (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning, pp 209–216
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. The Am Stat 46(3):175–185
MathSciNet Google Scholar
Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition. i. IEEE Trans Syst Man Cybern, Part B (Cybernetics) 29(6):778–785
Article Google Scholar
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Phil Mag 2(11):559–572
Article MATH Google Scholar
Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441 and 498–520
Article MATH Google Scholar
Hotelling H (1936) Relations between two sets of variates. Biometrika 28(3/4):321–377
Article MATH Google Scholar
Thorndike RL (1953) Who belongs in the family. In: Psychometrika, Citeseer
Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: an analysis and critique. Strategic management journal 17(6):441–458
Article Google Scholar
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Article MATH Google Scholar
Dua D, Graff C (2017) UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Don DR, Iacob IE (2020) Dcsvm: fast multi-class classification using support vector machines. Int J Mach Learn Cybern 11(2):433–447
Article Google Scholar
Tanveer M, Gautam C, Suganthan PN (2019) Comprehensive evaluation of twin svm based classifiers on uci datasets. Appl Soft Comput 83:105617
Article Google Scholar

Download references

Acknowledgements

The work of Byungjoon Lee is partially supported by The Catholic University of Korea, Research Fund, 2020 and the National Research Foundation (NRF) of Korea under Grant NRF-2020R1A2C4002378.

Author information

Authors and Affiliations

Department of Mathematics, The Catholic University of Korea, Bucheon-si, Gyeonggi-do, 14663, South Korea
Byungjoon Lee
Department of Sciences, Mathematics, Mississippi University for Women, Columbus, MS 39701, USA
Hwamog Kim
Department of Mathematics and Statistics, Mississippi State University, Mississippi State, MS 39762, USA
Seongjai Kim

Authors

Byungjoon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hwamog Kim
View author publications
You can also search for this author in PubMed Google Scholar
Seongjai Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Byungjoon Lee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, B., Kim, H. & Kim, S. Membership score machine for highly nonlinear classification for small data. Appl Intell 53, 6511–6524 (2023). https://doi.org/10.1007/s10489-022-03652-8

Download citation

Accepted: 19 April 2022
Published: 09 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03652-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Membership score machine for highly nonlinear classification for small data

Abstract

Access this article

Similar content being viewed by others

Very large-scale data classification based on K-means clustering and multi-kernel SVM

Large-Scale Instance Selection Using a Heterogeneous Value Difference Matrix

Massive Classification with Support Vector Machines

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Membership score machine for highly nonlinear classification for small data

Abstract

Access this article

Similar content being viewed by others

Very large-scale data classification based on K-means clustering and multi-kernel SVM

Large-Scale Instance Selection Using a Heterogeneous Value Difference Matrix

Massive Classification with Support Vector Machines

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation