KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification

Xu, Yitian; Zhang, Yuqun; Zhao, Jiang; Yang, Zhiji; Pan, Xianli

doi:10.1007/s13042-017-0720-6

KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification

Original Article
Published: 29 August 2017

Volume 10, pages 357–368, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yitian Xu¹,
Yuqun Zhang¹,
Jiang Zhao¹,
Zhiji Yang¹ &
…
Xianli Pan¹

657 Accesses
13 Citations
Explore all metrics

Abstract

Imbalanced data classification is often met in our real life. In this paper, a novel k-nearest neighbor (KNN)-based maximum margin and minimum volume hyper-sphere machine (KNN-M³VHM) is presented for the imbalanced data classification. The basic idea is to construct two hyper-spheres with different centres and radiuses. The first one contains majority examples and the second one covers minority examples. When constructing the first hyper-sphere, we remove some redundant majority samples using k-nearest neighbor (KNN)-based strategy to balance two classes of samples. Meanwhile, we maximize the margin between two hyper-spheres and minimize their volumes, which can result in two tight boundaries around each class. Similar to the twin hyper-sphere support vector machine (THSVM), KNN-M³VHM solves two related SVM-type problems and avoids the matrix inverse operation when solving the convex optimization problems. KNN-M³VHM considers not only the within-class information but also the between-class margin, then it achieves better performance in comparison with other state-of-the-art algorithms. Experimental results on twenty-five datasets validate the significant advantages of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hyper-Sphere Support Vector Classifier with Hybrid Decision Strategy

A non-convex robust small sphere and large margin support vector machine for imbalanced data classification

Article 08 October 2022

A spheres-based support vector machine for pattern classification

Article 22 April 2017

Notes

References

Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Wang X, Aamir R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196
Article MathSciNet Google Scholar
Manevitz LM, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2(1):139–154
MATH Google Scholar
Zhang W, Yoshida T, Tang X (2008) Text classification based on multi-word with support vector machine. Knowl Based Syst 21(8):879–886
Article Google Scholar
Kaper M, Meinicke P, Grossekathoefer U (2004) BCI competition 2003-data set IIb: support vector machines for the P300 speller paradigm. IEEE Trans Biomed Eng 51:1073–1076
Article Google Scholar
Xu Y, Wang L (2005) Fault diagnosis system based on rough set theory and support vector machine. Lecture Notes Comput Sci 3614:981–988
Google Scholar
Liu Z, Wu QH, Zhang Y et al (2011) Adaptive least squares support vector machines filter for hand tremor canceling in microsurgery. Int J Mach Learn Cybern 2(1):37–47
Article Google Scholar
Jayadeva Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29:905–910
Article MATH Google Scholar
Fung G, Mangasarian O (2001) Proximal support vector machine classifiers. In: Provost F, Srikant R (eds) KDD '01 proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. Asscociation for Computing Machinery, New York, pp 77–86
Ghorai S, Mukherjee A, Dutta P (2009) Nonparallel plane proximal classifier. Signal Process 89:510–522
Article MATH Google Scholar
Fung G, Mangasarian O (2005) Multicategory proximal support vector machine classifiers. Mach Learn 59:77–97
Article MATH Google Scholar
Peng X (2010) A \(\nu\)-twin support vector machine (\(\nu\)-TSVM) classifier and its geometric algorithms. Inf Sci 180:3863–3875
Article MathSciNet MATH Google Scholar
Xu Y, Wang L, Zhong P (2012) A rough margin-based \(\nu\)-twin support vector machine. Neural Comput Appl 21:1307–1317
Article Google Scholar
Kumar M, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36:7535–7543
Article Google Scholar
Peng X (2010) TSVR: an efficient twin support vector machine for regression. Neural Netw 23:365–372
Article MATH Google Scholar
Xu Y, Wang L (2012) A weighted twin support vector regression. Knowl Based Syst 33:92–101
Article MathSciNet Google Scholar
Xu Y, Guo R (2013) A twin multi-class classification support vector machine. Cognit Comput 5(4):580–588
Article Google Scholar
Wang X, He Q, Chen D, Yeung D (2005) A genetic algorithm for solving the inverse problem of support vector machines. Neurocomputing 68:225–238
Article Google Scholar
Peng X, Xu D (2013) A twin hypersphere support vector machine classifier and the fast learning algorithm. Inf Sci 221:12–27
Article MathSciNet MATH Google Scholar
He HB, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449
Article MATH Google Scholar
Wei W, Li J, Cao L et al (2013) Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 16:449–475
Article Google Scholar
Thomas C (2013) Improving intrusion detection for imbalanced network traffic. Secur Commun Netw 6:309–324
Article Google Scholar
Khalilia M, Chakraborty S, Popescu M (2011) Predicting disease risks from highly imbalanced data using random forest. BMC Med Inform Decis Mak 11(1):51
Article Google Scholar
Pedrajas NG, Rodriguez JP, Pedrajas MG et al (2012) Class imbalance methods for translation initiation site recognition in DNA sequences. Knowl Based Syst 25:22–34
Article Google Scholar
Mao W, Wang J, Xue Z (2017) An ELM-based model with sparse-weighting strategy for sequential data imbalance problem. Int J Mach Learn Cybern 8(4):1333–1345
Article Google Scholar
Vong CM, Ip WF, Wong PK, Chiu CC (2014) Predicting minority class for suspended particulate matters level by extreme learning machine. Neurocomputing 128:136–144
Article Google Scholar
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463–484
Article Google Scholar
Sun YM, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(4):687–719
Article Google Scholar
Zhai JH, Zhang SF, Wang CX (2017) The classification of imbalanced large data sets based on mapreduce and ensemble of ELM classifiers. J Mach Learn Cybern 8(3):1009–1017
Article Google Scholar
Zhai JH, Wang XZ, Pang XH (2016) Voting-based instance selection from large data sets with mapreduce and random weight networks. Inf Sci 367:1066–1077
Article Google Scholar
Zhai JH, Li T, Wang XZ (2016) A cross-selection instance algorithm. J Intell Fuzzy Syst 30(2):717–728
Article Google Scholar
Wang X, Xing H, Li Y et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Article Google Scholar
Tax D, Duin R (2004) Support vector data description. Mach Learn 54:45–66
Article MATH Google Scholar
Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092
Article Google Scholar
Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced data sets. In: Boulicaut JF, Esposito F, Giannotti F, Pedreschi D (eds) Proceedings of 15th ECML, vol 3201. Springer, Berlin, Heidelberg, pp 39–50
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27
Article MATH Google Scholar
Ye Q, Zhao C, Gao S, Zheng H (2012) Weighted twin support vector machines with local information and its application. Neural Netw 35:31–39
Article MATH Google Scholar
Xu Y, Yu J, Zhang Y (2014) KNN-based weighted rough v-twin support vector machine. Knowl Based Syst 71:303–313
Article Google Scholar
Shao Y, Chen W, Zhang J, Wang Z, Deng N (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47:3158–3167
Article MATH Google Scholar
Xu Y, Yang Z, Zhang Y, Pan X, Wang L (2016) A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowl Based Syst 95:75–85
Article Google Scholar
Demsar J (2006) Statistical comparisons of classification over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Garca S, Fernndez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044–2064
Article Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. This work was supported in part by the Beijing Natural Science Foundation (No. 4172035) and National Natural Science Foundation of China (No. 11671010).

Author information

Authors and Affiliations

College of Science, China Agricultural University, Beijing, 100083, China
Yitian Xu, Yuqun Zhang, Jiang Zhao, Zhiji Yang & Xianli Pan

Authors

Yitian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yuqun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiji Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xianli Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yitian Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, Y., Zhang, Y., Zhao, J. et al. KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification. Int. J. Mach. Learn. & Cyber. 10, 357–368 (2019). https://doi.org/10.1007/s13042-017-0720-6

Download citation

Received: 21 February 2016
Accepted: 24 August 2017
Published: 29 August 2017
Issue Date: 04 February 2019
DOI: https://doi.org/10.1007/s13042-017-0720-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification

Abstract

Access this article

Similar content being viewed by others

Hyper-Sphere Support Vector Classifier with Hybrid Decision Strategy

A non-convex robust small sphere and large margin support vector machine for imbalanced data classification

A spheres-based support vector machine for pattern classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification

Abstract

Access this article

Similar content being viewed by others

Hyper-Sphere Support Vector Classifier with Hybrid Decision Strategy

A non-convex robust small sphere and large margin support vector machine for imbalanced data classification

A spheres-based support vector machine for pattern classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation