Iterative tighter nonparallel hyperplane support vector clustering with simultaneous feature selection

Fang, Jiayan; Liu, Qiao; Qin, Zhiguang

doi:10.1007/s10586-017-1587-8

Iterative tighter nonparallel hyperplane support vector clustering with simultaneous feature selection

Published: 30 December 2017

Volume 22, pages 8035–8049, (2019)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Jiayan Fang^1,2,
Qiao Liu^1,2 &
Zhiguang Qin^1,2

618 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we propose a novel clustering method with feature selection in a synchronized manner, called iterative tighter nonparallel support vector clustering with simultaneous feature selection (IT-NHSVC-SFS). A certain iterative (alternating) optimization strategy for clustering is applied to a learning model with twin hyperplanes, in which two types of regularizers, namely the Euclidean and infinite norms, are introduced to achieve the enhancement of clustering generalization performance and coordinated feature selection. The L-infinite norm actually conducts implicit feature elimination process to reduce clustering noises resulting from irrelevant features, thus guaranteeing clustering accuracy. Meanwhile, since the formulation of the proposed model embodies the large-margin spirit,good generalization can also be ensured.Unlike twin support vector machine and its variants, nonparallel hyperplane SVM (NHSVM) is chosen to be a baseline model,thus only a single quadratic programming problem is needed to solve for the optimal twin hyperplanes, making it convenient to design a synchronized feature selection process in two hyperplanes. Additionally, two more groups of equality constraints are enforced into the original constraint set of NHSVM, thus the inverse operation of two large matrices can be avoided to reduce the computational complexity. Furthermore,the hinge loss function of NHSVM is replaced by the Laplacian loss measure to prevent the premature convergence. Numerical experiments are performed on benchmark datasets to investigate the validity of the proposed algorithm. The experimental results indicate that IT-NHSVC-SFS has better performance than other existing clustering methods mainly in terms of clustering accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

Mayank Arya Chandra & S. S. Bedi

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

Dipti Theng & Kishor K. Bhoyar

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Article Open access 24 January 2024

Beatriz Flamia Azevedo, Ana Maria A. C. Rocha & Ana I. Pereira

References

Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)
MATH Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Google Scholar
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. Siam Rev. 26(2), 195–239 (1984)
MathSciNet MATH Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: International Conference on Neural Information Processing Systems: Natural and Synthetic. MIT Press, pp. 849–856 (2001)
Wang, Y.X., Xu, H.: Noisy sparse subspace clustering. In: International Conference on International Conference on Machine Learning. JMLR.org, p. I-89 (2013)
Hershey, J.R., Chen, Z., Roux, J.L., et al.: Deep clustering: Discriminative embeddings for segmentation and separation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 31–35 (2015)
Zhang, X., Zhang, X., Liu, H.: Self-adapted multi-task clustering. In: International Joint Conference on Artificial Intelligence. AAAI Press, pp. 2357–2363 (2016)
Zhang, L., Zhang, Q., Du, B., et al.: Adaptive manifold regularized matrix factorization for data clustering. In: Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 33999–3405 (2017)
Vapnik, V.N.: The nature of statistical learning theory. IEEE Trans. Neural Netw. 38(4), 409 (2002)
Google Scholar
Xu, L., Neufeld, J., Larson, B., et al.: Maximum margin clustering. Adv. Neural Inf. Process. Syst. 17, 1537–1544 (2004)
Google Scholar
Khemchandani, R., Chandra, S.: Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007)
MATH Google Scholar
Wang, Z., Shao, Y.H., Bai, L., et al.: Twin support vector machine for clustering. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2583 (2015)
MathSciNet Google Scholar
Khemchandani, R., Pal, A., Chandra, S.: Fuzzy least squares twin support vector clustering. Neural Compuy. Appl. https://doi.org/10.1007/s00521-016-2468-4 (2016)
Article Google Scholar
Chandrashekar, G., Sahin, F.: A Survey on Feature Selection Methods. Pergamon Press Inc., Oxford (2014)
Google Scholar
Guyon, I.: An introduction to variable and feature selection. JMLR.org. (2003)
Maldonado, S., Weber, R.: A wrapper method for feature selection using support vector machines. Inf. Sci. 179(13), 2208–2217 (2009)
Google Scholar
Hsu, H.H., Hsieh, C.W., Lu, M.D.: Hybrid feature selection by combining filters and wrappers. Expert Syst. Appl. 38(7), 8144–8150 (2011)
Google Scholar
Sebban, M., Nock, R.: A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recogn. 35(4), 835–846 (2002)
MATH Google Scholar
Yang, C.H., Chuang, L.Y., Yang, C.H.: IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. J. Med. Biol. Eng. 30(1), 23–28 (2010)
Google Scholar
Bradley, P.S., Mangasarian, O.L.: k-Plane clustering. J. Glob. Optim. 16(1), 23–32 (2000)
MathSciNet MATH Google Scholar
Yuille, A.L., Rangarajan, A.: The concave–convex procedure. Neural Comput. 15(4), 915 (2003)
MATH Google Scholar
Cheung, P.M., Kwok, J.T.: A regularization framework for multiple-instance learning. In: International Conference. DBLP, pp. 193–200 (2006)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Deng, N., Tian, Y., Zhang, C.: Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions. Chapman & Hall/CRC, London (2012)
MATH Google Scholar
Shao, Y.H., Zhang, C.H., Wang, X.B., et al.: Improvements on twin support vector machines. IEEE Trans. Neural Netw. 22(6), 962–8 (2011)
Google Scholar
Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10(5), 1032–1037 (1999)
Google Scholar
Bai, L., Wang, Z., Shao, Y.H., et al.: A novel feature selection method for twin support vector machine. Knowl. Based Syst. 59(2), 1–8 (2014)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2001)
MATH Google Scholar
Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)
MATH Google Scholar
Yang, Y., Zou, H.: A fast unified algorithm for solving group-lasso penalize learning problems. Stat. Comput. 25(6), 1129–1141 (2015)
MathSciNet MATH Google Scholar
Moreno-Vega, J.M.: High-dimensional feature selection via feature grouping. Inf. Sci. 326(C), 102–118 (2016)
MathSciNet Google Scholar
Shao, Y.H., Chen, W.J., Deng, N.Y.: Nonparallel hyperplane support vector machine for binary classification problems. Inf. Sci. 263(3), 22–35 (2014)
MathSciNet MATH Google Scholar
Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Fifteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., pp. 82–90 (1998)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. 68(1), 49–67 (2006)
MathSciNet MATH Google Scholar
Zhang, K., Tsang, I.W., Kwok, J.T.: Maximum margin clustering made practical. IEEE Trans. Neural Netw. 20(4), 583–96 (2009)
Google Scholar
Schölkopf, B., Smola, A.J.: Learning with kernels. IEEE Trans. Signal Process. 52(8), 2165–2176 (2002)
MathSciNet Google Scholar
Bennett, K.P., Bredensteiner, E.J.: Duality and geometry in SVM classifiers. In: Seventeenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., pp. 57–64 (2000)
Mangasarian, O.L.: Nonlinear Programming. Classics in Applied Mathematics. Society for Industrial and Applied Mathematics, Philadelphia (1994)
MATH Google Scholar
Maldonado, S., López, J.: Synchronized feature selection for support vector machines with twin hyperplanes. Knowl. Based Syst. 132, 119–128 (2017)
Google Scholar
Bache, K., Lichman, M.: UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml
Gravier, E., Pierron, G., Vincent-Salomon, A., et al.: A prognostic DNA signature for T1T2 node-negative breast cancer patients. Genes Chromosomes Cancer 49(12), 1125–34 (2010)
Google Scholar
Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. USA 96(12), 6745 (1999)
Google Scholar
Davies, A.J., Rosenwald, A., Wright, G., et al.: Transformation of follicular lymphoma to diffuse large B-cell lymphoma proceeds by distinct oncogenic mechanisms. Br. J. Haematol. 136(2), 286 (2007)
Google Scholar
West, M., Blanchette, C., Dressman, H., et al.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Nat. Acad. Sci. USA 98(20), 11462–7 (2001)
Google Scholar
Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870), 436 (2002)
Google Scholar
Shipp, M.A., Ross, K.N., Tamayo, P., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8(1), 68–74 (2002)
Google Scholar
Yang, Z.M., He, J.Y., Shao, Y.H.: Feature selection based on linear twin support vector machines \(\star \). Proc. Comput. Sci. 17, 1039–1046 (2013)
Google Scholar
Pearson, K.: Note on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 58, 240–242 (2006)
Google Scholar
Weber, R., Basak, J.: Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf. Sci. 181(1), 115–128 (2011)
Google Scholar
Neumann, J., Schnörr, C., Steidl, G.: Combined SVM-based feature selection and classification. Mach. Learn. 61(1–3), 129–150 (2005)
MATH Google Scholar
Rakotomamonjy, A.: Variable selection using SVM based criteria. J. Mach. Learn. Res. 3(7–8), 1357–1370 (2003)
MathSciNet MATH Google Scholar
Schölkopf, B., Platt, J., Hofmann, T.: Generalized maximum margin clustering and unsupervised kernel learning. In: International Conference on Neural Information Processing Systems. MIT Press, pp. 1417–1424 (2006)
Djuric, N., Lan, L., Vucetic, S., et al.: BudgetedSVM: a toolbox for scalable SVM approximations. J. Mach. Learn. Res. 14(1), 3813–3817 (2013)
MathSciNet MATH Google Scholar
Nanculef, R., Frandi, E., Sartori, C., et al.: A novel Frank–Wolfe algorithm. Analysis and applications to large-scale SVM training. Inf. Sci. 285(C), 66–99 (2014)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National High Technology Research and Development Program of China (863 Program) (2011AA010706), the National Natural Science Foundation of China (61133016, 61272527), and Ministry of Education-China Mobile Communications Corporation Research Funds (MCM20121041), the Natural Science Research in Colleges and Universities of Anhui Province,China (KJ2015A290, KJ2017A579).

Author information

Authors and Affiliations

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
Jiayan Fang, Qiao Liu & Zhiguang Qin
Sichuan Province Key Laboratory of Network and Data Security, Chengdu, China
Jiayan Fang, Qiao Liu & Zhiguang Qin

Authors

Jiayan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Qiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiguang Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiayan Fang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, J., Liu, Q. & Qin, Z. Iterative tighter nonparallel hyperplane support vector clustering with simultaneous feature selection. Cluster Comput 22 (Suppl 4), 8035–8049 (2019). https://doi.org/10.1007/s10586-017-1587-8

Download citation

Received: 08 December 2017
Revised: 17 December 2017
Accepted: 19 December 2017
Published: 30 December 2017
Issue Date: July 2019
DOI: https://doi.org/10.1007/s10586-017-1587-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Iterative tighter nonparallel hyperplane support vector clustering with simultaneous feature selection

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Feature selection techniques for machine learning: a survey of more than two decades of research

Hybrid approaches to optimization and machine learning methods: a systematic literature review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Iterative tighter nonparallel hyperplane support vector clustering with simultaneous feature selection

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Feature selection techniques for machine learning: a survey of more than two decades of research

Hybrid approaches to optimization and machine learning methods: a systematic literature review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation