A K-Farthest-Neighbor-based approach for support vector data description

Xiao, Yanshan; Liu, Bo; Hao, Zhifeng; Cao, Longbing

doi:10.1007/s10489-013-0502-0

A K-Farthest-Neighbor-based approach for support vector data description

Published: 15 February 2014

Volume 41, pages 196–211, (2014)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yanshan Xiao¹,
Bo Liu²,
Zhifeng Hao¹ &
…
Longbing Cao³

513 Accesses
18 Citations
Explore all metrics

Abstract

Support vector data description (SVDD) is a well-known technique for one-class classification problems. However, it incurs high time complexity in handling large-scale datasets. In this paper, we propose a novel approach, named K-Farthest-Neighbor-based Concept Boundary Detection (KFN-CBD), to improve the training efficiency of SVDD. KFN-CBD aims at identifying the examples lying close to the boundary of the target class, and these examples, instead of the entire dataset, are then used to learn the classifier. Extensive experiments have shown that KFN-CBD obtains substantial speedup compared to standard SVDD, and meanwhile maintains comparable accuracy as the entire dataset used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Approach to Boost Support Vector Data Description

Parallel Support Vector Data Description

Centering SVDD for Unsupervised Feature Representation in Object Classification

Notes

Quadratic programming (QP) problem is about optimizing a quadratic function of several variables subject to linear constraints on these variables. The optimization problems of SVM and SVDD are QP problems.
Available at http://yann.lecun.com/exdb/mnist/.
Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/.
Available at http://archive.ics.uci.edu.
Available at www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Yu H, Hsieh C, Chang K, Lin C (2012) Large linear classification when data cannot fit in memory. ACM Trans Knowl Discov from Data 5(4):23210–23230
Google Scholar
Diosan L, Rogozan A, Pecuchet JP (2012) Improving classification performance of support vector machine by genetically optimising kernel shape and hyper-parameters. Appl Intell 36(2):280–294
Article Google Scholar
Shao YH, Wang Z, Chen WJ, Deng NY (2013) Least squares twin parametric-margin support vector machine for classification. Appl Intell 39(3):451–464
Article Google Scholar
Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66
Article MATH Google Scholar
Wang Z, Zhang Q, Sun X (2010) Document clustering algorithm based on NMF and SVDD. In: Proceedings of the international conference on communication systems, networks and applications
Google Scholar
Wang D, Tan X (2013) Centering SVDD for unsupervised feature representation in object classification. In: Proceedings of the international conference on neural information processing
Google Scholar
Tax DMJ, Duin RPW (1999) Support vector data description applied to machine vibration analysis. In: Proceedings of the fifth annual conference of the ASCI, pp 398–405
Google Scholar
Lee SW, Park J (2006) Low resolution face recognition based on support vector data description. Pattern Recognit 39(9):1809–1812
Article MATH Google Scholar
Brunner C, Fischer A, Luig K, Thies T (2012) Pairwise support vector machines and their application to large scale problems. J Mach Learn Res 13(1):2279–2292
MATH MathSciNet Google Scholar
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the international conference on very large data bases, pp 426–435
Google Scholar
Debnath R, Muramatsu M, Takahashi H (2005) An efficient support vector machine learning method with second-order cone programming for large-scale problems. Appl Intell 23(3):219–239
Article MATH Google Scholar
Joachims T (1998) Making large-scale support vector machine learning practical. Advances in Kernel methods. MIT Press, Cambridge
Google Scholar
Osuna E, Freund R, Girosi F (1997) An improved training algorithm for support vector machines. In: Proceedings of the IEEE workshop on neural networks for signal processing, pp 276–285
Google Scholar
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. Advances in Kernel methods. MIT Press, Cambridge
Google Scholar
Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649
Article MATH Google Scholar
Chang CC, Lin CJ (2001) LibSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Fine S, Scheinberg K (2001) Efficient SVM training using low-rank kernel representation. J Mach Learn Res 2:243–264
Google Scholar
Cossalter M, Yan R, Zheng L (2011) Adaptive kernel approximation for large-scale non-linear SVM prediction. In: Proceedings of the international conference on machine learning
Google Scholar
Chen M-S, Lin K-P (2011) Efficient kernel approximation for large-scale support vector machine classification. In: Proceedings of the SIAM international conference on data mining
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
MATH MathSciNet Google Scholar
Graf H, Cosatto E, Bottou L, Dourdanovic I, Vapnik V (2005) Parallel support vector machines: the cascade SVM. In: Proceedings of the advances in neural information processing systems, pp 521–528
Google Scholar
Smola A, Schölkopf B (2000) Sparse greedy matrix approximation for machine learning. In: Proceedings of the international conference on machine learning, pp 911–918
Google Scholar
Pavlov D, Chudova D, Smyth P (2000) Towards scalable support vector machines using squashing. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 295–299
Google Scholar
Li B, Chi M, Fan J, Xue X (2007) Support cluster machine. In: Proceedings of the international conference on machine learning, pp 505–512
Google Scholar
Zanni L, Serafini T, Zanghirati G (2006) Parallel software for training large scale support vector machines on multiprocessor systems. J Mach Learn Res 7:1467–1492
MATH MathSciNet Google Scholar
Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of SVMs for very large scale problems. Neural Comput 14(5):1105–1114
Article MATH Google Scholar
Gu Q, Han J (2013) Clustered support vector machines. In: Proceedings of the international conference on artificial intelligence and statistics
Google Scholar
Yuan Z, Zheng N, Liu Y (2005) A cascaded mixture SVM classifier for object detection. In: Proceedings of the international conference on advances in neural networks, pp 906–912
Google Scholar
Yu H, Yang J, Han J (2003) Classifying large data sets using svm with hierarchical clusters. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 306–315
Google Scholar
Sun S, Tseng C, Chen Y, Chuang S, Fu H (2004) Cluster-based support vector machines in text-independent speaker identification. In: Proceedings of the international conference on neural networks
Google Scholar
Boley D, Cao D (2004) Training support vector machine using adaptive clustering. In: Proceedings of the SIAM international conference on data mining
Google Scholar
Lawrence N, Seeger M, Herbrich R (2003) Fast sparse Gaussian process methods: the informative vector machine. In: Proceedings of the advances in neural information processing systems, pp 609–616
Google Scholar
Kim P, Chang H, Song D, Choi J (2007) Fast support vector data description using k-means clustering. In: Proceedings of the international symposium on neural networks: advances in neural networks, part III, pp 506–514
Google Scholar
Wang CW, You WH (2013) Boosting-SVM: effective learning with reduced data dimension. Appl Intell 39(3):465–474
Article Google Scholar
Li C, Liu K, Wang H (2011) The incremental learning algorithm with support vector machine based on hyperplane-distance. Appl Intell 34(1):19–27
Article MATH Google Scholar
Lee LH, Lee CH, Rajkumar R, Isa D (2012) An enhanced support vector machine classification framework by using Euclidean distance function for text document categorization. Appl Intell 37(1):80–99
Article Google Scholar
Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using GPU. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition workshops
Google Scholar
Menendez HD, Barrero DF, Camacho D (2013) A multi-objective genetic graph-based clustering algorithm with memory optimization. In: Proceedings of the IEEE Congress on evolutionary computation
Google Scholar
Ciaccia P, Patella M, Zezula P (1998) Bulk loading the M-tree. In: Proceedings of the Australasian database conference, pp 15–26
Google Scholar
Ciaccia P, Patella M (2000) PAC nearest neighbor queries: approximate and controlled search in high-dimensional and metric spaces. In: Proceedings of the international conference on data engineering
Google Scholar
Ben-Hur A, Horn D, Siegelmann H, Vapnik V (2001) Support vetor clustering. J Mach Learn Res 2:125–137
Google Scholar
Davy M, Godsill S (2002) Detection of abrupt spectral changes using support vector machines—an application to audio signal segmentation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing
Google Scholar
Mayur D, Nicole I, Piotr I, Vahab SM (2004) Locality-sensitive hashing scheme based on P-stable distributions. In: Proceedings of the annual ACM symposium on computational geometry
Google Scholar

Download references

Acknowledgements

This work is supported by Natural Science Foundation of China (61070033, 61203280, 61202270), Guangdong Natural Science Funds for Distinguished Young Scholar (S2013050014133), Natural Science Foundation of Guangdong province (9251009001000005, S2011040004187, S2012040007078), Specialized Research Fund for the Doctoral Program of Higher Education (20124420120004), Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, GDUT Overseas Outstanding Doctoral Fund (405120095), Science and Technology Plan Project of Guangzhou City (12C42111607, 201200000031, 2012J5100054), Science and Technology Plan Project of Panyu District Guangzhou (2012-Z-03-67), Australian Research Council Discovery Grant (DP1096218, DP130102691), and ARC Linkage Grant (LP100200774, LP120100566).

Author information

Authors and Affiliations

School of Computers, Guangdong University of Technology, Guangzhou, Guangdong, China
Yanshan Xiao & Zhifeng Hao
School of Automation, Guangdong University of Technology, Guangzhou, Guangdong, China
Bo Liu
Advanced Analytics Institute, University of Technology, Sydney, Australia
Longbing Cao

Authors

Yanshan Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Bo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Longbing Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanshan Xiao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, Y., Liu, B., Hao, Z. et al. A K-Farthest-Neighbor-based approach for support vector data description. Appl Intell 41, 196–211 (2014). https://doi.org/10.1007/s10489-013-0502-0

Download citation

Published: 15 February 2014
Issue Date: July 2014
DOI: https://doi.org/10.1007/s10489-013-0502-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A K-Farthest-Neighbor-based approach for support vector data description

Abstract

Access this article

Similar content being viewed by others

An Efficient Approach to Boost Support Vector Data Description

Parallel Support Vector Data Description

Centering SVDD for Unsupervised Feature Representation in Object Classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A K-Farthest-Neighbor-based approach for support vector data description

Abstract

Access this article

Similar content being viewed by others

An Efficient Approach to Boost Support Vector Data Description

Parallel Support Vector Data Description

Centering SVDD for Unsupervised Feature Representation in Object Classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation