Skip to main content
Log in

Feature Extraction Based on Support Vector Data Description

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Motivated by the improvement of performance and reduction of complexity, feature extraction is referred to one manner of dimensionality reduction. This paper presents a new feature extraction method based on support vector data description (FE-SVDD). First, the proposed method establishes hyper-sphere models for each category of the given data using support vector data description. Second, FE-SVDD calculates the distances between data points and the centers of the hyper-spheres. Finally, the ratios of the distances to the radii of the hyper-spheres are treated as new extracted features. Experimental results on different data sets indicate that FE-SVDD can speed up the procedure of feature extraction and extract the distinctive information of original data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30(1):41–47

    Article  Google Scholar 

  2. Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M (2001) Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proc Nat Acad Sci 98(24):13,790–13,795

    Article  Google Scholar 

  3. Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167

    Article  Google Scholar 

  4. Cao LJ, Chua KS, Chong WK, Lee HP, Gu QM (2003) A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine. Neurocomputing 55(1–2):321–336

    Google Scholar 

  5. Chen HH, Tiho P, Yao X (2009) Predictive ensemble pruning by expectation propagation. IEEE Trans Knowl Data Eng 21(7):999–1013

    Article  Google Scholar 

  6. Daelemans W, Goethals B, Morik K (eds) (2008) Machine learning and knowledge discovery in databases, European conference, ECML/PKDD 2008, Antwerp, Belgium, Sept 15–19, 2008, Proceedings, Part II, Lecture Notes in Computer Science, vol 5212, Springer

  7. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  8. Diaf A, Boufama B, Benlamri R (2013) Non-parametric fisher’s discriminant analysis with kernels for data classification. Pattern Recogn Lett 34(5):552–558

    Article  Google Scholar 

  9. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64

    Article  MathSciNet  MATH  Google Scholar 

  10. Eklund PW, Hoang A (2006) A comparative study of public domain supervised classifier performance on the UCI database. Aust J Intell Inf Process Syst 9(1):1–39

    Google Scholar 

  11. Elisseeff IGA (2006) Feature extraction. Springer, Berlin

    Google Scholar 

  12. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701

    Article  MATH  Google Scholar 

  13. Geller SC, Gregg JP, Hagerman P, Rocke DM (2003) Transformation and normalization of oligonucleotide microarray data. Bioinformatics 19(14):1817–1823

    Article  Google Scholar 

  14. Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recogn 43(1):5–12

    Article  MATH  Google Scholar 

  15. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  16. Hagan MT, Demuth HB, Beale MH, De J (1996) Neural network design. PWS Publishing Company, Boston

    Google Scholar 

  17. Huesken D, Lange J, Mickanin C, Weiler J, Asselbergs F, Warner J, Meloon B, Engel S, Rosenberg A, Cohen D, Labow M, Reinhardt M, Natt F, Hall J (2005) Design of a genome-wide siRNA library using an artificial neural network. Nat Biotechnol 23(8):995–1001

    Article  Google Scholar 

  18. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  MATH  Google Scholar 

  19. Lashkia GV, Anthony L (2004) Relevant, irredundant feature selection and noisy example elimination. IEEE Trans Syst Man Cybern Part B 34(2):888–897

    Article  Google Scholar 

  20. Lee D, Lee J (2007) Domain described support vector classifier for multi-classification problems. Pattern Recogn 40(1):41–51

    Article  MATH  Google Scholar 

  21. Liu B, Xiao YS, Yu PS, Hao ZF, Cao LB (2014) An efficient orientation distance-based discriminative feature extraction method for multi-classification. Knowl Inf Syst 39(2):409–433

    Article  Google Scholar 

  22. Liu WF, Zhang HM, Tao DP, Wang YJ, Lu K (2016) Large-scale paralleled sparse principal component analysis. Multimed Tools Appl 75(3):1481–1493

    Article  Google Scholar 

  23. Liu Y, Lita LV, Niculescu RS, Bai K, Mitra P, Giles CL (2008) Real-time data pre-processing technique for efficient feature extraction in large scale datasets, pp 981–990

  24. Liu Z, Hsiao W, Cantarel BL, Drábek EF, Fraser-Liggett C (2011) Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data. Bioinformatics 27(23):3242–3249

    Article  Google Scholar 

  25. López-Rubio E, Muñoz-Pérez J, Gómez-Ruiz JA (2003) Principal components analysis competitive learning. In: Artificial neural nets problem solving methods, 7th international work-conference on artificial and natural neural networks, IWANN2003, Maó, Menorca, Spain, June 3–6, 2003 Proceedings, Part I, pp 318–325

  26. Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181(1):115–128

    Article  Google Scholar 

  27. Pauwels EJ, Ambekar O (2011) One class classification for anomaly detection: support vector data description revisited. In: Advances in data mining. Applications and theoretical aspects—11th industrial conference, ICDM 2011, New York, NY, USA, Aug 30–Sept 3, 2011. Proceedings, pp 25–39

  28. Shao L, Liu L, Li X (2014) Feature learning for image classification via multiobjective genetic programming. IEEE Trans Neural Netw Learn Syst 25(7):1359–1371

    Article  Google Scholar 

  29. Tao DC, Tang XO, Li XL, Wu XD (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099

    Article  Google Scholar 

  30. Tao DC, Li XL, Wu XD, Maybank SJ (2009) Geometric mean for subspace selection. IEEE Trans Pattern Anal Mach Intell 31(2):260–274

    Article  Google Scholar 

  31. Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66

    Article  MATH  Google Scholar 

  32. Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Article  Google Scholar 

  33. Yang JB, Ong CJ (2012) An effective feature selection method via mutual information estimation. IEEE Trans Syst Man Cybern Part B 42(6):1550–1559

    Article  Google Scholar 

  34. Zhang L, Lu XN, Wang BJ, He SP (2015) Similarity learning based on multiple support vector data description. In: 2015 international joint conference on neural networks, IJCNN 2015, Killarney, Ireland, July 12–17, 2015, pp 1–7

  35. Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recogn 40(11):3236–3248

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank anonymous reviewers and Editor for their valuable comments and suggestions, which have significantly improved this paper. This work was supported in part by the National Natural Science Foundation of China under Grant No. 61373093, by the Soochow Scholar Project, by the Six Talent Peak Project of Jiangsu Province of China, and by the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61373093, by the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20140008, and by the Soochow Scholar Project.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Lu, X. Feature Extraction Based on Support Vector Data Description. Neural Process Lett 49, 643–659 (2019). https://doi.org/10.1007/s11063-018-9838-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-018-9838-0

Keywords

Navigation