Abstract
Support Vector Machines can achieve levels of accuracy comparable to those achieved by Artificial Neural Networks, but they are also slower to train. In this paper a new algorithm, called Purity Filtering, is presented, designed to filter training data for binary classification SVMs, in order to choose an approximation of the data subset that is more relevant to the training process.
The proposed algorithm is parametrized so to allow a regulation of both spatial and temporal complexity, adapting to the needs and possibilities of each execution environment. A user-specified parameter, the purity, is used to indirectly regulate the number of filtered data, even though the algorithm has also been adapted to let the user directly specify the number of filtered data. Using this algorithm with real datasets, reductions up to 75% of training data (using only 25% of the data samples to train) were achieved with no major loss on the quality of classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
All the experiments were done in an Intel Core i7 2.8 GHz machine with 16 GB memory.
References
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2009)
Amami, R., Ayed, D.B., Ellouze, N.: Practical selection of SVM supervised parameters with different feature representations for vowel recognition. arXiv preprint arXiv:1507.06020 (2015)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Campbell, C., Ying, Y.: Learning with support vector machines. Synth. Lect. Artif. Intell. Mach. Learn. 5(1), 1–95 (2011)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Jenssen, R., Principe, J.C., Erdogmus, D., Eltoft, T.: The Cauchy-Schwarz divergence and Parzen windowing: connections to graph theory and mercer kernels. J. Frankl. Inst. 343(6), 614–629 (2006)
Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers, K.R.: Fisher discriminant analysis with kernels. In: Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. no. 98th8468), pp. 41–48. IEEE (1999)
Neto, A.R., Barreto, G.A.: Opposite maps: vector quantization algorithms for building reduced-set SVM and LSSVM classifiers. Neural Process. Lett. 37(1), 3–19 (2013)
Nguyen, D., Ho, T.: An efficient method for simplifying support vector machines. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 617–624. ACM (2005)
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)
Pedreira, C.E.: Learning vector quantization with training data selection. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 157–162 (2005)
Peres, R.T., Pedreira, C.E.: Generalized risk zone: selecting observations for classification. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1331–1337 (2008)
Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report, Microsoft (1998)
Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12(5), 1207–1245 (2000)
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Shashua, A.: On the relationship between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Process. Lett. 9(2), 129–139 (1999)
Tang, B., Mazzoni, D.: Multiclass reduced-set support vector machines. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 921–928. ACM (2006)
UCI: UCI ML repository: covertype data set. archive.ics.uci.edu/ml/datasets/covertype. Accessed 10 Sept 2019
Yerukala, R., Boiroju, N.K.: Approximations to standard normal distribution function. Int. J. Sci. Eng. Res. 6(4), 515–518 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Morán-Pomés, D., Belanche-Muñoz, L.A. (2019). Purity Filtering: An Instance Selection Method for Support Vector Machines. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXVI. SGAI 2019. Lecture Notes in Computer Science(), vol 11927. Springer, Cham. https://doi.org/10.1007/978-3-030-34885-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-34885-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34884-7
Online ISBN: 978-3-030-34885-4
eBook Packages: Computer ScienceComputer Science (R0)