Purity Filtering: An Instance Selection Method for Support Vector Machines

Morán-Pomés, David; Belanche-Muñoz, Lluís A.

doi:10.1007/978-3-030-34885-4_2

David Morán-Pomés¹⁰ &
Lluís A. Belanche-Muñoz¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11927))

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

996 Accesses

Abstract

Support Vector Machines can achieve levels of accuracy comparable to those achieved by Artificial Neural Networks, but they are also slower to train. In this paper a new algorithm, called Purity Filtering, is presented, designed to filter training data for binary classification SVMs, in order to choose an approximation of the data subset that is more relevant to the training process.

The proposed algorithm is parametrized so to allow a regulation of both spatial and temporal complexity, adapting to the needs and possibilities of each execution environment. A user-specified parameter, the purity, is used to indirectly regulate the number of filtered data, even though the algorithm has also been adapted to let the user directly specify the number of filtered data. Using this algorithm with real datasets, reductions up to 75% of training data (using only 25% of the data samples to train) were achieved with no major loss on the quality of classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All the experiments were done in an Intel Core i7 2.8 GHz machine with 16 GB memory.

References

Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2009)
MATH Google Scholar
Amami, R., Ayed, D.B., Ellouze, N.: Practical selection of SVM supervised parameters with different feature representations for vowel recognition. arXiv preprint arXiv:1507.06020 (2015)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Google Scholar
Campbell, C., Ying, Y.: Learning with support vector machines. Synth. Lect. Artif. Intell. Mach. Learn. 5(1), 1–95 (2011)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Jenssen, R., Principe, J.C., Erdogmus, D., Eltoft, T.: The Cauchy-Schwarz divergence and Parzen windowing: connections to graph theory and mercer kernels. J. Frankl. Inst. 343(6), 614–629 (2006)
Article MathSciNet Google Scholar
Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers, K.R.: Fisher discriminant analysis with kernels. In: Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. no. 98th8468), pp. 41–48. IEEE (1999)
Google Scholar
Neto, A.R., Barreto, G.A.: Opposite maps: vector quantization algorithms for building reduced-set SVM and LSSVM classifiers. Neural Process. Lett. 37(1), 3–19 (2013)
Article Google Scholar
Nguyen, D., Ho, T.: An efficient method for simplifying support vector machines. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 617–624. ACM (2005)
Google Scholar
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)
Article Google Scholar
Pedreira, C.E.: Learning vector quantization with training data selection. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 157–162 (2005)
Article MathSciNet Google Scholar
Peres, R.T., Pedreira, C.E.: Generalized risk zone: selecting observations for classification. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1331–1337 (2008)
Article Google Scholar
Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report, Microsoft (1998)
Google Scholar
Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12(5), 1207–1245 (2000)
Article Google Scholar
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Shashua, A.: On the relationship between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Process. Lett. 9(2), 129–139 (1999)
Article Google Scholar
Tang, B., Mazzoni, D.: Multiclass reduced-set support vector machines. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 921–928. ACM (2006)
Google Scholar
UCI: UCI ML repository: covertype data set. archive.ics.uci.edu/ml/datasets/covertype. Accessed 10 Sept 2019
Yerukala, R., Boiroju, N.K.: Approximations to standard normal distribution function. Int. J. Sci. Eng. Res. 6(4), 515–518 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Informatics, Technical University of Catalonia, Barcelona, Spain
David Morán-Pomés
Computer Science Department, Technical University of Catalonia, Barcelona, Spain
Lluís A. Belanche-Muñoz

Authors

David Morán-Pomés
View author publications
You can also search for this author in PubMed Google Scholar
Lluís A. Belanche-Muñoz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lluís A. Belanche-Muñoz .

Editor information

Editors and Affiliations

University of Portsmouth, Portsmouth, UK
Max Bramer
Middlesex University, London, UK
Miltos Petridis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Morán-Pomés, D., Belanche-Muñoz, L.A. (2019). Purity Filtering: An Instance Selection Method for Support Vector Machines. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXVI. SGAI 2019. Lecture Notes in Computer Science(), vol 11927. Springer, Cham. https://doi.org/10.1007/978-3-030-34885-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-34885-4_2
Published: 19 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34884-7
Online ISBN: 978-3-030-34885-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics