Abstract
Kernel based methods, such as nonlinear support vector machines, have a high classification accuracy in many applications. But classification using these methods can be slow if the kernel function is complex and if it has to be evaluated many times. Existing solutions to this problem try to find a representation of the decision surface in terms of only a few basis vectors, so that only a small number of kernel evaluations is needed. However, in all of these methods the set of basis vectors used is independent of the example to be classified. In this paper we propose to adaptively select a small number of basis vectors given an unseen example. The set of basis vectors is thus not fixed, but it depends on the input to the classifier. Our approach is to first learn a non-sparse kernel machine using some existing techique, and then using training data to find a function that maps unseen examples to subsets of the basis vectors used by theis kernel machine. We propose to represent this function as a binary tree, called a support vector tree, and devise a greedy algorithm for finding good trees. In the experiments we observe that the proposed approach outperforms existing techniques in a number of cases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bakir, G., Weston, J., Schölkopf, B.: Learning to find pre-images. In: Advances in Neural Information Processing Systems (NIPS 2003), vol. 16 (2003)
Bakir, G., Zien, A., Tsuda, K.: Learning to find graph pre-images. In: Pattern Recognition, 26th DAGM Symposium, pp. 253–261 (2004)
Burges, C.: Simplified support vector decision rules. In: Machine Learning, Proceedings of the Thirteenth International Conference (ICML 1996), pp. 71–77 (1996)
Burges, C.: A tutorial on support vector machines for pattern recognition. Knowledge Discovery and Data Minnig 2(2), 121–167 (1998)
Burges, C., Schölkopf, B.: Improving the accuracy and speed of support vector machines. In: Advances in Neural Information Processing Systems, NIPS, vol. 9, pp. 375–381 (1996)
Cambazoglu, B., Zaragoza, H., Chapelle, O.: Early exit optimizations for additive machine learned ranking systems. In: Proceedings of the Third International Conference on Web Search and Web Data Mining, WSDM 2010 (to appear, 2010)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chen, J.-H., Chen, C.-S.: Reducing SVM classification time using multiple mirror classifiers. IEEE Transactions on Systems, Man, and Cybernetics – Part B: Cybernetics 34(2), 1173–1183 (2004)
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press, Cambridge (2001)
Dekel, O., Singer, Y.: Support vector machines on a budget. In: Advances in Neural Information Processing Systems, Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems, vol. 19, pp. 345–352 (2006)
Downs, T., Gates, K.E., Masters, A.: Exact simplification of support vector solutions. Journal of Machine Learning Research 2, 293–297 (2001)
Garey, M.R., Johnson, D.S.: Computers and Intractability – A Guide to the Theory of NP-Completeness. Freeman, New York (1979)
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 133–142 (2002)
Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 377–384 (2005)
Joachims, T.: Training linear svms in linear time. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), pp. 217–226 (2006)
Joachims, T., Yu, C.-N.J.: Sparse kernel svms via cutting-plane training. Machine Learning 76, 179–193 (2009)
Kwok, J.T., Tsang, I.W.: The pre-image problem in kernel methods. IEEE Transactions on Neural Networks 15(6), 1517–1525 (2004)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)
Nair, P.B., Choudhury, A., Keane, A.J.: Some greedy learning algorithms for sparse regression and classification with mercer kernels. Journal of Machine Learning Research 3, 781–801 (2002)
Osuna, E., Girosi, F.: Reducing the run-time complexity in support vector machines. In: Advances in Kernel Methods: Support Sector Learning. MIT Press, Cambridge (1999)
Teo, C.H., Vishwanathan, S.: Fast and space efficient string kernels using suffix arrays. In: Proceedings of 23rd International Conference on Machine Learning, pp. 929–936 (2006)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Wu, M., Schölkopf, B., Bakir, G.: A direct method for building sparse kernel learning algorithms. Journal of Machine Learning Research 7, 603–624 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ukkonen, A. (2010). The Support Vector Tree. In: Elomaa, T., Mannila, H., Orponen, P. (eds) Algorithms and Applications. Lecture Notes in Computer Science, vol 6060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12476-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-12476-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12475-4
Online ISBN: 978-3-642-12476-1
eBook Packages: Computer ScienceComputer Science (R0)