Abstract
This paper presents a novel application of automata algorithms to machine learning. It introduces the first optimization solution for support vector machines used with sequence kernels that is purely based on weighted automata and transducer algorithms, without requiring any specific solver. The algorithms presented apply to a family of kernels covering all those commonly used in text and speech processing or computational biology. We show that these algorithms have significantly better computational complexity than previous ones and report the results of large-scale experiments demonstrating a dramatic reduction of the training time, typically by several orders of magnitude.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allauzen, C., Mohri, M., Talwalkar, A.: Sequence kernels for predicting protein essentiality. In: ICML 2008 (2008)
Bach, F.R., Jordan, M.I.: Kernel independent component analysis. JMLR 3, 1–48 (2002)
Carrosco, R.C., Forcada, M.L.: Incremental construction and maintenance of minimal finite-state automata. Computational Linguistics 28(2), 207–216 (2002)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001)
Cortes, C., Haffner, P., Mohri, M.: Rational Kernels: Theory and Algorithms. JMLR (2004)
Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3) (1995)
Daciuk, J., Mihov, S., Watson, B.W., Watson, R.: Incremental construction of minimal acyclic finite state automata. Computational Linguistics 26(1), 3–16 (2000)
Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training SVM. JMLR 6, 1889–1918 (2005)
Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research 2, 243–264 (2002)
Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: ICML, pp. 408–415 (2008)
Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods: Support Vector Learning. The MIT Press, Cambridge (1998)
Kuich, W., Salomaa, A.: Semirings, Automata, Languages. In: EATCS Monographs on Theoretical Computer Science, vol. 5. Springer, New York (1986)
Kumar, S., Mohri, M., Talwalkar, A.: On sampling-based approximate spectral decomposition. In: ICML (2009)
Leslie, C.S., Eskin, E., Noble, W.S.: The Spectrum Kernel: A String Kernel for SVM Protein Classification. In: Pacific Symposium on Biocomputing, pp. 566–575 (2002)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLRÂ 2 (2002)
Luo, Z.Q., Tseng, P.: On the convergence of the coordinate descent method for convex differentiable minimization. J. of Optim. Theor. and Appl. 72(1), 7–35 (1992)
Mohri, M.: Weighted automata algorithms. In: Handbook of Weighted Automata, pp. 213–254. Springer, Heidelberg (2009)
Salomaa, A., Soittola, M.: Automata-Theoretic Aspects of Formal Power Series. Springer, Heidelberg (1978)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge Univ. Press, Cambridge (2004)
Tsang, I.W., Kwok, J.T., Cheung, P.-M.: Core vector machines: Fast SVM training on very large data sets. JMLR 6, 363–392 (2005)
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: NIPS, pp. 682–688 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Allauzen, C., Cortes, C., Mohri, M. (2011). Large-Scale Training of SVMs with Automata Kernels. In: Domaratzki, M., Salomaa, K. (eds) Implementation and Application of Automata. CIAA 2010. Lecture Notes in Computer Science, vol 6482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18098-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-18098-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18097-2
Online ISBN: 978-3-642-18098-9
eBook Packages: Computer ScienceComputer Science (R0)