ABSTRACT
Tagging algorithms have become increasingly important for identifying lexical and semantic features of unstructured text. We describe an approach to lattice-based tagging that estimates joint transition and emission probabilities using support vector machines. The technique offers several advantages over alternative methods, including the ability to accommodate non-local features, support for hundreds of thousands of features, and language-neutrality. We demonstrate the technique on two tagging applications: named entity recognition and part-of-speech tagging.
- D. M. Bikel, S. Miller, R. Schwartz, and R. Weischedel, 1997. 'Nymble: a high-performance learning name-finder.' Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97) pp. 194--201. Google ScholarDigital Library
- Thorsten Brants, 2000. 'TnT-A statistical part-of-speech tagger.' In Proceedings of ANLP-2000, Seattle, Washington. Google ScholarDigital Library
- Hai Leong Chieu and Hwee Tou Ng, 2002. 'Named entity recognition: A maximum entropy approach using global information.' Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), pp. 190--196, Taipei, Taiwan. Google ScholarDigital Library
- Thorsten Joachims. 1999. 'Making large-scale SVM learning practical.' In B. Schölkopf, C. Burges and A. Smola, eds., Support Vector Learning. MIT Press.Google Scholar
- Mitchell P. Marcus, Mary Ann Marcinkiewicz and Beatrice Santorini, 1993. 'Building a large annotated corpus of English: The Penn Treebank.' Computational Linguistics 19(2):313--330. Google ScholarDigital Library
- John C. Platt. 1999. 'Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.' In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, D. Schuurmans (eds.), MIT Press.Google Scholar
- Adwait Ratnaparkhi, 1996. 'A maximum entropy part-of-speech tagger.' Proceedings of the Empirical Methods in Natural Language Processing Conference, Philadelphia, Pennsylvania. Available from <http://www.cis.upenn.edu/ adwait/statnlp.html>, visited 28 May 2003.Google Scholar
- Beatrice Santorini, 1990. Part-of-Speech Tagging Guidelines for the Penn Treebank Project. 3rd revision. Available from <http://www.cis.upenn.edu/ treebank/>, visited 28 May 2003.Google Scholar
- Erik F. Tjong Kim Sang, 2002. 'Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition.' In Dan Roth and Antal van den Bosch, eds., Proceedings of CoNLL-2002, Taipei, Taiwan. pp. 155--158. Google ScholarDigital Library
- Erik F. Tjong Kim Sang and Fien De Meulder, 2003. 'Introduction to the CoNLL-2003 Shared Task: Language Independent Named Entity Recognition.' In Walter Daelemans and Miles Osborne (eds.), Proceedings of CoNLL-2003, Edmonton, Canada. Google ScholarDigital Library
- Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag. Google ScholarDigital Library
Index Terms
- Lattice-based tagging using support vector machines
Recommendations
Clinical entity recognition using structural support vector machines with rich features
DTMBIO '12: Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informaticsNamed entity recognition (NER) is an important task for natural language processing (NLP) of clinical text. Conditional Random Fields (CRFs), a sequential labeling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are ...
Incremental training of support vector machines using hyperspheres
In the conventional incremental training of support vector machines, candidates for support vectors tend to be deleted if the separating hyperplane rotates as the training data are added. To solve this problem, in this paper, we propose an incremental ...
An overview on twin support vector machines
Twin support vector machines (TWSVM) is based on the idea of proximal SVM based on generalized eigenvalues (GEPSVM), which determines two nonparallel planes by solving two related SVM-type problems, so that its computing cost in the training phase is 1/...
Comments