Article

Lattice-based tagging using support vector machines

Authors:
James Mayfield

The Johns Hopkins University, Laurel, MD

The Johns Hopkins University, Laurel, MD
View Profile

,
Paul McNamee

The Johns Hopkins University, Laurel, MD

The Johns Hopkins University, Laurel, MD
View Profile

,
Christine Piatko

The Johns Hopkins University, Laurel, MD

The Johns Hopkins University, Laurel, MD
View Profile

,
Claudia Pearce

Department of Defense, Ft. Meade, MD

Department of Defense, Ft. Meade, MD
View Profile

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge managementNovember 2003Pages 303–308https://doi.org/10.1145/956863.956921

Published:03 November 2003Publication History

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

Pages 303–308

ABSTRACT

Tagging algorithms have become increasingly important for identifying lexical and semantic features of unstructured text. We describe an approach to lattice-based tagging that estimates joint transition and emission probabilities using support vector machines. The technique offers several advantages over alternative methods, including the ability to accommodate non-local features, support for hundreds of thousands of features, and language-neutrality. We demonstrate the technique on two tagging applications: named entity recognition and part-of-speech tagging.

References

D. M. Bikel, S. Miller, R. Schwartz, and R. Weischedel, 1997. 'Nymble: a high-performance learning name-finder.' Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97) pp. 194--201. Google ScholarDigital Library
Thorsten Brants, 2000. 'TnT-A statistical part-of-speech tagger.' In Proceedings of ANLP-2000, Seattle, Washington. Google ScholarDigital Library
Hai Leong Chieu and Hwee Tou Ng, 2002. 'Named entity recognition: A maximum entropy approach using global information.' Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), pp. 190--196, Taipei, Taiwan. Google ScholarDigital Library
Thorsten Joachims. 1999. 'Making large-scale SVM learning practical.' In B. Schölkopf, C. Burges and A. Smola, eds., Support Vector Learning. MIT Press.Google Scholar
Mitchell P. Marcus, Mary Ann Marcinkiewicz and Beatrice Santorini, 1993. 'Building a large annotated corpus of English: The Penn Treebank.' Computational Linguistics 19(2):313--330. Google ScholarDigital Library
John C. Platt. 1999. 'Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.' In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, D. Schuurmans (eds.), MIT Press.Google Scholar
Adwait Ratnaparkhi, 1996. 'A maximum entropy part-of-speech tagger.' Proceedings of the Empirical Methods in Natural Language Processing Conference, Philadelphia, Pennsylvania. Available from <http://www.cis.upenn.edu/ adwait/statnlp.html>, visited 28 May 2003.Google Scholar
Beatrice Santorini, 1990. Part-of-Speech Tagging Guidelines for the Penn Treebank Project. 3rd revision. Available from <http://www.cis.upenn.edu/ treebank/>, visited 28 May 2003.Google Scholar
Erik F. Tjong Kim Sang, 2002. 'Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition.' In Dan Roth and Antal van den Bosch, eds., Proceedings of CoNLL-2002, Taipei, Taiwan. pp. 155--158. Google ScholarDigital Library
Erik F. Tjong Kim Sang and Fien De Meulder, 2003. 'Introduction to the CoNLL-2003 Shared Task: Language Independent Named Entity Recognition.' In Walter Daelemans and Miles Osborne (eds.), Proceedings of CoNLL-2003, Edmonton, Canada. Google ScholarDigital Library
Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag. Google ScholarDigital Library

Index Terms

Lattice-based tagging using support vector machines
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

Clinical entity recognition using structural support vector machines with rich features
DTMBIO '12: Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informatics

Named entity recognition (NER) is an important task for natural language processing (NLP) of clinical text. Conditional Random Fields (CRFs), a sequential labeling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are ...
Read More
Incremental training of support vector machines using hyperspheres

In the conventional incremental training of support vector machines, candidates for support vectors tend to be deleted if the separating hyperplane rotates as the training data are added. To solve this problem, in this paper, we propose an incremental ...
Read More
An overview on twin support vector machines

Twin support vector machines (TWSVM) is based on the idea of proximal SVM based on generalized eigenvalues (GEPSVM), which determines two nonparallel planes by solving two related SVM-type problems, so that its computing cost in the training phase is 1/...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
November 2003
592 pages
ISBN:1581137230
DOI:10.1145/956863
General Chair:
Donald Kraft
Louisiana State University
,
Program Chairs:
Ophir Frieder
Illinois Institute of Technology
,
Joachim Hammer
University of Florida
,
Sajda Qureshi
University of Nebraska, Omaha
,
Len Seligman
The MITRE Corporation
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SVM-Lattice
named entity recognition
part of speech tagging
support vector machines
tagging
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 589
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Lattice-based tagging using support vector machines

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Clinical entity recognition using structural support vector machines with rich features

Incremental training of support vector machines using hyperspheres

An overview on twin support vector machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Lattice-based tagging using support vector machines

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Clinical entity recognition using structural support vector machines with rich features

Incremental training of support vector machines using hyperspheres

An overview on twin support vector machines

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media