short-paper

A Comparative Study of Named Entity Recognition for Telugu

Authors:
SaiKiranmai Gorla

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India
View Profile

,
N. L. Bhanu Murthy

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India
View Profile

,
Aruna Malapati

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India

Birla Institute of Technology and Science Pilani Hyderabad Campus, Hyderabad, Telangana, India
View Profile

FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval EvaluationDecember 2017Pages 21–24https://doi.org/10.1145/3158354.3158358

Published:08 December 2017Publication History

FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation

Pages 21–24

ABSTRACT

In this paper, we apply three classification learning algorithms to Telugu Named Entity Recognition (NER) task and we present a comparative study between these three learning algorithms on Telugu dataset (NER for South and South-East Asian Languages (NERSSEAL) Competition). The empirical results show that Support Vector Machine achieves the best F-measure of 54.78% on the dataset.

References

N Abinaya, M Anand Kumar, and KP Soman. 2015. Randomized kernel approach for named entity recognition in Tamil. Indian Journal of Science and Technology 8, 24 (2015).Google ScholarCross Ref
Akshar Bharati, Rajeev Sangal, Dipti Misra Sharma, and Lakshmi Bai. 2006. Anncorra: Annotating corpora guidelines for pos and chunk annotation for indian languages. LTRC-TR31 (2006).Google Scholar
Daniel M Bikel, Richard Schwartz, and Ralph M Weischedel. 1999. An algorithm that learns what's in a name. Machine learning 34, 1 (1999), 211--231. Google ScholarDigital Library
William J Black, Fabio Rinaldi, and David Mowatt. 1998. FACILE: Description of the NE System Used for MUC-7. In Proceedings of the 7th Message Understanding Conference.Google Scholar
Andrew Borthwick, John Sterling, Eugene Agichtein, and Ralph Grishman. 1998. Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In Proc. of the Sixth Workshop on Very Large Corpora, Vol. 182.Google Scholar
Michael Collins and Yoram Singer. 1999. Unsupervised models for named entity classification. In Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora. 100--110.Google Scholar
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297. Google ScholarDigital Library
Walter Daelemans and Antal Van den Bosch. 2005. Memory-based language processing. Cambridge University Press. Google ScholarDigital Library
Ralph Grishman and Beth Sundheim. 1996. Design of the MUC-6 Evaluation. In Proceedings of a Workshop on Held at Vienna, Virginia: May 6-8, 1996 (TIPSTER '96). Association for Computational Linguistics, Stroudsburg, PA, USA, 413--422. Google ScholarDigital Library
Hideki Isozaki and Hideto Kazawa. 2002. Efficient support vector classifiers for named entity recognition. In Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 1--7. Google ScholarDigital Library
Andrew McCallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 188--191. Google ScholarDigital Library
David Nadeau. 2007. Semi-supervised named entity recognition. A PhD Thesis Submitted to Ottawa-Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa, Canada (2007). Google ScholarDigital Library
Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. 2002. The GENIA corpus: An annotated research abstract corpus in molecular biology domain. In Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., 82--86. Google ScholarDigital Library
Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. 11--19.Google Scholar
Satoshi Sekine, Ralph Grishman, and Hiroyuki Shinnou. 1998. A decision tree method for finding and classifying names in Japanese texts. In Proceedings of the Sixth Workshop on Very Large Corpora.Google Scholar
Praneeth Shishtla, Karthik Gali, Prasad Pingali, and Vasudeva Varma. 2008. Experiments in Telugu NER: A Conditional Random Field Approach.. In IJCNLP. 105--110.Google Scholar
P Srikanth and Kavi Narayana Murthy. 2008. Named Entity Recognition for Telugu.. In IJCNLP. 41--50.Google Scholar
Hanna M Wallach. 2004. Conditional random fields: An introduction. Technical Reports (CIS) (2004), 22.Google Scholar

Index Terms

A Comparative Study of Named Entity Recognition for Telugu
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
  2. Machine learning

Recommendations

Named entity recognition and disambiguation using linked data and graph-based centrality scoring
SWIM '12: Proceedings of the 4th International Workshop on Semantic Web Information Management

Named Entity Recognition (NER) is a subtask of information extraction and aims to identify atomic entities in text that fall into predefined categories such as person, location, organization, etc. Recent efforts in NER try to extract entities and link ...
Read More
Named entity recognition an aid to improve multilingual entity filling in language-independent approach
IKM4DR '12: Proceedings of the first workshop on Information and knowledge management for developing region

This paper details the approach to identify Named Entities (NEs) from a large non-English corpus and associate them with appropriate tags, requiring minimal human intervention and no linguistic expertise. The main objective in this paper is to focus on ...
Read More
Bengali Named Entity Recognition Using Classifier Combination
ICAPR '09: Proceedings of the 2009 Seventh International Conference on Advances in Pattern Recognition

This paper reports about the development of a Named Entity Recognition (NER) system for Bengali by combining the outputs of the classifiers like Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine(SVM) using a majority voting ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation
December 2017
38 pages
ISBN:9781450363822
DOI:10.1145/3158354
Editors:
Prasenjit Majumder,
Mandar Mitra,
Jainisha Sankhavara,
Parth Mehta
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 December 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Conditional random Field
Memory based Learning
Named Entity
Named Entity Recognition
Support Vector Machine
Telugu
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate19of64submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 72
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Comparative Study of Named Entity Recognition for Telugu

FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Named entity recognition and disambiguation using linked data and graph-based centrality scoring

Named entity recognition an aid to improve multilingual entity filling in language-independent approach

Bengali Named Entity Recognition Using Classifier Combination

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Comparative Study of Named Entity Recognition for Telugu

FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Named entity recognition and disambiguation using linked data and graph-based centrality scoring

Named entity recognition an aid to improve multilingual entity filling in language-independent approach

Bengali Named Entity Recognition Using Classifier Combination

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media