ABSTRACT
In this paper, we apply three classification learning algorithms to Telugu Named Entity Recognition (NER) task and we present a comparative study between these three learning algorithms on Telugu dataset (NER for South and South-East Asian Languages (NERSSEAL) Competition). The empirical results show that Support Vector Machine achieves the best F-measure of 54.78% on the dataset.
- N Abinaya, M Anand Kumar, and KP Soman. 2015. Randomized kernel approach for named entity recognition in Tamil. Indian Journal of Science and Technology 8, 24 (2015).Google ScholarCross Ref
- Akshar Bharati, Rajeev Sangal, Dipti Misra Sharma, and Lakshmi Bai. 2006. Anncorra: Annotating corpora guidelines for pos and chunk annotation for indian languages. LTRC-TR31 (2006).Google Scholar
- Daniel M Bikel, Richard Schwartz, and Ralph M Weischedel. 1999. An algorithm that learns what's in a name. Machine learning 34, 1 (1999), 211--231. Google ScholarDigital Library
- William J Black, Fabio Rinaldi, and David Mowatt. 1998. FACILE: Description of the NE System Used for MUC-7. In Proceedings of the 7th Message Understanding Conference.Google Scholar
- Andrew Borthwick, John Sterling, Eugene Agichtein, and Ralph Grishman. 1998. Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In Proc. of the Sixth Workshop on Very Large Corpora, Vol. 182.Google Scholar
- Michael Collins and Yoram Singer. 1999. Unsupervised models for named entity classification. In Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora. 100--110.Google Scholar
- Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297. Google ScholarDigital Library
- Walter Daelemans and Antal Van den Bosch. 2005. Memory-based language processing. Cambridge University Press. Google ScholarDigital Library
- Ralph Grishman and Beth Sundheim. 1996. Design of the MUC-6 Evaluation. In Proceedings of a Workshop on Held at Vienna, Virginia: May 6-8, 1996 (TIPSTER '96). Association for Computational Linguistics, Stroudsburg, PA, USA, 413--422. Google ScholarDigital Library
- Hideki Isozaki and Hideto Kazawa. 2002. Efficient support vector classifiers for named entity recognition. In Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 1--7. Google ScholarDigital Library
- Andrew McCallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 188--191. Google ScholarDigital Library
- David Nadeau. 2007. Semi-supervised named entity recognition. A PhD Thesis Submitted to Ottawa-Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa, Canada (2007). Google ScholarDigital Library
- Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. 2002. The GENIA corpus: An annotated research abstract corpus in molecular biology domain. In Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., 82--86. Google ScholarDigital Library
- Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. 11--19.Google Scholar
- Satoshi Sekine, Ralph Grishman, and Hiroyuki Shinnou. 1998. A decision tree method for finding and classifying names in Japanese texts. In Proceedings of the Sixth Workshop on Very Large Corpora.Google Scholar
- Praneeth Shishtla, Karthik Gali, Prasad Pingali, and Vasudeva Varma. 2008. Experiments in Telugu NER: A Conditional Random Field Approach.. In IJCNLP. 105--110.Google Scholar
- P Srikanth and Kavi Narayana Murthy. 2008. Named Entity Recognition for Telugu.. In IJCNLP. 41--50.Google Scholar
- Hanna M Wallach. 2004. Conditional random fields: An introduction. Technical Reports (CIS) (2004), 22.Google Scholar
Index Terms
- A Comparative Study of Named Entity Recognition for Telugu
Recommendations
Named entity recognition and disambiguation using linked data and graph-based centrality scoring
SWIM '12: Proceedings of the 4th International Workshop on Semantic Web Information ManagementNamed Entity Recognition (NER) is a subtask of information extraction and aims to identify atomic entities in text that fall into predefined categories such as person, location, organization, etc. Recent efforts in NER try to extract entities and link ...
Named entity recognition an aid to improve multilingual entity filling in language-independent approach
IKM4DR '12: Proceedings of the first workshop on Information and knowledge management for developing regionThis paper details the approach to identify Named Entities (NEs) from a large non-English corpus and associate them with appropriate tags, requiring minimal human intervention and no linguistic expertise. The main objective in this paper is to focus on ...
Bengali Named Entity Recognition Using Classifier Combination
ICAPR '09: Proceedings of the 2009 Seventh International Conference on Advances in Pattern RecognitionThis paper reports about the development of a Named Entity Recognition (NER) system for Bengali by combining the outputs of the classifiers like Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine(SVM) using a majority voting ...
Comments