skip to main content
10.1145/3158354.3158358acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
short-paper

A Comparative Study of Named Entity Recognition for Telugu

Authors Info & Claims
Published:08 December 2017Publication History

ABSTRACT

In this paper, we apply three classification learning algorithms to Telugu Named Entity Recognition (NER) task and we present a comparative study between these three learning algorithms on Telugu dataset (NER for South and South-East Asian Languages (NERSSEAL) Competition). The empirical results show that Support Vector Machine achieves the best F-measure of 54.78% on the dataset.

References

  1. N Abinaya, M Anand Kumar, and KP Soman. 2015. Randomized kernel approach for named entity recognition in Tamil. Indian Journal of Science and Technology 8, 24 (2015).Google ScholarGoogle ScholarCross RefCross Ref
  2. Akshar Bharati, Rajeev Sangal, Dipti Misra Sharma, and Lakshmi Bai. 2006. Anncorra: Annotating corpora guidelines for pos and chunk annotation for indian languages. LTRC-TR31 (2006).Google ScholarGoogle Scholar
  3. Daniel M Bikel, Richard Schwartz, and Ralph M Weischedel. 1999. An algorithm that learns what's in a name. Machine learning 34, 1 (1999), 211--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. William J Black, Fabio Rinaldi, and David Mowatt. 1998. FACILE: Description of the NE System Used for MUC-7. In Proceedings of the 7th Message Understanding Conference.Google ScholarGoogle Scholar
  5. Andrew Borthwick, John Sterling, Eugene Agichtein, and Ralph Grishman. 1998. Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In Proc. of the Sixth Workshop on Very Large Corpora, Vol. 182.Google ScholarGoogle Scholar
  6. Michael Collins and Yoram Singer. 1999. Unsupervised models for named entity classification. In Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora. 100--110.Google ScholarGoogle Scholar
  7. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Walter Daelemans and Antal Van den Bosch. 2005. Memory-based language processing. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ralph Grishman and Beth Sundheim. 1996. Design of the MUC-6 Evaluation. In Proceedings of a Workshop on Held at Vienna, Virginia: May 6-8, 1996 (TIPSTER '96). Association for Computational Linguistics, Stroudsburg, PA, USA, 413--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hideki Isozaki and Hideto Kazawa. 2002. Efficient support vector classifiers for named entity recognition. In Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 1--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Andrew McCallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 188--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. David Nadeau. 2007. Semi-supervised named entity recognition. A PhD Thesis Submitted to Ottawa-Carleton Institute for Computer Science, School of Information Technology and Engineering, University of Ottawa, Canada (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tomoko Ohta, Yuka Tateisi, and Jin-Dong Kim. 2002. The GENIA corpus: An annotated research abstract corpus in molecular biology domain. In Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., 82--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. 11--19.Google ScholarGoogle Scholar
  15. Satoshi Sekine, Ralph Grishman, and Hiroyuki Shinnou. 1998. A decision tree method for finding and classifying names in Japanese texts. In Proceedings of the Sixth Workshop on Very Large Corpora.Google ScholarGoogle Scholar
  16. Praneeth Shishtla, Karthik Gali, Prasad Pingali, and Vasudeva Varma. 2008. Experiments in Telugu NER: A Conditional Random Field Approach.. In IJCNLP. 105--110.Google ScholarGoogle Scholar
  17. P Srikanth and Kavi Narayana Murthy. 2008. Named Entity Recognition for Telugu.. In IJCNLP. 41--50.Google ScholarGoogle Scholar
  18. Hanna M Wallach. 2004. Conditional random fields: An introduction. Technical Reports (CIS) (2004), 22.Google ScholarGoogle Scholar

Index Terms

  1. A Comparative Study of Named Entity Recognition for Telugu

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        FIRE '17: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation
        December 2017
        38 pages
        ISBN:9781450363822
        DOI:10.1145/3158354

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 December 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate19of64submissions,30%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader