Skip to main content

Domain Information for Fine-Grained Person Name Categorization

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4919))

  • 1467 Accesses

Abstract

Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarse-grained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to the same fine-grained category. For our study, we generate a relevant domain resource by mapping and ranking the words from the WordNet glosses to their WordNetDomains. This approach allows us to capture the semantic information of the context around the named entity and thus to discover the corresponding fine-grained name category. The presented approach is evaluated with six different person names and it reaches 73% f-score. The obtained results are encouraging and perform significantly better than a majority baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Black, W., Rinaldi, F., Mowatt, D.: Facile: Description of the ne system used for muc. In: Proceedings of the Message Understanding Conference (1998)

    Google Scholar 

  2. Bunescu, R., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceeding of ACL, pp. 9–16 (2006)

    Google Scholar 

  3. Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceeding of RANLP 2005, pp. 166–172 (2005)

    Google Scholar 

  4. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)

    Google Scholar 

  5. Fleischman, M., Hovy, E.: Fine grained classification of named entities. In: Proceedings of the 19th international conference on Computational linguistics, pp. 1–7, Association for Computational Linguistics, Morristown (2002)

    Google Scholar 

  6. Gaizauskas, R., et al.: University of sheffield: Description of the lasie system as used for muc. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  7. Kozareva, Z., Vazquez, S., Montoyo, A.: Discovering the underlying meanings and categories of a name through domain and semantic information. In: Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP) (2007)

    Google Scholar 

  8. Kozareva, Z., Vazquez, S., Montoyo, A.: A language independent approach for name categorization and discrimination. In: Proceedings of the ACL 2007 Workshop on Balto-Slavonic Natural Language Processing (2007)

    Google Scholar 

  9. Lin, D.: Automatic retrieval and clustering of similar words. In: Proceeding of COLING-ACL (1998)

    Google Scholar 

  10. Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of LREC, pp. 1413–1418 (2000)

    Google Scholar 

  11. Mann, G.S.: Fine-grained proper noun ontology for question answering. In: Proceeding of COLING-2002 on SEMANET, pp. 1–7 (2002)

    Google Scholar 

  12. Nakov, P., Hearst, M.: Category-based pseudowords. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 67–69 (2003)

    Google Scholar 

  13. Navarro, B., et al.: Improving interaction with the user in cross-language question answering through relevant domains and syntactic semantic patterns. In: Proceedings of CLEF-2005, pp. 334–342 (2005)

    Google Scholar 

  14. Pasca, M.: Acquisition of categorized named entities for web search. In: Proceedings of CIKM, pp. 137–145 (2004)

    Google Scholar 

  15. Pedersen, T., et al.: An unsupervised language independent method of name discrimination using second order co-occurrence features. In: Proceeding of CICLING, pp. 208–222 (2006)

    Google Scholar 

  16. Sang, E.F.T.K.: Introduction to the conll-2002 shared task: Language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158. Taipei, Taiwan (2002)

    Google Scholar 

  17. Sang, E.F.T.K., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of HLT-NAACL, pp. 142–147 (2003)

    Google Scholar 

  18. Sekine, S., Sudo, K., Nobata, C.: Extended named entity hierarchy. In: Proceeding of LREC (2002)

    Google Scholar 

  19. Tanev, H., Magnini, B.: Weakly supervised approaches for ontology population. In: Proceedings of ACL, pp. 17–24 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kozareva, Z., Vazquez, S., Montoyo, A. (2008). Domain Information for Fine-Grained Person Name Categorization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78135-6_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78134-9

  • Online ISBN: 978-3-540-78135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics