Skip to main content

Extracting Fine-Grained Entities Based on Coordinate Graph

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7934))

  • 2358 Accesses

Abstract

Most previous entity extraction studies focus on a small set of coarse-grained classes, such as person etc. However, the distribution of entities within query logs of search engine indicates that users are more interested in a wider range of fine-grained entities, such as GRAMMY winner and Ivy League member etc. In this paper, we present a semi-supervised method to extract fine-grained entities from an open-domain corpus. We build a graph based on entities in coordinate lists, which are html nodes with the same tag path of the DOM trees. Then class labels are propagated over the graph from known entities to unknowns. Experiments on a large corpus from ClueWeb09a dataset show that our proposed approach achieves the promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Guo, J., et al.: Named entity recognition in query. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, pp. 267–274. ACM (2009)

    Google Scholar 

  2. Jiang, P., et al.: Wiki3C: exploiting wikipedia for context-aware concept categorization. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, Rome, Italy, pp. 345–354. ACM (2013)

    Google Scholar 

  3. Wang, F., Zhang, C.: Label propagation through linear neighborhoods. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, pp. 985–992. ACM (2006)

    Google Scholar 

  4. Ekbal, A., et al.: Assessing the challenge of fine-grained named entity recognition and classification. In: Proceedings of the 2010 Named Entities Workshop, Uppsala, Sweden, pp. 93–101. Association for Computational Linguistics (2010)

    Google Scholar 

  5. Ling, X., Weld, D.S.: Fine-Grained Entity Recognition. In: Proceedings of the 26th Conference on Artificial Intelligence, AAAI (2012)

    Google Scholar 

  6. Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. Proc. VLDB Endow. 3(1-2), 1338–1347 (2010)

    Google Scholar 

  7. Weischedel, R., Brunstein, A.: Bbn pronoun coreference and entity type corpus. Linguistic Data Consortium, Philadelphia (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, Q., Jiang, P., Zhang, C., Niu, Z. (2013). Extracting Fine-Grained Entities Based on Coordinate Graph. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2013. Lecture Notes in Computer Science, vol 7934. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38824-8_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38824-8_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38823-1

  • Online ISBN: 978-3-642-38824-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics