skip to main content
10.1145/2509558.2509570acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

A joint model for discovering and linking entities

Published:27 October 2013Publication History

ABSTRACT

Entity resolution, the task of automatically determining which mentions refer to the same real-world entity, is a crucial aspect of knowledge base construction and management. However, performing entity resolution at large scales is challenging because (1) the inference algorithms must cope with unavoidable system scalability issues and (2) the search space grows exponentially in the number of mentions. Current conventional wisdom has been that performing coreference at these scales requires decomposing the problem by first solving the simpler task of entity-linking (matching a set of mentions to a known set of KB entities), and then performing entity discovery as a post-processing step (to identify new entities not present in the KB). However, we argue that this traditional approach is harmful to both entity-linking and overall coreference accuracy. Therefore, we embrace the challenge of jointly modeling entity-linking and entity-discovery as a single entity resolution problem. In order to make progress towards scalability we (1) present a model that reasons over compact hierarchical entity representations, and (2) propose a novel distributed inference architecture that does not suffer from the synchronicity bottleneck which is inherent in map-reduce architectures. We demonstrate that more test-time data actually improves the accuracy of coreference, and show that joint coreference is substantially more accurate than traditional entity-linking, reducing error by 75%.

References

  1. M. Bilenko, B. Kamath, and R. J. Mooney. Adaptive blocking: Learning to scale up record linkage. In phProceedings of the Sixth International Conference on Data Mining, ICDM '06, pages 87--96, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0--7695--2701--9. http://dx.doi.org/10.1109/ICDM.2006.13. URL http://dx.doi.org/10.1109/ICDM.2006.13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Bohm, G. de Melo, F. Naumann, and G. Weikum. Linda: Distributed web-of-data-scale entity matching. In phCIKM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. L. Dunn. Record linkage. phAmerican Journal of Public Health, 36 (12): 1412--1416, 1946.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. K. McCallum, K. Nigam, and L. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In phProceedings of the Sixth International Conference On Knowledge Discovery and Data Mining (KDD-2000), Boston, MA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In phProceedings of the sixteenth ACM conference on Conference on information and knowledge management, CIKM '07, pages 233--242, New York, NY, USA, 2007. ACM. ISBN 978--1--59593--803--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. B. Newcombe. Record linking: the design of efficient systems for linking records into individual and family histories. phthe American Journal of Human Genetics, 19 (3): 334--359, 1967.Google ScholarGoogle Scholar
  7. D. Rao, P. McNamee, and M. Dredze. Streaming cross document entity coreference resolution. In phCOLING (Posters), pages 1050--1058, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algorithms for disambiguation to wikipedia. In phAnnual Meeting of the Association for Computational Linguistics (ACL), 2011. URL http://cogcomp.cs.illinois.edu/papers/RRDA11.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Singh, A. Subramanya, F. Pereira, and A. McCallum. Large-scale cross-document coreference using distributed inference and hierarchical models. In phAssociation for Computational Linguistics: Human Language Technologies (ACL HLT), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. lum}singh12:wiki-linksS. Singh, A. Subramanya, F. Pereira, and A. McCallum. WikiLinks: Large-scale cross-document coreference corpus labeled via links to wikipedia. Technical Report UM-CS-2012-015, University of Massachusetts, Amherst, 2012.Google ScholarGoogle Scholar
  11. M. Wick, S. Singh, and A. McCallum. A discriminative hierarchical model for fast coreference at large scale. In phAssociation for Computational Linguistics (ACL), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A joint model for discovering and linking entities

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      AKBC '13: Proceedings of the 2013 workshop on Automated knowledge base construction
      October 2013
      124 pages
      ISBN:9781450324113
      DOI:10.1145/2509558

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      AKBC '13 Paper Acceptance Rate9of19submissions,47%Overall Acceptance Rate9of19submissions,47%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader