ABSTRACT
Entity search over news, social media and the Web allows users to precisely retrieve concise information about specific people, organizations, movies and their characters, and other kinds of entities. This expressive search mode builds on two major assets: 1) a knowledge base (KB) that contains the entities of interest and 2) entity markup in the documents of interest derived by automatic disambiguation of entity names (NED) and linking names to the KB. These prerequisites are not easily available, though, in the important case when a user is interested in a newly emerging entity (EE) such as new movies, new songs, etc. Automatic methods for detecting and canonicalizing EEs are not nearly at the same level as the NED methods for prominent entities that have rich descriptions in the KB.
To overcome this major limitation, we have developed an approach and prototype system that allows searching for EEs in a user-friendly manner. The approach leverages the human in the loop by prompting for user feedback on candidate entities and on characteristic keyphrases for EEs. For convenience and low burden on users, this process is supported by the automatic harvesting oftentative keyphrases. Our demo system shows this interactive process and its high usability.
- H. Bast, F. Baurle, B. Buchhold, and E. Haußmann. Semantic Full-Text Search with Broccoli. In SIGIR 2014, 2014. Google ScholarDigital Library
- J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In SIGIR 2014, 2014. Google ScholarDigital Library
- J. Hoffart, Y. Altun, and G. Weikum. Discovering Emerging Entities with Ambiguous Names. In WWW 2014, 2014. Google ScholarDigital Library
- J. Hoffart, D. Milchevski, and G. Weikum. STICS: Searching with Strings, Things, and Cats. SIGIR 2014, 2014. Google ScholarDigital Library
- J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust Disambiguation of Named Entities in Text. In EMNLP 2011, 2011. Google ScholarDigital Library
- H. Ji, R. Grishman, and H. T. Dang. Overview of the TAC2011 Knowledge Base Population Track. In Text Analysis Conference, 2011.Google Scholar
- H. Ji, J. Nothman, B. Hachey, and F. Radu. Overview of TAC-KBP2015 Tri-lingual Entity Discovery and Linking. In Text Analysis Conference, 2015.Google Scholar
- B. Keegan, D. Gergle, and N. Contractor. Hot Off the Wiki: Structures and Dynamics of Wikipedia's Coverage of Breaking News Events. American Behavioral Scientist, 57(5), 2013.Google Scholar
- Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining Evidences for Named Entity Disambiguation. In KDD 2013, 2013. Google ScholarDigital Library
- W. Shen, J. Wang, and J. Han. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Trans. Knowl. Data Eng., 27(2), 2015.Google ScholarCross Ref
Index Terms
- The Knowledge Awakens: Keeping Knowledge Bases Fresh with Emerging Entities
Recommendations
Entity Disambiguation with Linkless Knowledge Bases
WWW '16: Proceedings of the 25th International Conference on World Wide WebNamed Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain ...
Search-based entity disambiguation with document-centric knowledge bases
i-KNOW '15: Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven BusinessEntity disambiguation is the task of mapping ambiguous terms in natural-language text to its entities in a knowledge base. One possibility to describe these entities within a knowledge base is via entity-annotated documents (document-centric knowledge ...
DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web CompanionIn this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The ...
Comments