Resolving Entity Morphs based on Character-Word Embedding

https://doi.org/10.1016/j.procs.2017.05.106Get rights and content
Under a Creative Commons license
open access

Abstract

Morph is a special type of fake alternative names. Internet users use morphs to achieve certain goals such as expressing special sentiment or avoiding censorship. For example, Chinese internet users often replace “马景涛” (Ma Jingtao) with “咆哮教主” (Roar Bishop)1. “咆哮教主” (Roar Bishop) is a morph and “马景涛” (Ma Jingtao) is the target entity of "咆哮教主" Roar Bishop . This paper focuses on morph resolution: given a morph, figure out the entity that it really refers to After analyse the common characteristic of morphs and target entities from cross-source corpora, we exploit temporal and semantic constraints to collect target candidates. We propose a framework based on character-word embeddings and radical-character-word embeddings to rank target candidates. Our method does not need any human-annotated data. Experimental results demonstrate our approaches outperforms the state-of-the-art method. The results also show that the performance is better when morphs share any character with target entities.

Keywords

morph
morph resoluton
social network
word embedding
character-word embeddings

Cited by (0)