Loading [MathJax]/extensions/MathZoom.js
TwiCS: Lightweight Entity Mention Detection in Targeted Twitter Streams | IEEE Journals & Magazine | IEEE Xplore

TwiCS: Lightweight Entity Mention Detection in Targeted Twitter Streams

Publisher: IEEE

Abstract:

Microblogging sites, like Twitter, continuously generate a large volume of streaming data. This streaming environment creates new challenges for two concomitant Informati...View more

Abstract:

Microblogging sites, like Twitter, continuously generate a large volume of streaming data. This streaming environment creates new challenges for two concomitant Information Extraction tasks: Entity Mention Detection (EMD) and Entity Detection (ED). The new challenges include (1) continuously evolving topics, which may deprecate model-based approaches quickly; (2) non-literary nature of posts, which makes traditional NLP techniques less effective; and (3) huge volume of streaming data, which makes computationally expensive approaches less suitable. In this paper, we propose an approach for EMD/ED whose creation is guided by the constraints specific to streaming environments from the ground up. Our system TwiCS implements this approach. TwiCS employs a computationally light two-phase process. In the first phase, it exploits simple (low computation) syntactic cues to suggest Entity Mention (EM) candidates. In the second phase, it uses occurrence mining to classify candidates according to their likelihood of being true EMs. Our experiments show that TwiCS achieves an average effectiveness improvement of 14.6 percent, while maintaining at least 2.64 times higher throughput, when compared to several state-of-the-art systems.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 35, Issue: 1, 01 January 2023)
Page(s): 1043 - 1057
Date of Publication: 11 June 2021

ISSN Information:

Publisher: IEEE

Funding Agency:


References

References is not available for this document.