Abstract
In recent years there has been a surge of interest in using Twitter to detect real-world events. However, many state-of-the-art event detection approaches are either too slow for real-time application, or can detect only specific types of events effectively. We examine the role of named entities and use them to enhance event detection. Specifically, we use a clustering technique which partitions documents based upon the entities they contain, and burst detection and cluster selection techniques to extract clusters related to on-going real-world events. We evaluate our approach on a large-scale corpus of 120 million tweets covering more than 500 events, and show that it is able to detect significantly more events than current state-of-the-art approaches whilst also improving precision and retaining low computational complexity. We find that nouns and verbs play different roles in event detection and that the use of hashtags and retweets lead to a decreases in effectiveness when using our entity-base approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Subbian, K.: Event detection in social streams. In: Proc. of SDM Conference (2012)
Allan, J., Harding, S., Fisher, D., Bolivar, A., Guzman-Lara, S., Amstutz, P.: Taking topic detection from evaluation to practice. In: HICSS 2005. IEEECS, Washington, D.C. (2005)
Allan, J., Lavrenko, V., Jin, H.: First story detection in TDT is hard. In: CIKM 2000, pp. 374–381. ACM, New York (2000)
Allan, J., Lavrenko, V., Malin, D., Swan, R.: Detections, bounds, and timelines: UMass and TDT-3. In: TDT-3 Workshop (2000)
Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Computational Intelligence (2013)
Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on twitter. In: ICWSM 2011 (2011)
Choudhury, S., Breslin, J.G.: Extracting semantic entities and events from sports tweets. In: Proceedings of #MSM2011 at ESWC (2011)
Derczynski, L., Ritter, A., Clark, S., Bontcheva, K.: Twitter part-of-speech tagging for all: overcoming sparse and noisy data. In: ICRA-NLP (2013)
Hu, M., Liu, S., Wei, F., Wu, Y., Stasko, J., Ma, K.-L.: Breaking news on twitter. In: CHI 2012. ACM, New York (2012)
Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: SIGIR 2004, pp. 297–304. ACM, New York (2004)
Kumaran, G., Allan, J.: Using names and topics for new event detection. In: HLT 2005, pp. 121–128. ACL, Stroudsburg (2005)
Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW 2010. ACM, New York (2010)
Li, C., Weng, J., He, Q., Yao, Y., Datta, A., Sun, A., Lee, B.-S.: Twiner: named entity recognition in targeted twitter stream. In: SIGIR (2012)
Liu, X., Zhang, S., Wei, F., Zhou, M.: Recognizing named entities in tweets. In: HLT 2011. ACL, Stroudsburg (2011)
McMinn, A.J., Moshfeghi, Y., Jose, J.M.: Building a large-scale corpus for evaluating event detection on twitter. In: CIKM 2013. ACM (2013)
Osborne, M., Petrovic, S., McCreadie, R., Macdonald, C., Ounis, I.: Bieber no more: first story detection using twitter and wikipedia. In: SIGIR 2012 Workshop TAIA (2012)
Ozdikis, O., Senkul, P., Oguztzn, H.: Semantic expansion of tweet contents for enhanced event detection in twitter. In: ASONAM. IEEE CS (2012)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In HLT 2010. ACL (2010)
Pukelsheim, F.: The three sigma rule. The American Statistician 48(2), 88–91 (1994)
Ritter, A., Mausam, Etzioni, O., Clark, S.: Open domain event extraction from twitter. In: Proceedings of ACM SIGKDD 2012. ACM (2012)
Sankaranarayanan, J., Samet, H., Teitler, B., Lieberman, M., Sperling, J.: Twitterstand: news in tweets. In: ACM SIGSPATIAL 2009 (2009)
Yang, Y., Pierce, T., Carbonell, J.: A study of retrospective and on-line event detection. In: SIGIR 1998, pp. 28–36. ACM, New York (1998)
Yang, Y., Zhang, J., Carbonell, J., Jin, C.: Topic-conditioned novelty detection. In: ACM CIKM 2002, pp. 688–693 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
McMinn, A.J., Jose, J.M. (2015). Real-Time Entity-Based Event Detection for Twitter. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)