skip to main content
10.1145/3097983.3098027acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams

Published: 04 August 2017 Publication History

Abstract

Detecting local events (e.g., protest, disaster) at their onsets is an important task for a wide spectrum of applications, ranging from disaster control to crime monitoring and place recommendation. Recent years have witnessed growing interest in leveraging geo-tagged tweet streams for online local event detection. Nevertheless, the accuracies of existing methods still remain unsatisfactory for building reliable local event detection systems. We propose TrioVecEvent, a method that leverages multimodal embeddings to achieve accurate online local event detection. The effectiveness of TrioVecEvent is underpinned by its two-step detection scheme. First, it ensures a high coverage of the underlying local events by dividing the tweets in the query window into coherent geo-topic clusters. To generate quality geo-topic clusters, we capture short-text semantics by learning multimodal embeddings of the location, time, and text, and then perform online clustering with a novel Bayesian mixture model. Second, TrioVecEvent considers the geo-topic clusters as candidate events and extracts a set of features for classifying the candidates. Leveraging the multimodal embeddings as background knowledge, we introduce discriminative features that can well characterize local events, which enables pinpointing true local events from the candidate pool with a small amount of training data. We have used crowdsourcing to evaluate TrioVecEvent, and found that it improves the performance of the state-of-the-art method by a large margin.

Supplementary Material

MP4 File (zhang_triovecevent.mp4)

References

[1]
H. Abdelhaq, C. Sengstock, and M. Gertz. Eventweet: Online localized event detection from twitter. PVLDB, 6(12):1326--1329, 2013.
[2]
C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In VLDB, pages 81--92, 2003.
[3]
C. C. Aggarwal and K. Subbian. Event detection in social streams. In SDM, pages 624--635, 2012.
[4]
J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In SIGIR, pages 37--45, 1998.
[5]
K. Batmanghelich, A. Saeedi, K. Narasimhan, and S. Gershman. Nonparametric spherical topic modeling with word embeddings. In ACL, 2016.
[6]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3(1):993--1022, 2003.
[7]
L. Cao, M. Wei, D. Yang, and E. A. Rundensteiner. Online outlier exploration over large datasets. In KDD, pages 89--98, 2015.
[8]
L. Chen and A. Roy. Event detection from flickr data through wavelet-based spatial analysis. In CIKM, pages 523--532, 2009.
[9]
J. Cranshaw, E. Toch, J. I. Hong, A. Kittur, and N. M. Sadeh. Bridging the gap between physical location and online social networks. In UbiComp, pages 119--128, 2010.
[10]
S. Doan, B.-K. H. Vo, and N. Collier. An analysis of twitter messages in the 2011 tohoku earthquake. In Electronic Healthcare, pages 58--66. Springer, 2012.
[11]
W. Feng, C. Zhang, W. Zhang, J. Han, J. Wang, C. Aggarwal, and J. Huang. Streamcube: Hierarchical spatio-temporal hashtag clustering for event exploration over the twitter stream. In ICDE, pages 1561--1572, 2015.
[12]
J. Foley, M. Bendersky, and V. Josifovski. Learning to extract local events from the web. In SIGIR, pages 423--432, 2015.
[13]
G. P. C. Fung, J. X. Yu, P. S. Yu, and H. Lu. Parameter free bursty events detection in text streams. In VLDB, pages 181--192, 2005.
[14]
P. Giridhar, S. Wang, T. F. Abdelzaher, J. George, L. Kaplan, and R. Ganti. Joint localization of events and sources in social networks. In DCOSS, pages 179--188, 2015.
[15]
S. Gopal and Y. Yang. Von mises-fisher clustering models. In ICML, pages 154--162, 2014.
[16]
J. Guo and Z. Gong. A nonparametric model for event discovery in the geospatial-temporal space. In CIKM, pages 499--508, 2016.
[17]
Q. He, K. Chang, and E.-P. Lim. Analyzing feature trajectories for event detection. In SIGIR, pages 207--214, 2007.
[18]
X. He, H. Zhang, M. Kan, and T. Chua. Fast matrix factorization for online recommendation with implicit feedback. In SIGIR, pages 549--558, 2016.
[19]
L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In WWW, pages 769--778, 2012.
[20]
W. Kang, A. K. H. Tung, W. Chen, X. Li, Q. Song, C. Zhang, F. Zhao, and X. Zhou. Trendspedia: An internet observatory for analyzing and visualizing the evolving web. In ICDE, pages 1206--1209, 2014.
[21]
C. C. Kling, J. Kunegis, S. Sizov, and S. Staab. Detecting non-gaussian geographical topics in tagged photo collections. In WSDM, pages 603--612, 2014.
[22]
J. Krumm and E. Horvitz. Eyewitness: Identifying local events via space-time signals in twitter feeds. In SIGSPATIAL, 2015.
[23]
C. Li, A. Sun, and A. Datta. Twevent: segment-based event detection from tweets. In CIKM, pages 155--164, 2012.
[24]
R. Li, K. H. Lei, R. Khadiwala, and K.-C. Chang. Tedas: A twitter-based event detection and analysis system. In ICDE, pages 1273--1276, 2012.
[25]
S. Liang, E. Yilmaz, and E. Kanoulas. Dynamic clustering of streaming short documents. In KDD, pages 995--1004, 2016.
[26]
M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In SIGMOD, pages 1155--1158, 2010.
[27]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111--3119, 2013.
[28]
K. P. Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.
[29]
G. Nunez-Antonio and E. Gutiérrez-Pena. A bayesian analysis of directional data using the von mises--fisher distribution. Communications in Statistics-Simulation and Computation®, 34(4):989--999, 2005.
[30]
M. Quezada, V. Pe na-Araya, and B. Poblete. Location-aware model for news events in social media. In SIGIR, pages 935--938, 2015.
[31]
A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In EMNLP, pages 1524--1534, 2011.
[32]
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, pages 851--860, 2010.
[33]
J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. Twitterstand: news in tweets. In GIS, pages 42--51, 2009.
[34]
S. Sizov. Geofolk: latent spatial semantics in web 2.0 social media. In WSDM, pages 281--290, 2010.
[35]
W. Wang, H. Yin, L. Chen, Y. Sun, S. W. Sadiq, and X. Zhou. Geo-sage: A geographical sparse additive generative model for spatial item recommendation. In KDD, pages 1255--1264, 2015.
[36]
K. Watanabe, M. Ochi, M. Okabe, and R. Onai. Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In CIKM, pages 2541--2544, 2011.
[37]
J. Weng and B.-S. Lee. Event detection in twitter. In ICWSM, pages 401--408, 2011.
[38]
S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. F. Abdelzaher. Deepsense: A unified deep learning framework for time-series mobile sensing data processing. In WWW, pages 351--360, 2017.
[39]
J. Yin and J. Wang. A text clustering algorithm using an online clustering scheme for initialization. In KDD, pages 1995--2004, 2016.
[40]
Z. Yin, L. Cao, J. Han, C. Zhai, and T. S. Huang. Geographical topic discovery and comparison. In WWW, pages 247--256, 2011.
[41]
Q. Yuan, G. Cong, Z. Ma, A. Sun, and N. M. Thalmann. Who, where, when and what: discover spatio-temporal topics for twitter users. In KDD, pages 605--613, 2013.
[42]
C. Zhang, K. Zhang, Q. Yuan, H. Peng, Y. Zheng, T. Hanratty, S. Wang, and J. Han. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In WWW, pages 361--370, 2017.
[43]
C. Zhang, K. Zhang, Q. Yuan, L. Zhang, T. Hanratty, and J. Han. Gmove: Group-level mobility modeling using geo-tagged social media. In KDD, pages 1305--1314, 2016.
[44]
C. Zhang, G. Zhou, Q. Yuan, H. Zhuang, Y. Zheng, L. Kaplan, S. Wang, and J. Han. Geoburst: Real-time local event detection in geo-tagged tweet streams. In SIGIR, pages 513--522, 2016.
[45]
L. Zhao, F. Chen, C.-T. Lu, and N. Ramakrishnan. Multi-resolution spatial event forecasting in social media. In KDD, 2016.
[46]
L. Zhao, Q. Sun, J. Ye, F. Chen, C. Lu, and N. Ramakrishnan. Multi-task learning for spatio-temporal event forecasting. In KDD, pages 1503--1512, 2015.
[47]
L. Zhao, J. Ye, F. Chen, C. Lu, and N. Ramakrishnan. Hierarchical incomplete multi-source feature learning for spatiotemporal event forecasting. In KDD, pages 2085--2094, 2016.
[48]
S. Zhao, T. Zhao, I. King, and M. R. Lyu. Geo-teaser: Geo-temporal sequential embedding rank for point-of-interest recommendation. In WWW, pages 153--162, 2017.

Cited By

View all
  • (2024)SemConvTree: Semantic Convolutional Quadtrees for Multi-Scale Event Detection in Smart CitySmart Cities10.3390/smartcities70501077:5(2763-2780)Online publication date: 28-Sep-2024
  • (2024)Relational Prompt-Based Pre-Trained Language Models for Social Event DetectionACM Transactions on Information Systems10.1145/369586943:1(1-43)Online publication date: 13-Sep-2024
  • (2024)ContCommRTD: A Distributed Content-Based Misinformation-Aware Community Detection System for Real-Time Disaster ReportingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341723236:11(5811-5822)Online publication date: Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
ISBN:9781450348874
DOI:10.1145/3097983
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. event detection
  2. multimodal embedding
  3. representation learning
  4. social media
  5. spatiotemporal data mining
  6. topic modeling

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '17
Sponsor:

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)110
  • Downloads (Last 6 weeks)19
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SemConvTree: Semantic Convolutional Quadtrees for Multi-Scale Event Detection in Smart CitySmart Cities10.3390/smartcities70501077:5(2763-2780)Online publication date: 28-Sep-2024
  • (2024)Relational Prompt-Based Pre-Trained Language Models for Social Event DetectionACM Transactions on Information Systems10.1145/369586943:1(1-43)Online publication date: 13-Sep-2024
  • (2024)ContCommRTD: A Distributed Content-Based Misinformation-Aware Community Detection System for Real-Time Disaster ReportingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341723236:11(5811-5822)Online publication date: Nov-2024
  • (2024)STORM: A Spatio-Temporal Context-Aware Model for Predicting Event-Triggered Abnormal Crowd TrafficIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.339018525:10(13051-13066)Online publication date: Oct-2024
  • (2024)Unpacking the role of volunteered geographic information in disaster management: focus on data qualityGeomatics, Natural Hazards and Risk10.1080/19475705.2023.230082515:1Online publication date: 8-Jan-2024
  • (2023)Self-Supervised Representation Learning for Geographical Data—A Systematic Literature ReviewISPRS International Journal of Geo-Information10.3390/ijgi1202006412:2(64)Online publication date: 12-Feb-2023
  • (2023)Online Confirmation-Augmented Probabilistic Topic Modeling in Cyber-Physical Social Infrastructure SystemsProceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation10.1145/3600100.3626341(390-397)Online publication date: 15-Nov-2023
  • (2023)SCStory: Self-supervised and Continual Online Story DiscoveryProceedings of the ACM Web Conference 202310.1145/3543507.3583507(1853-1864)Online publication date: 30-Apr-2023
  • (2023)Reinforced, Incremental and Cross-Lingual Event Detection From Social MessagesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.314499345:1(980-998)Online publication date: 1-Jan-2023
  • (2023)Cost-Effective Incremental Deep Model: Matching Model Capacity With the Least SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.313262235:4(3575-3588)Online publication date: 1-Apr-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media