skip to main content
10.1145/2380718.2380720acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Topical anomaly detection from Twitter stream

Published: 22 June 2012 Publication History

Abstract

In this paper, we spot topically anomalous tweets in twitter streams by analyzing the content of the document pointed to by the URLs in the tweets in preference to their textual content. Existing approaches to anomaly detection ignore such URLs thereby missing opportunities to detect off-topic tweets. Specifically, we determine the divergence of claimed topic of a tweet as reflected by the hashtags and the actual topic as reflected by the referenced document content. Our approach avoids the need for labeled samples by selecting documents from reliable sources gleaned from the URLs present in the tweets. These documents are used for comparison against documents associated with unknown URLs in incoming tweets improving reliability, scalability and adaptability to rapidly changing topics. We evaluate our approach on three events and show that it can find topical inconsistencies not detectable by existing approaches.

References

[1]
Becker, H., Naaman, M., and Gravano, L. Selecting quality twitter content for events. In Fifth International AAAI Conference on Weblogs and Social Media (2011).
[2]
Benevenuto, F., Magno, G., Rodrigues, T., and Almeida, V. Detecting spammers on twitter. In Proceedings of the 7th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS) (2010).
[3]
Dondio, P., Barrett, S., Weber, S., and Seigneur, J. Extracting trust from domain analysis: A case study on the wikipedia project. Autonomic and Trusted Computing (2006), 362--373.
[4]
Gayo-Avello, D., and Brenes, D. Overcoming spammers in twitter-a tale of five algorithms. In 1st Spanish Conference on Information Retrieval, Madrid, Spain (2010).
[5]
Kumaran, G., and Allan, J. Text classification and named entities for new event detection. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, ACM (2004), 297--304.
[6]
Liu, H., Lim, E., Lauw, H., Le, M., Sun, A., Srivastava, J., and Kim, Y. Predicting trusts among users of online communities: an epinions case study. In Proceedings of the 9th ACM Conference on Electronic Commerce, ACM (2008), 310--319.
[7]
Mustafaraj, E., and Metaxas, P. From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search. In Proceedings of the WebSci10: Extending the Frontiers of Society On-Line (Apr. 2010).
[8]
Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., and Menczer, F. Detecting and tracking the spread of astroturf memes in microblog streams. Arxiv preprint arXiv:1011.3768 (2010).
[9]
Toma, C. L., and Hancock, J. T. Reading between the lines: linguistic cues to deception in online dating profiles. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, CSCW '10, ACM (New York, NY, USA, 2010), 5--8.
[10]
Wang, A. Don't follow me: Spam detection in twitter. In Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on, IEEE (2010), 1--10.
[11]
Yardi, S., Romero, D. M., Schoenebeck, G., and Boyd, D. Detecting spam in a twitter network. First Monday 15, 1 (2010).

Cited By

View all
  • (2023)Real-Time Anomaly Detection and Popularity Prediction for Emerging Events on TwitterProceedings of the International Conference on Advances in Social Networks Analysis and Mining10.1145/3625007.3627517(300-304)Online publication date: 6-Nov-2023
  • (2019)An innovative user-attentive framework for supporting real-time detection and mining of streaming microblog postsSoft Computing10.1007/s00500-019-04478-2Online publication date: 9-Dec-2019
  • (2019)Detecting Spam Tweets in Trending Topics Using Graph-Based ApproachProceedings of the Future Technologies Conference (FTC) 201910.1007/978-3-030-32520-6_39(526-546)Online publication date: 13-Oct-2019
  • Show More Cited By

Index Terms

  1. Topical anomaly detection from Twitter stream

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WebSci '12: Proceedings of the 4th Annual ACM Web Science Conference
    June 2012
    531 pages
    ISBN:9781450312288
    DOI:10.1145/2380718
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Twitter stream analysis
    2. anomaly detection
    3. binary classification
    4. spam and off-topic content detection

    Qualifiers

    • Research-article

    Conference

    WebSci '12
    Sponsor:
    WebSci '12: Web Science 2012
    June 22 - 24, 2012
    Illinois, Evanston

    Acceptance Rates

    Overall Acceptance Rate 245 of 933 submissions, 26%

    Upcoming Conference

    Websci '25
    17th ACM Web Science Conference
    May 20 - 24, 2025
    New Brunswick , NJ , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Real-Time Anomaly Detection and Popularity Prediction for Emerging Events on TwitterProceedings of the International Conference on Advances in Social Networks Analysis and Mining10.1145/3625007.3627517(300-304)Online publication date: 6-Nov-2023
    • (2019)An innovative user-attentive framework for supporting real-time detection and mining of streaming microblog postsSoft Computing10.1007/s00500-019-04478-2Online publication date: 9-Dec-2019
    • (2019)Detecting Spam Tweets in Trending Topics Using Graph-Based ApproachProceedings of the Future Technologies Conference (FTC) 201910.1007/978-3-030-32520-6_39(526-546)Online publication date: 13-Oct-2019
    • (2018)Event detection from Twitter – a surveyInternational Journal of Web Information Systems10.1108/IJWIS-11-2017-007514:3(262-280)Online publication date: 20-Aug-2018
    • (2016)Real-time timeline summarisation for high-impact events in twitterProceedings of the Twenty-second European Conference on Artificial Intelligence10.3233/978-1-61499-672-9-1158(1158-1166)Online publication date: 29-Aug-2016
    • (2015)Semantics‐Empowered Big Data Processing with ApplicationsAI Magazine10.1609/aimag.v36i1.256636:1(39-54)Online publication date: 1-Mar-2015
    • (2015)Real-Time Top-R Topic Detection on Twitter with Topic Hijack FilteringProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2783402(417-426)Online publication date: 10-Aug-2015
    • (2014)Increasing the veracity of event detection on social media networks through user trust modeling2014 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2014.7004286(636-643)Online publication date: Oct-2014
    • (2014)Comparative trust management with applicationsFuture Generation Computer Systems10.1016/j.future.2013.05.00631(182-199)Online publication date: 1-Feb-2014
    • (2014)Detecting anomalies in social network data consumptionSocial Network Analysis and Mining10.1007/s13278-014-0231-34:1Online publication date: 29-Aug-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media