Skip to main content
Log in

SNAF: Observation filtering and location inference for event monitoring on twitter

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Twitter has recently emerged as a popular microblogging service that has 284 million monthly active users around the world. A part of the 500 million tweets posted on Twitter everyday are personal observations of immediate environment. If provided with time and location information, these observations can be seen as sensory readings for monitoring and localizing objects and events of interests. Location information on Twitter, however, is scarce, with less than 1% of tweets have associated GPS coordinates. Current researches on Twitter location inference mostly focus on city-level or coarser inference, and cannot provide accurate results for fine-grained locations. We propose an event monitoring system for Twitter that emphasizes local events, called SNAF (Sense and Focus). The system filters personal observations posted on Twitter and infers location of each report. Our extensive experiments with real Twitter data show that, the proposed observation filtering approach can have about 22% improvement over existing filtering techniques, and our location inference approach can increase the location accuracy by up to 36% within the 3km error range. By aggregating the observation reports with location information, our prototype event monitoring system can detect real world events, in many case earlier than news reports.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Similar content being viewed by others

Notes

  1. https://about.twitter.com/company

  2. As (1 − 0.009)10 ≈ 0.91

  3. http://time.com/3024078/foursquare-swarm/

  4. https://dev.twitter.com/streaming/public

  5. http://wiki.dbpedia.org/Datasets

  6. https://dev.twitter.com/rest/public/timelines

  7. https://cran.r-project.org/package=e1071

  8. http://crisislex.org/

  9. http://www.crowdflower.com/

  10. http://news.google.com

References

  1. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A nucleus for a web of open data. Springer (2007)

  2. Branch, J., Szymanski, B., Giannella, C., Wolff, R., Kargupta, H.: In-network outlier detection in wireless sensor networks Proceedings of the 26th IEEE International Conference on Distributed Computing Systems, p 51 (2006)

    Google Scholar 

  3. Carroll, T.Z.J.: Unsupervised classification of sentiment and objectivity in chinese text Third International Joint Conference on Natural Language Processing, p 304 (2008)

    Google Scholar 

  4. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter Proceedings of the 20th Internation World Wide Web Conference, pp 675–684 (2011)

    Google Scholar 

  5. Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction Proceedings of the 9th International Conference on Semantic Systems, pp 121–124 (2013)

    Google Scholar 

  6. Graham, M., Hale, S.A., Gaffney, D.: Where in the world are you? geolocation and language identification in Twitter. Prof. Geogr. 66(4), 568–578 (2014)

    Article  Google Scholar 

  7. Hong, L., Ahmed, A., Gurumurthy, S., Smola, A.J., Tsioutsiouliklis, K.: Discovering geographical topics in the Twitter stream Proceedings of the 21st International World Wide Web Conference, pp 769–778 (2012)

    Chapter  Google Scholar 

  8. Ikawa, Y., Enoki, M., Tatsubori, M.: Location inference using microblog messages Proceedings of the 21st International World Wide Web Conference Companion, pp 687–690 (2012)

    Google Scholar 

  9. Jeffery, S. R., Alonso, G., Franklin, M. J., Hong, W., Widom, J.: Declarative support for sensor data cleaning. Pervasive Computing. Springer (2006)

  10. Ji, Z., Sun, A., Cong, G., Han, J.: Joint recognition and linking of fine-grained locations from tweets Proceedings of the 25th International Conference on World Wide Web, pp 1271–1281 (2016)

    Chapter  Google Scholar 

  11. Kennedy, J.: Particle swarm optimization. Encyclopedia of Machine Learning, pages 760–766. Springer (2010)

  12. Kinsella, S., Murdock, V., O’Hare, N.: I’m eating a sandwich in glasgow: modeling locations with tweets Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents, pp 61–68 (2011)

    Chapter  Google Scholar 

  13. Knox, E.M., Ng, R.T.: Algorithms for mining distance based outliers in large datasets Proceedings of 24th International Conference on Very Large Data Bases, pp 392–403 (1998)

    Google Scholar 

  14. Kwon, S., Cha, M., Jung, K., Chen, W., Wang, Y.: Prominent features of rumor propagation in online social media Proceedings of 13th International Conference on Data Mining, pp 1103–1108 (2013)

    Google Scholar 

  15. Li, C., Sun, A.: Fine-grained location extraction from tweets with temporal awareness Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp 43–52 (2014)

    Google Scholar 

  16. Li, R., Lei, K.H., Khadiwala, R., Chang, K.-C.: TEDAS: A Twitter-based event detection and analysis system Proceedings of 28th International Conference on Data Engineering, pp 1273–1276 (2012)

    Google Scholar 

  17. Lingad, J., Karimi, S., Yin, J.: Location extraction from disaster-related microblogs Proceedings of the 22nd International World Wide Web Conference Companion, pp 1017–1020 (2013)

    Chapter  Google Scholar 

  18. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford coreNLP natural language processing toolkit Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 55–60 (2014)

    Chapter  Google Scholar 

  19. McMinn, A.J., Moshfeghi, Y., Jose, J. M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 409–418 ACM (2013)

  20. Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)

    Article  Google Scholar 

  21. Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs Credibility of user statements in health communities Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 65–74 (2014)

    Google Scholar 

  22. Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: A lexicon for collecting and filtering microblogged communications in crises Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, pp 376–385 (2014)

    Google Scholar 

  23. Sakaki, T., Okazaki, M., shakes, Y. M.: Earthquake Twitter users: Real-time event detection by social sensors Proceedings of the 19th International World Wide Web Conference, pp 851–860 (2010)

    Google Scholar 

  24. Sakaki, T., Okazaki, M., Matsuo, Y.: Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 25(4), 919–931 (2013)

    Article  Google Scholar 

  25. Santorini, B.: Part-of-speech tagging guidelines for the penn treebank project (3rd revision). Technical Report MS-CIS-90-47 University of Pennsylvania Department of Computer and Information Science Technical (1990)

  26. Schulz, A., Hadjakos, A., Paulheim, H., Nachtwey, J., Mühlhäuser, M.: A multi-indicator approach for geolocalization of tweets Proceedings of the Seventh International Conference on Weblogs and Social Media, pp 573–582 (2013)

    Google Scholar 

  27. Sheng, B., Li, Q., Mao, W., Jin, W.: Outlier detection in sensor networks Proceedings of the 8th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp 219–228 (2007)

    Google Scholar 

  28. Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in Twitter to improve information filtering Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 841–842 (2010)

    Google Scholar 

  29. Starbird, K., Maddock, J., Orand, M., Achterman, P., Mason, R. M.: Rumors, false flags, and digital vigilantes: Misinformation on Twitter after the 2013 boston marathon bombing. Proceedings of the iConference 2014, pp 654–662. iSchools (2014)

  30. Subramaniam, S., Palpanas, T., Papadopoulos, D., Kalogeraki, V., Gunopulos, D.: Online outlier detection in sensor data using non-parametric models Proceedings of the 32nd International Conference on Very Large Data Bases, pp 187–198 (2006)

    Google Scholar 

  31. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)

    Article  Google Scholar 

  32. Unankard, S., Li, X., Sharaf, M., Zhong, J., Li, X.: Predicting elections from social networks based on sub-event detection and sentiment analysis Proceedings of 15th International Conference on Web Information Systems Engineering, Part II, pp 1–16. Springer (2014)

  33. Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web 18(5), 1393–1417 (2015)

  34. Wen, Y. -J., Agogino, A.M., Goebel, K.: Fuzzy validation and fusion for wireless sensor networks Proceedings of the ASME International Mechanical Engineering Congress, pp 727–732 (2004)

    Google Scholar 

  35. Zhang, Y., Meratnia, N., Havinga, P.: Outlier detection techniques for wireless sensor networks A survey. Communications Surveys Tutorials, IEEE 12(2), 159–170 (2010)

    Article  Google Scholar 

  36. Zhang, Y., Szabo, C., Sheng, Q.Z.: Sense and focus: Towards effective location inference and event detection on twitter Proceedings of the 16th International Conference on Web Information Systems Engineering Part I, pp 463–477 (2015)

    Google Scholar 

  37. Zhang, Y., Szabo, C., Sheng, Q.Z.: Improved object and event monitoring on twitter through lexical analysis and user profiling Proceedings of the 17th International Conference on Web Information System Engineering (2016)

    Google Scholar 

  38. Zhang, Y., Szabo, C., Sheng, Q.Z., Fang, X. S.: Classifying perspectives on twitter Immediate observation, affection, and speculation Proceedings of the 16th International Conference on Web Information Systems Engineering Part I, volume, pp 493–507 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yihong Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Szabo, C., Sheng, Q.Z. et al. SNAF: Observation filtering and location inference for event monitoring on twitter. World Wide Web 21, 311–343 (2018). https://doi.org/10.1007/s11280-017-0453-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0453-1

Keywords

Navigation