Abstract
Twitter has recently emerged as a popular microblogging service that has 284 million monthly active users around the world. A part of the 500 million tweets posted on Twitter everyday are personal observations of immediate environment. If provided with time and location information, these observations can be seen as sensory readings for monitoring and localizing objects and events of interests. Location information on Twitter, however, is scarce, with less than 1% of tweets have associated GPS coordinates. Current researches on Twitter location inference mostly focus on city-level or coarser inference, and cannot provide accurate results for fine-grained locations. We propose an event monitoring system for Twitter that emphasizes local events, called SNAF (Sense and Focus). The system filters personal observations posted on Twitter and infers location of each report. Our extensive experiments with real Twitter data show that, the proposed observation filtering approach can have about 22% improvement over existing filtering techniques, and our location inference approach can increase the location accuracy by up to 36% within the 3km error range. By aggregating the observation reports with location information, our prototype event monitoring system can detect real world events, in many case earlier than news reports.








Similar content being viewed by others
Notes
As (1 − 0.009)10 ≈ 0.91
References
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A nucleus for a web of open data. Springer (2007)
Branch, J., Szymanski, B., Giannella, C., Wolff, R., Kargupta, H.: In-network outlier detection in wireless sensor networks Proceedings of the 26th IEEE International Conference on Distributed Computing Systems, p 51 (2006)
Carroll, T.Z.J.: Unsupervised classification of sentiment and objectivity in chinese text Third International Joint Conference on Natural Language Processing, p 304 (2008)
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter Proceedings of the 20th Internation World Wide Web Conference, pp 675–684 (2011)
Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction Proceedings of the 9th International Conference on Semantic Systems, pp 121–124 (2013)
Graham, M., Hale, S.A., Gaffney, D.: Where in the world are you? geolocation and language identification in Twitter. Prof. Geogr. 66(4), 568–578 (2014)
Hong, L., Ahmed, A., Gurumurthy, S., Smola, A.J., Tsioutsiouliklis, K.: Discovering geographical topics in the Twitter stream Proceedings of the 21st International World Wide Web Conference, pp 769–778 (2012)
Ikawa, Y., Enoki, M., Tatsubori, M.: Location inference using microblog messages Proceedings of the 21st International World Wide Web Conference Companion, pp 687–690 (2012)
Jeffery, S. R., Alonso, G., Franklin, M. J., Hong, W., Widom, J.: Declarative support for sensor data cleaning. Pervasive Computing. Springer (2006)
Ji, Z., Sun, A., Cong, G., Han, J.: Joint recognition and linking of fine-grained locations from tweets Proceedings of the 25th International Conference on World Wide Web, pp 1271–1281 (2016)
Kennedy, J.: Particle swarm optimization. Encyclopedia of Machine Learning, pages 760–766. Springer (2010)
Kinsella, S., Murdock, V., O’Hare, N.: I’m eating a sandwich in glasgow: modeling locations with tweets Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents, pp 61–68 (2011)
Knox, E.M., Ng, R.T.: Algorithms for mining distance based outliers in large datasets Proceedings of 24th International Conference on Very Large Data Bases, pp 392–403 (1998)
Kwon, S., Cha, M., Jung, K., Chen, W., Wang, Y.: Prominent features of rumor propagation in online social media Proceedings of 13th International Conference on Data Mining, pp 1103–1108 (2013)
Li, C., Sun, A.: Fine-grained location extraction from tweets with temporal awareness Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp 43–52 (2014)
Li, R., Lei, K.H., Khadiwala, R., Chang, K.-C.: TEDAS: A Twitter-based event detection and analysis system Proceedings of 28th International Conference on Data Engineering, pp 1273–1276 (2012)
Lingad, J., Karimi, S., Yin, J.: Location extraction from disaster-related microblogs Proceedings of the 22nd International World Wide Web Conference Companion, pp 1017–1020 (2013)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford coreNLP natural language processing toolkit Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 55–60 (2014)
McMinn, A.J., Moshfeghi, Y., Jose, J. M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 409–418 ACM (2013)
Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)
Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs Credibility of user statements in health communities Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 65–74 (2014)
Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: A lexicon for collecting and filtering microblogged communications in crises Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, pp 376–385 (2014)
Sakaki, T., Okazaki, M., shakes, Y. M.: Earthquake Twitter users: Real-time event detection by social sensors Proceedings of the 19th International World Wide Web Conference, pp 851–860 (2010)
Sakaki, T., Okazaki, M., Matsuo, Y.: Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 25(4), 919–931 (2013)
Santorini, B.: Part-of-speech tagging guidelines for the penn treebank project (3rd revision). Technical Report MS-CIS-90-47 University of Pennsylvania Department of Computer and Information Science Technical (1990)
Schulz, A., Hadjakos, A., Paulheim, H., Nachtwey, J., Mühlhäuser, M.: A multi-indicator approach for geolocalization of tweets Proceedings of the Seventh International Conference on Weblogs and Social Media, pp 573–582 (2013)
Sheng, B., Li, Q., Mao, W., Jin, W.: Outlier detection in sensor networks Proceedings of the 8th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp 219–228 (2007)
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in Twitter to improve information filtering Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 841–842 (2010)
Starbird, K., Maddock, J., Orand, M., Achterman, P., Mason, R. M.: Rumors, false flags, and digital vigilantes: Misinformation on Twitter after the 2013 boston marathon bombing. Proceedings of the iConference 2014, pp 654–662. iSchools (2014)
Subramaniam, S., Palpanas, T., Papadopoulos, D., Kalogeraki, V., Gunopulos, D.: Online outlier detection in sensor data using non-parametric models Proceedings of the 32nd International Conference on Very Large Data Bases, pp 187–198 (2006)
Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)
Unankard, S., Li, X., Sharaf, M., Zhong, J., Li, X.: Predicting elections from social networks based on sub-event detection and sentiment analysis Proceedings of 15th International Conference on Web Information Systems Engineering, Part II, pp 1–16. Springer (2014)
Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web 18(5), 1393–1417 (2015)
Wen, Y. -J., Agogino, A.M., Goebel, K.: Fuzzy validation and fusion for wireless sensor networks Proceedings of the ASME International Mechanical Engineering Congress, pp 727–732 (2004)
Zhang, Y., Meratnia, N., Havinga, P.: Outlier detection techniques for wireless sensor networks A survey. Communications Surveys Tutorials, IEEE 12(2), 159–170 (2010)
Zhang, Y., Szabo, C., Sheng, Q.Z.: Sense and focus: Towards effective location inference and event detection on twitter Proceedings of the 16th International Conference on Web Information Systems Engineering Part I, pp 463–477 (2015)
Zhang, Y., Szabo, C., Sheng, Q.Z.: Improved object and event monitoring on twitter through lexical analysis and user profiling Proceedings of the 17th International Conference on Web Information System Engineering (2016)
Zhang, Y., Szabo, C., Sheng, Q.Z., Fang, X. S.: Classifying perspectives on twitter Immediate observation, affection, and speculation Proceedings of the 16th International Conference on Web Information Systems Engineering Part I, volume, pp 493–507 (2015)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, Y., Szabo, C., Sheng, Q.Z. et al. SNAF: Observation filtering and location inference for event monitoring on twitter. World Wide Web 21, 311–343 (2018). https://doi.org/10.1007/s11280-017-0453-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0453-1