Abstract
In today’s social media world we are provided with an impressive amount of data about users and their societal interactions. This offers computer scientists among others many new opportunities for research exploration. Arguably, one of the most interesting areas of work is that of predicting events and developments based on social media data and trends. We have recently seen this happen in many areas including politics, finance, entertainment, market demands, health, and many others. Furthermore, there has been a lot of attention garnered on being able to predict a user’s location based on their online activity taking into account that large amount of social interaction online is done behind usernames and anonymous titles. This area of research is well-known as geolocation inference. In this paper, we propose a novel model for geolocation inference of social media users using the aid of a discrete event: the Solar Eclipse of 2017. Being able to use the path pf the eclipse and timing of its path of travel to infer a user’s location is a unique model seen only in this paper. We apply this unique model to Twitter data gathered from users during the Solar Eclipse of 2017 and attempt to determine if certain features of the data itself are indicative of users viewing the eclipse or of similar events. Taking advantage of Stanford’s natural language processing software, we also consider the proportions and existences of many words, part-of-speech tags, and relations between users both found in our sample data, in an attempt to find key features of users who are viewing the eclipse. We discuss our results using our unique model and conclude by discussing the strengths and weaknesses of the model with the resulting potential future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alphabetical list of part-of-speech tags used in the Penn Treebank project. https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
POS tags. https://www.sketchengine.eu/pos-tags/
The PDP-11 Assembly Language, August 2011. https://programmer209.wordpress.com/2011/08/03/the-pdp-11-assembly-language/
Backstrom, L., Sun, E., Marlow, C.: Find me if you can: improving geographical prediction with social and spatial proximity. In: Proceedings of the 19th International Conference on World Wide Web, pp. 61–70. ACM (2010)
Bifet, A., Frank, E.: Sentiment knowledge discovery in Twitter streaming data. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS (LNAI), vol. 6332, pp. 1–15. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16184-1_1
Caverlee, J., Cheng, Z., Sui, D.Z., Kamath, K.Y.: Towards geo-social intelligence: mining, analyzing, and leveraging geospatial footprints in social media. IEEE Data Eng. Bull. 36(3), 33–41 (2013)
Cheng, R., Zhang, Y., Bertino, E., Prabhakar, S.: Preserving user location privacy in mobile data management infrastructures. In: Danezis, G., Golle, P. (eds.) PET 2006. LNCS, vol. 4258, pp. 393–412. Springer, Heidelberg (2006). https://doi.org/10.1007/11957454_23
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating Twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM (2010)
Conover, W.: Practical Nonparametric Statistics. Wiley Series in Probability and Statistics. Wiley, New York, 3 edn. (1999). [u.a.]. http://gso.gbv.de/DB=2.1/CMD?ACT=SRCHA&SRT=YOP&IKT=1016&TRM=ppn+24551600X&sourceid=fbw_bibsonomy
Davis Jr., C.A., Pappa, G.L., de Oliveira, D.R.R., de Arcanjo, F.L.: Inferring the location of Twitter messages based on user relationships. Trans. GIS 15(6), 735–751 (2011)
De Marneffe, M.C., Manning, C.D.: Stanford typed dependencies manual. Technical report, Stanford University, Technical report (2008)
Jurgens, D., Finethy, T., McCorriston, J., Xu, Y.T., Ruths, D.: Geolocation prediction in twitter using social networks: a critical analysis and review of current practice. ICWSM 15, 188–197 (2015)
Kinsella, S., Murdock, V., O’Hare, N.: I’m eating a sandwich in Glasgow: modeling locations with tweets. In: Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents, pp. 61–68. ACM (2011)
Kong, L., Liu, Z., Huang, Y.: SPOT: locating social media users based on social network context. VLDB Endow. 7(13), 1681–1684 (2014)
Kuenzi, J.J.: Science, Technology, Engineering, and Mathematics (STEM) Education: Background, Policy, and Legislative Action (2008)
Li, R., Wang, S., Chang, K.C.C.: Multiple location profiling for users and relationships from social network and content. VLDB Endow. 5(11), 1603–1614 (2012)
Li, R., Wang, S., Deng, H., Wang, R., Chang, K.C.C.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: 18th ACM SIGKDD, pp. 1023–1031. ACM (2012)
Makice, K.: Twitter API: Up and Running Learn How to Build Applications with the Twitter API, 1st edn. O’Reilly Media Inc., Newton (2009)
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
McGee, J., Caverlee, J.A., Cheng, Z.: A geographic study of tie strength in social media. In: Proceedings of the 20th ACM international conference on Information and knowledge management. pp. 2333–2336. ACM (2011)
Panther, J.: Please explain why the total solar eclipse in August 2017 starts on the west coast and progresses eastward. http://www.astronomy.com/magazine/ask-astro/2016/01/2017-solar-eclipse-path 5 (2017)
Python, J.: Python Programming Language. In: USENIX Annual Technical Conference (2007)
Roesslein, J.: Tweepy Documentation (2009). http://tweepy.readthedocs.io/en/v3.5
Roller, S., Speriosu, M., Rallapalli, S., Wing, B., Baldridge, J.: Supervised text-based geolocation using language models on an adaptive grid. In: Proceedings of the 2012 JCEMNLP, pp. 1500–1510. Association for Computational Linguistics (2012)
Rout, D., Bontcheva, K., Preoţiuc-Pietro, D., Cohn, T.: Where’s@ wally?: a classification approach to geolocating users based on their social ties. In: 24th ACM HSM, pp. 11–20. ACM (2013)
Shumay, M., Spencer, D., Srivastava, G., Pickering, D.: Repeatable measurement of Twitter user impact nasa and the great American eclipse of 2017. FILOMAT 32(5), 12 (2018)
Sloan, L., Morgan, J.: Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter. PloS one 10(11), e0142209 (2015)
Srivastava, G.: Gauging ecliptic sentiment. In: 2018 41st International Conference on Telecommunications and Signal Processing (TSP), pp. 1–5. IEEE (2018)
Wang, Y., Liu, J., Qu, J., Huang, Y., Chen, J., Feng, X.: Hashtag graph based topic model for tweet mining. In: 2014 IEEE International Conference on Data Mining (ICDM), pp. 1025–1030. IEEE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Clarkson, K., Srivastava, G., Meawad, F., Dwivedi, A.D. (2019). Where’s @Waldo?: Finding Users on Twitter. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2019. Lecture Notes in Computer Science(), vol 11509. Springer, Cham. https://doi.org/10.1007/978-3-030-20915-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-20915-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20914-8
Online ISBN: 978-3-030-20915-5
eBook Packages: Computer ScienceComputer Science (R0)