Abstract
Internet users are getting more and more dependent for information regarding their daily lives. Most of the users are connected to each other using social networks. Social networking sites not only helps the users to connect and talk to each other but also share information with each other. Twitter [1] users attach their location information with the post or tweet to show their presence at the location. But, not all users tags or integrate the location information within the post. If a person wants to obtain the latest updates about an event then he/she have to go through all the tweets about that event, which is impossible because nearly 500 million tweets are posted on Twitter on a daily basis. Using Twitter the users can post up to 140 characters in their posts or tweet. Also, the tweets that originate from the location of the event are latest and contain new facts and the rest of the tweets convey that information only. Non-geo-tagged tweets are eliminated by the traditional systems. This paper presents a method to tag the non-geo-tagged tweets with the location then the user would be able to obtain the latest information by including the new. The proposed method performs better than previous methods and yields better results.
Similar content being viewed by others
References
Twitter (2017) Tweet object—Twitter developers. https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object
Han B, Cook P, Baldwin T (2012) Geolocation prediction in social media data by finding location indicative words. Proc COLING 2012:1045–1062
Twitter Usage Statistics (2018). http://www.internetlivestats.com/twitter-statistics/#ref-3
Yamaguchi Y, Amagasa T, Kitagawa H, Ikawa Y (2014) Online user location inference exploiting spatiotemporal correlations in social streams. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, pp 1139–1148
Khan MAH, Bollegala D, Liu G, Sezaki K (2013) Multi-tweet summarization of real-time events, In: 2013 international conference on social computing (SocialCom). IEEE, pp 128–133
Mansouri T, Ravasan AZ, Gholamian MR (2014) A novel hybrid algorithm based on k-means and evolutionary computations for real time clustering. Int J Data Warehous Min (IJDWM) 10(3):1–14
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems. ACM, pp 556–562
Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Min Knowl Discov 7(4):373–397
Peetz M-H, Meij E, de Rijke M, Weerkamp W (2012) Adaptive temporal query modeling. In: Advances in information retrieval. Springer, pp 455–458
Lavrenko V, Allan J, DeGuzman E, LaFlamme D, Pollard V, Thomas S (2002) Relevance models for topic detection and tracking, In: Proceedings of the second international conference on human language technology research. Morgan Kaufmann Publishers Inc., pp 115–121
Li X, Croft WB (2003) Time-based language models. In: Proceedings of the twelfth international conference on information and knowledge management. ACM, pp 469–475
Kaleel SB, Abhari A (2015) Cluster-discovery of Twitter messages for event detection and trending. J Comput Sci 6:47–57
Diaz F, Jones R (2004) Using temporal profiles of queries for precision prediction. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 18–24
Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: Proceedings of the 2010 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 1277–1287
De Vries CM, Geva S, Trotman A (2012) Document clustering evaluation: divergence from a random baseline. CoRR. arXiv:1208.5654
Dakka W, Gravano L, Ipeirotis PG (2012) Answering general time-sensitive queries. Knowl Data Eng IEEE Trans 24(2):220–235
Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: Proceedings of the 2010 conference on empirical methods in natural language processing, EMNLP 2010. ACL, pp 1277–1287
Keikha M, Gerani S, Crestani F (2011) Time-based relevance models. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1087–1088
Doulamis ND, Doulamis AD, Kokkinos P, Varvarigos E (2018) Event detection in twitter microblogging. IEEE Trans Cybern 46(12):2810–2824
Jones R, Diaz F (2007) Temporal profiles of queries. ACM Trans Inf Syst (TOIS) 25(3):14
Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 591–600
Kumar R, Mahadevan U, Sivakumar D (2004) A graph-theoretic approach to extract storylines from search results, In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 216–225
Lavrenko V, Croft WB (2001) Relevance based language models. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 120–127
Zheng X, Han J, Sun A (2018) A survey of location prediction on Twitter. CoRR. https://doi.org/10.1109/TKDE.2018.2807840
Sakai T, Tamura K (2014) Identifying bursty areas of emergency topics in geotagged tweets using density-based spatiotemporal clustering algorithm. In: 2014 IEEE 7th international workshop on computational intelligence and applications (IWCIA). IEEE, pp 95–100
Vincenty T (1975) Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Surv Rev 23(176):88–93
Sugitani T, Shirakawa M, Hara T, Nishio S (2013) Detecting local events by analyzing spatiotemporal locality of tweets. In: 2013 27th international conference on advanced information networking and applications workshops (WAINA). IEEE, pp 191–196
Li J, Li L, Li T (2011) Mssf: a multi-document summarization framework based on submodularity, in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 1247–1248, ACM
Pelleg D, Moore AW et al (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, vol 1. ACM
Yih W-t, Goodman J, Vanderwende L, Suzuki H (2007) Multi-document summarization by maximizing informative content-words. In: IJCAI, vol 7. ACM, pp 1776–1782
Hawking D, Jones T (2012) Reordering an index to speed query processing without loss of effectiveness. In: Proceedings of the seventeenth Australasian document computing symposium. ACM, pp 17–24
Internet and Mobile Association of India (2018). http://www.iamai.in/
Global Positioning System (2018). https://www.gps.gov/
Sloan L, Morgan J (2015) Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter”. PLoS One 10(11):e0142209
Efron M, Golovchinsky G (2011) Estimation methods for ranking recent information. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 495–504
Richtarik P, Takac M (2016) Parallel coordinate descent methods for big data optimization. Math Program 156(1):433–484
Coordinate Descent (2018). https://en.wikipedia.org/wiki/Coordinate\_descent
Adams B, Janowicz K (2012) On the geo-indicativeness of non-georeferenced text. In: ICWSM. AAAI, pp 375–378
Compton R, Jurgens D, Allen D (2014) Geotagging one hundred million twitter accounts with total variation minimization. In: 2014 IEEE international conference on Big Data (Big Data). IEEE, pp 393–401
Middleton SE, Middleton L, Modafferi S (2014) Real-time crisis mapping of natural disasters using social media. IEEE Intell Syst 29(2):9–17
Maeda TN, Yoshida M, Toriumi F, Ohashi H (2016) Decision tree analysis of tourists’ preferences regarding tourist attractions using geotag data from social media. In: Proceedings of the second international conference on IoT in urban space. ACM, pp 61–64
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Samuel, A., Sharma, D.K. Location estimation of non-geo-tagged tweets. Evol. Intel. 14, 205–216 (2021). https://doi.org/10.1007/s12065-018-0163-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-018-0163-3