Skip to main content
Log in

Location estimation of non-geo-tagged tweets

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Internet users are getting more and more dependent for information regarding their daily lives. Most of the users are connected to each other using social networks. Social networking sites not only helps the users to connect and talk to each other but also share information with each other. Twitter [1] users attach their location information with the post or tweet to show their presence at the location. But, not all users tags or integrate the location information within the post. If a person wants to obtain the latest updates about an event then he/she have to go through all the tweets about that event, which is impossible because nearly 500 million tweets are posted on Twitter on a daily basis. Using Twitter the users can post up to 140 characters in their posts or tweet. Also, the tweets that originate from the location of the event are latest and contain new facts and the rest of the tweets convey that information only. Non-geo-tagged tweets are eliminated by the traditional systems. This paper presents a method to tag the non-geo-tagged tweets with the location then the user would be able to obtain the latest information by including the new. The proposed method performs better than previous methods and yields better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Twitter (2017) Tweet object—Twitter developers. https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object

  2. Han B, Cook P, Baldwin T (2012) Geolocation prediction in social media data by finding location indicative words. Proc COLING 2012:1045–1062

    Google Scholar 

  3. Twitter Usage Statistics (2018). http://www.internetlivestats.com/twitter-statistics/#ref-3

  4. Yamaguchi Y, Amagasa T, Kitagawa H, Ikawa Y (2014) Online user location inference exploiting spatiotemporal correlations in social streams. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, pp 1139–1148

  5. Khan MAH, Bollegala D, Liu G, Sezaki K (2013) Multi-tweet summarization of real-time events, In: 2013 international conference on social computing (SocialCom). IEEE, pp 128–133

  6. Mansouri T, Ravasan AZ, Gholamian MR (2014) A novel hybrid algorithm based on k-means and evolutionary computations for real time clustering. Int J Data Warehous Min (IJDWM) 10(3):1–14

    Article  Google Scholar 

  7. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems. ACM, pp 556–562

  8. Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Min Knowl Discov 7(4):373–397

    Article  MathSciNet  Google Scholar 

  9. Peetz M-H, Meij E, de Rijke M, Weerkamp W (2012) Adaptive temporal query modeling. In: Advances in information retrieval. Springer, pp 455–458

  10. Lavrenko V, Allan J, DeGuzman E, LaFlamme D, Pollard V, Thomas S (2002) Relevance models for topic detection and tracking, In: Proceedings of the second international conference on human language technology research. Morgan Kaufmann Publishers Inc., pp 115–121

  11. Li X, Croft WB (2003) Time-based language models. In: Proceedings of the twelfth international conference on information and knowledge management. ACM, pp 469–475

  12. Kaleel SB, Abhari A (2015) Cluster-discovery of Twitter messages for event detection and trending. J Comput Sci 6:47–57

    Article  Google Scholar 

  13. Diaz F, Jones R (2004) Using temporal profiles of queries for precision prediction. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 18–24

  14. Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: Proceedings of the 2010 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 1277–1287

  15. De Vries CM, Geva S, Trotman A (2012) Document clustering evaluation: divergence from a random baseline. CoRR. arXiv:1208.5654

  16. Dakka W, Gravano L, Ipeirotis PG (2012) Answering general time-sensitive queries. Knowl Data Eng IEEE Trans 24(2):220–235

    Article  Google Scholar 

  17. Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: Proceedings of the 2010 conference on empirical methods in natural language processing, EMNLP 2010. ACL, pp 1277–1287

  18. Keikha M, Gerani S, Crestani F (2011) Time-based relevance models. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1087–1088

  19. Doulamis ND,  Doulamis AD,  Kokkinos P, Varvarigos E (2018) Event detection in twitter microblogging. IEEE Trans Cybern 46(12):2810–2824

    Article  Google Scholar 

  20. Jones R, Diaz F (2007) Temporal profiles of queries. ACM Trans Inf Syst (TOIS) 25(3):14

    Article  Google Scholar 

  21. Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 591–600

  22. Kumar R, Mahadevan U, Sivakumar D (2004) A graph-theoretic approach to extract storylines from search results, In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 216–225

  23. Lavrenko V, Croft WB (2001) Relevance based language models. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 120–127

  24. Zheng X, Han J, Sun A (2018) A survey of location prediction on Twitter. ​CoRR. https://doi.org/10.1109/TKDE.2018.2807840

    Article  Google Scholar 

  25. Sakai T, Tamura K (2014) Identifying bursty areas of emergency topics in geotagged tweets using density-based spatiotemporal clustering algorithm. In: 2014 IEEE 7th international workshop on computational intelligence and applications (IWCIA). IEEE, pp 95–100

  26. Vincenty T (1975) Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Surv Rev 23(176):88–93

    Article  Google Scholar 

  27. Sugitani T, Shirakawa M, Hara T, Nishio S (2013) Detecting local events by analyzing spatiotemporal locality of tweets. In: 2013 27th international conference on advanced information networking and applications workshops (WAINA). IEEE, pp 191–196

  28. Li J, Li L, Li T (2011) Mssf: a multi-document summarization framework based on submodularity, in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 1247–1248, ACM

  29. Pelleg D, Moore AW et al (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, vol 1. ACM

  30. Yih W-t, Goodman J, Vanderwende L, Suzuki H (2007) Multi-document summarization by maximizing informative content-words. In: IJCAI, vol 7. ACM, pp 1776–1782

  31. Hawking D, Jones T (2012) Reordering an index to speed query processing without loss of effectiveness. In: Proceedings of the seventeenth Australasian document computing symposium. ACM, pp 17–24

  32. Internet and Mobile Association of India (2018). http://www.iamai.in/

  33. Global Positioning System (2018). https://www.gps.gov/

  34. Sloan L, Morgan J (2015) Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter”. PLoS One 10(11):e0142209

    Article  Google Scholar 

  35. Efron M, Golovchinsky G (2011) Estimation methods for ranking recent information. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 495–504

  36. Richtarik P, Takac M (2016) Parallel coordinate descent methods for big data optimization. Math Program 156(1):433–484

    Article  MathSciNet  Google Scholar 

  37. Coordinate Descent (2018). https://en.wikipedia.org/wiki/Coordinate\_descent

  38. Adams B, Janowicz K (2012) On the geo-indicativeness of non-georeferenced text. In: ICWSM. AAAI, pp 375–378

  39. Compton R, Jurgens D, Allen D (2014) Geotagging one hundred million twitter accounts with total variation minimization. In: 2014 IEEE international conference on Big Data (Big Data). IEEE, pp 393–401

  40. Middleton SE, Middleton L, Modafferi S (2014) Real-time crisis mapping of natural disasters using social media. IEEE Intell Syst 29(2):9–17

    Article  Google Scholar 

  41. Maeda TN, Yoshida M, Toriumi F, Ohashi H (2016) Decision tree analysis of tourists’ preferences regarding tourist attractions using geotag data from social media. In: Proceedings of the second international conference on IoT in urban space. ACM, pp 61–64

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Avinash Samuel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Samuel, A., Sharma, D.K. Location estimation of non-geo-tagged tweets. Evol. Intel. 14, 205–216 (2021). https://doi.org/10.1007/s12065-018-0163-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-018-0163-3

Keywords

Navigation