Abstract
We propose and evaluate an unsupervised approach to identify the location of a user purely based on tweet history of that user. We combine the location references from tweets of a user with gazetteers like DBPedia to identify the geolocation of that user at a city level. This can be used for location based personalization services like targeted advertisements, recommendations and services on a finer level. In this paper, we use convex hull and k-center clustering, to identify the location of a user at a city level. The main contributions of this paper are: (i) reliability on just the contents of a tweet, without the need for manual intervention or training data; (ii) a novel approach to handle ambiguous location entries; and (iii) a computational geometric solution to narrow down the location of the user from a set of points corresponding to location references. Experimental results show that the system is able to identify a location for each user with high accuracy within a tolerance range. We also study the effect of tolerance on accuracy and average error distance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: Geotagging web content. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 273–280. ACM (2004)
Backstrom, L., Kleinberg, J., Kumar, R., Novak, J.: Spatial variation in search engine queries. In: Proceedings of the 17th International Conference on World Wide Web, pp. 357–366. ACM (2008)
Backstrom, L., Sun, E., Marlow, C.: Find me if you can: Improving geographical prediction with social and spatial proximity. In: Proceedings of the 19th International Conference on World Wide Web, pp. 61–70. ACM (2010)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Journal of Computational Science (2011)
Buyukkokten, O., Cho, J., Garcia-molina, H., Gravano, L., SHivakumar, N.: Exploiting geographical location information of web pages. In: Proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDB 1999), pp. 91–96 (1999)
Chazelle, B.: On the convex layers of a planar set. IEEE Transactions on Information Theory 31(4), 509–517 (1985)
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: A content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM (2010)
Crandall, D.J., Backstrom, L., Huttenlocher, D., Kleinberg, J.: Mapping the world’s photos. In: Proceedings of the 18th International Conference on World Wide Web, pp. 761–770. ACM (2009)
Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction. In: Proceedings of the 9th International Conference on Semantic Systems, I-Semantics (2013)
Eisenstein, J., O’Connor, B., Smith, N.A., Xing, E.P.: A latent variable model for geographic lexical variation. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1277–1287. Association for Computational Linguistics (2010)
Fink, C., Piatko, C.D., Mayfield, J., Finin, T., Martineau, J.: Geolocating blogs from their textual content. In: AAAI Spring Symposium: Social Semantic Web: Where Web 2.0 Meets Web 3.0, pp. 25–26 (2009)
Gelernter, J., Mushegian, N.: Geo-parsing messages from microtext. Transactions in GIS 15(6), 753–773 (2011)
Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.A.: Part-of-speech tagging for twitter: Annotation, features, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short papers, vol. 2, pp. 42–47. Association for Computational Linguistics (2011)
Graham, M., Hale, S.A., Gaffney, D.: Where in the world are you? Geolocation and Language Identification in Twitter. CoRR abs/1308.0683 (2013)
Guha, S.: Tight results for clustering and summarizing data streams. In: Proceedings of the 12th International Conference on Database Theory, pp. 268–275. ACM (2009)
Hauff, C., Houben, G.-J.: Geo-location estimation of flickr images: Social web based enrichment. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 85–96. Springer, Heidelberg (2012)
Hecht, B., Hong, L., Suh, B., Chi, E.H.: Tweets from justin bieber’s heart: the dynamics of the location field in user profiles. In: Tan, D.S., Amershi, S., Begole, B., Kellogg, W.A., Tungare, M. (eds.) CHI, pp. 237–246. ACM (2011)
Kinsella, S., Murdock, V., O’Hare, N.: I’m eating a sandwich in glasgow: Modeling locations with tweets. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 61–68. ACM (2011)
Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., Tomkins, A.: Geographic routing in social networks. Proceedings of the National Academy of Sciences of the United States of America 102(33), 11623–11628 (2005)
Lieberman, M.D., Lin, J.: You are where you edit: Locating wikipedia contributors through edit histories. In: ICWSM (2009)
Mahmud, J., Nichols, J., Drews, C.: Where is this tweet from? inferring home locations of twitter users. In: ICWSM (2012)
Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 563–572. ACM (2012)
Paradesi, S.M.: Geotagging tweets using their content. In: FLAIRS Conference (2011)
Popescu, A., Grefenstette, G., et al.: Mining user home location and gender from flickr tags. In: ICWSM (2010)
Rout, D., Bontcheva, K., Preoţiuc-Pietro, D., Cohn, T.: Where’s@ wally?: A classification approach to geolocating users based on their social ties. In: Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp. 11–20. ACM (2013)
Sadilek, A., Kautz, H., Bigham, J.P.: Finding your friends and following them to where you are. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 723–732. ACM (2012)
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: Real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM (2010)
Schulz, A., Hadjakos, A., Paulheim, H., Nachtwey, J., Mühlhäuser, M.: A multi-indicator approach for geolocalization of tweets. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)
Shavitt, Y., Zilberman, N.: A study of geolocation databases. CoRR abs/1005.5674 (2010)
Silva, M.J., Martins, B., Chaves, M., Afonso, A.P., Cardoso, N.: Adding geographic scopes to web resources. Computers, Environment and Urban Systems 30(4), 378–399 (2006)
Sultanik, E.A., Fink, C.: Rapid geotagging and disambiguation of social media text via an indexed gazetteer. In: Proceedings of ISCRAM 2012, pp. 1–10 (2012)
Tumasjan, A., Sprenger, T., Sandner, P., Welpe, I.: Predicting elections with twitter: What 140 characters reveal about political sentiment. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, pp. 178–185 (2010)
Wang, L., Wang, C., Xie, X., Forman, J., Lu, Y., Ma, W.Y., Li, Y.: Detecting dominant locations from search queries. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 424–431. ACM (2005)
Zong, W., Wu, D., Sun, A., Lim, E.P., Goh, D.H.L.: On assigning place names to geography related web pages. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 354–362. ACM (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Katragadda, S., Jin, M., Raghavan, V. (2014). An Unsupervised Approach to Identify Location Based on the Content of User’s Tweet History. In: Ślȩzak, D., Schaefer, G., Vuong, S.T., Kim, YS. (eds) Active Media Technology. AMT 2014. Lecture Notes in Computer Science, vol 8610. Springer, Cham. https://doi.org/10.1007/978-3-319-09912-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-09912-5_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09911-8
Online ISBN: 978-3-319-09912-5
eBook Packages: Computer ScienceComputer Science (R0)