research-article

End-to-End Neural Matching for Semantic Location Prediction of Tweets

Authors:

Lynda TamineAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 39, Issue 1

Article No.: 3, Pages 1 - 35

https://doi.org/10.1145/3415149

Published: 05 September 2020 Publication History

Abstract

The impressive increasing availability of social media posts has given rise to considerable research challenges. This article is concerned with the problem of semantic location prediction of geotagged tweets. The underlying task is to associate to a social media post, the focal spatial object, if any (e.g., Place Of Interest POI), it topically focuses on. Although relevant for a number of applications such as POI recommendation, this problem has not so far received the attention it deserves. In previous work, the problem has mainly been tackled by means of language models that rely on costly probability estimation of word relevance across spatial regions. We propose the Spatially-aware Geotext Matching (SGM) model, which relies on a neural network learning framework. The model combines exact word-word-local interaction matching signals with semantic global tweet-POI interaction matching signals. The local interactions are built over kernel spatial word distributions that allow revealing spatially driven word pair similarity patterns. The global interactions consider the strength of the interaction between the tweet and the POI from both the spatial and semantic perspectives. Experimental results on two real-world datasets demonstrate the effectiveness of our proposed SGM model compared to state-of-the-art baselines including language models and traditional neural interaction-based models.

References

[1]

Amr Ahmed, Liangjie Hong, and Alexander J. Smola. 2013. Hierarchical geographical modeling of user locations from social media posts. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 25--36.

[2]

Oluwaseun Ajao, Jun Hong, and Weiru Liu. 2015. A survey of location inference techniques on Twitter. J. Inf. Sci. 41, 6 (2015), 855--864.

Digital Library

[3]

Lars Backstrom, Jon Kleinberg, Ravi Kumar, and Jasmine Novak. 2008. Spatial variation in search engine queries. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 357--366.

Digital Library

[4]

Jordan Bakerman, Karl Pazdernik, Alyson Wilson, Geoffrey Fairchild, and Rian Bahran. 2018. Twitter geolocation: A hybrid approach. ACM Trans. Knowl. Data Eng. 12, 3 (March 2018), 34:1–34:17.

[5]

Petr Baudiš, Jan Pichl, Tomáš Vyskočil, and Jan Šedivỳ. 2016. Sentence pair scoring: Towards unified framework for text comprehension. CoRR arXiv:1603.06127 (2016).

[6]

Tanusri Bhattacharya, Lars Kulik, and James Bailey. 2015. Automatically recognizing places of interest from unreliable GPS data using spatio-temporal density estimation and line intersections. Perv. Mobile Comput. 19, C (2015), 86--107.

[7]

Lianhua Chi, Kwan Hui Lim, Nebula Alam, and Christopher J. Butler. 2016. Geolocation prediction in Twitter using location indicative words and textual features. In Proceedings of the 2nd Workshop on Noisy User-generated Text (NUT’16). 227--234.

[8]

Wen-Haw Chong and Ee-Peng Lim. 2018. Exploiting user and venue characteristics for fine-grained tweet geolocation. ACM Trans. Inf. Syst. 36, 3, Article 26 (February 2018), 26:1–26:34 pages.

Digital Library

[9]

Anne Cocos and Chris Callison-Burch. 2017. The language of place: Semantic value from geospatial context. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17). 99--104.

[10]

Nick Craswell. 2009. Mean reciprocal rank. Encyclopedia of Database Systems, 1703--1703.

[11]

Nilesh Dalvi, Ravi Kumar, and Bo Pang. 2012. Object matching in tweets with spatial models. In Proceedings of the 5th International Conference on Web Search and Web Data Mining (WSDM’12). 43--52.

Digital Library

[12]

Nilesh Dalvi, Ravi Kumar, Bo Pang, and Andrew Tomkins. 2009. Matching reviews to objects using a language model. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 609--618.

[13]

Nilesh Dalvi, Ravi Kumar, Bo Pang, and Andrew Tomkins. 2009. A translation model for matching reviews to objects. In Proceedings of the 18th Conference on Information and Knowledge Management (CIKM’09). 167--176.

Digital Library

[14]

Michael De Smith and Michael F. Goodchild. 2007. Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools. Troubador Publishing Ltd.

[15]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 4171–4186.

[16]

Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. 2010. A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP’10). 1277--1287.

Digital Library

[17]

Yuan Fang and Ming-Wei Chang. 2014. Entity linking on microblogs with spatial and temporal signals. Trans. Assoc. Comput. Ling. 2 (2014), 259--272.

[18]

David Flatow, Mor Naaman, Ke Eddie Xie, Yana Volkovich, and Yaron Kanza. 2015. On the accuracy of hyper-local geotagging of social media content. In Proceedings of the 8th International Conference on Web Search and Data Mining (WSDM’15). 127--136.

Digital Library

[19]

Jonas Gehring, Michael Auli, David Grangier, and Yann Dauphin. 2017. A convolutional encoder model for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

[20]

Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 55--64.

[21]

Bo Han, Afshin Rahimi, Leon Derczynski, and Timothy Baldwin. 2016. Twitter geolocation prediction shared task of the 2016 workshop on noisy user-generated text. In Proceedings of the 2nd Workshop on Noisy User-generated Text (NUT’16). 213--217.

[22]

J. Han, A. Sun, G. Cong, W. X. Zhao, Z. Ji, and M. C. Phan. 2018. Linking fine-grained locations in user comments. Trans. Knowl. Data Eng. 30, 1 (2018), 59--72.

[23]

Djoerd Hiemstra and Wessel Kraaij. 1999. Twenty-one at TREC-7: Ad-hoc and cross-language track. In Proceedings of the 7th Text REtrieval Conference (TREC’99). 227--238.

[24]

Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR arXiv:1207.0580 (2012).

[25]

Thi Bich Ngoc Hoang and Josiane Mothe. 2018. Location extraction from tweets. Inf. Process. Manage. 54, 2 (2018), 129--144.

[26]

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’14). 2042--2050.

[27]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd International Conference on Information and Knowledge Management (CIKM’13). 2333--2338.

Digital Library

[28]

Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. Comput. Surv. 47, 4, Article 67 (2015), 67:1–67:38 pages.

[29]

Zongcheng Ji, Aixin Sun, Gao Cong, and Jialong Han. 2016. Joint recognition and linking of fine-grained locations from tweets. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 1271--1281.

Digital Library

[30]

Seonhoon Kim, Inho Kang, and Nojun Kwak. 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the 33rd Conference on Artificial Intelligence (AAAI’19). 6586--6593.

Digital Library

[31]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR arXiv:1412.6980 (2014).

[32]

Sheila Kinsella, Vanessa Murdock, and Neil O’Hare. 2011. “I’m eating a sandwich in Glasgow”: Modeling locations with tweets. In Proceedings of the 3rd International CIKM Workshop on Search and Mining User-Generated Contents (SMUC’11). 61--68.

Digital Library

[33]

Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 591--600.

Digital Library

[34]

Kisung Lee, Raghu K. Ganti, Mudhakar Srivatsa, and Ling Liu. 2014. When Twitter meets foursquare: Tweet location prediction using foursquare. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Systems (MOBIQUITOUS’14). 198--207.

Digital Library

[35]

Chenliang Li and Aixin Sun. 2014. Fine-grained location extraction from tweets with temporal awareness. In Proceedings of the 37th International Conference on Research and Development in Information Retrieval (SIGIR’14). 43--52.

Digital Library

[36]

Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the 18th International Conference on Knowledge Discovery and Data Mining (SIGKDD’12). 1023--1031.

Digital Library

[37]

Wen Li, Pavel Serdyukov, Arjen P. de Vries, Carsten Eickhoff, and Martha Larson. 2011. The where in the tweet. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). 2473--2476.

Digital Library

[38]

Jiaheng Lu, Ying Lu, and Gao Cong. 2011. Reverse spatial and textual K nearest neighbor search. In Proceedings of the 2011 International Conference on Management of Data (SIGMOD’11). 349--360.

Digital Library

[39]

Zhengdong Lu and Hang Li. 2013. A deep architecture for matching short texts. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’13). 1367--1375.

[40]

Amr Magdy, Mohamed F. Mokbel, Sameh Elnikety, Suman Nath, and Yuxiong He. 2014. Mercury: A memory-constrained spatio-temporal real-time search on microblogs. In Proceedings of the 30th International Conference on Data Engineering (ICDE’14). 172--183.

[41]

Ryan McDonald, George Brokos, and Ion Androutsopoulos. 2018. Deep relevance ranking using enhanced document-query interactions. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 1849--1860.

[42]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations (ICLR’13).

[43]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’13). 3111--3119.

Digital Library

[44]

Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). 1291--1299.

Digital Library

[45]

Paul Mousset, Yoann Pitarch, and Lynda Tamine. 2019. Towards spatial word embeddings. In Proceedings of the 41st European Conference on IR Research (ECIR’19). 1--9.

[46]

Igor Mozetič, Luis Torgo, Vitor Cerqueira, and Jasmina Smailović. 2018. How to evaluate sentiment classifiers for Twitter time-ordered data? PLoS ONE 13, 3 (03 2018), 1--20.

[47]

Thanh-Son Nguyen, Hady W. Lauw, and Panayiotis Tsaparas. 2015. Review synthesis for micro-review summarization. In Proceedings of the 8th International Conference on Web Search and Data Mining (WSDM’15). 169--178.

Digital Library

[48]

Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, et al. 2018. Neural information retrieval: At the end of the early years. Inf. Retriev. J. 21, 2–3 (2018), 111--182.

Digital Library

[49]

Ozer Ozdikis, Heri Ramampiaro, and Kjetil NAzrvAyg. 2019. Locality-adapted kernel densities of term co-occurrences for location prediction of tweets. Inf. Process. Manage. 56, 4 (2019), 1280--1299.

Digital Library

[50]

Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In Proceedings of the 30th Conference on Artificial Intelligence (AAAI’16). 2793--2799.

[51]

David Martin Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness 8 correlation. J. Mach. Learn. Technol. (2011), 37--63.

[52]

Xipeng Qiu and Xuanjing Huang. 2015. Convolutional neural tensor network architecture for community-based question answering. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 1305--1311.

Digital Library

[53]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3980--3990.

[54]

Stephen E. Robertson and K. Sparck Jones. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 3 (1976), 129--146.

[55]

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. CoRR arXiv:1910.01108 (2019).

[56]

Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International Conference on Research and Development in Information Retrieval (SIGIR’15). 373--382.

Digital Library

[57]

Blake Shaw, Jon Shea, Siddhartha Sinha, and Andrew Hogue. 2013. Learning to rank for spatiotemporal search. In Proceedings of the 6th International Conference on Web Search and Data Mining (WSDM’13). 717--726.

Digital Library

[58]

Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International World Wide Web Conference (WWW’14). 373--374.

Digital Library

[59]

Jaime Teevan, Daniel Ramage, and Merredith Ringel Morris. 2011. #TwitterSearch: A comparison of microblog search and web search. In Proceedings of the 4th International Conference on Web Search and Data Mining (WSDM’11). 35--44.

Digital Library

[60]

Olivier Van Laere, Jonathan Quinn, Steven Schockaert, and Bart Dhoedt. 2014. Spatially aware term selection for geotagging. ACM Trans. Knowl. Data Eng. 26, 1 (Jan. 2014), 221--234.

[61]

Liu Yang, Qingyao Ai, Jiafeng Guo, and W. Bruce Croft. 2016. aNMM: Ranking short answer texts with attention-based neural matching model. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 287--296.

[62]

Jia-Dong Zhang and Chi-Yin Chow. 2015. GeoSoCa: Exploiting geographical, social and categorical correlations for point-of-interest recommendations. In Proceedings of the 38th International Conference on Research and Development in Information Retrieval (SIGIR’15). 443--452.

Digital Library

[63]

Yating Zhang, Adam Jatowt, and Katsumi Tanaka. 2017. Is Tofu the cheese of Asia?: Searching for corresponding objects across geographical areas. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17). 1033--1042.

Digital Library

[64]

Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. CoRR arXiv:1707.08114 (2017).

[65]

Kaiqi Zhao, Gao Cong, and Aixin Sun. 2016. Annotating points of interest with geo-tagged tweets. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 417--426.

Digital Library

[66]

Xin Zheng, Jialong Han, and Aixin Sun. 2018. A survey of location prediction on Twitter. ACM Trans. Knowl. Data Eng. 30, 9 (2018), 1652--1671.

Digital Library

Cited By

Xu RHuang WZhao JChen MNie L(2023)A Spatial and Adversarial Representation Learning Approach for Land Use Classification with POIsACM Transactions on Intelligent Systems and Technology10.1145/362782414:6(1-25)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3627824
Qiu GTang GLi CLuo LGuo DShen Y(2023)Differentiated Location Privacy Protection in Mobile Communication Services: A Survey from the Semantic Perception PerspectiveACM Computing Surveys10.1145/361758956:3(1-36)Online publication date: 5-Oct-2023
https://dl.acm.org/doi/10.1145/3617589
Rajaonarivo LMine T(2023)Few-Shot Learning-Based Lesser-Known POI Category Estimation Based on Syntactic and Semantic InformationIEEE Access10.1109/ACCESS.2023.332763611(141100-141111)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3327636
Show More Cited By

Index Terms

End-to-End Neural Matching for Semantic Location Prediction of Tweets
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Analyzing and predicting viral tweets
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Twitter and other microblogging services have become indispensable sources of information in today's web. Understanding the main factors that make certain pieces of information spread quickly in these platforms can be decisive for the analysis of ...
Predicting lifespans of popular tweets in microblog
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

In microblog like Twitter, popular tweets are usually retweeted by many users. For different tweets, their lifespans (i.e., how long they will stay popular) vary. This paper presents a simple yet effective approach to predict the lifespans of popular ...
Fine-grained location extraction from tweets with temporal awareness
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Twitter is a popular platform for sharing activities, plans, and opinions. Through tweets, users often reveal their location information and short term visiting plans. In this paper, we are interested in extracting fine-grained locations mentioned in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 39, Issue 1

January 2021

329 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3423044

Editor:
Min Zhang
Tsinghua University, China

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 September 2020

Accepted: 01 August 2020

Revised: 01 July 2020

Received: 01 December 2019

Published in TOIS Volume 39, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

IRIT and ATOS Intégration research

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
390
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)2

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu RHuang WZhao JChen MNie L(2023)A Spatial and Adversarial Representation Learning Approach for Land Use Classification with POIsACM Transactions on Intelligent Systems and Technology10.1145/362782414:6(1-25)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3627824
Qiu GTang GLi CLuo LGuo DShen Y(2023)Differentiated Location Privacy Protection in Mobile Communication Services: A Survey from the Semantic Perception PerspectiveACM Computing Surveys10.1145/361758956:3(1-36)Online publication date: 5-Oct-2023
https://dl.acm.org/doi/10.1145/3617589
Rajaonarivo LMine T(2023)Few-Shot Learning-Based Lesser-Known POI Category Estimation Based on Syntactic and Semantic InformationIEEE Access10.1109/ACCESS.2023.332763611(141100-141111)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3327636
Rizk RRizk DRizk FHsu S(2023)280 characters to the White House: predicting 2020 U.S. presidential elections from twitter dataComputational & Mathematical Organization Theory10.1007/s10588-023-09376-529:4(542-569)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10588-023-09376-5
Li MLim KGuo TLiu J(2023)A Transformer-Based Framework for POI-Level Social Post GeolocationAdvances in Information Retrieval10.1007/978-3-031-28244-7_37(588-604)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28244-7_37
Kuculo T(2022)Comprehensive Event Representations using Event Knowledge Graphs and Natural Language ProcessingCompanion Proceedings of the Web Conference 202210.1145/3487553.3524199(359-363)Online publication date: 25-Apr-2022
https://dl.acm.org/doi/10.1145/3487553.3524199
Li WLi XDeng JWang YGuo J(2021)Sentiment based multi-index integrated scoring method to improve the accuracy of recommender systemExpert Systems with Applications10.1016/j.eswa.2021.115105179(115105)Online publication date: Oct-2021
https://doi.org/10.1016/j.eswa.2021.115105

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents