skip to main content
research-article

End-to-End Neural Matching for Semantic Location Prediction of Tweets

Published: 05 September 2020 Publication History

Abstract

The impressive increasing availability of social media posts has given rise to considerable research challenges. This article is concerned with the problem of semantic location prediction of geotagged tweets. The underlying task is to associate to a social media post, the focal spatial object, if any (e.g., Place Of Interest POI), it topically focuses on. Although relevant for a number of applications such as POI recommendation, this problem has not so far received the attention it deserves. In previous work, the problem has mainly been tackled by means of language models that rely on costly probability estimation of word relevance across spatial regions. We propose the Spatially-aware Geotext Matching (SGM) model, which relies on a neural network learning framework. The model combines exact word-word-local interaction matching signals with semantic global tweet-POI interaction matching signals. The local interactions are built over kernel spatial word distributions that allow revealing spatially driven word pair similarity patterns. The global interactions consider the strength of the interaction between the tweet and the POI from both the spatial and semantic perspectives. Experimental results on two real-world datasets demonstrate the effectiveness of our proposed SGM model compared to state-of-the-art baselines including language models and traditional neural interaction-based models.

References

[1]
Amr Ahmed, Liangjie Hong, and Alexander J. Smola. 2013. Hierarchical geographical modeling of user locations from social media posts. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 25--36.
[2]
Oluwaseun Ajao, Jun Hong, and Weiru Liu. 2015. A survey of location inference techniques on Twitter. J. Inf. Sci. 41, 6 (2015), 855--864.
[3]
Lars Backstrom, Jon Kleinberg, Ravi Kumar, and Jasmine Novak. 2008. Spatial variation in search engine queries. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). 357--366.
[4]
Jordan Bakerman, Karl Pazdernik, Alyson Wilson, Geoffrey Fairchild, and Rian Bahran. 2018. Twitter geolocation: A hybrid approach. ACM Trans. Knowl. Data Eng. 12, 3 (March 2018), 34:1–34:17.
[5]
Petr Baudiš, Jan Pichl, Tomáš Vyskočil, and Jan Šedivỳ. 2016. Sentence pair scoring: Towards unified framework for text comprehension. CoRR arXiv:1603.06127 (2016).
[6]
Tanusri Bhattacharya, Lars Kulik, and James Bailey. 2015. Automatically recognizing places of interest from unreliable GPS data using spatio-temporal density estimation and line intersections. Perv. Mobile Comput. 19, C (2015), 86--107.
[7]
Lianhua Chi, Kwan Hui Lim, Nebula Alam, and Christopher J. Butler. 2016. Geolocation prediction in Twitter using location indicative words and textual features. In Proceedings of the 2nd Workshop on Noisy User-generated Text (NUT’16). 227--234.
[8]
Wen-Haw Chong and Ee-Peng Lim. 2018. Exploiting user and venue characteristics for fine-grained tweet geolocation. ACM Trans. Inf. Syst. 36, 3, Article 26 (February 2018), 26:1–26:34 pages.
[9]
Anne Cocos and Chris Callison-Burch. 2017. The language of place: Semantic value from geospatial context. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17). 99--104.
[10]
Nick Craswell. 2009. Mean reciprocal rank. Encyclopedia of Database Systems, 1703--1703.
[11]
Nilesh Dalvi, Ravi Kumar, and Bo Pang. 2012. Object matching in tweets with spatial models. In Proceedings of the 5th International Conference on Web Search and Web Data Mining (WSDM’12). 43--52.
[12]
Nilesh Dalvi, Ravi Kumar, Bo Pang, and Andrew Tomkins. 2009. Matching reviews to objects using a language model. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 609--618.
[13]
Nilesh Dalvi, Ravi Kumar, Bo Pang, and Andrew Tomkins. 2009. A translation model for matching reviews to objects. In Proceedings of the 18th Conference on Information and Knowledge Management (CIKM’09). 167--176.
[14]
Michael De Smith and Michael F. Goodchild. 2007. Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools. Troubador Publishing Ltd.
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 4171–4186.
[16]
Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. 2010. A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP’10). 1277--1287.
[17]
Yuan Fang and Ming-Wei Chang. 2014. Entity linking on microblogs with spatial and temporal signals. Trans. Assoc. Comput. Ling. 2 (2014), 259--272.
[18]
David Flatow, Mor Naaman, Ke Eddie Xie, Yana Volkovich, and Yaron Kanza. 2015. On the accuracy of hyper-local geotagging of social media content. In Proceedings of the 8th International Conference on Web Search and Data Mining (WSDM’15). 127--136.
[19]
Jonas Gehring, Michael Auli, David Grangier, and Yann Dauphin. 2017. A convolutional encoder model for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
[20]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 55--64.
[21]
Bo Han, Afshin Rahimi, Leon Derczynski, and Timothy Baldwin. 2016. Twitter geolocation prediction shared task of the 2016 workshop on noisy user-generated text. In Proceedings of the 2nd Workshop on Noisy User-generated Text (NUT’16). 213--217.
[22]
J. Han, A. Sun, G. Cong, W. X. Zhao, Z. Ji, and M. C. Phan. 2018. Linking fine-grained locations in user comments. Trans. Knowl. Data Eng. 30, 1 (2018), 59--72.
[23]
Djoerd Hiemstra and Wessel Kraaij. 1999. Twenty-one at TREC-7: Ad-hoc and cross-language track. In Proceedings of the 7th Text REtrieval Conference (TREC’99). 227--238.
[24]
Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR arXiv:1207.0580 (2012).
[25]
Thi Bich Ngoc Hoang and Josiane Mothe. 2018. Location extraction from tweets. Inf. Process. Manage. 54, 2 (2018), 129--144.
[26]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’14). 2042--2050.
[27]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd International Conference on Information and Knowledge Management (CIKM’13). 2333--2338.
[28]
Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. Comput. Surv. 47, 4, Article 67 (2015), 67:1–67:38 pages.
[29]
Zongcheng Ji, Aixin Sun, Gao Cong, and Jialong Han. 2016. Joint recognition and linking of fine-grained locations from tweets. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 1271--1281.
[30]
Seonhoon Kim, Inho Kang, and Nojun Kwak. 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the 33rd Conference on Artificial Intelligence (AAAI’19). 6586--6593.
[31]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR arXiv:1412.6980 (2014).
[32]
Sheila Kinsella, Vanessa Murdock, and Neil O’Hare. 2011. “I’m eating a sandwich in Glasgow”: Modeling locations with tweets. In Proceedings of the 3rd International CIKM Workshop on Search and Mining User-Generated Contents (SMUC’11). 61--68.
[33]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). 591--600.
[34]
Kisung Lee, Raghu K. Ganti, Mudhakar Srivatsa, and Ling Liu. 2014. When Twitter meets foursquare: Tweet location prediction using foursquare. In Proceedings of the 11th International Conference on Mobile and Ubiquitous Systems (MOBIQUITOUS’14). 198--207.
[35]
Chenliang Li and Aixin Sun. 2014. Fine-grained location extraction from tweets with temporal awareness. In Proceedings of the 37th International Conference on Research and Development in Information Retrieval (SIGIR’14). 43--52.
[36]
Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the 18th International Conference on Knowledge Discovery and Data Mining (SIGKDD’12). 1023--1031.
[37]
Wen Li, Pavel Serdyukov, Arjen P. de Vries, Carsten Eickhoff, and Martha Larson. 2011. The where in the tweet. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). 2473--2476.
[38]
Jiaheng Lu, Ying Lu, and Gao Cong. 2011. Reverse spatial and textual K nearest neighbor search. In Proceedings of the 2011 International Conference on Management of Data (SIGMOD’11). 349--360.
[39]
Zhengdong Lu and Hang Li. 2013. A deep architecture for matching short texts. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’13). 1367--1375.
[40]
Amr Magdy, Mohamed F. Mokbel, Sameh Elnikety, Suman Nath, and Yuxiong He. 2014. Mercury: A memory-constrained spatio-temporal real-time search on microblogs. In Proceedings of the 30th International Conference on Data Engineering (ICDE’14). 172--183.
[41]
Ryan McDonald, George Brokos, and Ion Androutsopoulos. 2018. Deep relevance ranking using enhanced document-query interactions. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 1849--1860.
[42]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations (ICLR’13).
[43]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS’13). 3111--3119.
[44]
Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to match using local and distributed representations of text for web search. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). 1291--1299.
[45]
Paul Mousset, Yoann Pitarch, and Lynda Tamine. 2019. Towards spatial word embeddings. In Proceedings of the 41st European Conference on IR Research (ECIR’19). 1--9.
[46]
Igor Mozetič, Luis Torgo, Vitor Cerqueira, and Jasmina Smailović. 2018. How to evaluate sentiment classifiers for Twitter time-ordered data? PLoS ONE 13, 3 (03 2018), 1--20.
[47]
Thanh-Son Nguyen, Hady W. Lauw, and Panayiotis Tsaparas. 2015. Review synthesis for micro-review summarization. In Proceedings of the 8th International Conference on Web Search and Data Mining (WSDM’15). 169--178.
[48]
Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, et al. 2018. Neural information retrieval: At the end of the early years. Inf. Retriev. J. 21, 2–3 (2018), 111--182.
[49]
Ozer Ozdikis, Heri Ramampiaro, and Kjetil NAzrvAyg. 2019. Locality-adapted kernel densities of term co-occurrences for location prediction of tweets. Inf. Process. Manage. 56, 4 (2019), 1280--1299.
[50]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In Proceedings of the 30th Conference on Artificial Intelligence (AAAI’16). 2793--2799.
[51]
David Martin Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness 8 correlation. J. Mach. Learn. Technol. (2011), 37--63.
[52]
Xipeng Qiu and Xuanjing Huang. 2015. Convolutional neural tensor network architecture for community-based question answering. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 1305--1311.
[53]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3980--3990.
[54]
Stephen E. Robertson and K. Sparck Jones. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 3 (1976), 129--146.
[55]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. CoRR arXiv:1910.01108 (2019).
[56]
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International Conference on Research and Development in Information Retrieval (SIGIR’15). 373--382.
[57]
Blake Shaw, Jon Shea, Siddhartha Sinha, and Andrew Hogue. 2013. Learning to rank for spatiotemporal search. In Proceedings of the 6th International Conference on Web Search and Data Mining (WSDM’13). 717--726.
[58]
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International World Wide Web Conference (WWW’14). 373--374.
[59]
Jaime Teevan, Daniel Ramage, and Merredith Ringel Morris. 2011. #TwitterSearch: A comparison of microblog search and web search. In Proceedings of the 4th International Conference on Web Search and Data Mining (WSDM’11). 35--44.
[60]
Olivier Van Laere, Jonathan Quinn, Steven Schockaert, and Bart Dhoedt. 2014. Spatially aware term selection for geotagging. ACM Trans. Knowl. Data Eng. 26, 1 (Jan. 2014), 221--234.
[61]
Liu Yang, Qingyao Ai, Jiafeng Guo, and W. Bruce Croft. 2016. aNMM: Ranking short answer texts with attention-based neural matching model. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 287--296.
[62]
Jia-Dong Zhang and Chi-Yin Chow. 2015. GeoSoCa: Exploiting geographical, social and categorical correlations for point-of-interest recommendations. In Proceedings of the 38th International Conference on Research and Development in Information Retrieval (SIGIR’15). 443--452.
[63]
Yating Zhang, Adam Jatowt, and Katsumi Tanaka. 2017. Is Tofu the cheese of Asia?: Searching for corresponding objects across geographical areas. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17). 1033--1042.
[64]
Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. CoRR arXiv:1707.08114 (2017).
[65]
Kaiqi Zhao, Gao Cong, and Aixin Sun. 2016. Annotating points of interest with geo-tagged tweets. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). 417--426.
[66]
Xin Zheng, Jialong Han, and Aixin Sun. 2018. A survey of location prediction on Twitter. ACM Trans. Knowl. Data Eng. 30, 9 (2018), 1652--1671.

Cited By

View all
  • (2023)A Spatial and Adversarial Representation Learning Approach for Land Use Classification with POIsACM Transactions on Intelligent Systems and Technology10.1145/362782414:6(1-25)Online publication date: 14-Nov-2023
  • (2023)Differentiated Location Privacy Protection in Mobile Communication Services: A Survey from the Semantic Perception PerspectiveACM Computing Surveys10.1145/361758956:3(1-36)Online publication date: 5-Oct-2023
  • (2023)Few-Shot Learning-Based Lesser-Known POI Category Estimation Based on Syntactic and Semantic InformationIEEE Access10.1109/ACCESS.2023.332763611(141100-141111)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. End-to-End Neural Matching for Semantic Location Prediction of Tweets

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 39, Issue 1
    January 2021
    329 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3423044
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 September 2020
    Accepted: 01 August 2020
    Revised: 01 July 2020
    Received: 01 December 2019
    Published in TOIS Volume 39, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Semantic location prediction
    2. neural text matching
    3. point of interest
    4. tweet

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • IRIT and ATOS Intégration research

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 25 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Spatial and Adversarial Representation Learning Approach for Land Use Classification with POIsACM Transactions on Intelligent Systems and Technology10.1145/362782414:6(1-25)Online publication date: 14-Nov-2023
    • (2023)Differentiated Location Privacy Protection in Mobile Communication Services: A Survey from the Semantic Perception PerspectiveACM Computing Surveys10.1145/361758956:3(1-36)Online publication date: 5-Oct-2023
    • (2023)Few-Shot Learning-Based Lesser-Known POI Category Estimation Based on Syntactic and Semantic InformationIEEE Access10.1109/ACCESS.2023.332763611(141100-141111)Online publication date: 2023
    • (2023)280 characters to the White House: predicting 2020 U.S. presidential elections from twitter dataComputational & Mathematical Organization Theory10.1007/s10588-023-09376-529:4(542-569)Online publication date: 1-Dec-2023
    • (2023)A Transformer-Based Framework for POI-Level Social Post GeolocationAdvances in Information Retrieval10.1007/978-3-031-28244-7_37(588-604)Online publication date: 2-Apr-2023
    • (2022)Comprehensive Event Representations using Event Knowledge Graphs and Natural Language ProcessingCompanion Proceedings of the Web Conference 202210.1145/3487553.3524199(359-363)Online publication date: 25-Apr-2022
    • (2021)Sentiment based multi-index integrated scoring method to improve the accuracy of recommender systemExpert Systems with Applications10.1016/j.eswa.2021.115105179(115105)Online publication date: Oct-2021

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media