Abstract
Timely and high-resolution estimates of the home locations of a sufficiently large subset of the population are critical for effective disaster response and public health intervention, but this is still an open problem. Conventional data sources, such as census and surveys, have a substantial time lag and cannot capture seasonal trends. Recently, social media data has been exploited to address this problem by leveraging its large user-base and real-time nature. However, inherent sparsity and noise, along with large estimation uncertainty in home locations, have limited their effectiveness. Consequently, much of previous research has aimed only at a coarse spatial resolution, with accuracy being limited for high-resolution methods. In this paper, we develop a consensus deep-learning solution that uses two deep neural networks to deal with sparse and noisy social media data. In the first step, high accuracy is achieved by implementing a deep neural network that has more balanced home location candidates, using batch normalization, and duplicating home location records. We obtained over 92% accuracy for large subsets on a commonly used dataset. Compared to other high-resolution methods, our approach yields up to 60% error reduction by reducing high-resolution home prediction error from over 21% to less than 8%. Systematic comparisons show that our method gives the highest accuracy both for the entire sample and for subsets. Evaluation on a real-world public health problem further validates the effectiveness of our approach.
Similar content being viewed by others
Notes
References
Belagiannis V, Rupprecht C, Carneiro G, Navab N (2015) Robust optimization for deep regression. In: Proceedings of the IEEE international conference on computer vision, pp 2830–2838
Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl Eng 60(1):208–221
Caminade C, Turner J, Metelmann S, Hesson JC, Blagrove MS, Solomon T, Morse AP, Baylis M (2017) Global risk model for vector-borne transmission of Zika virus reveals the role of El Niño 2015. Proc Natl Acad Sci 114(1):119–124
Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM
Chollet F (2017) Deep learning with python. Manning Publications Co., Shelter Island
Ericksen SS et al (2017) Machine learning consensus scoring improves performance across targets in structure-based virtual screening. J Chem Inf Model 57(7):1579–1590
Ghaffari M, Ghadiri N (2016) Ambiguity-driven fuzzy C-means clustering: how to detect uncertain clustered records. Appl Intell 45(2):293–304
Ghaffari M, Srinivasan A, Liu X (2019a) High-resolution home location prediction from tweets using deep learning with dynamic structure. In: Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining, pp 540–542
Ghaffari M, Srinivasan A, Mubayi A, Liu X, Viswanathan K (2019b) Next-generation high-resolution vector-borne disease risk assessment. In: 2019 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 621–624
Hecht B et al (2011) Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM
Hossain N et al (2016) Precise localization of homes and activities: detecting drinking-while-tweeting patterns in communities. In: ICWSM
https://www.omnicoreagency.com/twitter-statistics/ [26/10/18]
Hu T et al (2016) Home location inference from sparse and noisy data: models and applications. Front Inf Technol Electron Eng 17(5):389–402
Ioffe S, Szegedy C (2015)Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Isaacman S et al (2011) Identifying important places in people’s lives from cellular network data. In: International conference on pervasive computing. Springer, Berlin, Heidelberg
Janocha K, Czarnecki WM (2017) On loss functions for deep neural networks in classification. arXiv:1702.05659
Jones KH, Daniels H, Heys S, Ford DV (2018) Challenges and potential opportunities of mobile phone call detail records in health research: review. JMIR Mhealth Uhealth 6:e161
Kavak H, Vernon-Bido D, Padilla JJ (2018) Fine-scale prediction of people’s home location using social media footprints. In: International conference on social computing, behavioral-cultural modeling and prediction and behavior representation in modeling and simulation. Springer, Cham
Liu Z et al (2018) Top-down person re-identification with Siamese convolutional neural networks. In: 2018 international joint conference on neural networks (IJCNN). IEEE
Mahmud J, Nichols J, Drews C (2012) Where is this tweet from? Inferring home locations of Twitter users. In: ICWSM, vol 12, pp 511–514
Mahmud J, Nichols J, Drews C (2014) Home location identification of Twitter users. ACM Trans Intell Syst Technol: TIST 5(3):47
Mendenhall J, Meiler J (2016) Improving quantitative structure-activity relationship models using Artificial Neural Networks trained with dropout. J Comput Aided Mol Des 30(2):177–189
Peak CM, Wesolowski A, Erbach-Schoenberg EZ, Tatem AJ, Wetter E, Lu X, Power D, Weidman-Grunewald E, Ramos S, Moritz S, Buckee CO, Bengtsson L (2018) Population mobility reductions associated with travel restrictions during the Ebola epidemic in Sierra Leone: use of mobile phone data. Int J Epidemiol 47:1562–1570
Pontes T et al (2012) Beware of what you sh are: inferring home location in social networks. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW). IEEE
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1249
Srivastava N et al (1958) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Tasse D, Sciuto A, Hong JI (2016) Our house, in the middle of our tweets. In: ICWSM
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31
Wesolowski A, Qureshi T, Boni MF, Sundsoy PR, Johansson MA, Rasheed SB, Engo-Monsen K, Buckee CO (2015) Impact of human mobility on the emergence of dengue epidemics in Pakistan. Proc Natl Acad Sci 112:11887–11892
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghaffari, M., Srinivasan, A., Liu, X. et al. High-resolution home location prediction from Twitter activities using consensus deep learning. Soc. Netw. Anal. Min. 11, 95 (2021). https://doi.org/10.1007/s13278-021-00808-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-021-00808-1