Abstract
Modeling social media activity has numerous practical implications such as in helping analyze strategic information operations, designing intervention techniques to mitigate disinformation, or delivering critical information during disaster relief operations. In this paper, we propose a modeling technique that forecasts topic-specific daily volume of social media activities by multiplexing different exogenous signals, such as news reports and armed conflicts records, and endogenous data from the social media platform we model. For this, we trained a collection of LSTM models, each leveraging a different exogenous source, and dynamically select one model for each topic. Empirical evaluations with real datasets from two social media platforms and two different contexts each composed of multiple interrelated topics demonstrate the effectiveness of our solution.
Similar content being viewed by others
References
Abdelzaher T, Han J, Hao Y, et al (2020) Multiscale online media simulation with socialcube. Comput Math Organ Theory 1–30
Afzal M (2020) At all costs’: how Pakistan and China control the narrative on the China-Pakistan economic corridor. In: The Brookings Institution report, pp 1–10
Asur S, Huberman BA (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. IEEE, pp 492–499
Bacry E, Bompaire M, Gaïffas S et al (2020) Sparse and low-rank multivariate Hawkes processes. J Mach Learn Res 21(50):1–32
Bacry E, Mastromatteo I, Muzy JF (2015) Hawkes processes in finance. Market Microstruct Liquid 1(01):1550005
Beskow D, Carley K (2020) Characterization and comparison of Russian and Chinese disinformation campaigns. In: Disinformation, misinformation, and fake news in social media. Springer, pp 63–81
Box GE, Jenkins GM, Reinsel GC et al (2015) Time series analysis: forecasting and control. Wiley, Hoboken
Bui C, Pham N, Vo A, et al (2017) Time series forecasting for healthcare diagnosis and prognostics with the focus on cardiovascular diseases. In: International conference on the development of biomedical engineering in Vietnam. Springer, pp 809–818
Deb C, Zhang F, Yang J et al (2017) A review on time series forecasting techniques for building energy consumption. Renew Sustain Energy Rev 74:902–924
Devlin J, Chang MW, Lee K, et al (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Dutta S, Masud S, Chakrabarti S, et al (2020) Deep exogenous and endogenous influence combination for social chatter intensity prediction. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining
Ferrara E, Chang H, Chen E, et al (2020) Characterizing social media manipulation in the 2020 US presidential election. First Monday
Hajiakhoond Bidoki N, Mantzaris AV, Sukthankar G (2019) An LSTM model for predicting cross-platform bursts of social media activity. Information 10(12):394
Hameed M (2018) The politics of the China–Pakistan economic corridor. Palgrave Commun 4(1):1–10
Hawkes AG (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1):83–90
Hernandez A, Ng K, Iamnitchi A (2020) Using deep learning for temporal forecasting of user activity on social media: challenges and limitations. In: Companion proceedings of the web conference, pp 331–336
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hong R, He C, Ge Y et al (2017) User vitality ranking and prediction in social networking services: a dynamic network perspective. IEEE Trans Knowl Data Eng 29(6):1343–1356
Horawalavithana S, Bhattacharjee A, Liu R, et al (2019) Mentions of security vulnerabilities on Reddit, Twitter and GitHub. In: IEEE/WIC/ACM international conference on web intelligence, pp 200–207
Horawalavithana S, NG KW, Iamnitchi A (2021) Drivers of polarized discussions on Twitter during Venezuela political crisis. In: The 13th ACM conference on web science. ACM. https://doi.org/10.1145/3447535.3462496
Hyndman R, Koehler AB, Ord JK et al (2008) Forecasting with exponential smoothing: the state space approach. Springer, Berlin
Kong S, Mei Q, Feng L, et al (2014) Predicting bursts and popularity of hashtags in real-time. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, pp 927–930
Leetaru K, Schrodt PA (2013) GDELT: Global data on events, location, and tone. In: ISA Annual Convention
Li Y, Yu R, Shahabi C, et al (2017) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: The international conference on learning representations (ICLR)
Liu R, Mubang F, Hall LO (2020) Simulating temporal user activity on social networks with sequence to sequence neural models. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 1677–1684
Liu W, Deng ZH, Gong X, et al (2015) Effectively predicting whether and when a topic will become prevalent in a social network. In: Proceedings of the AAAI conference on artificial intelligence
Lukasik M, Srijith P, Vu D, et al (2016) Hawkes processes for continuous time sequence classification: an application to rumour stance classification in Twitter. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 393–398
Masud S, Dutta S, Makkar S, et al (2021) Hate is the new infodemic: a topic-aware modeling of hate speech diffusion on Twitter. In: 2021 IEEE 37th international conference on data engineering (ICDE). IEEE, pp 504–515
Masuda N, Takaguchi T, Sato N, et al (2013) Self-exciting point process modeling of conversation event sequences. In: Temporal networks. Springer, pp 245–264
McClellan C, Ali MM, Mutter R et al (2017) Using social media to monitor mental health discussions-evidence from Twitter. J Am Med Inform Assoc 24(3):496–502
NG KW, Horawalavithana S, Iamnitchi A (2021) Forecasting topic activity with exogenous and endogenous information signals in twitter. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining, pp 95–98
Nizzoli L, Tardelli S, Avvenuti M et al (2020) Charting the landscape of online cryptocurrency manipulation. IEEE Access 8:113230–113245
Odlum M, Yoon S (2015) What can we learn about the Ebola outbreak from tweets? Am J Infect Control 43(6):563–571
Phillips L, Dowling C, Shaffer K, et al (2017) Using social media to predict the future: a systematic literature review. arXiv:1706.06134
Pinto JCL, Chahed T, Altman E (2015) Trend detection in social networks using Hawkes processes. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, pp 1441–1448
Raleigh C, Dowd C (2015) Armed conflict location and event data project (ACLED) codebook. In: Find this resource
Ribeiro FN, Araújo M, Gonçalves P et al (2016) Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Sci 5(1):1–29
Rizoiu MA, Xie L, Sanner S, et al (2017) Expecting to be hip: Hawkes intensity processes for social media popularity. In: Proceedings of the 26th international conference on world wide web, pp 735–744
Sacks D (2021) The China–Pakistan economic corridor-hard reality greets Bri’s signature initiative. In: Council on foreign relations
Saima H, Jaafar J, Belhaouari S, et al (2011) Intelligent methods for weather forecasting: a review. In: 2011 National postgraduate conference. IEEE, pp 1–6
Saleiro P, Soares C (2016) Learning from the news: Predicting entity popularity on Twitter. In: International symposium on intelligent data analysis. Springer, pp 171–182
Shrestha P, Maharjan S, Arendt D, et al (2019) Learning from dynamic user interaction graphs to forecast diverse social behavior. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 2033–2042
Tasnim S, Hossain MM, Mazumder H (2020) Impact of rumors and misinformation on Covid-19 in social media. J Prev Med Public Health 53(3):171–174
Tommasel A, Diaz-Pace A, Rodriguez JM et al (2021) Forecasting mental health and emotions based on social media expressions during the Covid-19 pandemic. Inf Discov Deliv 49:259–268
Del Vicario M, Bessi A, Zollo F et al (2016) The spreading of misinformation online. Proc Natl Acad Sci 113(3):554–559
Yin Y, Shang P (2016) Forecasting traffic time series with multivariate predicting method. Appl Math Comput 291:266–278
Yin H, Cui B, Lu H, et al (2013) A unified model for stable and temporal topic detection from social media data. In: 29th international conference on data engineering (ICDE), IEEE, pp 661–672
Yu B, Yin H, Zhu Z (2017) Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: The 27th international joint conference on artificial intelligence
Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62
Zhao Q, Erdogdu MA, He HY, et al (2015) Seismic: a self-exciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1513–1522
Acknowledgements
This work is supported by the DARPA SocialSim Program and the Air Force Research Laboratory under contract FA8650-18-C-7825. The authors would like to thank Leidos for providing data.
Funding
This work was funded by the DARPA SocialSim Program and the Air Force Research Laboratory under contract FA8650-18-C-7825.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the final manuscript.
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A. Iamnitchi: Work done while at University of South Florida.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ng, K.W., Horawalavithana, S. & Iamnitchi, A. Social media activity forecasting with exogenous and endogenous signals. Soc. Netw. Anal. Min. 12, 102 (2022). https://doi.org/10.1007/s13278-022-00927-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-022-00927-3