Abstract
Shortage of gasoline is a common phenomenon during onset of forecasted disasters like hurricanes. Prediction of future gasoline shortage can guide agencies in pushing supplies to the correct regions and mitigating the shortage. We demonstrate how to incorporate social media data into gasoline supply decision making. We develop a systematic approach to examine social media posts like tweets and sense future gasoline shortage. We build a four-stage shortage prediction methodology. In the first stage, we filter out tweets related to gasoline. In the second stage, we use an SVM-based tweet classifier to classify tweets about the gasoline shortage, using unigrams and topics identified using topic modeling techniques as our features. In the third stage, we predict the number of future tweets about gasoline shortage using a hybrid loss function, which is built to combine ARIMA and Poisson regression methods. In the fourth stage, we employ Poisson regression to predict shortage using the number of tweets predicted in the third stage. To validate the methodology, we develop a case study that predicts the shortage of gasoline, using tweets generated in Florida during the onset and post landfall of Hurricane Irma. We compare the predictions to the ground truth about gasoline shortage during Irma, and the results are very accurate based on commonly used error estimates.
Similar content being viewed by others
References
Ashktorab Z, Brown C, Nandi M, Culotta A (2014) Tweedr: mining twitter to inform disaster response. In: ISCRAM 2014 conference proceedings—11th international conference on information systems for crisis response and management (May), pp 354–358. https://doi.org/10.1145/1835449.1835643, http://www.scopus.com/inward/record.url?eid=2-s2.0-84905845531&partnerID=40&md5=ee57e6c3d9498b083428cdae67d83396
Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164
Beigi G, Hu X, Maciejewski R, Liu H (2016) An overview of sentiment analysis in social media and its applications in disaster relief. In: Pedrycz W, Chen SM (eds) Sentiment analysis and ontology engineering. Studies in Computational Intelligence, vol 639. Springer, Cham, pp 313–340. https://doi.org/10.1007/978-3-319-30319-2_13
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
Blei DM, Lafferty JD et al (2007) A correlated topic model of science. Ann Appl Stat 1(1):17–35
Boulos MNK, Resch B, Crowley DN, Breslin JG, Sohn G, Burtner R, Pike WA, Jezierski E, Chuang KYS (2011) Crowdsourcing, citizen sensing and sensor web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples. Int J Health Geogr 10(1):67
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken
Brockwell PJ, Davis RA, Calder MV (2002) Introduction to time series and forecasting, vol 2. Springer, Berlin
Cadenas E, Rivera W (2010) Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA–ANN model. Renew Energy 35(12):2732–2738
Caragea C, Squicciarini A, Stehle S, Neppalli K, Tapia A (2014) Mapping moods: geo-mapped sentiment analysis during hurricane sandy. In: ISCRAM 2014 conference proceedings—11th international conference on information systems for crisis response and management (May), pp 642–651. http://www.iscram.org/legacy/ISCRAM2014/papers/p29.pdf
Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 759–768
Chowdhury R, Chowdhury SR, Castillo C (2013) Tweet4act : using incident-specific profiles for classifying crisis-related messages. In: Proceedings of the 10th international ISCRAM conference (May), pp 834–839
Conover WJ (1971) Practical nonparametric statistics. Wiley, New York, pp 295–301
Cordeiro M, Gama J (2016) Online social networks event detection: a survey. In: Solving large scale learning tasks. Challenges and algorithms. Springer, Cham, pp 1–41. https://doi.org/10.1007/978-3-319-41706-6_1
Faulkner M, Olson M, Chandy R, Krause J, Chandy KM, Krause A (2011) The next big one: detecting earthquakes and other rare events from community-based sensors. In: 2011 10th international conference on information processing in sensor networks (IPSN). IEEE, pp 13–24
Fdot (2017) Hurricane IRMA report by Florida department of transportation. http://www.fdot.gov/info/CO/news/newsreleases/020118_FDOT-Fuel-Report.pdf
Feinerer I (2008) An introduction to text mining in R. Newslett R Proj 8/2:19
Fessenden H (2017) Price gouging. https://www.richmondfed.org/-/media/richmondfedorg/publications/research/econ_focus/2017/q4/jargon_alert.pdf
Flood R (2017) Express UK website. https://www.express.co.uk/news/weather/850222/Hurricane-Irma-path-destruction-USA-Florida-panic-buying-storm
Gasbuddy (2017a) https://business.gasbuddy.com/hurricane-irma-live-updates-fuel-availability-station-outages/
Gasbuddy (2017b) https://tracker.gasbuddy.com/?q=Buffalo,%20NY
Gaynor M, Seltzer M, Moulton S, Freedman J (2005) A dynamic, data-driven, decision support system for emergency medical services. In: International conference on computational science. Springer, pp 703–711
Geman S, Geman D (1987) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. In: Readings in computer vision. Elsevier, pp 564–584
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235
Gu S, Pan C, Liu H, Li S, Hu S, Su L, Wang S, Wang D, Amin T, Govindan R, et al (2014) Data extrapolation in social sensing for disaster response. In: 2014 IEEE international conference on distributed computing in sensor systems (DCOSS). IEEE, pp 119–126
Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 729–736
Han B, Cook P, Baldwin T (2013) A stacking-based approach to twitter user geolocation prediction. In: Proceedings of the 51st annual meeting of the association for computational linguistics: system demonstrations, pp 7–12
Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: Advances in neural information processing systems, pp 856–864
Hope AC (1968) A simplified Monte Carlo significance test procedure. J R Stat Soc: Ser B (Methodological) 30(3):582–598
Hornik K, Grün B (2011) topicmodels: an R package for fitting topic models. J Stat Softw 40(13):1–30
Hughes AL, St Denis LA, Palen L, Anderson KM (2014) Online public communications by police & fire services during the 2012 hurricane sandy. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1505–1514
Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Practical extraction of disaster-relevant information from social media. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 1021–1024
Imran M, Castillo C, Lucas J, Meier P, Vieweg S (2014) AIDR: Artificial intelligence for disaster response. In: Proceedings of the companion publication of the 23rd international conference on World Wide Web companion (October), pp 159–162. https://doi.org/10.1145/2567948.2577034. https://mimran.me/papers/ imran_castillo_lucas_meier_vieweg_www2014.pdf
Kaigo M (2012) Social media usage during disasters and social capital: Twitter and the great East Japan earthquake. Keio Commun Rev 34(1):19–35
Ki EJ, Nekmat E (2014) Situational crisis communication and interactivity: usage and effectiveness of Facebook for crisis management by fortune 500 companies. Comput Hum Behav 35:140–147
Kumar S, Barbier G, Abbasi MA, Liu H (2011) Tweettracker: an analysis tool for humanitarian and disaster relief. In: Fifth international AAAI conference on weblogs and social media
Lachlan KA, Spence PR, Lin X (2014) Expressions of risk awareness and concern through Twitter: on the utility of using the medium as an indication of audience needs. Comput Hum Behav 35:554–559. https://doi.org/10.1016/j.chb.2014.02.029
Lee S, Song J, Kim Y (2010) An empirical comparison of four text mining methods. J Comput Inf Syst 51(1):1–10
Liu BF, Fraustino JD, Jin Y (2016) Social media use during disasters: how information form and source influence intended behavioral responses. Commun Res 43(5):626–646. https://doi.org/10.1177/0093650214565917
Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what we RT?. In: Proceedings of the first workshop on social media analytics. ACM, pp 71–79
Meyer D, Hornik K, Feinerer I (2008) Text mining infrastructure in R. J Stat Softw 25(5):1–54
Morstatter F, Lubold N, Pon-Barry H, Pfeffer J, Liu H (2014) Finding eyewitness tweets during crises. arXiv:1403.1773
National Hurricane Centre (2017) National hurricane centre website. https://www.nhc.noaa.gov
Nazer TH, Xue G, Ji Y, Liu H (2017) Intelligent disaster response via social media analysis a survey. ACM SIGKDD Explor Newsl 19(1):46–59
Ni M, He Q, Gao J (2017) Forecasting the subway passenger flow under event occurrences with social media. IEEE Trans Intell Transp Syst 18(6):1623–1632
Nie H, Liu G, Liu X, Wang Y (2012) Hybrid of ARIMA and SVMS for short-term load forecasting. Energy Procedia 16:1455–1460
Olteanu A, Castillo C, Diaz F, Vieweg S (2014) CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of the 8th international conference on weblogs and social media, p 376. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/download/8091/8138
Pai PF, Lin CS (2005) A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 33(6):497–505
Panagiotopoulos P, Barnett J, Bigdeli AZ, Sams S (2016) Social media in emergency management: Twitter as a tool for communicating risks to the public. Technol Forecast Soc Change 111:86–96. https://doi.org/10.1016/j.techfore.2016.06.010
Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 91–100
Said SE, Dickey DA (1984) Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3):599–607
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 851–860
Sampson J, Morstatter F, Zafarani R, Liu H (2015) Real-time crisis mapping using language distribution. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 1648–1651
Schulz A, Hadjakos A, Paulheim H, Nachtwey J, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets. In: Seventh international AAAI conference on weblogs and social media, pp 573–582
Starbird K, Stamberger J (2010) Tweak the tweet: leveraging microblogging proliferation with a prescriptive syntax to support citizen reporting. In: Proceedings of the 7th international ISCRAM conference, information systems for crisis response and management Seattle, WA, vol 1, pp 1–5
Stowe K, Paul MJ, Palmer M, Palen L, Anderson K (2016) Identifying and categorizing disaster-related tweets. In: Proceedings of The fourth international workshop on natural language processing for social media, pp 1–6
Stříteskỳ V, Stránská A, Drábik P (2015) Crisis communication on facebook. Studia Commercialia Bratislavensia 8(29):103–111
Tien Nguyen D, Mannai KAA, Joty S, Sajjad H, Imran M, Mitra P (2016) Rapid classification of crisis-related data on social networks using convolutional neural networks. arXiv:1608.03902
Tseng FM, Yu HC, Tzeng GH (2002) Combining neural network model with seasonal time series ARIMA model. Technol Forecast Soc Change 69(1):71–87
Ushahidi (2017) Ushahidi. https://www.ushahidi.com
Utz S, Schultz F, Glocka S (2013) Crisis communication online: how medium, crisis type and emotions affected public reactions in the Fukushima Daiichi nuclear disaster. Public Relat Rev 39(1):40–46
van Gorp A, Pogrebnyakov N, Maldonado E (2015) Just keep tweeting: emergency responder’s social media use before and during emergencies. In: Proceedings of the 23rd European conference on information systems (ECIS 2015), pp 1–15. https://doi.org/10.18151/7217512
Wainwright MJ, Jordan MI et al (2008) Graphical models, exponential families, and variational inference. Found Trends Mach Learn 1(1–2):1–305
Waze (2017) Waze. https://www.waze.com
Xu Q, Tsui KL, Jiang W, Guo H (2016) A hybrid approach for forecasting patient visits in emergency department. Qual Reliab Eng Int 32(8):2751–2759
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175
Zhu B, Wei Y (2013) Carbon price forecasting with a novel hybrid ARIMA and least squares support vector machines methodology. Omega 41(3):517–524
Zook M, Graham M, Shelton T, Gorman S (2010) Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian earthquake. World Med Health Policy 2(2):7–33
Acknowledgements
The authors would like to thank two anonymous referees who provided detailed comments that significantly enhanced our paper.
Funding
Funding was provided by National Science Foundation (Grant No. 1663101).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Khare, A., He, Q. & Batta, R. Predicting gasoline shortage during disasters using social media. OR Spectrum 42, 693–726 (2020). https://doi.org/10.1007/s00291-019-00559-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00291-019-00559-8