Research Article
Social media crowdsourcing for rapid damage assessment following a sudden-onset natural hazard event

https://doi.org/10.1016/j.ijinfomgt.2021.102378Get rights and content

Highlights

  • Develop a crowdsourcing approach using Twitter data for rapid earthquake damage assessment.

  • Build text classification models to parse the damage levels adapated from MMI Scale.

  • Create application-specific library for earthquake damage assessment.

  • The time of saturation (convergence) for post-event damage assessment was 14–16 h.

  • The social media derived geographic damage distribution appears consistent with the USGS MMI “Did You Feel It” map.

Abstract

Rapid appraisal of damages related to hazard events is of importance to first responders, government agencies, insurance industries, and other private and public organizations. While satellite monitoring, ground-based sensor systems, inspections and other technologies provide data to inform post-disaster response, crowdsourcing through social media is an additional and novel data source. In this study, the use of social media data, principally Twitter postings, is investigated to make approximate but rapid early assessments of damages following a disaster. The goal is to explore the potential utility of using social media data for rapid damage assessment after sudden-onset hazard events and to identify insights related to potential challenges. This study defines a text-based damage assessment scale for earthquake damages, and then develops a text classification model for rapid damage assessment. Although the accuracy remains a challenge compared to ground-based instrumental readings and inspections, the proposed damage assessment model features rapidity with large amounts of data at spatial densities that exceed those of conventional sensor networks. The 2019 Ridgecrest, California earthquake sequence is investigated as a case study.

Introduction

Natural disasters are costly (Coronese, Lamperti, Keller, Chiaromonte, & Roventini, 2019). The rapid appraisal of damages related to natural hazards is essential to event response and recovery efforts. Post-event environments are characterized by incomplete and rapidly evolving information (Rouhanizadeh, Kermanshachi, & Nipa, 2020). As a result, a significant hindrance to emergency response and efficient event recovery is a lack of knowledge of the spatial extent and magnitude of damage. While certain events (e.g., hurricanes) may be forecasted, other hazards (e.g., earthquakes) occur with little or no warning (Vieweg, Castillo, & Imran, 2014). These sudden-onset events present a particular challenge due to the inability to pre-deploy response and recovery resources.

First responders, government agencies, and private sector entities need information to support effective response and decision-making. While satellite monitoring, ground-based sensors, and other modern technologies have provided benefits in tracking post-disaster conditions (Haq, Akhtar, Muhammad, Paras, & Rahmatullah, 2012; Monfort, Negulescu, & Belvaux, 2019; Wang, Qu, Hao, Liu, & Stanturf, 2010), crowdsourcing through social media presents an additional and novel source of such information (Simon, Goldberg, & Adini, 2015). While social media is an inherently imperfect information source, it provides rapid and geographically distributed information that complements that from other sources. Specifically, social media data are available in large quantities and at spatial densities that may exceed conventional sensor networks. Moreover, social media data can be rapidly collected in near-real-time, without the time required to deploy reconnaissance technologies or personnel. However, they are of relatively low fidelity compared to data from conventional sensors, aerial imagery, or the observations of trained inspection teams.

The goal of this study is to explore the utility of social media data to provide rapid indications of damage following sudden-onset natural hazard events. Multiple studies have demonstrated the potential of applying social media for a rapid damage assessment. These studies use the intensity of social media activities or the sentiment level as a metric to indicate the extent of damage in the affected areas rather than to quantitatively parse the damage levels (Resch, Usländer, & Havas, 2018; Wu & Cui, 2018; Yuan & Liu, 2020). Moreover, there has been minimal focus on leveraging text classification methods to estimate damage levels with social media data in the earthquake context. To fill this research gap, this study investigates the use of Twitter™ postings and builds text classification models based on the Modified Mercalli Intensity (MMI) Scale to make approximate but rapid early assessments of damages due to earthquakes. This approach can be extended to other natural hazards such as floods, coastal surges, wildfires, or tornadoes.

While assessments based on social media have neither the accuracy nor the precision of ground-based instruments and observations, they have advantages of rapidity, quantity, and spatial coverage. This study provides insights on useful natural language processing instruments and machine learning classifiers for textual analysis in the context of post-event rapid damage assessment. Through the exploration of results, this study further offers insights regarding potential challenges to using social media and offers opportunities for future research. Specifically, using relatively simple metrics of text characteristics, a great deal of information can be gathered about the patterns and timing of a hazard event. Combining these simple metrics with textual analysis allows a characterization, albeit rough for the present, of damage levels. The case of the Ridgecrest, CA earthquake sequence on 4 and 6 July 2019 is used to investigate the potential for rapid damage estimates from social media data.

Section snippets

Social media in disaster management

Social media data from Twitter™, Facebook™, Instagram™, and other web platforms are now used for emergency response and recovery, natural hazards planning, and risk mitigation. Several efforts to use social media data for natural hazards response and civil infrastructure planning have recently appeared (e.g., Kryvasheyeu et al., 2016; Leykin, Lahad, & Aharonson-Daniel, 2018; Niles, Emery, Reagan, Dodds, & Danforth, 2019). These studies have evaluated social media volume and the use of targeted

Hypothesis development

The intent of this study is to quantitatively explore the value using of social media data to provide rapid indications of damage following sudden-onset natural hazard events, specifically earthquakes. Specifically, we seek to quantitatively parse the damage levels through the use of Twitter™ postings and text classification models. To assess the proposed approach, we establish a series of hypotheses to: (1) test whether social media users are shown to react significantly to earthquake events,

Data and methods

This study investigates the aforementioned hypotheses by parsing social media (Twitter) data to identify (1) indicators of various levels of damage and (2) the geospatial location of the damage (even when explicit geospatial information is not available). Given the engineering relevance of language related to damage, a special-purpose library is constructed, leveraging existing damage scales. A series of candidate models are developed and tested for identifying, processing, and analyzing tweets

Results

The subsections that follow explore the analysis of Twitter data associated with the Ridgecrest earthquake sequences from three perspectives, which are linked the exploration of the two hypotheses identified in Section 3. First, Section 5.1 presents an initial descriptive analysis of tweet volumes and characteristics, which provide interesting insights regarding the nature of responses to earthquakes on Twitter and supports Hypothesis 1(a) and Hypothesis 1(b) related to user actions following

Discussion

Sudden-onset natural hazards, especially earthquakes, occur with little or no warning and may result in significant damage. The rapid appraisal of losses related to these natural hazards events is of significance to residents, government agencies, insurance companies, and other stakeholders. Damage can vary from things being broken and minor injuries to complete collapse of buildings and deaths. The spatial distribution of impact on large regions makes place-by-place, in-person assessment

Conclusions

This research uses the Ridgecrest, CA earthquakes of July 2019 as a test case to illustrate the potential utility of using social media data for rapid damage assessment and to provide insights regarding potential challenges. To verify the feasibility of the proposed methodology, this study leverages the insights from MMI Scale and develops a simple four-step scale of earthquake damage. It then trains and tests a series of candidate models for parsing tweet text to extract information about the

Author statement

  • Lingyao Li: Conceptualization, Methodology, Data Curation, Formal Analysis, Writing – Original Draft, Writing – Review & Editing.

  • Michelle Bensi: Conceptualization, Validation, Writing – Original Draft, Writing – Review & Editing.

  • Qingbin Cui: Conceptualization, Writing – Review & Editing.

  • Gregory B. Baecher: Conceptualization, Writing – Original Draft, Writing – Review & Editing, Supervision.

  • You Huang: Data Curation.

Data availability

The full dictionaries of words to filter the data and the temporal and spatial results can be found at https://data.mendeley.com/datasets/z9xjcmg6s2/3, an open-source online data repository hosted at Mendeley Data. (L. Li, 2020).

Acknowledgement

We acknowledge the help of Kaveh Faraji Najarkolaie with map data generation in Section 5. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (78)

  • A. Majumdar et al.

    Do tweets create value? A multi-period analysis of Twitter use and content of tweets for manufacturing firms

    International Journal of Production Economics

    (2019)
  • S. Mangalathu et al.

    Deep learning-based classification of earthquake-impacted buildings using textual damage descriptions

    International Journal of Disaster Risk Reduction

    (2019)
  • D. Monfort et al.

    Remote sensing vs. field survey data in a post-earthquake context: Potentialities and limits of damaged building assessment datasets

    Remote Sensing Applications: Society and Environment

    (2019)
  • J. Osorio-Arjona et al.

    Social media and urban mobility: Using twitter to calculate home-work travel matrices

    Cities

    (2019)
  • R.A. Plunz et al.

    Twitter sentiment in New York City parks as measure of well-being

    Landscape and Urban Planning

    (2019)
  • J.R. Ragini et al.

    Big data analytics for disaster response and recovery through sentiment analysis

    International Journal of Information Management

    (2018)
  • A.A. Rajput et al.

    Temporal network analysis of inter-organizational communications on social media during disasters: A study of Hurricane Harvey in Houston

    International Journal of Disaster Risk Reduction

    (2020)
  • J.C. Reboredo et al.

    The impact of Twitter sentiment on renewable energy stocks

    Energy Economics

    (2018)
  • C. Rossi et al.

    Early detection and information extraction for weather-induced floods using social media streams

    International Journal of Disaster Risk Reduction

    (2018)
  • B. Rouhanizadeh et al.

    Exploratory analysis of barriers to effective post-disaster recovery

    International Journal of Disaster Risk Reduction

    (2020)
  • K.C. Roy et al.

    Understanding the efficiency of social media based crisis communication during hurricane Sandy

    International Journal of Information Management

    (2020)
  • S. Shan et al.

    Disaster management 2.0: A real-time disaster damage assessment model based on mobile social media data—A case study of Weibo (Chinese Twitter)

    Safety Science

    (2019)
  • K. Shoyama et al.

    Emergency flood detection using multiple information sources: Integrated analysis of natural hazard monitoring and social media data

    Science of the Total Environment

    (2021)
  • T. Simon et al.

    Socializing in emergencies—A review of the use of social media in emergency situations

    International Journal of Information Management

    (2015)
  • J. Son et al.

    Content features of tweets for effective communication during disasters: A media synchronicity theory perspective

    International Journal of Information Management

    (2019)
  • B. Takahashi et al.

    Communicating on Twitter during a disaster: An analysis of tweets during Typhoon Haiyan in the Philippines

    Computers in Human Behavior

    (2015)
  • W. Wang et al.

    Post-hurricane forest damage assessment using satellite remote sensing

    Agricultural and Forest Meteorology

    (2010)
  • D. Wu et al.

    Disaster early warning and damage assessment analysis using social media data and geo-location information

    Decision Support Systems

    (2018)
  • T. Yabe et al.

    Integrating information from heterogeneous networks on social media to predict post-disaster returning behavior

    Journal of Computational Science

    (2019)
  • D. Yates et al.

    Emergency knowledge management and social media technologies: A case study of the 2010 Haitian earthquake

    International Journal of Information Management

    (2011)
  • F. Yuan et al.

    Feasibility study of using crowdsourcing to identify critical affected areas for rapid damage assessment: Hurricane Matthew case study

    International Journal of Disaster Risk Reduction

    (2018)
  • C. Zhang et al.

    Social media for intelligent public information and warning in disasters: An interdisciplinary review

    International Journal of Information Management

    (2019)
  • J. Beel et al.

    Research-paper recommender systems: A literature survey

    International Journal on Digital Libraries

    (2016)
  • S. Bird et al.

    Natural language processing with Python

    (2009)
  • P. Bojanowski et al.

    Enriching word vectors with subword information

    Transactions of the Association for Computational Linguistics

    (2017)
  • F. Chollet

    Keras

    (2015)
  • M. Coronese et al.

    Evidence for sharp increase in the economic damages of extreme natural disasters

    Proceedings of the National Academy of Sciences

    (2019)
  • C. Cortes et al.

    Support-vector networks

    Machine Learning

    (1995)
  • P.S. Earle et al.

    Twitter earthquake detection: Earthquake monitoring in a social world

    Annals of Geophysics

    (2011)
  • Cited by (0)

    View full text