Skip to main content

Location Mention Recognition from Japanese Disaster-Related Tweets

  • Conference paper
  • First Online:
Information Technology in Disaster Risk Reduction (ITDRR 2022)

Abstract

In order to minimize the damage inflicted by large-scale disasters, it is essential to collect and disseminate information quickly and accurately. In recent years, various national agencies and local municipalities have used Twitter and other highly immediate social media to help focus their disaster relief efforts. Because the volume of information circulating on social media increases rapidly during a disaster, the ability to quickly sort out valuable posts from the massive volume of posts that appear is essential. In the case of Twitter, it is vital for early responders to identify the location of relevant tweets in order to facilitate decision making and focus their response. To help in this task, attempts have been made to use machine learning to classify genres, extract useful information, and identify locations and points of interest for groups of tweets posted during a disaster. However, since preparing training data and building a model during the early stages of a disaster are extremely challenging, using a model built on past disaster tweet data offers a promising possibility. In this study, we focus on three heavy rain disasters that occurred in Japan and examine the extraction of the location mentions in tweets using models learned from tweets posted during prior disasters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Saleem, H., Zamal, F., Ruths, D.: Tackling the challenges of situational awareness extraction in twitter with an adaptive approach. Procedia Eng. 107, 301–311 (2015). https://doi.org/10.1016/j.proeng.2015.06.085

    Article  Google Scholar 

  2. Meier, P.: Digital humanitarians: how big data is changing the face of humanitarian response (2015). https://doi.org/10.1201/b18023

  3. Uchida, O., Utsu, K.: Utilization of social media at the time of disaster. IEICE ESS Fundam. Rev. 13, 301–311 (2020). https://doi.org/10.1587/essfr.13.4_301. (inJapanese)

    Article  Google Scholar 

  4. Yamada, S., Utsu, K., Uchida, O.: An analysis of tweets during the 2018 Osaka North Earthquake in Japan -a brief report. In: 2018 5th International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 1–5 (2018). https://doi.org/10.1109/ICT-DM.2018.8636393

  5. Villegas, C., Martinez, M., Krause, M.: Lessons from harvey: crisis informatics for urban resilience. Rice University Kinder Institute for Urban Research (2018). https://doi.org/10.25611/np4y-3bil

  6. Suwaileh, R., Imran, M., Elsayed, T., Sajjad, H.: Are we ready for this disaster? Towards location mention recognition from crisis tweets. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 6252–6263. International Committee on Computational Linguistics (2020)

    Google Scholar 

  7. Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of the 8th International Conference on Weblogs and Social Media, ICWSM 2014, pp. 376–385 (2014)

    Google Scholar 

  8. Imran, M., Mitra, P., Castillo, C.: Twitter as a lifeline: human-annotated Twitter corpora for NLP of crisis-related messages. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, pp. 1638–1643. European Language Resources Association (ELRA) (2016)

    Google Scholar 

  9. Cobo, A., Parra, D., Navón, J.: Identifying relevant messages in a Twitter-based citizen channel for natural disaster situations. In: Proceedings of the 24th International Conference on World Wide Web, New York, NY, USA, pp. 1189–1194. Association for Computing Machinery (2015). https://doi.org/10.1145/2740908.2741719

  10. Alharbi, A., Lee, M.: Kawarith: an Arabic Twitter corpus for crisis events. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine (Virtual), pp. 42–52. Association for Computational Linguistics (2021)

    Google Scholar 

  11. Sarioglu Kayi, E., Nan, L., Qu, B., Diab, M., McKeown, K.: Detecting urgency status of crisis tweets: a transfer learning approach for low resource languages. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 4693–4703. International Committee on Computational Linguistics (2020)

    Google Scholar 

  12. Ray Chowdhury, J., Caragea, C., Caragea, D.: Cross-lingual disaster-related multi-label tweet classification with manifold mixup. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 292–298. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-srw.39

  13. Ray Chowdhury, J., Caragea, C., Caragea, D.: Keyphrase extraction from disaster-related tweets. In: The World Wide Web Conference, New York, NY, USA, pp. 1555–1566. Association for Computing Machinery (2019). https://doi.org/10.1145/3308558.3313696

  14. Al-Olimat, H., Thirunarayan, K., Shalin, V., Sheth, A.: Location name extraction from targeted text streams using gazetteer-based statistical language models. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 1986–1997. Association for Computational Linguistics (2018)

    Google Scholar 

  15. Medina Maza, S., Spiliopoulou, E., Hovy, E., Hauptmann, A.: Event-related bias removal for real-time disaster events. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 3858–3868. Association for Computational Linguistics (2020)

    Google Scholar 

  16. Suwaileh, R., Elsayed, T., Imran, M., Sajjad, H.: When a disaster happens, we are ready: location mention recognition from crisis tweets. Int. J. Disaster Risk Reduct. 78, 103107 (2022). https://doi.org/10.1016/j.ijdrr.2022.103107

    Article  Google Scholar 

  17. Martínez-García, A., Badia, T., Barnes, J.: Evaluating morphological typology in zero-shot cross-lingual transfer. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3136–3153. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.244

  18. Hamoui, B., Mars, M., Almotairi, K.: FloDusTA: Saudi tweets dataset for flood, dust storm, and traffic accident events. In: Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 1391–1396. European Language Resources Association (2020)

    Google Scholar 

  19. DISAANA. https://disaana.jp/

  20. D-SUMM. https://disaana.jp/d-summ/

  21. Yamada, S., Utsu, K., Uchida, O.: An analysis of tweets posted during 2018 Western Japan heavy rain disaster. In: 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 1–8 (2019). https://doi.org/10.1109/BIGCOMP.2019.8679346

  22. Yamamoto, F., Suzuki, Y., Nadamoto, A.: Extraction and analysis of regionally specific behavioral facilitation information in the event of a large-scale disaster. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, New York, NY, USA, pp. 538–543. Association for Computing Machinery (2021). https://doi.org/10.1145/3486622.3493991

  23. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on Information and knowledge management, New York, NY, USA, pp. 759–768. Association for Computing Machinery (2010). https://doi.org/10.1145/1871437.1871535

  24. Sakaki, T., Matsuno, S., Hino, Y.: Analysis on geographic bias in private graphs on Twitter towards SNS marketing applications. IEICE Technical report, vol. 121, pp. 25–30 (2021). (in Japanese)

    Google Scholar 

  25. Gelernter, J., Balaji, S.: An algorithm for local geoparsing of microtext. GeoInformatica 17, 635–667 (2013). https://doi.org/10.1007/s10707-012-0173-8

    Article  Google Scholar 

  26. Kumar, A., Singh, J.P.: Deep neural networks for location reference identification from Bilingual disaster-related tweets. IEEE Trans. Comput. Soc. Syst., 1–12 (2022). https://doi.org/10.1109/TCSS.2022.3213702

  27. Davari, M., Kosseim, L., Bui, T.: TIMBERT: toponym identifier for the medical domain based on BERT. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 662–668. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.58

  28. Yang, J., Liang, S., Zhang, Y.: Design challenges and misconceptions in neural sequence labeling. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 3879–3889. Association for Computational Linguistics (2018)

    Google Scholar 

  29. Paul, U., Ermakov, A., Nekrasov, M., Adarsh, V., Belding, E.: #Outage: detecting power and communication outages from social networks. In: Proceedings of The Web Conference 2020, Taipei Taiwan, pp. 1819–1829. ACM (2020). https://doi.org/10.1145/3366423.3380251

  30. Matsuda, K., Sasaki, A., Okazaki, N., Inui, K.: Annotating geographical entities on microblog text. In: Proceedings of The 9th Linguistic Annotation Workshop, Denver, Colorado, USA, pp. 85–94. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/W15-1609

  31. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423

  32. Xu, C., Ge, T., Li, C., Wei, F.: UnihanLM: coarse-to-fine Chinese-Japanese language model pretraining with the unihan database. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, China, pp. 201–211. Association for Computational Linguistics (2020)

    Google Scholar 

  33. Koto, F., Rahimi, A., Lau, J.H., Baldwin, T.: IndoLEM and IndoBERT: a benchmark dataset and pre-trained language model for Indonesian NLP. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 757–770. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.66

  34. Antoun, W., Baly, F., Hajj, H.: AraBERT: transformer-based model for arabic language understanding. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, Marseille, France, pp. 9–15. European Language Resource Association (2020)

    Google Scholar 

  35. Kato, T., Miyata, R., Sato, S.: BERT-based simplification of japanese sentence-ending predicates in descriptive text. In: Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland, pp. 242–251. Association for Computational Linguistics (2020)

    Google Scholar 

  36. Chen, W.-T., Xia, Y., Shinzato, K.: Extreme multi-label classification with label masking for product attribute value extraction. In: Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), Dublin, Ireland, pp. 134–140. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.ecnlp-1.16

  37. Nakayama, Y., Murakami, K., Kumar, G., Bhingardive, S., Hardaway, I.: A large-scale Japanese dataset for aspect-based sentiment analysis. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, pp. 7014–7021. European Language Resources Association (2022)

    Google Scholar 

  38. Liu, Y., et al.: RoBERTa: a robustly optimized bert pretraining approach (2019). http://arxiv.org/abs/1907.11692. https://doi.org/10.48550/arXiv.1907.11692

  39. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for Self-supervised learning of language representations (2020). http://arxiv.org/abs/1909.11942. https://doi.org/10.48550/arXiv.1909.11942

  40. Xiao, Z., Blanco, E.: Are people located in the places they mention in their tweets? A multimodal approach. In: Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, pp. 2561–2571. International Committee on Computational Linguistics (2022)

    Google Scholar 

  41. Khanal, S., Caragea, D.: Multi-task learning to enable location mention identification in the early hours of a crisis event. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, pp. 4051–4056. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.340

  42. Wang, L., Gao, C., Wei, J., Ma, W., Liu, R., Vosoughi, S.: An empirical survey of unsupervised text representation methods on Twitter data. In: Proceedings of the Sixth Workshop on Noisy User-Generated Text (W-NUT 2020), pp. 209–214. Association for Computational Linguistics (2020)

    Google Scholar 

  43. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space (2013). http://arxiv.org/abs/1301.3781

  44. Nguyen, D.Q., Vu, T., Tuan Nguyen, A.: BERTweet: a pre-trained language model for English Tweets. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 9–14. Association for Computational Linguisticse (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.2

  45. Kawintiranon, K., Singh, L.: PoliBERTweet: a pre-trained language model for analyzing political content on Twitter. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, pp. 7360–7367. European Language Resources Association (2022)

    Google Scholar 

Download references

Acknowledgments

This research was supported by JSPS KAKENHI Grant Number 18K11553.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Toshihiro Rokuse or Osamu Uchida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rokuse, T., Uchida, O. (2023). Location Mention Recognition from Japanese Disaster-Related Tweets. In: Gjøsæter, T., Radianti, J., Murayama, Y. (eds) Information Technology in Disaster Risk Reduction. ITDRR 2022. IFIP Advances in Information and Communication Technology, vol 672. Springer, Cham. https://doi.org/10.1007/978-3-031-34207-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34207-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34206-6

  • Online ISBN: 978-3-031-34207-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics