Skip to main content

Text Processing Framework for Emergency Event Detection in the Arctic Zone

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 706))

Abstract

We present the text processing framework for detection and analysis of events related to emergencies in a specified region. We consider the Arctic zone as a particular example. The peculiarity of the task consists in data sparseness and scarceness of tools/language resources for processing such specific texts. The system performs focused crawling of texts related to emergencies in the Arctic region, information extraction including named entity recognition, geotagging, vessel name recognition, and detection of emergency related messages, as well as indexing of texts with their metadata for faceted search. The framework aims at processing both English and Russian text messages and documents. We report the results of the experimental evaluation of the framework components on Twitter data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://aot.ru/.

  2. 2.

    http://maltparser.org/.

  3. 3.

    http://www.geonames.org/.

  4. 4.

    http://www.marinevesseltraffic.com/.

  5. 5.

    http://crisislex.org/.

References

  1. Deviatkin, D., Shelmanov, A.: Towards text processing system for emergency event detection in the Arctic zone. In: Selected Papers of the XVIII International Conference on Data Analytics and Management in Data Intensive Domains, CEUR Workshop Proceedings, pp. 148–154 (2016)

    Google Scholar 

  2. Sixto, J., Pena, O., Klein, B., López-de Ipina, D.: Enable tweet-geolocation and don’t drive ERTs crazy! Improving situational awareness using Twitter. In: Proceedings of SMERST, pp. 27–31 (2013)

    Google Scholar 

  3. Yin, J., Karimi, S., Robinson, B., Cameron, M.: ESA: emergency situation awareness via microbloggers. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2701–2703. ACM (2012)

    Google Scholar 

  4. Purohit, H., Sheth, A.P.: Twitris v3: from citizen sensing to analysis, coordination and action. In: Proceedings of ICWSM, pp. 746–747 (2013)

    Google Scholar 

  5. MacEachren, A.M., Jaiswal, A., Robinson, A.C., Pezanowski, S., Savelyev, A., Mitra, P., Zhang, X., Blanford, J.: Senseplace2: Geotwitter analytics support for situational awareness. In: Proceedings of Visual Analytics Science and Technology (VAST) on IEEE Conference, pp. 181–190 (2011)

    Google Scholar 

  6. Verma, S., Vieweg, S., Corvey, W.J., Palen, L., Martin, J.H., Palmer, M., Schram, A., Anderson, K.M.: Natural language processing to the rescue? Extracting “situational awareness” tweets during mass emergency. In: Proceedings of ICWSM, pp. 385–392 (2011)

    Google Scholar 

  7. Imran, M., Castillo, C., Lucas, J., Meier, P., Vieweg, S.: AIDR: artificial intelligence for disaster response. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 159–162 (2014)

    Google Scholar 

  8. Ashktorab, Z., Brown, C., Nandi, M., Culotta, A.: Tweedr: mining Twitter to inform disaster response. In: Proceedings of ISCRAM, pp. 354–358 (2014)

    Google Scholar 

  9. Li, R., Lei, K.H., Khadiwala, R., Chang, K.C.C.: Tedas: a Twitter-based event detection and analysis system. In: 2012 IEEE 28th International Conference on Data engineering (ICDE), pp. 1273–1276. IEEE (2012)

    Google Scholar 

  10. Avvenuti, M., Del Vigna, F., Cresci, S., Marchetti, A., Tesconi, M.: Pulling information from social media in the aftermath of unpredictable disasters. In: 2015 2nd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 258–264. IEEE (2015)

    Google Scholar 

  11. Li, R., Wang, S., Chang, K.C.C.: Towards social data platform: automatic topic-focused monitor for Twitter stream. Proc. VLDB Endowment 6(14), 1966–1977 (2013)

    Article  Google Scholar 

  12. Gossen, G., Demidova, E., Risse, T.: The iCrawl wizard – Supporting interactive focused crawl specification. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 797–800. Springer, Cham (2015). doi:10.1007/978-3-319-16354-3_88

    Google Scholar 

  13. Boanjak, M., Oliveira, E., Martins, J., Mendes Rodrigues, E., Sarmento, L.: Twitterecho: a distributed focused crawler to support open research with twitter data. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 1233–1240. ACM (2012)

    Google Scholar 

  14. Nguyen, D.T., Mannai, K.A.A., Joty, S., Sajjad, H., Imran, M., Mitra, P.: Rapid classification of crisis-related data on social networks using convolutional neural networks (2016). arXiv preprint arXiv:1608.03902

  15. Caragea, C., Silvescu, A., Tapia, A.H.: Identifying informative messages in disaster events using convolutional neural networks. In: International Conference on Information Systems for Crisis Response and Management (2016)

    Google Scholar 

  16. Al-Rfou, R., Kulkarni, V., Perozzi, B., Skiena, S.: Polyglot-NER: massive multilingual named entity recognition. In: Proceedings of the 2015 SIAM International Conference on Data Mining, SIAM (2015)

    Google Scholar 

  17. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)

    Google Scholar 

  18. Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51(2), 32–49 (2015)

    Article  Google Scholar 

  19. Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554. Association of Computational Linguistics (2015)

    Google Scholar 

  20. Arenas, M., Cuenca Grau, B., Evgeny, E., Marciuska, S., Zheleznyakov, D.: Towards semantic faceted search. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 219–220. ACM (2014)

    Google Scholar 

  21. Bast, H., Buchhold, B.: An index for efficient semantic full-text search. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 369–378. ACM (2013)

    Google Scholar 

  22. Armentano, M.G., Godoy, D., Campo, M., Amandi, A.: NLP-based faceted search: experience in the development of a science and technology search engine. Expert Syst. Appl. 41(6), 2886–2896 (2014)

    Article  Google Scholar 

  23. Zubarev, D., Sochenkov, I.: Using sentence similarity measure for plagiarism source retrieval. In: CLEF (Working Notes), pp. 1027–1034 (2014)

    Google Scholar 

  24. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)

    Google Scholar 

  25. Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 641–648 (2008)

    Google Scholar 

  26. Shelmanov, A.O., Smirnov, I.V.: Methods for semantic role labeling of Russian texts. In: Computational Linguistics and Intellectual Technologies, Papers from the Annual International Conference “Dialogue”, no. 13, pp. 607–620 (2014)

    Google Scholar 

  27. Padró, L., Stanilovsky, E.: Freeling 3.0: towards wider multilinguality. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012). ELRA (2012)

    Google Scholar 

  28. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  29. Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification (2015). arXiv preprint arXiv:1511.08630

  30. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 14, 1532–1543 (2014)

    Google Scholar 

  31. Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of ICWSM (2014)

    Google Scholar 

  32. Fafalios, P., Tzitzikas, Y.: Exploratory professional search through semantic post-analysis of search results. In: Paltoglou, G., Loizides, F., Hansen, P. (eds.) Professional Search in the Modern World. LNCS, vol. 8830, pp. 166–192. Springer, Cham (2014). doi:10.1007/978-3-319-12511-4_9

    Google Scholar 

  33. Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., Shelmanov, A.: Exactus expert – search and analytical engine for research and development support. In: Hadjiski, M., Kasabov, N., Filev, D., Jotsov, V. (eds.) Novel Applications of Intelligent Systems, vol. 586, pp. 269–285. Springer, Switzerland (2016)

    Chapter  Google Scholar 

Download references

Acknowledgments

The project is supported by the Russian Foundation for Basic Research, project number: 15-29-06045 “ofi_m”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artem Shelmanov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Devyatkin, D., Shelmanov, A. (2017). Text Processing Framework for Emergency Event Detection in the Arctic Zone. In: Kalinichenko, L., Kuznetsov, S., Manolopoulos, Y. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2016. Communications in Computer and Information Science, vol 706. Springer, Cham. https://doi.org/10.1007/978-3-319-57135-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57135-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57134-8

  • Online ISBN: 978-3-319-57135-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics