Abstract
We present the text processing framework for detection and analysis of events related to emergencies in a specified region. We consider the Arctic zone as a particular example. The peculiarity of the task consists in data sparseness and scarceness of tools/language resources for processing such specific texts. The system performs focused crawling of texts related to emergencies in the Arctic region, information extraction including named entity recognition, geotagging, vessel name recognition, and detection of emergency related messages, as well as indexing of texts with their metadata for faceted search. The framework aims at processing both English and Russian text messages and documents. We report the results of the experimental evaluation of the framework components on Twitter data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Deviatkin, D., Shelmanov, A.: Towards text processing system for emergency event detection in the Arctic zone. In: Selected Papers of the XVIII International Conference on Data Analytics and Management in Data Intensive Domains, CEUR Workshop Proceedings, pp. 148–154 (2016)
Sixto, J., Pena, O., Klein, B., López-de Ipina, D.: Enable tweet-geolocation and don’t drive ERTs crazy! Improving situational awareness using Twitter. In: Proceedings of SMERST, pp. 27–31 (2013)
Yin, J., Karimi, S., Robinson, B., Cameron, M.: ESA: emergency situation awareness via microbloggers. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2701–2703. ACM (2012)
Purohit, H., Sheth, A.P.: Twitris v3: from citizen sensing to analysis, coordination and action. In: Proceedings of ICWSM, pp. 746–747 (2013)
MacEachren, A.M., Jaiswal, A., Robinson, A.C., Pezanowski, S., Savelyev, A., Mitra, P., Zhang, X., Blanford, J.: Senseplace2: Geotwitter analytics support for situational awareness. In: Proceedings of Visual Analytics Science and Technology (VAST) on IEEE Conference, pp. 181–190 (2011)
Verma, S., Vieweg, S., Corvey, W.J., Palen, L., Martin, J.H., Palmer, M., Schram, A., Anderson, K.M.: Natural language processing to the rescue? Extracting “situational awareness” tweets during mass emergency. In: Proceedings of ICWSM, pp. 385–392 (2011)
Imran, M., Castillo, C., Lucas, J., Meier, P., Vieweg, S.: AIDR: artificial intelligence for disaster response. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 159–162 (2014)
Ashktorab, Z., Brown, C., Nandi, M., Culotta, A.: Tweedr: mining Twitter to inform disaster response. In: Proceedings of ISCRAM, pp. 354–358 (2014)
Li, R., Lei, K.H., Khadiwala, R., Chang, K.C.C.: Tedas: a Twitter-based event detection and analysis system. In: 2012 IEEE 28th International Conference on Data engineering (ICDE), pp. 1273–1276. IEEE (2012)
Avvenuti, M., Del Vigna, F., Cresci, S., Marchetti, A., Tesconi, M.: Pulling information from social media in the aftermath of unpredictable disasters. In: 2015 2nd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), pp. 258–264. IEEE (2015)
Li, R., Wang, S., Chang, K.C.C.: Towards social data platform: automatic topic-focused monitor for Twitter stream. Proc. VLDB Endowment 6(14), 1966–1977 (2013)
Gossen, G., Demidova, E., Risse, T.: The iCrawl wizard – Supporting interactive focused crawl specification. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 797–800. Springer, Cham (2015). doi:10.1007/978-3-319-16354-3_88
Boanjak, M., Oliveira, E., Martins, J., Mendes Rodrigues, E., Sarmento, L.: Twitterecho: a distributed focused crawler to support open research with twitter data. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 1233–1240. ACM (2012)
Nguyen, D.T., Mannai, K.A.A., Joty, S., Sajjad, H., Imran, M., Mitra, P.: Rapid classification of crisis-related data on social networks using convolutional neural networks (2016). arXiv preprint arXiv:1608.03902
Caragea, C., Silvescu, A., Tapia, A.H.: Identifying informative messages in disaster events using convolutional neural networks. In: International Conference on Information Systems for Crisis Response and Management (2016)
Al-Rfou, R., Kulkarni, V., Perozzi, B., Skiena, S.: Polyglot-NER: massive multilingual named entity recognition. In: Proceedings of the 2015 SIAM International Conference on Data Mining, SIAM (2015)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., Petrak, J., Bontcheva, K.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51(2), 32–49 (2015)
Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554. Association of Computational Linguistics (2015)
Arenas, M., Cuenca Grau, B., Evgeny, E., Marciuska, S., Zheleznyakov, D.: Towards semantic faceted search. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 219–220. ACM (2014)
Bast, H., Buchhold, B.: An index for efficient semantic full-text search. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 369–378. ACM (2013)
Armentano, M.G., Godoy, D., Campo, M., Amandi, A.: NLP-based faceted search: experience in the development of a science and technology search engine. Expert Syst. Appl. 41(6), 2886–2896 (2014)
Zubarev, D., Sochenkov, I.: Using sentence similarity measure for plagiarism source retrieval. In: CLEF (Working Notes), pp. 1027–1034 (2014)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)
Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 641–648 (2008)
Shelmanov, A.O., Smirnov, I.V.: Methods for semantic role labeling of Russian texts. In: Computational Linguistics and Intellectual Technologies, Papers from the Annual International Conference “Dialogue”, no. 13, pp. 607–620 (2014)
Padró, L., Stanilovsky, E.: Freeling 3.0: towards wider multilinguality. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012). ELRA (2012)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification (2015). arXiv preprint arXiv:1511.08630
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 14, 1532–1543 (2014)
Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of ICWSM (2014)
Fafalios, P., Tzitzikas, Y.: Exploratory professional search through semantic post-analysis of search results. In: Paltoglou, G., Loizides, F., Hansen, P. (eds.) Professional Search in the Modern World. LNCS, vol. 8830, pp. 166–192. Springer, Cham (2014). doi:10.1007/978-3-319-12511-4_9
Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., Shelmanov, A.: Exactus expert – search and analytical engine for research and development support. In: Hadjiski, M., Kasabov, N., Filev, D., Jotsov, V. (eds.) Novel Applications of Intelligent Systems, vol. 586, pp. 269–285. Springer, Switzerland (2016)
Acknowledgments
The project is supported by the Russian Foundation for Basic Research, project number: 15-29-06045 “ofi_m”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Devyatkin, D., Shelmanov, A. (2017). Text Processing Framework for Emergency Event Detection in the Arctic Zone. In: Kalinichenko, L., Kuznetsov, S., Manolopoulos, Y. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2016. Communications in Computer and Information Science, vol 706. Springer, Cham. https://doi.org/10.1007/978-3-319-57135-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-57135-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57134-8
Online ISBN: 978-3-319-57135-5
eBook Packages: Computer ScienceComputer Science (R0)