Abstract
The paper deals with extracting contexts for keywords found in text, in particular in Automatic Speech Recognition (ASR) output. We propose using a syntactic parser to find contexts by analysing the sentence structure, rather than simply using a window of several words on the left and right of the keyword, or the whole sentence. This method provides concise but meaningful contexts that are easily readable by humans and can also be used in applications such as thematic clustering. We describe the Russian SemSin system which combines a syntactic dependency parser and elements of semantic ontology. We demonstrate the use of SemSin for our task both for normal text and for recognition output, and outline the suggestions for future developments of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–442. ACM (2002)
Mihalcea, R., Tarau, P.: A language independent algorithm for single, multiple document summarization. In: IJCNLP (2005)
Boyarsky, K., Kanevsky, E.: Vega - a system for text classification and analysis. LAP Lambert Academic Publishing, Saarbrũcken (2011). in Russian
Boyarsky, K., Kanevsky, E.: The semantic-and-syntactic parser SemSin. In: Dialog 2012 (2012). http://www.dialog-21.ru/digest/2012/?type=doc. in Russian
Tuzov, V.A.: Computer semantics of the Russian language. Saint-Petersburg State University Publishing House, Saint-Petersburg (2004). in Russian
Covington, M.A.: A dependency parser for variable-word-order languages. Research Report (1990)
Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 641–648. Association for Computational Linguistics (2008)
Chernykh, G., Korenevsky, M., Levin, K., Ponomareva, I., Tomashenko, N.: State level control for acoustic model training. In: Ronzhin, A., Potapova, R., Delic, V. (eds.) SPECOM 2014. LNCS (LNAI), vol. 8773, pp. 435–442. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11581-8_54
Tomashenko, N., Khokhlov, Y.: Speaker adaptation of context dependent deep neural networks based on MAP-adaptation, GMM-derived feature processing. In: INTERSPEECH 2014 - Proceedings of the 15th Annual Conference of the International Speech Communication Association, pp. 2997–3001 (2014)
Popova, S., Krivosheeva, T., Korenevsky, M.: Automatic stop list generation for clustering recognition results of call center recordings. In: Ronzhin, A., Potapova, R., Delic, V. (eds.) SPECOM 2014. LNCS (LNAI), vol. 8773, pp. 137–144. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11581-8_17
Acknowledgements
The work was financially supported by the Ministry of Education and Science of the Russian Federation, Contract 14.579.21.0008, ID RFMEFI57914X0008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Khomitsevich, O., Boyarsky, K., Kanevsky, E., Bulusheva, A., Mendelev, V. (2017). Flexible Context Extraction for Keywords in Russian Automatic Speech Recognition Results. In: Ignatov, D., et al. Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol 661. Springer, Cham. https://doi.org/10.1007/978-3-319-52920-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-52920-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52919-6
Online ISBN: 978-3-319-52920-2
eBook Packages: Computer ScienceComputer Science (R0)