Abstract
This paper describes evaluation and recent advances in application of speech dictation system for the judicial domain. The dictation system incorporates Slovak speech recognition and uses a plugin for widely used office suite. It was introduced recently after preliminary user evaluation in the Slovak courts. The system was improved significantly using new acoustic databases for evaluation and acoustic modeling when compared to the previous version. The speaker adaptation procedure and gender dependent models significantly improve the overall accuracy below 5 % WER for domain specific test set. The language resources were extended and the language modeling techniques were improved as it is described in the paper. An end-user questionnaire about the user interface was evaluated and new functionalities were introduced. According to the available feedback, it can be concluded that the dictation system is able to speed up the court proceedings significantly for each user willing to cooperate with new technologies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Google Speech API Demo. http://www.google.com/intl/en/chrome/demos/speech.html. Accessed 17 September 2015
Lööf, J., Falavigna, D., Schlüter, R., Giuliani, D., Gretter, R., Ney H.: Evaluation of automatic transcription systems for the judicial domain. In: Proceedings of the IEEE Spoken Language Technology Workshop, SLT 2010, Berkeley, CA, USA, pp. 194–199 (2010)
Rusko, M., Juhár, J., Trnka, M., Staš, J., Darjaa, S., Hládek, D., Cerňák, M., Papco, M., Sabo, R., Pleva, M., Ritomský, M., Lojka, M.: Slovak automatic transcription and dictation system for the judicial domain. In: Proceedings of the 5th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC 2011, Poznań, Poland, pp. 365–369 (2011)
Darjaa, S., Cerňak, M., Beňuš, Š., Rusko, M., Sabo, R., Trnka, M.: Rule-based triphone mapping for acoustic modeling in automatic speech recognition. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS(LNAI), vol. 6836, pp. 268–275. Springer, Heidelberg (2011)
Vandecatseye, A., et al.: The COST278 Pan-European broadcast news database. In: Proceedings of the 6th Language Resources and Evaluation Conference, LREC 2004, Lisbon, Portugal, pp. 873–876 (2004)
Pleva, M., Juhár, J.: TUKE-BNews-SK: Slovak broadcast news corpus construction and evaluation. In: Proceedings of LREC 2014: Ninth International Conference on Language Resources and Evaluation, ELRA, Reykjavik, Iceland, pp. 1709–1713, 26–31 May 2014
Pleva, M., Juhár, J.: Building of broadcast news database for evaluation of the automated subtitling service. Commun. (Komunikacie) 15(2A), 124–128 (2013)
Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: development and use of a tool for assisting speech corpora production. Speech Commun. Spec. Issue Speech Annotation Corpus Tools 33(1–2), 5–22 (2000)
Slovak National Corpus, Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences. http://korpus.juls.savba.sk/prim(2d)6(2e)1.html. Accessed 17 September 2015
Hládek, D., Staš, J.: Text mining and processing for corpora creation in Slovak language. J. Comput. Sci. Control Syst. 3(1), 65–68 (2010)
Staš, J., Hládek, D., Juhár, J.: Recent advances in the statistical modeling of the Slovak language. In: Proceedings of the 54th International Symposium ELMAR 2014, Zadar, Croatia, pp. 39–42 (2014)
Venkataraman, A., Wang, W.: Techniques for effective vocabulary selection. In: Proceedings of EUROSPEECH 2003, Geneva, Switzerland, pp. 245–248 (2003)
Darjaa, S., Cerňak, M., Trnka, M., Rusko, M., Sabo, R.: Effective triphone mapping for acoustic modeling in speech recognition. In: Proceedings of INTERSPEECH 2011, Florence, Italy, pp. 1717–1720 (2011)
Lindberg, B., Johansen, F.T., Warakagoda, N., Lehtinen, G., Kacic, Z., Zgank, A., Elenius, K., Salvi, G.: A noise robust multilingual reference recognizer based on SpeechDat (II). In: Proceedings of INTERSPEECH 2000, Beijing, China, pp. 370–373 (2000)
Žgank, A., et al.: The COST 278 MASPER initiative - crosslingual speech recognition with large telephone databases. In: Proceedings of the 6th Language Resources and Evaluation Conference, LREC 2004, Lisbon, Portugal, pp. 2107–2110 (2004)
Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–186 (1995)
Jokisch, O., Wagner, A., Sabo, R., Jackel, R., Cylwik, N., Rusko, M., Ronzhin, A., Hoffman, R.: Multilingual speech data collection for the assessment of pronunciation and prosody in a language learning system. In: Karpov, A. (ed.) Proceedings of the 13th International Conference Speech and Computer, SPECOM 2009, St. Petersburg, Russia, pp. 515–520 (2009)
Papco, M., Juhár, J.: Comparison of acoustic model adaptation methods and adaptation database selection approaches. J. Electr. Electron. Eng. 3(1), 147–150 (2010)
Stolcke, A.: SRILM – An extensible language modeling toolkit. In: Proceedings of ICSLP 2002, Denver, Colorado, USA, pp. 901–904 (2002)
Zlacký, D., Staš, J., Juhár, J., Čižmár, A.: Text categorization with latent Dirichlet allocation. J. Electr. Electron. Eng. 7(1), 161–164 (2014)
Staš, J., Hládek, D., Juhár, J.: Language model adaptation for Slovak LVCSR. In: Proceedings of International Conference on Applied Electrical Engineering and Informatics, AEI 2010, Venice, Italy, pp. 101–106 (2010)
Staš, J., Juhár, J., Hládek, D.: Classification of heterogeneous text data for robust domain-specific language modeling. EURASIP J. Audio Speech Music Process. 2014(14), 12 (2014)
Chelba, C., Brants, T., Neveitt, W., Xu, P.: Study on interaction between entropy pruning and Kneser-Ney smoothing. In: Proceedings of INTERSPEECH 2010, Makuhari, Japan, pp. 2422–2425 (2010)
Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine Julius. In: Proceedings of the Asia-Pacific Signal and Information Processing Association, Annual Summit and Conference, APSIPA ASC 2009, Sapporo, Japan, pp. 131–137 (2009)
Alapetite, A., Andersen, H.B., Hertzum, M.: Acceptance of speech recognition by physicians: A survey of expectations, experiences, and social influence. Int. J. Hum Comput Stud. 67(1), 36–49 (2009)
Acknowledgements
The research of Technical university of Kosice team presented in this paper was partially supported by the Ministry of Education, Science, Research and Sport of the Slovak Republic under project VEGA 1/0075/15 (50 %) and partially by the Research and Development Operational Programme funded by the ERDF project implementation: University Science Park TECHNICOM for Innovation Applications Supported by Knowledge Technology, ITMS project # 26220220182 (50 %). The research of Slovak Academy of Sciences team presented in this paper was supported by the Ministry of Education, Science, Research and Sport of the Slovak Republic under research project VEGA 2/0197/15 (100 %).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Rusko, M. et al. (2016). Advances in the Slovak Judicial Domain Dictation System. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-43808-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43807-8
Online ISBN: 978-3-319-43808-5
eBook Packages: Computer ScienceComputer Science (R0)