Abstract
The development of computerized information retrieval dialogue systems communicating with the user in natural language requires the implementation of an effective training procedure with the aid of which the main modules of the dialogue system can be partly automatically developed. The presented paper describes an attempt to create the sentence templates automatically, using a special program package implementing an especially developed method of a quantitative linguistic analysis of transcribed real dialogues. Firstly, the program package generates a set of formulas (templates) consisting of elements of a special grammar and describing the syntactic structure of required sentences. Secondly, it generates a large corpus of unique training sentences using the sentence templates and a stochastic context-free grammar. The experimentally created corpus was used for the training of modules of a city information dialogue system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hoffmannová, J.; Mullerová, O.: “Dialog v češtině”, Sagner Verlag, Munchen, 1999.
Mullerová, O.: “Výskyt a funkce slova “no” v českých textech prostě sdělovacího stylu”, Stylistika, Vol. 4, 1996, pp. 222–229.
Rieck, S.: “Parametrisierung und Klassifikation gesprochener Sprache”, PhD. Thesis, University of Erlangen, 1994.
Selting, M.: “Fragments of TCUs as deviant cases of TCU-production in conversational talk”, University of Konstanz, InLiSt No. 9, 1998.
Selting, M.: “TCUs and TRPs: The Construction of Units in Conversational Talk”, University of Konstanz, InLiSt No. 4, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schwarz, J., Matoušek, V. (2001). Creation of a Corpus of Training Sentences Based on Automated Dialogue Analysis. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_56
Download citation
DOI: https://doi.org/10.1007/3-540-44805-5_56
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive