Abstract
The article presents results of an experiment consisting in automatic concept annotation of the transliterated spontaneous human-human dialogues in the city transportation domain. The data source was a corpus of dialogues collected at a Warsaw call center and annotated with about 200 concepts’ types. The machine learning technique we used is the linear-chain Conditional Random Fields (CRF) sequence labeling approach. The model based on word lemmas in a window of length 5 gave results of concept recognition with an F-measure equal to 0.85.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bonneau-Maynard, H., Rosset, S.: A semantic representation for spoken dialog. In: Eurospeech, Geneva (2003)
Deschat, K., Moens, M.: Efficient Hierarchical Entity Classifier Using Conditional Random Fields. In: Proceedings of the 2nd Workshop on Ontology Learning and Populations, Sydney Australia, pp. 33–40 (2006)
Galen, A.: A Hybrid/Semi-Markov Conditional Random Field for Sequence Segmentation, In. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, ACL, Sydney (2006)
Lafferty, J., McCallum, A., Pereira, P.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of 18th International Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)
Marciniak, M., Rabiega-Wiśniewska, J., Mykowiecka, A.: Proper Names in Dialogs from Warsaw Transportation Call Ceter. In: Intelligent Information Systems XVI, Zakopane, EXIT, Warsaw (2008)
Mykowiecka, A., Marasek, K., Marciniak, M., Gubrynowicz, R.J.: Rabiega-Wiániewska: Annotation of Polish spoken dialogs in LUNA project. In: Proceedings of the Language and Technology Conference, Poznan (2007)
Mykowiecka, A., Marciniak, M., Głowińska, K.: Automatic semantic annotation of polish dialogue corpus. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 625–632. Springer, Heidelberg (2008)
Ponomareva, N., Rosso, P., Pla, F., Molina, A.: Conditional Random Fields vs. HiddenMarkov Models in a Biomedical Named Entity Recognition Task. In: Proceedings of the RANLP 2007 conference, Bulgaria, Borovets (2007)
Raymond, C., Bechet, F., De Mori, R., Damnati, G.: On the use of finite state transducers for semantic interpretation. Speech Communication 48, 288–304 (2006)
Settles, B.: Biomedical Named Entitiy Recognition Using Conditional Random Fields and Rich Feature Sets. In: Proceedings of the COLING 2004 International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA), Geneva, Switzerland (2004)
Sha, F., Pereira, F.: Shallow Parsing with Conditional Random Fields. In: Proceedings of Human Language Technology-NAACL 2003, Edmonton, Canada (2003)
Sutton, C., McCallum, A.: An Introduction to Conditional Random Fields for Relational Learning. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT Press, Cambridge (2006)
Zhu, C., Byrd, R.H., Nocedal, J.: L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23(4), 550–560 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mykowiecka, A., Waszczuk, J. (2009). Semantic Annotation of City Transportation Information Dialogues Using CRF Method. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_56
Download citation
DOI: https://doi.org/10.1007/978-3-642-04208-9_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)