Direct and Wordgraph-Based Confidence Measures in Dialogue Annotation with N-Gram Transducers

Martínez-Hinarejos, Carlos-D.; Tamarit, Vicent; Benedí, José-Miguel

doi:10.1007/978-3-319-08958-4_22

Carlos-D. Martínez-Hinarejos⁶,
Vicent Tamarit⁶ &
José-Miguel Benedí⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8387))

Included in the following conference series:

Language and Technology Conference

828 Accesses

Abstract

Dialogue annotation is a necessary step for the development of dialogue systems, specially for data-based dialogue strategies. Manual annotation is hard and time-consuming, and automatic techniques can be used to obtain a draft annotation and speed up the process. The presentation of the draft annotation with confidence levels on the correctness of every part of the hypothesis can make even faster the supervision process. In this paper we propose two methods to calculate confidence measures for an automatic dialogue annotation model, and test them for the annotation of a task-oriented human-computer corpus on railway information. The results show that our proposals have a similar behaviour and that they are a good starting point for incorporating confidence measures in the dialogue annotation process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Grammatical Inference and Alignment for Transducer Infer.
2.
Available in http://www.dsic.upv.es/~cmartine/research/resources.html

References

Alcácer, N., Benedí, J.M., Blat, F., Granell, R., Martínez, C.D., Torres, F.: Acquisition and labelling of a spontaneous speech dialogue corpus. In: SPECOM, Greece, pp. 583–586 (2005)
Google Scholar
Austin, J.L.: How to Do Things with Words. Oxford University Press, London (1962)
Google Scholar
Benedí, J.M., Lleida, E., Varona, A., Castro, M.J., Galiano, I., Justo, R., López, I., Miguel, A.: Design and acquisition of a telephone spontaneous speech dialogue corpus in spanish: DIHANA. In: Fifth LREC, Genova, Italy, pp. 1636–1639 (2006)
Google Scholar
Bunt, H.: Context and dialogue control. THINK Q. 3, 19–31 (1994)
Google Scholar
Casacuberta, F., Vidal, E., Picó, D.: Inference of finite-state transducers from regular languages. Pattern Recogn. 38(9), 1431–1443 (2005)
Article MATH Google Scholar
Core, M.G., Allen, J.F.: Coding dialogues with the DAMSL annotation scheme. In: Traum, D. (ed.) Working Notes: AAAI Fall Symposium on Communicative Action in Humans and Machines, pp. 28–35. AAAI, Menlo Park (1997)
Google Scholar
Devijver, P.A.: Baum’s forward-backward algorithm revisited. Pattern Recogn. Lett. 3(6), 369–373 (1985)
Article MATH Google Scholar
Egan, J.P.: Signal Detection Theory and Roc Analysis. Academic Press, New York (1975)
Google Scholar
Fraser, M., Gilbert, G.: Simulating speech systems. Comp. Speech Lang. 5, 81–99 (1991)
Article Google Scholar
Fukada, T., Koll, D., Waibel, A., Tanigaki, K.: Probabilistic dialogue act extraction for concept based multilingual translation systems. In: Proceedings of ICSLP, vol. 6, pp. 2771–2774 (1998)
Google Scholar
Ghigi, F., Tamarit, V., Martínez-Hinarejos, C.-D., Benedí, J.-M.: Active learning for dialogue act labelling. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 652–659. Springer, Heidelberg (2011)
Chapter Google Scholar
Jiang, H.: Confidence measures for speech recognition: a survey. Speech Comm. 45(4), 455–470 (2005)
Article Google Scholar
Lee, G.G., Mariani, J., Minker, W., Nakamura, S. (eds.): IWSDS 2010. LNCS, vol. 6392. Springer, Heidelberg (2010)
Google Scholar
Martínez-Hinarejos, C.-D., Benedí, J.M., Granell, R.: Statistical framework for a spanish spoken dialogue corpus. Speech Commun. 50, 992–1008 (2008)
Article Google Scholar
Martínez-Hinarejos, C.-D., Tamarit, V., Benedí, J.M.: Improving unsegmented dialogue turns annotation with N-gram transducers. In: Proceedings of PACLIC23, vol. 1, pp. 345–354. City University of Hong Kong Press, Hong Kong (2009)
Google Scholar
San-Segundo, R., Pellom, B., Hacioglu, K., Ward, W., Pardo, J.M.: Confidence measures for spoken dialogue systems. In: ICASSP, vol. 1, pp. 393–396. IEEE Computer Society, Los Alamitos (2001)
Google Scholar
Sánchez-Sáez, R., Sánchez, J.A., Benedí, J.M.: Statistical confidence measures for probabilistic parsing. In: Proceedings of RANLP’09, Borovets, Bulgaria, pp. 388–392, September 2009
Google Scholar
Seneff, S., Polifroni, J.: Dialogue management in Mercury flight reservation system. In: ANLP-NAACL, pp. 1–6 (2000)
Google Scholar
Stolcke, A., Coccaro, N., Bates, R., Taylor, P., van Ess-Dykema, C., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., Meteer, M.: Dialogue act modelling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 1–34 (2000)
Article Google Scholar
Tamarit, V., Martínez-Hinarejos, C.-D., Benedí, J.M.: On the use of N-gram transducers for dialogue annotation. In: Spoken Dialogue Systems Technology and Design, pp. 255–276. Springer, New York (2011)
Google Scholar
Ueffing, N., Macherey, K., Ney, H.: Confidence measures for statistical machine translation. In: Proceedings of the MT Summit IX, pp. 394–401. Springer (2003)
Google Scholar
Walker, M., Passonneau, R.: DATE: a dialogue act tagging scheme for evaluation of spoken dialogue systems. In: HLT’01: Proceedings of the 1st International Conference on Human Language Technology, San Diego, pp. 1–8 (2001)
Google Scholar
Wessel, F., Schlüter, R., Macherey, K., Ney, H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9, 288–298 (2001)
Article Google Scholar
Williams, J.D., Young, S.: Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 393–422 (2007)
Article Google Scholar

Download references

Acknowledgments

Work supported by EC under FP7 project CasMaCat (FP7-28757), and by Spanish MINECO under projects STraDA (TIN2012-37475-C02-01) and Active2Trans (TIN2012-31723), and by GVA under project AMIIS (ISIC/2012/004).

Author information

Authors and Affiliations

PRHLT Research Center, Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain
Carlos-D. Martínez-Hinarejos, Vicent Tamarit & José-Miguel Benedí

Authors

Carlos-D. Martínez-Hinarejos
View author publications
You can also search for this author in PubMed Google Scholar
Vicent Tamarit
View author publications
You can also search for this author in PubMed Google Scholar
José-Miguel Benedí
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carlos-D. Martínez-Hinarejos .

Editor information

Editors and Affiliations

Adam Mickiewicz University, Poznań, Poland
Zygmunt Vetulani
IMMI-CNRS, Orsay, France
Joseph Mariani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martínez-Hinarejos, CD., Tamarit, V., Benedí, JM. (2014). Direct and Wordgraph-Based Confidence Measures in Dialogue Annotation with N-Gram Transducers. In: Vetulani, Z., Mariani, J. (eds) Human Language Technology Challenges for Computer Science and Linguistics. LTC 2011. Lecture Notes in Computer Science(), vol 8387. Springer, Cham. https://doi.org/10.1007/978-3-319-08958-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-08958-4_22
Published: 26 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08957-7
Online ISBN: 978-3-319-08958-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics