Skip to main content

Active Learning to Speed-Up the Training Process for Dialogue Act Labelling

  • Conference paper
  • First Online:
Book cover Human Language Technology Challenges for Computer Science and Linguistics (LTC 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8387))

Included in the following conference series:

  • 837 Accesses

Abstract

The dialogue act labelling task is the process of splitting and annotating a dialogue into dialogue meaningful units; the labelling task can be performed semi-automatically by using statistical models trained from previously annotated dialogues. The appropiate selection of training dialogues can make the process faster, and Active Learning is one suitable strategy for this selection. In this work, Active Learning based on two different criteria (Weighted Number of Hypothesis and Entropy) has been tested for the task of dialogue act labelling by using the N-gram Transducers model. The framework was tested against two heterogeneous corpora, DIHANA and SwitchBoard. The results confirm the goodness of this kind of selection strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alcácer, N., Benedí, J.M., Blat, F., Granell, R., Martínez, C.D., Torres, F.: Acquisition and labelling of a spontaneous speech dialogue corpus. In: SPECOM, Greece, pp. 583–586 (2005)

    Google Scholar 

  • Benedí, J.M., Lleida, E., Varona, A., Castro, M.J., Galiano, I., Justo, R., López, I., Miguel, A.: Design and acquisition of a telephone spontaneous speech dialogue corpus in spanish: DIHANA. In: Fifth LREC, Genova, Italy, pp. 1636–1639 (2006)

    Google Scholar 

  • Bunt, H.: Context and dialogue control. THINK Q. 3, 19–31 (1994)

    Google Scholar 

  • Casacuberta, F., Vidal, E., Picó, D.: Inference of finite-state transducers from regular languages. Pat. Recogn. 38(9), 1431–1443 (2005)

    Article  MATH  Google Scholar 

  • Core, M.G., Allen, J.F.: Coding dialogues with the DAMSL annotation scheme. In: Traum, D. (ed.) Working Notes: AAAI Fall Symposium on Communicative Action in Humans and Machines, pp. 28–35. AAAI, Menlo Park (1997)

    Google Scholar 

  • Godfrey, J., Holliman, E., McDaniel, J.: SWITCHBOARD: telephone speech corpus for research and development. In: Proceedings of the ICASSP-92, pp. 517–520 (1992)

    Google Scholar 

  • Hwa, R.: Sample selection for statistical grammar induction. In: Proceedings of the 2000 Joint SIGDAT, pp. 45–52. Association for Computational Linguistics, Morristown (2000)

    Google Scholar 

  • Jurafsky, D., Shriberg, E., Biasca, D.: Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual - draft 13. Technical report 97–01, University of Colorado Institute of Cognitive Science (1997)

    Google Scholar 

  • Lavie, A., Levin, L., Zhan, P., Taboada, M., Gates, D., Lapata, M.M., Clark, C., Broadhead, M., Waibel, A.: Expanding the domain of a multi-lingual speech-to-speech translation system. In: Proceedings of the Workshop on Spoken Language Translation, ACL/EACL-97, pp. 67–72 (1997)

    Google Scholar 

  • Martínez-Hinarejos, C.D., Tamarit, V., Benedí, J.M.: Improving unsegmented dialogue turns annotation with N-gram transducers. In: Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation (PACLIC23), vol. 1, pp. 345–354 (2009)

    Google Scholar 

  • Riccardi, G., Tür, D.: Active and unsupervised learning for automatic speech recognition. In: INTERSPEECH (2003)

    Google Scholar 

  • Robinson, D.W.: Entropy and uncertainty. Entropy 10, 493–506 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Stolcke, A., Coccaro, N., Bates, R., Taylor, P., van Ess-Dykema, C., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., Meteer, M.: Dialogue act modelling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 1–34 (2000)

    Article  Google Scholar 

  • Young, S.: Probabilistic methods in spoken dialogue systems. Philos. Trans. R. Soc. (Series A) 358(1769), 1389–1402 (2000)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

Work supported by EC under FP7 project CasMaCat (FP7-28757), and by Spanish MINECO under projects STraDA (TIN2012-37475-C02-01) and Active2Trans (TIN2012-31723), and by Spanish MED/MICINN under the FPI scholarship BES-2009-028965, and by GVA under project AMIIS (ISIC/2012/004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabrizio Ghigi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ghigi, F., Martínez-Hinarejos, CD., Benedí, JM. (2014). Active Learning to Speed-Up the Training Process for Dialogue Act Labelling. In: Vetulani, Z., Mariani, J. (eds) Human Language Technology Challenges for Computer Science and Linguistics. LTC 2011. Lecture Notes in Computer Science(), vol 8387. Springer, Cham. https://doi.org/10.1007/978-3-319-08958-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08958-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08957-7

  • Online ISBN: 978-3-319-08958-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics