Skip to main content

The 2007 AMI(DA) System for Meeting Transcription

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4625))

Abstract

Meeting transcription is one of the main tasks for large vocabulary automatic speech recognition (ASR) and is supported by several large international projects in the area. The conversational nature, the difficult acoustics, and the necessity of high quality speech transcripts for higher level processing make ASR of meeting recordings an interesting challenge. This paper describes the development and system architecture of the 2007 AMIDA meeting transcription system, the third of such systems developed in a collaboration of six research sites. Different variants of the system participated in all speech to text transcription tasks of the 2007 NIST RT evaluations and showed very competitive performance. The best result was obtained on close-talking microphone data where a final word error rate of 24.9% was obtained.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fiscus, J.: Spring 2007 (RT-07) Rich Transcription Meeting Recognition Evaluation Plan. U.S. NIST (2007)

    Google Scholar 

  2. Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., McCowan, I., Moore, D., Wan, V., Ordelman, R., Renals, S.: The development of the AMI system for the transcription of speech in meetings. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869. Springer, Heidelberg (2006)

    Google Scholar 

  3. Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., McCowan, I., Moore, D., Wan, V., Ordelman, R., Renals, S.: The 2005 AMI system for the transcription of speech in meetings. In: Proc. NIST RT 2005, Edinburgh (2005)

    Google Scholar 

  4. Fitt, S.: Documentation and user guide to UNISYN lexicon and post-lexical rules. Technical report, Centre for Speech Technology Research, Edinburgh (2000)

    Google Scholar 

  5. Burget, L.: Combination of speech features using smoothed heteroscedastic linear discriminant analysis. In: Proc. ICSLP, Jeju Island, Korea, pp. 4–7 (2004)

    Google Scholar 

  6. Povey, D.: Discriminative Training for Large Vocabulary Speech, Recognition. PhD thesis, Cambridge University (2004)

    Google Scholar 

  7. Gales, M.J., Woodland, P.: Mean and variance adaptation within the mllr framework. Computer Speech & Language 10, 249–264 (1996)

    Article  Google Scholar 

  8. Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., Vepa, J., Wan, V.: The ami meeting transcription system: Progress and performance. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 419–431. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Cieri, C., Miller, D., Walker, K.: The fisher corpus: a resource for the next generations of speech-to-text. In: LREC 2004: Fourth International Conference on Language Resources and Evaluatio, Lisbon (2004)

    Google Scholar 

  10. Carletta, J., Ashby, S., Bourban, S., Guillemot, M., Kronenthal, M., Lathoud, G., Lincoln, M., McCowan, I., Hain, T., Kraaij, W., Post, W., Kadlec, J., Wellner, P., Flynn, M., Reidsma, D.: The AMI meeting corpus. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. van Leeuwen, D.A., Huijbregts, M.: The ami speaker diarization system for nist rt06s meeting data. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 371–384. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: Proceedings IEEE ICASSP (2003)

    Google Scholar 

  13. Garofolo, J., Laprun, C., Miche, M., Stanford, V., Tabassi, E.: The nist meeting room pilot corpus. In: Proc. LREC 2004 (2004)

    Google Scholar 

  14. Burger, S., MacLaren, V., Yu, H.: The ISL meeting corpus: The impact of meeting type on speech style. In: Proc. ICSLP (2002)

    Google Scholar 

  15. Schwarz, P., Matějka, P., Černocký, J.: Hierarchical structures of neural networks for phoneme recognition. In: IEEE ICASSP (accepted, 2006)

    Google Scholar 

  16. Karafiat, M., Burget, L., Hain, T., Cernocky, J.: Application of cmllr in narrow band wide band adapted systems. In: Proc 8th international conference INTERSPEECH 2007, Antwerp, p. 4 (2007)

    Google Scholar 

  17. Grezl, F., Karafiat, M., Kontar, S., Cernocky, J.: Probabilistic and bottle-neck features for lvcsr of meetings. In: Proc. ICASSP, vol. 4, pp. IV–757–IV–760 (2007)

    Google Scholar 

  18. Wan, V., Hain, T.: Strategies for language model web-data collection. In: Proc. ICASSP 2006. Number SLP-P17.11 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rainer Stiefelhagen Rachel Bowers Jonathan Fiscus

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hain, T. et al. (2008). The 2007 AMI(DA) System for Meeting Transcription. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68585-2_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68584-5

  • Online ISBN: 978-3-540-68585-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics