Abstract
This paper considers a number of issues surrounding current annotation science and corpus analysis and presents a bespoke suite of software, the Corpus Analysis Toolkit, for processing and analysing multilevel annotations of time-aligned linguistic data. The toolkit provides a variety of specialised tools for performing temporal analysis of annotated linguistic data. The toolkit is feature-set and corpus independent and offers support for a number of commonly used annotations formats.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aioanei, D.: YASPER: A Knowledge-based and Data-driven Speech Recognition Framework. PhD Thesis, University College Dublin (2008)
Bird, S., Klein, E.: Phonological Events. Journal of Linguistics 26, 33–56 (1990)
Bird, S., Liberman, M.: A Formal Framework for Linguistic Annotation. Speech Communication 33, 23–60 (2001)
Boersma, P., Weenik, D.: A System for Doing Phonetics by Computer. Glot International 5, 9–10 (2001)
Browman, C., Goldstein, L.: Towards an Articulatory Phonology. Phonology Yearbook 2, 219–252 (1986)
Brugman, H., Russell, A., Broeder, D., Wittenburg, P.: Eudico - Annotation and Exploitation of Multimedia Corpora over the Internet (2000)
Carson-Berndsen, J.: Time Map Phonology 5, Text Speech and Language Technology. Kluwer, Dordrecht (1998)
Goldsmith, J.: Autosegmental Phonology. PhD Thesis, MIT, Boston, USA (1976)
Greenberg, S.: Speaking in shorthand a syllable-centric perspective for pronunciation variation. Speech Communication 29, 159–176 (1999)
Ide, N., Romary, L., de la Clergerie, E.: International Standard for a Linguistic Annotation Framework. CoRR abs/0707.3269 (2007)
Kanokphara, S., Carson-Berndsen, J.: Better HMM-Based Articulatory Feature Extraction with Context-Dependent Models. FLAIR (2005)
Kelly, R.: Learning Multitape Finite-state Machines from Multilevel Annotations. PhD Thesis, University College Dublin (2005)
Kelly, R., Neugebauer, M., Walsh, M., Wilson, S.: Annotating Syllable Corpora with Linguistic Data Categories in XML. In: Proceedings of the 4th International Conference on Linguistic Resources and Evaluation (2004)
Kipp, M.: Anvil A Generic Annotation Tool for Multimodal Dialogue. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), pp. 1367–1370 (2001)
Patterson, E.K., Gurbuz, S., Tufecki, Z., Gowdy, J.N.: CUAVE: A New Audio-visual Database for Multimodal Human Computer Interface Research. IEEE Conference on Acoustics, Speech and Signal Processing (2002)
Saenko, K., Livescu, K., Glass, J., Darell, T.: Visual Speech Recognition with Loosely Synchronised Feature Streams. In: Proceedings ICCV, Beijing (2005)
Sagey, E.: On the ill-formedness of crossing association lines. Linguistic Enquiry 19(1), 109–118 (1988)
Schmidt, T.: The transcription system EXMARaLDA: an application of the annotation graph formalism as the basis of a database of multilingual spoken discourse. In: Proceedings of the IRCS Workshop on Linguistic Databases (2001)
Van Bael, C., Boves, L., van den Heuvel, H., Strik, H.: Automatic Transcription of Large Speech Corpora. Computer Speech and Language 21(4), 652–668 (2007)
Walsh, M., Wilson, S.: An Agent-based Framework for Audio-visual Speech Investigation. In: Proceedings of Audio Visual Speech Processing Conference (2005)
Wilson, S.: Gesture-based Representations of Speech - Acquiring and Analysing Resources for Audio-visual Processing. PhD Thesis. University College Dublin (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wilson, S., Carson-Berndsen, J. (2011). The Corpus Analysis Toolkit - Analysing Multilevel Annotations. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-20095-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20094-6
Online ISBN: 978-3-642-20095-3
eBook Packages: Computer ScienceComputer Science (R0)