Abstract
The NITE XML Toolkit (NXT) is open source software for working with language corpora, with particular strengths for multimodal and heavily cross-annotated data sets. In NXT, annotations are described by types and attribute value pairs, and can relate to signal via start and end times, to representations of the external environment, and to each other via either an arbitrary graph structure or a multi-rooted tree structure characterized by both temporal and structural orderings. Simple queries in NXT express variable bindings for n-tuples of objects, optionally constrained by type, and give a set of conditions on the n-tuples combined with boolean operators. The defined operators for the condition tests allow full access to the timing and structural properties of the data model. A complex query facility passes variable bindings from one query to another for filtering, returning a tree structure. In addition to describing NXTȁ9s core data handling and search capabilities, we explain the stand-off XML data storage format that it employs and illustrate its use with examples from an early adopter of the technology.
Similar content being viewed by others
References
AGTK: Annotation Graph Toolkit. (n.d.). Retrieved 06 January 2005, from http://www.agtk. sourceforge.net/
Allen, J. F. (1984). Towards a general theory of action and time. Artificial Intelligence, 23(2), 123–154.
Anderson, A. H., Bader, M., Bard, E. G., Boyle, E., Doherty, G., Garrod, S., et al. (1991). The HCRC map task corpus. Language and Speech, 34(4), 351–366
ATLAS Project. (2000, 6 Feb 2003). Retrieved 1 March 2004, from http://www.nist.gov/speech/atlas/
Bird, S., & Liberman, M. (2001). A formal framework for linguistic annotation. Speech Communication, 33(1–2), 23–60.
Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., Siméon, J., et al. (n.d., 29 October 2004). XQuery 1.0: An XML query language. Retrieved 06 January 2005, from http://www.w3.org/TR/xquery/
Boersma, P., & Weenink, D. (n.d.). Praat: doing phonetics by computer. Retrieved 14 October 2005, from http://www.fon.hum.uva.nl/praat/
Callisto (n.d., 20 April 2004). Retrieved 06 January 2005, from http://www.callisto.mitre.org/
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., et al. (2005, 11–13 July). The AMI meeting corpus: A pre-announcement. Paper presented at the 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinburgh.
Carletta, J., Dingare, S., Nissim, M., & Nikitina, T. (2004, May 26–28). Using the NITE XML Toolkit on the Switchboard Corpus to study syntactic choice: a case study. Paper presented at the Fourth Language Resources and Evaluation Conference, Lisbon, Portugal.
Carletta, J., Kilgour, J., Oȁ9Donnell, T., Evert, S., & Voormann, H. (2003). The NITE object model library for handling structured linguistic annotation on multimodal data sets. In: N. Ide (Ed.), Proceedings of the EACL workshop on language technology and the semantic web (3rd Workshop on NLP and XML, NLPXML-2003).
Carletta, J. C., Isard, A., Isard, S., Kowtko, J., Doherty-Sneddon, G., & Anderson, A. (1997). The reliability of a dialogue structure coding scheme. Computational Linguistics, 23(1), 13–31.
Carletta, J. C., & Kilgour, J. (2005). The NITE XML Toolkit Meets the ICSI Meeting Corpus: Import, Annotation, and Browsing. In: S. Bengio & H. Bourlard (Eds.), Machine learning for multimodal interaction: First international workshop, MLMI 2004, Martigny, Switzerland, June 21–23, 2004, Revised Selected Papers (pp. 111–121). Berlin: Springer-Verlag.
Carletta, J. C., McKelvie, D., Isard, A., Mengel, A., Klein, M., & Møller, M. B. (2004). A generic approach to software support for linguistic annotation using XML. In: G. Sampson, D. McCarthy (Eds.), Corpus linguistics: Readings in a widening discipline. London and NY: Continuum International
Cassidy, S., & Harrington, J. (2001). Multi-level annotation in the Emu speech database management system. Speech Communication, 33, 61–77.
Clark, J., & DeRose, S. (1999, 16 November). XML path language (XPath) version 1.0. Retrieved 26 May 2003, from http://www.w3.org/TR/xpath
DeRose, S., Maler, E., & Daniel, R. Jr. (2001, 11 September). XML pointer language (XPointer) version 1.0. From http://www.w3.org/TR/xptr/
DeRose, S., Maler, E., & Orchard, D. (2001, 27 June). XML linking language (XLink) version 1.0. Retrieved 05 April 2004, from http://www.w3.org/TR/xlink/
Evert, S., Carletta, J. C., Oȁ9Donnell, T. J., Kilgour, J., Vogele, A., & Voormann, H. (2002). NXT data model. Retrieved 26 May 2003, from http://www.ltg.ed.ac.uk/NITE/
Extensions to Transcriber for Meeting Recorder Transcription. (n.d.). Retrieved 14 October 2005, from http://www.icsi.berkeley.edu/Speech/mr/channeltrans.html
Godfrey, J., Holliman, E., & McDaniel, J. (1992, March). Switchboard: Telephone speech corpus for research development. Paper presented at the International Conference on Acoustics, Speech, and Signal Processing, San Francisco, CA, USA.
Heid, U., Voormann, H., Milde, J.-T., Gut, U., Erk, K., & Padó, S. (2004). Querying both time-aligned and hierarchical corpora with NXT Search. Paper presented at the Fourth Language Resources and Evaluation Conference, Lisbon, Portugal.
ISO/TC37/SC4. (n.d.). Welcome to ISO/TC 37/SC 4 Language Resources Management. Retrieved 17 October 2005, from http://www.tc37sc4.org/index.php
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., et al. (2003). The ICSI meeting corpus. Paper presented at the IEEE International Conference on Acoustics, Speech and Signal Processing, Hong Kong.
jATLAS. (n.d., 15 October 2003). Retrieved 06 January 2005, from http://www.jatlas.sourceforge.net/
Laprun, C., Fiscus, J. G., Garofolo, J., & Pajot, S. (2002, May). A practical introduction to ATLAS. Paper presented at the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas.
Lickley, R. J. (1998). HCRC disfluency coding manual (HCRC Technical Report No. 100): Human Communication Research Centre, University of Edinburgh.
Marcus, M., Kim, G., Marcinkiewicz, M. A., MacIntyre, R., Bies, A., Ferguson, M., et al. (1994). The penn treebank: Annotating predicate argument structure. Paper presented at the ARPA Human Language Technology Workshop.
Max-Planck-Institute for Psycholinguistics. (n.d.). Tools. Retrieved 14 October 2005, from http://www.mpi.nl/tools/elan.html
Milde, J.-T., & Gut, U. (2002, 11–13 April 2002). A prosodic corpus of non-native speech. Paper presented at the Speech Prosody 2002 conference, Aix-en-Provence.
Noldus, L. P. J. J., Trienes, R. J. H., Hendriksen, A. H. M., Jansen, H., & Jansen, R. G. (2000). The observer video-pro: New software for the collection, management, and presentation of time-structured data from videotapes and digital media files. Behavior Research Methods, Instruments & Computers 32, 197–206.
NXT Search. (n.d., 3 July 2003). Retrieved 1 March 2004, from http://www.ims.uni-stuttgart.de/projekte/nite/
SR Research. (n.d.). Complete eyetracking solutions. Retrieved 14 October 2005, from http://www.eyelinkinfo.com/
Strassel, S., Maxwell, M., & Cieri, C. (2003). Linguistic resource creation for research and technology development: A recent experiment. ACM Transactions on Asian Language Information Processing, 2(2), 101–117.
Sumec, S. (n.d.). Event editor. Retrieved 14 October 2005, http://www.fit.vutbr.cz/research/grants/m4/editor/index.htm.cs.iso-8859-2
Taylor, P., Black, A., & Caley, R. (2001). Heterogeneous relation graphs as a formalism for representing linguistic information. Speech Communication, 33(1–2), 153–174.
TEI Consortium. (1999). TEI: The text encoding initiative. Retrieved 26 May 2003, from http://www. tei-c.org/
The Apache XML Project. (n.d.). Xerces2 java parser readme. Retrieved 15 March 2004, from http://www.xml.apache.org/xerces2-j/index.html
The Edinburgh Speech Tools Library. (n.d.). Retrieved 23 December 2004, from http://www. cstr.ed.ac.uk/projects/speech_tools/
The EMU Speech Database System. (2001). Retrieved 23 December 2004, from http://www. emu.sourceforge.net/
The Language Technology Group. (n.d.). LTG software. Retrieved 15 March 2004, from http://www. ltg.ed.ac.uk/software/index.html
Tiger Project. (n.d., 17 Nov 03). Linguistic interpretation of a German corpus. Retrieved 1 March 2004, from http://www.ims.uni-stuttgart.de/projekte/TIGER/
W3C. (2005, 16 September). EMMA: Extensible MultiModal Annotation markup language. Retrieved 17 October 2005, from http://www.w3.org/TR/2005/WD-emma-20050916/
Wittenburg, P., Broeder, D., & Sloman, B. (2000). EAGLES/ISLE: A proposal for a meta description standard for language resources, white paper. Paper presented at the LREC 2000 Workshop, Athens, Greece.
Acknowledgements
NXT has been developed under European Commission funding (NITE, IST-2000-26095 and AMI, FP6-506811) and was based heavily upon the successes and failures of an earlier EC-funded prototype (MATE, LE4-8370). We are grateful for this funding and also for permission to use sample data from the EPSRC (UK) funded MONITOR project and from the Ph.D. work of Hannele Nicholson for our worked example in “Example corpus treatment”. We are also grateful to those who support Sourceforge and the Open Source Community.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carletta, J., Evert, S., Heid, U. et al. The NITE XML Toolkit: Data Model and Query Language. Lang Resources & Evaluation 39, 313–334 (2005). https://doi.org/10.1007/s10579-006-9001-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-006-9001-9