Skip to main content
Log in

Creating an interoperable language resource for interoperable linguistic studies

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

There are two different levels of interoperability for language resources: operational interoperability and conceptual interoperability. The former refers to the standardization of the formal aspects of language resources so that different resources can work together. The latter refers to the standardization of the notional representation of the semantic content of the analysis. This article addresses both issues but focuses on the latter through a description of the annotation and analysis of the International Corpus of English, which is a corpus for the study of English as a global language. The project is parameterised by component, regional sub-corpora and a set of pre-defined textual categories. The one-million-word British component has been constructed, grammatically tagged, and syntactically parsed. This article is first of all a description of steps taken to ensure conformity within the project. These include corpus design, part-of-speech tagging, and syntactic parsing. The article will then present a study that examines the use of adverbial clauses across speech and writing, illustrating the imminent necessity for interoperable analysis of linguistic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The X axis in Fig. 3 has legends indicating the proportion of adverbial clauses in the following groups of samples in ICE–GB:

    • Spon: spontaneous conversations

    • Speech: complete spoken samples

    • Scripted: scripted broadcast news and talks

    • Timed: timed university essays

    • Writing: complete written samples

    • Untimed: untimed university essays.

References

  • Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Fang, A. C. (1996a). Grammatical tagging and cross-tagset mapping. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 110–124). Oxford: Oxford University Press.

  • Fang, A. C. (1996b). The survey parser: Design and development. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 142–160). Oxford: Oxford University Press.

  • Fang, A. C. (2000). From cases to rules and vice versa: robust practical parsing with analogy. In Proceedings of the sixth international workshop on parsing technologies, 23–25 February 2000, Trento, Italy, pp. 77–88.

  • Fang, A. C. (2008). Measuring a syntactically Rich Parser with an evaluation scheme for automatic speech recognition. In Proceedings of the first workshop on syntactic annotations for interoperable language resources, Hong Kong, 8 January 2008.

  • Greenbaum, S. (1992). A new corpus of English: ICE. In J. Svartvik (Ed), Directions in corpus linguistics: Proceedings of nobel symposium 82, Stockholm 48 August 199 (pp. 171–179). Berlin: Mouton de Gruyter.

  • Greenbaum, S. (1996). The international corpus of English. Oxford: Oxford University Press.

    Google Scholar 

  • Greenbaum, S., & Ni, Y. (1996). About the ICE tagset. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 92–109). Oxford: Oxford University Press.

  • Thompson, S. (1984). Subordination in formal and informal discourse. In D. Schffrin (Ed.), Meaning, form, and use in context: Linguistic applications (pp. 85–94). Washington DC: Georgetown University Press.

    Google Scholar 

  • Witt, A., Heid, U., Sasaki, F., & Sérasset, G. (2009). Multilingual language resources and interoperability. Language Resource and Evaluation, 43, 1–14.

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by research grants from City University of Hong Kong (Project Nos 7002387, 7008002, 9610126 and 9610053).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Chengyu Fang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, A.C. Creating an interoperable language resource for interoperable linguistic studies. Lang Resources & Evaluation 46, 327–340 (2012). https://doi.org/10.1007/s10579-012-9189-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-012-9189-9

Keywords

Navigation