Creating an interoperable language resource for interoperable linguistic studies

Fang, Alex Chengyu

doi:10.1007/s10579-012-9189-9

Creating an interoperable language resource for interoperable linguistic studies

Original Paper
Published: 24 May 2012

Volume 46, pages 327–340, (2012)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Alex Chengyu Fang¹

190 Accesses
1 Citation
Explore all metrics

Abstract

There are two different levels of interoperability for language resources: operational interoperability and conceptual interoperability. The former refers to the standardization of the formal aspects of language resources so that different resources can work together. The latter refers to the standardization of the notional representation of the semantic content of the analysis. This article addresses both issues but focuses on the latter through a description of the annotation and analysis of the International Corpus of English, which is a corpus for the study of English as a global language. The project is parameterised by component, regional sub-corpora and a set of pre-defined textual categories. The one-million-word British component has been constructed, grammatically tagged, and syntactically parsed. This article is first of all a description of steps taken to ensure conformity within the project. These include corpus design, part-of-speech tagging, and syntactic parsing. The article will then present a study that examines the use of adverbial clauses across speech and writing, illustrating the imminent necessity for interoperable analysis of linguistic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MULTEXT-East

Towards Open Data for Linguistics: Linguistic Linked Data

Parallel Corpora

Notes

The X axis in Fig. 3 has legends indicating the proportion of adverbial clauses in the following groups of samples in ICE–GB:
- Spon: spontaneous conversations
- Speech: complete spoken samples
- Scripted: scripted broadcast news and talks
- Timed: timed university essays
- Writing: complete written samples
- Untimed: untimed university essays.

References

Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Book Google Scholar
Fang, A. C. (1996a). Grammatical tagging and cross-tagset mapping. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 110–124). Oxford: Oxford University Press.
Fang, A. C. (1996b). The survey parser: Design and development. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 142–160). Oxford: Oxford University Press.
Fang, A. C. (2000). From cases to rules and vice versa: robust practical parsing with analogy. In Proceedings of the sixth international workshop on parsing technologies, 23–25 February 2000, Trento, Italy, pp. 77–88.
Fang, A. C. (2008). Measuring a syntactically Rich Parser with an evaluation scheme for automatic speech recognition. In Proceedings of the first workshop on syntactic annotations for interoperable language resources, Hong Kong, 8 January 2008.
Greenbaum, S. (1992). A new corpus of English: ICE. In J. Svartvik (Ed), Directions in corpus linguistics: Proceedings of nobel symposium 82, Stockholm 4–8 August 199 (pp. 171–179). Berlin: Mouton de Gruyter.
Greenbaum, S. (1996). The international corpus of English. Oxford: Oxford University Press.
Google Scholar
Greenbaum, S., & Ni, Y. (1996). About the ICE tagset. In S. Greenbaum (Ed.), Comparing English worldwide: The international corpus of English (pp. 92–109). Oxford: Oxford University Press.
Thompson, S. (1984). Subordination in formal and informal discourse. In D. Schffrin (Ed.), Meaning, form, and use in context: Linguistic applications (pp. 85–94). Washington DC: Georgetown University Press.
Google Scholar
Witt, A., Heid, U., Sasaki, F., & Sérasset, G. (2009). Multilingual language resources and interoperability. Language Resource and Evaluation, 43, 1–14.
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by research grants from City University of Hong Kong (Project Nos 7002387, 7008002, 9610126 and 9610053).

Author information

Authors and Affiliations

Department of Chinese, Translation and Linguistics, City University of Hong Kong, Hong Kong, China
Alex Chengyu Fang

Authors

Alex Chengyu Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Chengyu Fang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, A.C. Creating an interoperable language resource for interoperable linguistic studies. Lang Resources & Evaluation 46, 327–340 (2012). https://doi.org/10.1007/s10579-012-9189-9

Download citation

Published: 24 May 2012
Issue Date: June 2012
DOI: https://doi.org/10.1007/s10579-012-9189-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Creating an interoperable language resource for interoperable linguistic studies

Abstract

Access this article

Similar content being viewed by others

MULTEXT-East

Towards Open Data for Linguistics: Linguistic Linked Data

Parallel Corpora

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Creating an interoperable language resource for interoperable linguistic studies

Abstract

Access this article

Similar content being viewed by others

MULTEXT-East

Towards Open Data for Linguistics: Linguistic Linked Data

Parallel Corpora

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation