Abstract
There is a great deal of variation in the encoding of spoken texts in electronic form, both with respect to the types of features represented and the way particular features are rendered. This paper surveys problems in the electronic representation of speech and presents the solutions proposed by the Text Encoding Initiative. The special tags needed for the encoding of spoken texts are discussed, including a mechanism for temporal alignment. Further work is needed on phonological aspects, parallel representation, and on the development of software which connects the systematic underlying representation with a workable format for input and display.
Similar content being viewed by others
References
Atkinson, J.M. and J. Heritage, eds.Structures of Social Action: Studies in Conversation Analysis. Cambridge: Cambridge University Press, 1984.
Biber, D.Variation across Speech and Writing. Cambridge: Cambridge University Press, 1988.
Boase, S.London-Lund Corpus: Example Text and Transcription Guide. Survey of English Usage, University College London, 1990.
Coulthard, M. and M. Montgomery, eds.Studies in Discourse Analysis. London: Routledge & Kegan Paul, 1981.
Crowdy, S. “The Longman Approach to Spoken Corpus Design”. Manuscript, 1991.
Crystal, D.A Dictionary of Linguistics and Phonetics. 3rd ed. Oxford: Blackwell, 1991.
Du Bois, J.W., S. Schuetze-Coburn, D. Paolino and S. Cumming.Discourse Transcription. Santa Barbara: University of California, Santa Barbara, 1990.
Edwards, J.A. and M.D. Lampert, eds.Talking Data: Transcription and Coding in Discourse Research. Hillsdale, NJ: LAwrence Erlbaum, 1993.
Gaylord, H. “Character Sets”. In this volume.
Giordano, R. “The TEI Header and the Documentation of Electronic Texts”. In this volume.
Johansson, S., L. Burnard, J. Edwards and A. Rosta. “Working Paper on Spoken Texts”. Text Encoding Initiative, Spoken Text Work Group, 1991.
Loman, B. and N. Jörgensen.Manual för analys och beskrivning av makrosyntagmer. Lund: Studentlitteratur, 1971.
Sinclair, J. and M. Coulthard.Towards an Analysis of Discourse: The English Used by Teachers and Pupils. London: Oxford University Press, 1975.
Sperberg-McQueen, C.M. and L. Burnard, eds.Guidelines for the Encoding and Interchange of Machine-readable Texts. Draft version 1.0. Chicago and Oxford: Association for Computers and the Humanities/Association for Computational Linguistics/Association for Literary and Linguistic Computing, 1990.
Sperberg-McQueen, C.M. and L. Burnard, eds.Guidelines for Electronic Text Encoding and Interchange (TEI P3). Chicago and Oxford: Association for Computers and the Humanities/Association for Computational Linguistics/Association for Literary and Linguistic Computing, 1994.
Svartvik, J. and R. Quirk, eds.A Corpus of English Conversation. Lund Studies in English 56. Lund: Lund University Press, 1980.
Terkel, S.Working. People Talk about What They Do all Day and How They Feel about What They Do. New York: Avon Books, 1975.
The White House Transcripts. Submission of Recorded Presidential Conversations to the Committee on the Judiciary of the House of Representatives by President Nixon. By the New York Times Staff for the White House Transcripts. New York: Bantam Books, 1974.
Author information
Authors and Affiliations
Additional information
Stig Johansson is Professor of English Language at the Department of British and American Studies, University of Oslo. He is co-ordinating secretary of the International Computer Archive of Modern English (ICAME) and editor of theICAME Journal. Recent publications includeFrequency Analysis of English Vocabulary and Grammar (with Knut Hofland, Clarendon Press, 1989) andEnglish Computer Corpora (with Anna-Brita Stenström, Mouton de Gruyter, 1991).
Rights and permissions
About this article
Cite this article
Johansson, S. The encoding of spoken texts. Comput Hum 29, 149–158 (1995). https://doi.org/10.1007/BF01830708
Issue Date:
DOI: https://doi.org/10.1007/BF01830708