Abstract
One recurring theme in the TEI project has been the need to represent non-hierarchical information in a natural way — or at least in a way that is acceptable to those who must use it — using a technical tool that assumes a single hierarchical representation. This paper proposes solutions to a variety of such problems: the encoding of segments which do not reflect a document's primary hierarchy; relationships among non-adjacent segments of texts; ambiguous content; overlapping structures; parallel structures; cross-references; vague locations.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
David T. Barnard is Professor of Computing and Information Science at Queen's University. His research interests are in structured text processing and the compilation of programming languages. His recent publications include “Tree-to-tree Correction for Document Trees”, Queen's Technical Report, and “Error Handling in a Parallel LR Substring Parser”,Computer Languages, 19,4 (1993) 247–59.
Lou Burnard is Director of the Oxford Text Archive at Oxford University Computing Services, with interests in electronic text and database technology. He is European Editor of the Text Encoding Initiative's Guidelines.
Jean-Pierre Gaspart is with Associated Consultants and Software Engineers.
Lynne A. Price (Ph.D., computer sciences, University of Wisconsin-Madison) is a senior software engineer at Frame Technology Corp. Her main area of research has been representing text structure for automatic processing. She has served on both the US and international SGML standards committee for several years and is the editor ofInternational Standard ISO/IEC 13673 on Conformance Testing for Standard Generalized Markup Language (SGML) Systems.
C. M. Sperberg-McQueen is a Senior Research Programmer at the academic computer center of the University of Illinois at Chicago; his interests include medieval Germanic languages and literatures and the theory of electronic text markup. Since 1988 he has been editor in chief of the ACH/ACL/ALLC Text Encoding Initiative.
Giovanni Battista Varile works for the Commission of the European Communities.
This paper is derived from a working paper of the Metalanguage Committee entitled “Notes on SGML Solutions to Markup Problems” which was produced following a meeting of the committee in Luxembourg. The co-authors all participated in that meeting and provided input to this paper. Others serving on the committee at other times included David Durand (Boston University), Nancy Ide (Vassar College) and Frank Tompa (University of Waterloo).
Rights and permissions
About this article
Cite this article
Barnard, D.T., Burnard, L., Gaspart, JP. et al. Hierarchical encoding of text: Technical problems and SGML solutions. Comput Hum 29, 211–231 (1995). https://doi.org/10.1007/BF01830617
Issue Date:
DOI: https://doi.org/10.1007/BF01830617