Skip to main content
Log in

Marking up in TATOE and exporting to SGML

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

This paper presents a method for developing limited-context grammar rules in order to mark up text automatically, by attaching specific text segments to a small number of well-defined and application-determined semantic categories. The Text Analysis Tool with Object Encoding (TATOE) was used in order to support the iterative process of developing a set of rules as well as for constructing and managing the lexical resources. The work reported here is part of a real-world application scenario: the automatic semantic mark up of German news messages, as provided by a German press agency, according to the SGML-based standard News Industry Text Format (NITF) to facilitate their further exchange. The implemented export mechanism of the semantic mark up into NITF is also described in the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alexa, M. and L. Rostek. "Pattern Concordances-TATOE Calls XGrammar." ACH-ALLC '97 Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 3–

  • Alexa, M. and L. Rostek. "Computer-Assisted, Corpus-Based Analysis Text with TATOE." ALLCACH96, Book of Abstracts. Bergen, Norway, 1996, pp. 11–17.

    Google Scholar 

  • Chen, Hsin-Hsi and J.-L. Lee. "Identification and Classification of Proper Nouns in Chinese Texts." Proceedings of COLING-96, Vol. 1. Copenhagen, Denmark, 1996, pp. 222–229.

    Google Scholar 

  • Flanders, J., S. Bauman, P. Caton, M. Cournane, W. McCarty and J. Bradley. "Applying the TEI: Problems in the Classification of Proper Nouns." ACH-ALLC Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 53–58.

  • Hockey, S., T. Butler, S. Brown and S. Fischer. "The Orlando Project: Humanities Computing in Conversation with Literary History." ACH-ALLC Conference Abstracts. Queens University, Kingston, Canada, June 3-7, 1997, pp. 83–89.

  • Kitani, T. and T. Mitamura. "An Accurate Morphological Analysis and Proper Noun Identification for Japanese Text Processing." Transactions of Information Processing Society of Japan, 35(3) (1994), 404–413.

    Google Scholar 

  • Lingsoft-GERTWOL. German Morphological Analyzer, available from Lingsoft. Finland, 1996. {urhttp://www.lingsoft.fi}.

  • Mani, I. and R. T. MacMillan. "Identifying Unknown Proper Names in Newswire Text." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 41–59.

    Google Scholar 

  • McCarty, W. "Encoding Persons and Places in the Metamorphoses of Ovid. Part 1: Engineering the Text" (published 1994), Texte(13/14) (1993), 121–172.

    Google Scholar 

  • McCarty, W. "Peering Through the Skylight. Part 2: Towards an Electronic Edition of Ovid's Metamorphoses" (published 1995), Texte(15/16) (1994), 261–305.

    Google Scholar 

  • McDonald, D. "Internal and External Evidence in the Identification and Semantic Categorization of Proper Names." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 21–39

    Google Scholar 

  • Paik, W., E. D. Liddy, E. Yu and M. McKenna. "Categorizing and Standardizing Proper Nouns for Efficient Information Retrieval." In Corpus Processing for Lexical Acquisition. Ed. B. Boguraev and J. Pustejovsky. MIT Press, MA, 1996, pp. 61–73.

    Google Scholar 

  • Rostek, L., W. Moehr and D. Fischer. "Weaving a Web: The Structure and Creation of an Object Network Representing an Electronic Reference Work." Electronic Publishing, 6(4) (1993), 495–505.

    Google Scholar 

  • Wakao, T., R. Gaizauskas and Y.Wilks. "Evaluation of an Algorithm for the Recognition and Classi-fication of Proper Nouns." Proceedings of COLING-96, Vol. 1. Copenhagen, Denmark, 1996, pp. 418–423.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rostek, L., Alexa, M. Marking up in TATOE and exporting to SGML. Computers and the Humanities 31, 311–326 (1997). https://doi.org/10.1023/A:1001070608920

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1001070608920

Navigation