Skip to main content

An Experiment with Theme–Rheme Identification

  • Conference paper
Text, Speech and Dialogue (TSD 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8655))

Included in the following conference series:

  • 1533 Accesses

Abstract

In this paper we start from the theory of Functional Sentence Perspective developed primarily by Firbas [1], Svoboda [12] and also later by Sgall et al. [9].

We make an attempt to formulate and implement a procedure for Czech allowing to automatically recognize which sentence constituents carry information that is contextually dependent and thus known to an addressee (theme), constituents containing new information (rheme), and also constituents bearing non-thematic and non-rhematic information (transition).

The experimental implementation of the procedure uses tools developed in NLP Centre, FI MU, particularly the morphological analyzer Majka [17], disambiguator DESAMB [16] and parser SET [5].

As a starting data resource we use a small corpus of 120 Czech sentences, which at the moment does not include a free continuous text. This is motivated by the fact that we do not use syntactically pre-tagged text but perform syntactic analysis directly using the parser SET. Thus, we offer only a very basic evaluation, which captures the main FSP phenomena and shows that the task is feasible.

The toolset developed for the experiment consists of two parts: first, a chunker, which determines word-order positions from the parse tree of a sentence, second, an FSP tagger which is the implementation of the procedure. It labels the chunks with the tags of what is further called functional elements (e.g. theme proper, transition, rheme proper). An experimental version is available at http://nlp.fi.muni.cz/~xsvobo15/fsp/fsp.html .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Firbas, J.: On the problem of non-thematic subjects in contemporary English (English summary of “k otázce nezákladových podmětů v současné angličtině”, ib. pp. 22–42 and 165–173). Časopis pro moderní filologii 39, 171–173 (1957)

    Google Scholar 

  2. Firbas, J.: Functional sentence perspective in written and spoken communication. Cambridge University Press (1992) (reprinted 1995)

    Google Scholar 

  3. Hajičová, E., Sgall, P., Skoumalová, H.: An automatic procedure for topic-focus identification. Journal of Computational Linguistics 21(1), 81–94 (1995)

    Google Scholar 

  4. Karlík, P., Svoboda, A.: Skladba češtiny pro cizince (Czech Syntax for Foreigners). Univerzita J.E. Purkyně, Faculty of Arts, Brno (1982)

    Google Scholar 

  5. Kovář, V., Horák, A., Jakubíček, M.: Syntactic analysis using finite patterns: A new parsing system for Czech. In: Human Language Technology: Challenges for Computer Science and Linguistics, pp. 161–171 (2011)

    Google Scholar 

  6. Mathesius, V.: O tak zvaném aktuálním členění větném (on the so-called functional sentence perspective). Slovo a Slovesnost 5, 171–174 (1939)

    Google Scholar 

  7. Mikulová, M., Bémová, A., Hajič, J., Hajičová, E., Havelka, J., Kolářová-řezníčková, V., Kučová, L., Lopatková, M., Pajas, P., Panevová, J., Razímová, M., Sgall, P., Štěpánek, J., Urešová, Z., Veselá, K., Žabokrtský, Z.: Annotation on the tectogrammatical layer in the Prague Dependency Treebank. Tech. rep., ÚFAL MFF UK, Prague, Czech Republic (2005), http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/t-layer/html/index.html

  8. Pala, K., Svoboda, O.: Semi-automatic theme-rheme identification. In: Proceedings of the Raslan Workshop, pp. 39–48. Karlova Studánka (2013)

    Google Scholar 

  9. Sgall, P.: Towards a definition of focus and topic. Prague Bulletin of Mathematical Linguistics 31, 32, 3–25, 24–32 (1979, 1980)

    Google Scholar 

  10. Steinberger, R., Bennett, P.: Automatic recognition of theme, focus and contrastive stress. In: Proceedings of the Conference Focus and NLP (1994)

    Google Scholar 

  11. Svoboda, A.: České slovosledné pozice z pohledu aktuálního členění. Slovo a slovesnost 45, 22–34, 88–103 (1984), http://kramerius.lib.cas.cz/search/i.jsp?pid=uuid:c9de3a32-530d-11e1-1418-001143e3f55c

  12. Svoboda, A.: Kapitoly z funkční syntaxe. In: Spisy pedagogické fakulty v Ostravě. vol. 66 (1989)

    Google Scholar 

  13. Veselá, K., Havelka, J.: Anotování aktuálního členění věty v pražském závislostním korpusu, ÚFAL/CKL TR-2003-20 (2003), http://ufal.mff.cuni.cz/pdt2.0/publications/VeselaHavelkaTR2003.pdf

  14. Zikánová, Š., Týnovský, M.: Identification of topic and focus in czech: Comparative evaluation on prague dependency treebank. In: Studies in Formal Slavic Phonology, Morphology, Syntax, Semantics and Information Structure (Formal Description of Slavic Languages 7, pp. 343–353. Peter Lang, Frankfurt am Main (2009)

    Google Scholar 

  15. Zikánová, Š., Týnovský, M., Havelka, J.: Identification of topic and focus in czech: Evaluation of manual parallel annotations. The Prague Bulletin of Mathematical Linguistics (87), 61–70 (2007)

    Google Scholar 

  16. Šmerk, P.: Unsupervised learning of rules for morphological disambiguation. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 211–216. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  17. Šmerk, P.: Majka – fast morphological analyzer. In: Proceedings of the Raslan Workshop, pp. 13–16. Masarykova Univerzita, Brno (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Pala, K., Svoboda, O. (2014). An Experiment with Theme–Rheme Identification. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10816-2_34

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10815-5

  • Online ISBN: 978-3-319-10816-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics