ABSTRACT
Over centuries texts of all genres have been connected by quotes, allusions, idioms, stylistic imitations and many more. Understanding literature means understanding these kinds of intertextual relations. The goal of the Göttingen sub-project of eTRACES, an interdisciplinary project of humanists and computer scientists, is to enable research on this essential part of literary studies. We are creating a digital working environment, a tool called GERTRUDE (Göttingen E-Research: Text Re-Use for Digital Editions), in an effort to determine whether and to what degree such a tool can support a researcher in finding, marking and annotating intertextual relations:
- Especially in big text corpora, looking for intertextual relations can be very time-consuming. So Text-Mining-algorithms are integrated to determine so called "textual re-use", hoping also to find interesting textual relations not yet known or expected (serendipity effect).
- For referencing an exact text passage, the TextGrid Citation Schema is used, because it enables us to mark up a segment of text down to the granularity of letters and to address different editions of a text.
- The possibilities and limitations of annotating or even evaluating a text passage in its relation to others, its form, function and"degree"of intertextuality will be researched by creating this tool as a crowd-sourcing environment: It is usable by everyone interested and it is also integrated in university courses, where students are encouraged to use it. By this means it is possible to compare and discuss the results as well as the usability, possibilities and limitations of the tool.
Our approach is based on German literature from 1500s to 1900s and is part of a BMBF-sponsored text corpus available under a Creative Commons License online.
In the future it will be possible to use other corpora, even in languages other than German, if the algorithms are adapted.
Because of its great influence, we chose the Luther-Bible and its re-use in German literature as a use case.
- Ulrich Broich. Intertextualität. In Reallexikon der deutschen Literaturwissenschaft. de Gruyter, Berlin {etc.}, 1997--2003.Google Scholar
- Marco Büchler. Medusa: Performante Textstatistiken auf großen Textmengen:. Vdm Verlag Dr. Müller, September 2008.Google Scholar
- Marco Büchler. Informationstechnische Aspekte des Historical Text Re-use. to be published 2013.Google Scholar
- Stuart Dunn and Mark Hedges. Crowd-sourcing scoping study: Engaging the crowd with humanities research: Ahrc report, 2012.Google Scholar
- Franco Moretti. Conjectures on world literature. New Left Review (NLR), 1, 2000.Google Scholar
- Andrea Polaschegg and Daniel Weidner. Bibel und Literatur: Topographie eines Spannungsfeldes. In Andrea Polaschegg and Daniel Weidner, editors, Das Buch in den Büchern. Fink, München, 2012.Google Scholar
Index Terms
- Biblical intertextuality in a digital world: the tool GERTRUDE
Recommendations
Mining Synonymous Transliterations from the World Wide Web
The World Wide Web has been considered one of the important sources for information. Using search engines to retrieve Web pages can gather lots of information, including foreign information. However, to be better understood by local readers, proper ...
Integrating MultiWordNet with Italian Sign Language lexical resources
A novel Italian Sign Language MultiWordNet (LMWN), which integrates the MultiWordNet (MWN) lexical database with the Italian Sign Language (LIS), is presented in this paper. The approach relies on LIS lexical resources which support and help to search ...
A novel Arabic lemmatization algorithm
AND '08: Proceedings of the second workshop on Analytics for noisy unstructured text dataTokenization is a fundamental step in processing textual data preceding the tasks of information retrieval, text mining, and natural language processing. Tokenization is a language-dependent approach, including normalization, stop words removal, ...
Comments