The Multilingual Student Translation corpus: a resource for translation teaching and research

Granger, Sylviane; Lefer, Marie-Aude

doi:10.1007/s10579-020-09485-6

The Multilingual Student Translation corpus: a resource for translation teaching and research

Project Notes
Published: 25 January 2020

Volume 54, pages 1183–1199, (2020)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

1239 Accesses
19 Citations
Explore all metrics

Abstract

The Multilingual Student Translation (MUST) corpus is a corpus of translations produced by foreign language learners or trainee translators collected collaboratively by a large number of partner teams internationally. The corpus represents a prime example of community sourcing, as the data are collected and shared by the members of the MUST network. Two key characteristics of the corpus are that it involves a large number of language pairs and that each text is accompanied by a rich set of standardized metadata related to the source texts, the translation tasks and the students. The web interface on which the corpus is stored allows the data to be aligned and annotated with a purpose-built translation annotation system. The resulting corpus data lend themselves to a range of applications (translator training, materials design, pedagogical lexicography) and can also be used to advance empirical research in corpus-based translation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

https://uclouvain.be/en/research-institutes/ilc/cecl/must.html.
For a list of partners, see https://uclouvain.be/en/research-institutes/ilc/cecl/must-partners.html.
The project-specific interface, Hypal4MUST, is not available outside the MUST project but the generic Hypal interface can be used by researchers outside MUST (see https://hypal.eu).
The initial basis for the MUST genre taxonomy is Lee (2001)’s categorization of the genres included in the British National Corpus, supplemented with genres identified by the MUST community as being relevant to the project.
For this reason, student translations collected outside the MUST project cannot be included in MUST.

References

Alfuraih, R. F. (2019). The undergraduate learner translator corpus: a new resource for translation studies and computational linguistics. Language Resources & Evaluation. https://doi.org/10.1007/s10579-019-09472-6.
Article Google Scholar
Baker, M. (1993). Corpus linguistics and translation studies. Implications and applications. In M. Baker, G. Francis, & E. Tognini-Bonelli (Eds.), Text and Technology. In Honour of John Sinclair (pp. 233–250). Amsterdam: John Benjamins.
Chapter Google Scholar
Baker, M. (1995). Corpora in translation studies: An overview and some suggestions for future research. Target, 7(2), 223–243.
Article Google Scholar
Bowker, L., & Bennison, P. (2002). Translation tracking system: A tool for managing translation archives. Proceedings of the Third International Conference on Language Resources and Evaluation (pp. 503–507). Las Palmas, Canary Islands, 29–31 May 2002.
Bowker, L., & Bennison, P. (2003). Student translation archive: design, development and application. In F. Zanettin, S. Bernardini, & D. Stewart (Eds.), Corpora in Translator Education (pp. 103–117). London & New York: Routledge.
Google Scholar
Branzov, T. (2016). Community-sourcing in virtual societies. Serdica Journal of Computing, 10(3–4), 263–284.
Google Scholar
Castagnoli, S. (2009). Regularities and variations in learner translations: A corpus-based study of conjunctive explicitation. Unpublished PhD Thesis. Pisa University.
Castagnoli, S., Ciobanu, D., Kunz, K., Kübler, N., & Volanschi, A. (2011). Designing a learner translator corpus for training purposes. In N. Kübler (Ed.), Corpora, Language, Teaching, and Resources: From Theory to Practice (pp. 221–248). Bern: Peter Lang.
Google Scholar
Chesterman, A. (2007). Similarity analysis and the translation profile. Belgian Journal of Linguistics, 21, 53–66.
Article Google Scholar
Cosme, C. (2008). Participle clauses in learner English: the role of transfer. In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking Up Contrastive and Learner Corpus Research (pp. 177–198). Amsterdam & New York: Rodopi.
Google Scholar
Dagneaux, E., Denness, S., & Granger, S. (1998). Computer-aided error analysis. System: An International Journal of Educational Technology and Applied Linguistics., 26(2), 163–174.
Article Google Scholar
Díaz-Negrillo, A., & Fernández-Domínguez, J. (2006). Error tagging systems for learner corpora. Revista Española de Lingüística Aplicada, 19, 83–102.
Google Scholar
Espunya, A. (2014). The UPF learner translation corpus as a resource for translator training. Language Resources and Evaluation, 48, 33–43.
Article Google Scholar
Fictumova, J., Obrusník, A., & Stepankova, K. (2017). Teaching specialized translation error-tagged translation learner corpora. Sendebar, 28, 209–241.
Google Scholar
Florén, C. (2006). ENTRAD, an English Spanish parallel corpus created for the teaching of translation. Paper presented at the 7th Teaching and Language Corpora Conference (TALC 2006).
Gaspari, F., & Bernardini, S. (2010). Comparing non-native and translated language: Monolingual comparable corpora with a twist. In R. Xiao (Ed.), Using Corpora in Contrastive and Translation Studies (pp. 215–234). Newcastle: Cambridge Scholars Publishing.
Google Scholar
Gillard, P., & Gadsby, A. (1998). Using a learners’ corpus in compiling ELT dictionaries. In S. Granger (Ed.), Learner English on Computer (pp. 159–171). London & New York: Addison Wesley Longman.
Google Scholar
Graedler, A.-L. (2013). NEST—A corpus in the brooding box. Studies in Variation, Contacts and Change in English, 13. http://www.helsinki.fi/varieng/series/volumes/13/graedler/.
Granger, S. (1993). The international corpus of learner English. In J. Aarts, P. de Haan, & N. Oostdijk (Eds.), English Language Corpora: Design, Analysis and Exploitation (pp. 57–69). Amsterdam & Atlanta: Rodopi.
Google Scholar
Granger, S. (1994). The learner corpus: A revolution in applied linguistics. English Today, 10(3), 25–33.
Article Google Scholar
Granger, S. (1996). From CA to CIA and back: An integrated contrastive approach to computerized bilingual and learner corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in Contrast Text-based Cross-Linguistic Studies. Lund Studies in English (88th ed., pp. 37–51). Lund: Lund University Press.
Google Scholar
Granger, S. (2003). Error-tagged learner corpora and CALL: A promising synergy. CALICO, 20(3), 465–480.
Google Scholar
Granger, S., & Lefer, M.-A. (2016). From general to learners’ bilingual dictionaries: Towards a more effective fulfilment of advanced learners’ phraseological needs. International Journal of Lexicography, 29(3), 279–295.
Article Google Scholar
Halverson, S. (2017). Gravitational pull in translation testing a revised model. In G. De Sutter, M.-A. Lefer, & I. Delaere (Eds.), Empirical Translation Studies: New Methodological and Theoretical Traditions. Trends in Linguistics. Studies and Monographs (pp. 9–45). Berlin: De Gruyter Mouton.
Google Scholar
Hasselgård, H., & Johansson, S. (2011). Learner corpora and contrastive interlanguage analysis. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A Taste for Corpora. In honour of Sylviane Granger (pp. 33–62). Amsterdam: John Benjamins.
Chapter Google Scholar
Johansson, S. (2007). Seeing through Multilingual Corpora On the use of corpora in contrastive studies. Amsterdam and Philadelphia: John Benjamins.
Book Google Scholar
Kruger, H. (2018). Expanding the third code: Corpus-based studies of constrained communication and language mediation. In S. Granger, M.-A. Lefer, & L. Penha-Marion (Eds.), Book of Abstracts. Using Corpora in Contrastive and Translation Studies Conference (5th edition). CECL Papers 1 (pp. 9–12). Louvain-la-Neuve: Centre for English Corpus Linguistics/Université catholique de Louvain.
Kübler, N. (2008). A comparable Learner Translator Corpus: Creation and use. LREC 2008 Workshop on Comparable Corpora, (pp 73–78).
Kutuzov, A., & Kunilovskaya, M. (2014). Russian learner translator corpus: design, research potential and applications. In P. Sojka, A. Horák, I. Kopeček, & K. Palak (Eds.), Text, Speech and Dialogue. Lecture Notes in Computer Science (pp. 315–323). Berlin: Springer.
Chapter Google Scholar
Lanstyák, I., & Heltai, P. (2012). Universals in language contact and translation. Across Languages and Cultures, 13(1), 99–121.
Article Google Scholar
Lapshinova-Koltunski, E. (2013). VARTRA: A comparable corpus for analysis of translation variation. Proceedings of the 6th Workshop on Building and Using Comparable Corpora (pp. 77–86). Sofia, Bulgaria, 8 August 2013.
Laviosa, S. (1998). The English comparable corpus: A resource and a methodology. In L. Bowker, M. Cronin, D. Kenny, & J. Pearson (Eds.), Unity in Diversity? Current Trends in Translation Studies. Manchester: St. Jerome Publishing.
Google Scholar
Lee, D. Y. W. (2001). Genres, registers, text types, domains, and styles: Clarifying the concepts and navigating a path through the BNC jungle. Language Learning & Technology, 5(3), 37–72.
Google Scholar
Lefer, M.-A. (forthcoming). Parallel corpora. In M. Paquot, & S. Th. Gries (Eds), Practical Handbook of Corpus Linguistics. Berlin: Springer.
Lüdeling, A., & Hirschmann, H. (2015). Error annotation systems. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 135–157). Cambridge: Cambridge University Press.
Chapter Google Scholar
Macken, L., De Clercq, O., & Paulussen, H. (2011). Dutch Parallel Corpus: A Balanced Copyright-cleared Parallel Corpus. Meta, 56(2), 374–390.
Article Google Scholar
Maingay, S., & Rundell, M. (1987). Anticipating learners’ errors—implications for dictionary writers. In A. P. Cowie (Ed.), The Dictionary and the Language Learner (pp. 128–135). Tübingen: Niemeyer.
Google Scholar
Obrusník, A. (2013). A hybrid approach to parallel text alignment. Bachelor thesis. Masaryk University.
Obrusník, A. (2014). Hypal: A User-Friendly Tool for Automatic Parallel Text Alignment and Error Tagging. Eleventh International Conference Teaching and Language Corpora (pp. 67–69), Lancaster, 20–23 July 2014.
Štěpánková, K. (2014). Learner Translation Corpus: CELTraC (Czech-English Learner Translation Corpus). Bachelor’s Diploma Thesis. Masaryk University.
Uzar, R. S. (2002). A corpus methodology for analysing translation. In S.E.O. Tagnin (Ed.), Cadernos de Tradução: Corpora e Tradução (pp. 235–263). Florianópolis: NUT, 1(9).
Uzar, R., & Waliński, J. (2001). Analysing the fluency of translators. International Journal of Corpus Linguistics, 6, 155–166.
Article Google Scholar
Wible, D., Kuo, C.-H., Chien, F.-Y., Liu, A., & Tsao, N.-L. (2001). A Web-based EFL writing environment: Integrating information for learners, teachers, and researchers. Computers & Education, 37, 297–315.
Article Google Scholar
Wurm, A. (2016). Presentation of the KOPTE Corpus and Research Project. https://www.academia.edu/24012369/Presentation_of_the_KOPTE_Corpus_and_Research_Project.

Download references

Acknowledgements

We would like to thank the MUST local coordinators—Silvia Bernardini, Łucja Biel, Mario Cal Varela, Cem Can, Sara Castagnoli, Madalina Chitez, Elisa Corino, Julie Deconinck, Gert De Sutter, Margherita Dore, Gaetano Falco, Jonė Grigaliūnienė, Sandra Louise Halverson, Ruska Ivanovska-Naskova, Marlen Izquierdo, Xu Jiajin, Gurgen Karapetyan, Natalie Kübler, Efi Lamprou, Magnus Levin, Adriana Mezeg, Christine Michaux, Marina Morbiducci, Adriane Orenha Ottaiano, Adriana Orlandi, Heloísa Orsi Koch Delgado, Jun Pan, Anastasia Parianou, Gill Philip, Éric Poirier, Juan Pedro Rica Peromingo, Carola Strobl, Jenny Ström Herold, Olympia Tsaknaki, Jurgita Vaičenonienė, Susana Valdez, Heidi Verplaetse, Andrea Wurm—for contributing their translation data to the MUST project as well as for their helpful and enthusiastic support.

We would also like to thank the two anonymous reviewers for their helpful suggestions and comments.

Author information

Authors and Affiliations

Centre for English Corpus Linguistics, University of Louvain, Louvain-la-Neuve, Belgium
Sylviane Granger & Marie-Aude Lefer

Authors

Sylviane Granger
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Aude Lefer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylviane Granger.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Granger, S., Lefer, MA. The Multilingual Student Translation corpus: a resource for translation teaching and research. Lang Resources & Evaluation 54, 1183–1199 (2020). https://doi.org/10.1007/s10579-020-09485-6

Download citation

Published: 25 January 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10579-020-09485-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Multilingual Student Translation corpus: a resource for translation teaching and research

Abstract

Access this article

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation