Skip to main content

Constructions in Parallel Corpora: A Quantitative Approach

  • Conference paper
  • First Online:
Computational and Corpus-Based Phraseology (EUROPHRAS 2017)

Abstract

The primary goal of the present study is to find an adequate method for the quantitative analysis of empirical data obtained from parallel corpora. Such a task is particularly important in the case of fixed constructions possessing some degree of idiomaticity and language specificity. Our data consist of the Russian construction дeлo в тoм, чтo and its parallels in English, German and Swedish. This construction, which appears to present no difficulty for translation into other languages, is in fact, language-specific when compared with other languages. It displays a large number of different parallels (translation equivalents) in other languages, and possesses a complex semantic structure. The configuration of semantic elements comprising the content plane of this construction is unique. The empirical data have been collected from the corpus query system Sketch Engine, subcorpus OPUS2 Russian, and the Russian National Corpus (RNC). We propose to use the Herfindahl index as a tool for quantitative analysis in order to measure the degree of uniformity in the frequency distribution of the various translations of the construction under investigation. This tool is not universal and does not enable us to answer all the questions that arise in connection with determining the specificity of language units. However, it clearly helps to obtain more objective results and to refine the quantitative analysis of idiomatic constructions on the basis of corpus data.

This paper is based on work supported by the Russian Science Foundation (RSF) under Grant 16-48-03006 “Semantic Analysis of Translated Texts for Comparative Cultural Studies and Cultural Specificity in Language Learning”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Literally: the thing is that.

  2. 2.

    Literally: the problem is that and the truth is that.

  3. 3.

    For more detail see Sitchinava (2016).

  4. 4.

    The quantitative method for analyzing fixed expressions in monolingual corpora is used in such studies as Zhu and Fellbaum (2015), Steyer (2015).

  5. 5.

    Figures in brackets indicate total number of hits.

References

  • Buntman, N.V., Zaliznjak, A.A., Zatsman, I.M., Kruzhkov, M.G., Loshchilova, E.J., Sitchinava, D.V.: Informacionnye texnologii korpusnyx issledovanij: principy postroenija kross-lingvističeskix baz dannyx (Informational technology in corpus-based studies: towards a cross-linguistic database). Inf. Appl. 8(2), 98–110 (2014)

    Google Scholar 

  • Dobrovol’skij, D., Pöppel, L.: Diskursivnaja konstrukcija N в тoм, чтo i ee paralleli v drugix jazykax: kontrastivnoe korpusnoe issledovanie. (The discursive construction N в тoм, чтo and and its correlates in other languages: A contrastive corpus analysis). Novosibirsk State Pedagogical Univ. Bull. 6, 164–175 (2016a)

    Google Scholar 

  • Dobrovol’skij, D.O., Pöppel, L.: The discursive construction дeлo в тoм, чтo and its parallels in other languages: A contrastive corpus study. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue 2016”, issue 15 (22), pp. 126–137. RGGU, Moscow (2016b)

    Google Scholar 

  • Günthner, S.: Die “die Sache/das Ding ist”-Konstruktion im gesprochenen Deutsch – eine interaktionale Perspektive auf Konstruktionen im Gebrauch. In: Stefanowitsch, A., Fischer, K. (eds.), Konstruktionsgrammatik II. Von der Konstruktion zur Grammatik, pp. 157–177. Tübingen, Stauffenburg (2008)

    Google Scholar 

  • Sitchinava, D.: Parallel corpora as a source of defining language-specific lexical items. In: Margalitadze, T., Meladze, G. (eds.) Proceedings of the XVII EURALEX International Congress: Lexicography and Linguistic Diversity, pp. 394–401. Ivane Javakhishvili Tbilisi University Press, Tbilisi (2016)

    Google Scholar 

  • Šmelev, A.D.: Russkaja jazykovaja model’ mira. Materialy k slovarju. (The Russian language picture of the world). Jazyki slavjanskoj kul’tury, Moscow (2002)

    Google Scholar 

  • Šmelev, A.D.: Jazyk i kul’tura: est’ li točki soprikosnovenija? (Language and culture: do they have points of interaction?). In: Proceedings of the V.V. Vinogradov Institute of Russian Language, issue 1, pp. 36–116. Russian Language Institute, Moscow (2014)

    Google Scholar 

  • Šmelev, A.D.: Russkie lingvospecifičnye leksičeskie edinicy v parallel’nyx korpusax: vozmožnosti issledovanija i “podvodnye kamni” (Russian language-specific lexical units in parallel corpora: prospects of investigation and “pitfalls”). In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue 2015”, issue 14(21), vol. 1, pp. 584–594. RGGU, Moscow (2015)

    Google Scholar 

  • Steyer, K.: Patterns. Phraseology in a state of flux. Int. J. Lexicogr. 28(3), 279–298 (2015)

    Article  MathSciNet  Google Scholar 

  • Wierzbicka, A.: Semantics, Culture, and Cognition. Universal Human Concepts in Culture-Specific Configurations. Oxford University Press, Oxford (1992)

    Google Scholar 

  • Wierzbicka, A.: Semantics: Primes and Universals. Oxford University Press, Oxford (1996)

    Google Scholar 

  • Zaliznjak, A.A.: Lingvospecifičnye edinicy russkogo jazyka v svete kontrastivnogo korpusnogo analiza (Russian language-specific words as an object of contrastive corpus analysis). In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue 2015”, issue 14(21), vol. 1, pp. 683–695. RGGU, Moscow (2015)

    Google Scholar 

  • Zaliznjak, A.A., Levontina, I.B., Šmelev, A.D.: Ključevye idei russkoj jazykovoij kartiny mira (Key ideas of the Russian language picture of the world). Jazyki slavjanskoj kul’tury, Moscow (2005)

    Google Scholar 

  • Zaliznjak, A.A., Levontina, I.B., Šmelev, A.D.: Konstanty i peremennye russkoj jazykovoj kartiny mira (Constants and variables of the Russian language picture of the world). Jazyki slavjanskoj kul’tury, Moscow (2012)

    Google Scholar 

  • Zhu, F., Fellbaum, C.: Quantifying fixedness and compositionality in chinese idioms. Int. J. Lexicogr. 28(3), 338–350 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitrij Dobrovol’skij .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Dobrovol’skij, D., Pöppel, L. (2017). Constructions in Parallel Corpora: A Quantitative Approach. In: Mitkov, R. (eds) Computational and Corpus-Based Phraseology. EUROPHRAS 2017. Lecture Notes in Computer Science(), vol 10596. Springer, Cham. https://doi.org/10.1007/978-3-319-69805-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69805-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69804-5

  • Online ISBN: 978-3-319-69805-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics