Prof. Dr. Roland Meyer

Humboldt-Universität zu Berlin

Institut für Slawistik und Hungarologie


  • West- und ostslawische synchrone Sprachwissenschaft
  • Korpuslinguistik und Sprachtechnologie, Verhältnis von Theorie und Empirie
  • Diachrone Syntax
  • Kasus und grammatische Funktionen; Argumentstruktur
  • Intonation
  • Informationsstruktur und Satzmodus
  • Grammatiktheorien
  • Sprachkontakt Deutsch/Slavisch, Areallinguistik


A03 "Expressive" dislocation and register in Czech vs. Russian


Unter den Linden 6 , 10099 Berlin

030 / 2093-73322


Veröffentlichungen und Präsentationen


  • Meyer, Roland  (2023) Control Constructions  In: Grenoble, L. et al. (eds.): Brill’s Encyclopedia of Slavic Languages and Linguistics [ViVo]
  • Meyer, Roland  (2023) Raising Contstructions  In: Grenoble, L. et al. (eds.): Brill’s Encyclopedia of Slavic Languages and Linguistics [ViVo]
  • Meyer, Roland  (2023) Detecting Authorship, Hands, and Corrections in Historical Manuscripts. A Mixed-Methods Approach Towards the Unpublished Writings of an 18th Century Czech Emigré Community in Berlin  In: Schneider, B. et al. (eds.). Mixed Methods in the Humanities. Digital Humanities Research, transcript. [ViVo]
  • Demian, Christoph; Buchmüller, Olga; Meyer, Roland; Szucsich, Luka  (2022) Syntactic Complexity and Register in Russian [ViVo]
    Syntactic complexity is often thought to systematically interact with register (Biber and Gray, 2010), ultimately because both are closely connected to processing load (e.g., Liu 2008). At the same time, it is far from clear how exactly to frame syntactic complexity, and what aspect of syntactic complexity is actually most sensitive to distinctions in register. The present paper compares two basic measures of syntactic complexity as applied to a corpus of Russian: (a) a simple frequency measure of clausal subordination, and (b) a measure of internal complexity of dependency trees, the average dependency distance (Liu 2008; Proisl et al. 2019). We show that traditional preconceptions about the amount of clausal subordination per register are often unwarranted, and that frequency of clausal subordination shows a very different register profile from dependency-based complexity measures. The classification of registers in Russian is a matter of debate in itself. The traditional and still most widespread approach relies on an inventory of so-called functional styles, which are distinguished by a mixture of situational, contents-based, and communicative-intentional characteristics (e.g., Warditz 2019). These found their way into tagged corpora such as the one we used, the 1.25 M tokens so-called Russkij standart (“Russian National Corpus” 2003), which has been hand-corrected for part-of-speech and grammatical tagging. (Note that functional styles are called spheres in this corpus and in the remainder.) While a desirable methodologically sound register analysis is in preparation, we can still use some uncontroversial register-related distinctions, such as spoken vs. written mode, fictional vs. technical prose, scientific prose vs. official announcements etc. Several methodological precautions must be mentioned: (i) We systematically excluded punctuation from token counts and from the leaves of dependency trees, which greatly improved the adequacy of our measures. (ii) Since text lengths differ dramatically in the spheres under consideration, we could not simply compare relative frequencies of types (Evert 2006, among many others). Instead, we approximated the frequencies of (lexical and part of speech) types by drawing a large number (50) of equally sized random samples of 4K tokens per sphere. The figures below depict the distribution of frequencies over these samples. (iii) The correctness of tagging and parsing by UDPipe (Straka and Straková 2017) was inspected systematically; about 94 % of the parses were found to be correct. First, consider the distribution of the subordinating complementizer čto ‘that’ across spheres (fig. 1). In line with Biber’s (1988) observations for English and contrary to Kožina (2011) for Russian, the spoken subcorpora, especially the oral public communication, show the highest relative frequency of this most widespread subordinator. This fact generalizes to all subordinating sentence connectors (= tagged SCONJ) (cf. fig. 2), rendering čto more or less prototypical. Second, official and business communication, technical documents and scientific prose are located at the other end of the scale, containing relatively few hypotactic structures. Fictional and private spoken communication take a medium position. We tentatively attribute this distribution to the net effect of a factor [±spoken] and a factor [±narrative], with nonfictional texts being less narrative in character than fictional texts. It is also plausible that fictional texts and public oral statements tend to contain more expressions of attitude and perception, which serve as embedding predicates for čto-clauses. Third, we observed that the frequency of čto-complementizers across spheres formed an almost perfect mirror image of the frequency of nouns (fig. 3). We attribute this to the two acting, at least partially, as complementary variants of a variable. 1 As a dependency-based measure of syntactic complexity, we chose average dependency distance (Liu 2008; cf. Proisl et al. 2019 for evaluation), which comprises the length of dependency links per sentence. Interestingly, this measure showed a profile across spheres which differed clearly from the clausal subordination measure: Here, all three spoken subcorpora are at the low end of the scale, while all written subcorpora more or less pool in the middle (fig. 4). Educational/scientific texts are among the highest in dependency-based complexity. This finding supports Biber & Grey’s (2010) conclusion that English academic writing is structurally complex, but not in the sense of frequent clausal subordination. By contrast, they found subordinate clauses to be more common in conversation, which is in line with our above finding on public oral communication in Russian. A potential problem for most dependency-based measures is their close correlation with sentence length (Proisl et al. 2019). This becomes especially cumbersome here, because sentence length varies across registers independently, confounding structural complexity. Furthermore, spoken subcorpora rely on (loosely defined) communicative units rather than on sentences delimited by punctuation. In order to strengthen our conclusions, we therefore plan to run a careful comparison of samples of equally long sentences across spheres, in order to reveal the effect of structural complexity proper.

    Biber, D. (1988). Variation across speech and writing. Cambridge UP.
    Biber, D., & Gray, B. (2010). Challenging stereotypes about academic writing: Complexity, elaboration, explicitness. Journal of English for Academic Purposes, 9(1), 2–20.
    Evert, S. (2006). How random is a corpus? The library metaphor. Zeitschrift für Anglistik und Amerikanistik, 54(2), 177–190. Kožina, M. N. (2011). Stilistika russkogo jazyka (4th ed.). Nauka.
    Liu, H. (2008). Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science, 9, 159–191.
    Proisl, T., Konle, L., Evert, S., & Jannidis, F. (2019). Dependenzbasierte syntaktische Komplexitätsmaße. In P. Sahle (Ed.), DHd 2019 Digital Humanities: multimedial & multimodal. Konferenzabstracts (pp. 270–273). Russian national corpus: Offline disambiguated version of the corpus. (2003). http://ruscorpora. ru/new/
    Straka, M., & Straková, J. (2017). Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDpipe. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 88–99.
    Warditz, V. (2019). Varianz im Russischen: von funktionalstilistischer zur soziolinguistischen Perspektive. Peter Lang.
  • Alexiadou, Artemis; Lüdeling, Anke; Adli, Aria; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Szucsich, Luka; Verhoeven, Elisabeth; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik  (2020) Register: Language Users’ Knowledge of Situational-Functional Variation  In: REALIS: Register Aspects of Language in Situation [DOI] [PDF] [ViVo]
    The Collaborative Research Center 1412 “Register: Language Users’ Knowledge of Situational-Functional Variation” (CRC 1412) investigates the role of register in language, focusing in particular on what constitutes a language user’s register knowledge and which situational-functional factors determine a user’s choices. The following paper is an extract from the frame text of the proposal for the CRC 1412, which was submitted to the Deutsche Forschungsgemeinschaft in 2019, followed by a successful onsite evaluation that took place in 2019. The CRC 1412 then started its work on January 1, 2020. The theoretical part of the frame text gives an extensive overview of the theoretical and empirical perspectives on register knowledge from the viewpoint of 2019. Due to the high collaborative effort of all PIs involved, the frame text is unique in its scope on register research, encompassing register-relevant aspects from variationist approaches, psycholinguistics, grammatical theory, acquisition theory, historical linguistics, phonology, phonetics, typology, corpus linguistics, and computational linguistics, as well as qualitative and quantitative modeling. Although our positions and hypotheses since its submission have developed further, the frame text is still a vital resource as a compilation of state-of-the-art register research and a documentation of the start of the CRC 1412. The theoretical part without administrative components therefore presents an ideal starter publication to kick off the CRC’s publication series REALIS. For an overview of the projects and more information on the CRC, see
  • Meyer, Roland  (2020) Die tschechischen Wenkerbögen: Deutsch und seine Kontaktsprachen in der Dokumentation der Wenker-Materialien  In: Minderheitensprachen und Sprachminderheiten [ViVo]
  • Tikhonov, Aleksej; Meyer, Roland; Müller, K.  (2020) LiViTo: Linguistic and Visual features Tool for assisted analysis of historic manuscripts  In: Proceedings of the 12th Language Resources and Evaluation Conference [ViVo]
  • Präsentationen