A06
Modeling register variation across languages

Drawing on a language sample encompassing German, Yucatec Maya, Persian, Southern Kurdish, and Javanese, A06 investigates the role of typological properties for possibilities of register variation and gathers insights from multilingual contact situations, where contact-induced grammatical variants as well as code-switching can differ by register. In doing so, A06 analyzes object drop, forms of address and politeness, plural marking, and TAM marking. We draw empirical results primarily from the newly built Lang*Reg corpus, to which we will add results from perception studies and semantic fieldwork, aiming towards a cross-linguistically and cross-culturally informed model of register.

Members

Project leader

Prof. Dr. Aria Adli

Romanisches Seminar
Universität zu Köln

aria.adli@uni-koeln.de


Members


Student assistant

Former members

Vahid Mortezapour

Romanisches Seminar
Universität zu Köln



Doctoral fellows


Contact

Project in Phase I

Project Title for Phase I

Disentangling cross-linguistic and language-specific aspects of register variation

Project Description for Phase I

Overall question

This project aims at better understanding how register knowledge relates to grammatical aspects of linguistic knowledge. It does so by specifically taking a comparative viewpoint and addressing aspects of the universal and language-specific nature of register variation. We will test our hypotheses by directly comparing three typologically different languages (Persian, German, Yucatec Maya) with widely different register distinctions through the parallel application of the same methods.

Languages greatly vary in register diversity. Some languages (Persian) show more salient differences between registers than others (German), in spite of both being used orally as well as in written form. Still other languages (Yucatec Maya) are mainly used for oral communication with only incipient written use.

Research goals

First, we will investigate cross-linguistic vs. language-specific properties of register (Research goal 1), tackling the question of which aspects of syntactic variation are cross-linguistically associated with register and which aspects are language-specific.

Second, we will concentrate on the impact of differences in register diversity and normative aspects on cross-linguistic similarities and differences in register variation (Research goal 2). Third, in order to disentangle language-specific and cross-linguistic components of register variation, we will focus on syntactic phenomena related to the encoding of information structure that are likely to be register sensitive (Research goal 3).

The phenomena to be considered for these three research goals include (a) word order operations reducing syntactic compactness, such as right- and left-dislocations and (b) the choice of referential expressions as pronominal or null. More specifically, we will investigate i) whether there is a cross-linguistic association of structural devices that reduce syntactic compactness. We will compare informal spontaneous speech to formal speech and written language, where we expect existing word order variation to be more clearly restricted to positions within clausal boundaries. We will investigate ii) to what extent variability in the use of referential expressions differs by register within and across languages.

Fourth, since cross-linguistic studies on register variation are still rare, we will develop methods for the parallel investigation of register variation across languages involving both language production and perception (Research goal 4). At first, we will build a Lang*Reg corpus based on guided naturalistic (spontaneous) language production in different situations. In a second step, production data will be complemented with perception data collected through a gradient judgement study and a situative classification task on the association of syntactic variants with specific situative contexts.

Project Leaders in Phase I

Prof. Dr. Aria Adli

Romanisches Seminar
Universität zu Köln

aria.adli@uni-koeln.de

Publications & Presentations

    Publications

    2025

  • Adli, Aria; Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina; Farokhnejad, Zahra; Müller, David; Verhoeven, Elisabeth  (2025) Lang*Reg corpus: Documenting intraspeaker variation across languages and registers In:  Language Documentation & Conservation [ViVo]
  • 2024

  • Adli, Aria; Verhoeven, Elisabeth; Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina  (2024) Lang*Reg: A multi-lingual corpus of intra-speaker variation across situations[DOI] [ViVo]

    The Lang*Reg corpus records intra-speaker variation across languages and different situational-functional contexts, presumed to result in different registers. It has been prepared in the SFB1412 Register with data collections taking place in 2021-2022 for the following languages included in this version: German, Persian, Kurdish, Javanese. The data sets for each language comprise the speech of the same language users in a variety of spoken conversations and one written interaction. A minimum of 12 participants per language traversed a course of 6 situations in which they were asked to produce language in three types of activities: telling a story to a friend, talking freely with various interlocutors (friend, stranger, taxi driver) and engaging in an interview with a (university) professor. Moreover, our design included the storytelling in two modes, which allows for the comparison between spoken and written modes of the same language user. 

    Lang*Reg has a basic syntactic segmentation (one matrix clause and all its dependent clauses per segment). v0.2.0 includes the data sets with transcriptions, normalizations and tokens for each language as well as additional language-specific annotations such as glosses and syntactic annotations. We prepared each data set also for use with the browser-based search and visualization architecture ANNIS. For further language-specific morpho-syntactic and sociolinguistic annotations, refer to the respective data set description. For an overview of all data set characteristics, please see the corpus documentation in each data set.

  • Lüdeling, Anke; Szucsich, Luka; Zeige, Lars Erik; Adli, Aria; Alexiadou, Artemis; Belz, Malte; Bouzouita, Miriam; Adli, Aria; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Fleischer, Jürg; Gagarina, Natalia; Hirschmann , Hagen; Jannedy, Stefanie; Knoeferle, Pia; Krause, Thomas; Kutscher, Silvia; Liu, Mingya; Lütke, Beate; Machicao y Priemer, Antonio; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Sauerland, Uli; Sauermann, Antje; Schmitt, Viola; Schumacher, Nicole; Serova, Dina; Solt, Stephanie; Vander Klok, Jozina; Verhoeven, Elisabeth; Waltereit, Richard; Weirich, Melanie  (2024) Register: Language Users’ Knowledge of Situational-Functional Variation. Frame text of the Second Phase Proposal for the CRC 1412[DOI] [ViVo]
  • Adli, Aria; Lüdeling, Anke; Szucsich, Luka; Zeige, Lars Erik; Alexiadou, Artemis; Belz, Malte; Bouzouita, Miriam; Bunk, Oliver; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Fleischer, Jürg; Gagarina, Natalia; Hirsch, Aron; Jannedy, Stefanie; Knoeferle, Pia; Krause, Thomas; Kutscher, Silvia; Liu, Mingya; Lütke, Beate; Machicao y Priemer, Antonio; Maquate, Katja; Merino Hernández, Laura; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Sauerland, Uli; Sauermann, Antje; Schmitt, Viola; Schumacher, Nicole; Serova, Dina; Solt, Stephanie; Vander Klok, Jozina; Verhoeven, Elisabeth; Waltereit, Richard; Weirich, Melanie; Wiese, Heike  (2024) Register: Language Users’ Knowledge of Situational-Functional Variation. Frame text of the Second Phase Proposal for the CRC 1412 In:  Register Aspects of Language in Situation [DOI] [ViVo]
  • Varaschin, Giuseppe; Machicao y Priemer, Antonio; Lu, Yanru  (2024) Topic drop in German: Grammar and usage In:  Proceedings of the International Conference on Head-Driven Phrase Structure Grammar [DOI] [ViVo]
    German topic drop clauses are a subtype of declarative clauses where the initial position (usually filled by an overt constituent) is left empty. It is often noted that topic drop appears mainly in specific registers (e.g. spoken dialogues), but this claim has neither been previously experimentally validated, nor formally implemented. In this paper, we report the results of a matched-guise study which indicate that the syntactic variation between topic drop and regular V2 declaratives in fact correlates with different social meanings, leading to the register variation postulated in the literature. In order to model German speakers' grammatical and register knowledge about topic drop in HPSG we propose, (i) a unified grammatical constraint that licenses topic drop structures, (ii) a formal theory of register that treats social meanings as a type of use-conditional content subject to compositional rules.
  • 2023

  • Adli, Aria; Verhoeven, Elisabeth; Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina  (2023) Lang*Reg: A multi-lingual corpus of intra-individual variation across situations[DOI] [ViVo]
    Language: German, Persian, Yucatec Maya, Kurdish, Javanese
    Size: 36 hours
    Description: same speakers varied by mode, acquaintance, professionalism, and expertise
    Features: transcription, syntactic segmentation, normalization, token, glossing or POS-tags, some syntax
    Access: transcription or annotation in progress; CC-BY-NC-ND
  • Pescuma, Valentina Nicole; Serova, Dina; Lukassek, Julia; Sauermann, Antje; Schäfer, Roland; Adli, Aria; Bildhauer, Felix; Egg, Markus; Hülk, Kristina; Ito, Aine; Jannedy, Stefanie; Kordoni, Valia; Kühnast, Milena; Kutscher, Silvia; Lange, Robert; Lehmann, Nico; Liu, Mingya; Lütke, Beate; Maquate, Katja; Mooshammer, Christine; Mortezapour, Vahid; Müller, Stefan; Norde, Muriel; Pankratz, Elizabeth; Patarroyo, Angela Giovanna; Plesca, Ana-Maria; Ronderos, Camilo R.; Rotter, Stephanie; Sauerland, Uli; Schulte, Britta; Schüppenhauer, Gediminas; Sell, Bianca Maria; Solt, Stephanie; Terada, Megumi; Tsiapou, Dimitra; Verhoeven, Elisabeth; Weirich, Melanie; Wiese, Heike; Zaruba, Kathy; Zeige, Lars Erik; Lüdeling, Anke; Knoeferle, Pia; Schnelle, Gohar  (2023) Situating language register across the ages, languages, modalities, and cultural aspects: Evidence from complementary methods In:  Frontiers in Psychology [DOI] [ViVo]
    In the present review paper by members of the collaborative research center ‘Register: Language Users’ Knowledge of SituationalFunctional Variation’ (CRC 1412), we assess the pervasiveness of register phenomena across different time periods, languages, modalities, and cultures. We define ‘register’ as recurring variation in language use depending on the function of language and on the social situation. Informed by rich data, we aim to better understand and model the knowledge involved in situation- and function-based use of language register. In order to achieve this goal, we are using complementary methods and measures. In the review, we start by clarifying the concept of ‘register’, by reviewing the state of the art, and by setting out our methods and modeling goals. Against this background, we discuss three key challenges, two at the methodological level and one at the theoretical level: 1. To better uncover registers in text and spoken corpora, we propose changes to established analytical approaches. 2. To tease apart between-subject variability from the linguistic variability at issue (intra-individual situation based register variability), we use within-subject designs and the modeling of individuals’ social, language, and educational background. 3. We highlight a gap in cognitive modeling, viz. modeling the mental representations of register (processing), and present our first attempts at filling this gap. We argue that the targeted use of multiple complementary methods and measures supports investigating the pervasiveness of register phenomena and yields comprehensive insights into the cross-methodological robustness of register-related language variability. These comprehensive insights in turn provide a solid foundation for associated cognitive modeling.
  • Lehmann, Nico; Serova, Dina; Lukassek, Julia; Döring, Sophia; Goymann, Frank; Lüdeling, Anke; Akbari, Roodabeh  (2023) Guidelines for the annotation of parameters of narration. In:  REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
    The present guidelines describe the annotation of narrative phenomena on the clause level, using a combination of ideas and methods from linguistics and lit- erary studies. The main categories marking the discourse strategy “narration” in stretches of text have been narrowed down to mediacy, i. e. involving a narrator, and sequentiality of events. This document specifies how to define mediacy, and in turn determine whether a narrator is present, as well as how to identify events and their sequential ordering. Lastly, a functional layer annotation is proposed which allows researchers to compare different types of narrative instances. This offers a basis for investigating a potential narrative register which is said to be important for many kinds of register studies.
  • Varaschin, Giuseppe; Culicover, Peter W.; Winkler, Susanne  (2023) In pursuit of Condition C: (Non-)coreference in grammar, discourse and processing In:  Information Structure and Discourse in Generative Grammar [ViVo]
  • 2022

  • Adli, Aria; Lüdeling, Anke; Alexiadou, Artemis; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Solt, Stephanie; Szucsich, Luka; Verhoeven, Elisabeth; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik  (2022) Register: Language Users’ Knowledge of Situational-Functional Variation. Frame text of the First Phase Proposal for the CRC 1412 In:  REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
  • Lehmann, Nico; Verhoeven, Elisabeth  (2022) Discourse-Independent Variation in V-Initial Constituent Order: The Yucatec Mayan Preverbal Domain Revisited In:  ProcLingEvi2020, Universität Tübingen [DOI] [ViVo]
    Contribution to Linguistic Evidence 2020
  • Adli, Aria  (2022) Coherence and implicational hierarchies in the speech of the very old In:  The Coherence of Linguistic Communities Orderly Heterogeneity and Social Meaning [ViVo]
  • 2021

  • Machicao y Priemer, Antonio; Müller, Stefan  (2021) NPs in German: Locality, theta roles, possessives, and genitive arguments In:  Glossa: a journal of general linguistics [DOI] [ViVo]
    Since Abney (1987), the DP-analysis has been the standard analysis for nominal complexes, but in the last decade, the NP analysis has experienced a revival. In this spirit, we provide an NP analysis for German nominal complexes in HPSG. Our analysis deals with the fact that relational nouns assign case and theta role to their arguments. We develop an analysis in line with selectional localism (Sag 2012: 149), accounting for the asymmetry between prenominal and postnominal genitives, as well as for the complementarity between higher arguments and possessives, providing a syntactic and semantic analysis.
  • 2020

  • Lüdeling, Anke; Alexiadou, Artemis; Adli, Aria; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Szucsich, Luka; Verhoeven, Elisabeth; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik  (2020) Register: Language Users’ Knowledge of Situational-Functional Variation In:  REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
    The Collaborative Research Center 1412 “Register: Language Users’ Knowledge of Situational-Functional Variation” (CRC 1412) investigates the role of register in language, focusing in particular on what constitutes a language user’s register knowledge and which situational-functional factors determine a user’s choices. The following paper is an extract from the frame text of the proposal for the CRC 1412, which was submitted to the Deutsche Forschungsgemeinschaft in 2019, followed by a successful onsite evaluation that took place in 2019. The CRC 1412 then started its work on January 1, 2020. The theoretical part of the frame text gives an extensive overview of the theoretical and empirical perspectives on register knowledge from the viewpoint of 2019. Due to the high collaborative effort of all PIs involved, the frame text is unique in its scope on register research, encompassing register-relevant aspects from variationist approaches, psycholinguistics, grammatical theory, acquisition theory, historical linguistics, phonology, phonetics, typology, corpus linguistics, and computational linguistics, as well as qualitative and quantitative modeling. Although our positions and hypotheses since its submission have developed further, the frame text is still a vital resource as a compilation of state-of-the-art register research and a documentation of the start of the CRC 1412. The theoretical part without administrative components therefore presents an ideal starter publication to kick off the CRC’s publication series REALIS. For an overview of the projects and more information on the CRC, see https://sfb1412.hu-berlin.de/.
  • Machicao y Priemer, Antonio; Fritz-Huechante, Paola  (2020) Boundaries at play In:  Interfaces in Romance [DOI] [ViVo]
    Summary In this paper, we model the left-bounded state reading and the true reflexive reading of the se clitic in the Spanish psychological domain. We argue that a lexical analysis of se provides us with a more accurate description of the different classes of psychological verbs that occur with the clitic. We provide a unified analysis where the use of the two readings of se are modeled by means of lexical rules. We take the morphologically simple but semantically more complex basic items (e.g. asustar ‘frighten’) as input of the lexical rules, getting as the output a morphologically more complex but semantically simpler verb (e.g asustarse ‘get frightened’). The analysis for psych verbs correctly allows only those verbs assigning accusative to the experiencer or the stimulus to combine with se, hence preventing dative verbs from entering the lexical rules. The analysis also demonstrates how to account for punctual and non-punctual readings of psych verbs with se incorporating ‘boundaries’ into the type hierarchy of eventualities.
  • 2018

  • Verhoeven, Elisabeth; Lehmann, Nico  (2018) Self-embedding and complexity in oral registers In:  Glossa: a journal of general linguistics [DOI] [ViVo]
    This article reports the results of a study on the self-embedding depth of nominal, verbal and clausal projections in spoken corpora of German. We compared two spoken registers featuring public and non-public (i.e. private) conversation by measuring the depth of self-embedding in C, V, and N projections. The findings confirm the hypothesis that the familiarity of the speech situation (public vs. non-public speech) has a significant impact on complexity in terms of self-embedding: speakers use more self-embedding in public speech production in different syntactic projections. In addition, we examined previous assumptions about the differences between right, left, and center embedding in C projections. The results confirm a preference against center embedding in non-public texts, which reflects the complexity of center embedding. Finally, we find evidence that the depth of self-embedding in V and C projections is correlated. This finding suggests that self-embedding depth is part of a general strategy, i.e., speakers select more or less complex structures (of different types) depending on factors of the speech situation.
  • Presentations

    2024

  • Adli, Aria  (2024) Code-switching into the dominant language in multilingual societies? Pronominal forms as markers of politeness and register In:  DGfS 2024 (Ruhr-Universität Bochum) [ViVo]
  • 2023

  • Adli, Aria  (2023) The formal and functional distribution of right-peripheral arguments in German and Persian across registers In:  56th Annual Meeting of the Societas Linguistica Europaea [ViVo]
  • Haig, Geoffrey  (2023) Which domains of morphosyntax are sensitive to register variation? Thoughts from Iranian languages. In:  Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2023) [ViVo]
  • Sailer, Manfred  (2023) Explicit or redundant: The social meaning of multiple exponence In:  Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2023) [ViVo]
  • Sailer, Manfred  (2023) Explicit or redundant: The social meaning of multiple exponence In:  Kolloquium SFB1412 (2023) [ViVo]
  • 2022

  • Adli, Aria  (2022) Local person referents and the role of politeness: Comparing variable subject pronouns in Spanish and Persian In:  11th International Conference on Language Variation in Europe - ICLaVE|11 [ViVo]
  • Engel, Eric; Adli, Aria  (2022) Complexity and fluency at the end of the life span In:  Kolloquium SFB1412 (2022) [ViVo]
  • Farokhnejad, Zahra  (2022) A general outlook of Kurdish register data: focusing on Code-switching and post-predicate constituents In:   Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2022) [ViVo]
  • Lehmann, Nico  (2022) Register and the function puzzle: Why register competence is not the whole story In:  Kolloquium SFB1412 (2022) [ViVo]
  • Varaschin, Giuseppe; Machicao y Priemer, Antonio  (2022) Agreement mismatches and register-driven variation in Brazilian Portuguese In:  Oberseminar Syntax and Semantics, Institut für England- und Amerikastudien, Goethe-Universität Frankfurt am Main [ViVo]
  • 2021

  • 2020

  • Adli, Aria  () Languages as registers: Language choice and language mixing as register markers In:  12th International Conference on Language Variation in Europe – ICLaVE|12 [ViVo]
  • Adli, Aria  () Linguistic and situational context in multiple and flexible plural marking: some data and directions for further research In:  Workshop Flexible and Multiple Plural Marking in Language Contact and Creolization: Social and Situational Correlates [ViVo]
  • Fritz-Huechante, Paola; Varaschin, Giuseppe; Verhoeven, Elisabeth; Machicao y Priemer, Antonio  () Spanish Experiencer Object Verbs: Different Case Marking of the Experiencer and Different Levels of Politeness In:  XXIV. Deutscher Hispanistiktag 2025: Dinámicas de transferencia e hibridación [ViVo]