A06
Modeling register variation across languages

Ausgehend von einem Sprachsample, das Deutsch, Yukatekisches Maya, Persisch, Südkurdisch und Javanisch umfasst, untersucht A06 die Rolle typologischer Eigenschaften in der Registervariation. Zudem erforscht A06 mehrsprachige Kontaktsituationen, in denen sich kontaktinduzierte grammatische Varianten sowie Code-Switching je nach Register unterscheiden. Hierzu analysiert A06 Objektdrop, Anrede- und Höflichkeitsformen, Plural- und TAM-Markierung. Grundlage ist das neu erstellte Lang*Reg-Korpus, ergänzt um Perzeptionsstudien und semantische Feldforschung, mit dem Ziel, ein Registermodell zu entwickeln, das für verschiedene Sprachen und kulturelle Kontexte adäquat ist.

Mitarbeiter*innen

Leitung

Prof. Dr. Aria Adli

Romanisches Seminar
Universität zu Köln

aria.adli@uni-koeln.de

Dr. Jozina Vander Klok

Institut für deutsche Sprache und Linguistik

jozina.vander.klok@hu-berlin.de

Prof. Dr. Elisabeth Verhoeven

Institut für deutsche Sprache und Linguistik
Humboldt-Universität zu Berlin

Mitarbeiter*innen

Nico Lehmann

Institut für deutsche Sprache und Linguistik
Humboldt-Universität zu Berlin

nico.lehmann@hu-berlin.de

Maryam Bahmani

Romanisches Seminar

mbahmani@uni-koeln.de

Studentische Hilfskräfte

Lara Lempp

Institut für deutsche Sprache und Linguistik

lara.lempp@hu-berlin.de

Ehemalige Mitarbeiter*innen

Vahid Mortezapour

Romanisches Seminar
Universität zu Köln

Zahra Farokhnejad

Sprach- und literaturwissenschaftliche Fakultät
Humboldt-Universität zu Berlin

David Müller

Institut für deutsche Sprache und Linguistik
Humboldt-Universität zu Berlin

Kontakt

Nico Lehmann

Humboldt-Universität zu Berlin

(030) 2093-9718

nico.lehmann@hu-berlin.de

Projekt in Förderphase I

Titel in Phase I

Disentangling cross-linguistic and language-specific aspects of register variation

Beschreibung für Phase I

Fragestellung

Das Projekt hat zum Ziel, besser zu verstehen, wie Registerwissen mit grammatikalischen Aspekten des Sprachwissens zusammenhängt. Dies geschieht auf Basis einer vergleichenden Sichtweise und der Behandlung von Aspekten der universellen und sprachspezifischen Natur der Registervariation. Wir werden unsere Hypothesen testen, indem wir drei typologisch unterschiedliche Sprachen (Persisch, Deutsch, Yucatec Maya) mit sehr unterschiedlichen Registerunterscheidungen durch die parallele Anwendung derselben Methoden direkt vergleichen.

Die Sprachen unterscheiden sich stark in der Registervielfalt. Einige Sprachen (Persisch) zeigen auffälligere Unterschiede zwischen den Registern als andere (Deutsch), obwohl beide sowohl mündlich als auch schriftlich verwendet werden. Wieder andere Sprachen (Yucatec Maya) werden hauptsächlich für die mündliche Kommunikation verwendet, wobei der schriftliche Gebrauch erst am Anfang steht.

Forschungsziele

Zunächst werden wir sprachübergreifende vs. sprachspezifische Eigenschaften von Registern untersuchen (Forschungsziel 1) und dabei der Frage nachgehen, welche Aspekte der syntaktischen Variation sprachübergreifend mit Registern assoziiert und welche Aspekte sprachspezifisch sind.

Zweitens betrachten wir den Einfluss von Unterschieden in der Registervielfalt und von normativen Aspekten auf sprachübergreifende Ähnlichkeiten und Unterschiede in der Registervariation (Forschungsziel 2).

Drittens werden wir uns auf syntaktische Phänomene konzentrieren, die mit der Kodierung von Informationsstruktur zusammenhängen und die aller Voraussicht nach registersensitiv sind, um die sprachspezifischen und die sprachübergreifenden Komponenten der Registervariation zu entflechten (Forschungsziel 3).

Die für diese drei Forschungsziele zu berücksichtigenden Phänomene umfassen (a) Wortfolgeoperationen, die die syntaktische Kompaktheit reduzieren, wie Rechts- und Linksdislokationen, und (b) die Wahl von referentiellen Ausdrücken als pronominal oder null. Genauer gesagt wird untersucht, i) ob es eine sprachübergreifende Assoziation von strukturellen Vorrichtungen gibt, die die syntaktische Kompaktheit reduzieren. Wir werden informelle spontane Sprache mit formaler Sprache und schriftlicher Sprache vergleichen, wobei wir erwarten, dass die bestehende Variation der Wortstellung deutlicher auf Positionen innerhalb der Satzgrenzen beschränkt wird. Wir werden ii) untersuchen, inwieweit sich die Variabilität in der Verwendung von referentiellen Ausdrücken durch Register innerhalb und zwischen den Sprachen unterscheidet.

Da sprachübergreifende Studien zur Registervariation immer noch selten sind, werden wir Methoden zur parallelen Untersuchung der Registervariation über Sprachen hinweg entwickeln, die sowohl die Sprachproduktion als auch die Wahrnehmung betreffen (Forschungsziel 4). Zunächst werden wir ein Lang*Reg-Korpus aufbauen, das auf geführter naturalistischer (spontaner) Sprachproduktion in verschiedenen Situationen basiert. In einem zweiten Schritt werden die Produktionsdaten durch Wahrnehmungsdaten ergänzt, die durch “gradient judgement” Studien und eine situative Klassifizierungsaufgabe zur Assoziation syntaktischer Varianten mit spezifischen situativen Kontexten gesammelt werden.

Veröffentlichungen und Präsentationen

Veröffentlichungen

2025

Lehmann, Nico; Mortezapour, Vahid; Sameri, Motahareh; Verhoeven, Elisabeth; Adli, Aria (2025) Right-peripheral subjects in German and Persian across registers In: Folia Linguistica [DOI] [ViVo]
Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina; Farokhnejad, Zahra; Müller, David; Verhoeven, Elisabeth; Adli, Aria (2025) Lang*Reg corpus: Documenting intra-speaker variation across languages and registers In: Language Documentation & Conservation [ViVo]

2024

Adli, Aria; Verhoeven, Elisabeth; Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina (2024) Lang*Reg: A multi-lingual corpus of intra-speaker variation across situations[DOI] [ViVo]
The Lang*Reg corpus records intra-speaker variation across languages and different situational-functional contexts, presumed to result in different registers. It has been prepared in the SFB1412 Register with data collections taking place in 2021-2022 for the following languages included in this version: German, Persian, Kurdish, Javanese. The data sets for each language comprise the speech of the same language users in a variety of spoken conversations and one written interaction. A minimum of 12 participants per language traversed a course of 6 situations in which they were asked to produce language in three types of activities: telling a story to a friend, talking freely with various interlocutors (friend, stranger, taxi driver) and engaging in an interview with a (university) professor. Moreover, our design included the storytelling in two modes, which allows for the comparison between spoken and written modes of the same language user.

Lang*Reg has a basic syntactic segmentation (one matrix clause and all its dependent clauses per segment). v0.2.0 includes the data sets with transcriptions, normalizations and tokens for each language as well as additional language-specific annotations such as glosses and syntactic annotations. We prepared each data set also for use with the browser-based search and visualization architecture ANNIS. For further language-specific morpho-syntactic and sociolinguistic annotations, refer to the respective data set description. For an overview of all data set characteristics, please see the corpus documentation in each data set.
Lüdeling, Anke; Szucsich, Luka; Zeige, Lars Erik; Alexiadou, Artemis; Adli, Aria; Belz, Malte; Bouzouita, Miriam; Bunk, Oliver; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Fleischer, Jürg; Gagarina, Natalia; Hirsch, Aron; Jannedy, Stefanie; Knoeferle, Pia; Krause, Thomas; Kutscher, Silvia; Liu, Mingya; Lütke, Beate; Machicao y Priemer, Antonio; Maquate, Katja; Merino Hernández, Laura; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Sauerland, Uli; Sauermann, Antje; Schmitt, Viola; Serova, Dina; Solt, Stephanie; Waltereit, Richard; Weirich, Melanie; Wiese, Heike; Verhoeven, Elisabeth; Schumacher, Nicole (2024) Register: Language Users’ Knowledge of Situational-Functional Variation. Frame text of the Second Phase Proposal for the CRC 1412 In: Register Aspects of Language in Situation [DOI] [ViVo]
Lehmann, Nico (2024) The intricacies of register variation across languages[ViVo]

2023

Adli, Aria; Verhoeven, Elisabeth; Lehmann, Nico; Mortezapour, Vahid; Vander Klok, Jozina (2023) Lang*Reg: A multi-lingual corpus of intra-individual variation across situations[DOI] [ViVo]
Language: German, Persian, Yucatec Maya, Kurdish, Javanese
Size: 36 hours
Description: same speakers varied by mode, acquaintance, professionalism, and expertise
Features: transcription, syntactic segmentation, normalization, token, glossing or POS-tags, some syntax
Access: transcription or annotation in progress; CC-BY-NC-ND
Pescuma, Valentina Nicole; Serova, Dina; Lukassek, Julia; Sauermann, Antje; Schäfer, Roland; Adli, Aria; Bildhauer, Felix; Egg, Markus; Hülk, Kristina; Ito, Aine; Jannedy, Stefanie; Kordoni, Valia; Kühnast, Milena; Kutscher, Silvia; Lange, Robert; Lehmann, Nico; Liu, Mingya; Lütke, Beate; Maquate, Katja; Mooshammer, Christine; Mortezapour, Vahid; Müller, Stefan; Norde, Muriel; Pankratz, Elizabeth; Patarroyo, Angela Giovanna; Plesca, Ana-Maria; Ronderos, Camilo R.; Rotter, Stephanie; Sauerland, Uli; Schulte, Britta; Schüppenhauer, Gediminas; Sell, Bianca Maria; Solt, Stephanie; Terada, Megumi; Tsiapou, Dimitra; Verhoeven, Elisabeth; Weirich, Melanie; Wiese, Heike; Zaruba, Kathy; Zeige, Lars Erik; Lüdeling, Anke; Knoeferle, Pia; Schnelle, Gohar (2023) Situating language register across the ages, languages, modalities, and cultural aspects: Evidence from complementary methods In: Frontiers in Psychology [DOI] [ViVo]
In the present review paper by members of the collaborative research center ‘Register: Language Users’ Knowledge of SituationalFunctional Variation’ (CRC 1412), we assess the pervasiveness of register phenomena across different time periods, languages, modalities, and cultures. We define ‘register’ as recurring variation in language use depending on the function of language and on the social situation. Informed by rich data, we aim to better understand and model the knowledge involved in situation- and function-based use of language register. In order to achieve this goal, we are using complementary methods and measures. In the review, we start by clarifying the concept of ‘register’, by reviewing the state of the art, and by setting out our methods and modeling goals. Against this background, we discuss three key challenges, two at the methodological level and one at the theoretical level: 1. To better uncover registers in text and spoken corpora, we propose changes to established analytical approaches. 2. To tease apart between-subject variability from the linguistic variability at issue (intra-individual situation based register variability), we use within-subject designs and the modeling of individuals’ social, language, and educational background. 3. We highlight a gap in cognitive modeling, viz. modeling the mental representations of register (processing), and present our first attempts at filling this gap. We argue that the targeted use of multiple complementary methods and measures supports investigating the pervasiveness of register phenomena and yields comprehensive insights into the cross-methodological robustness of register-related language variability. These comprehensive insights in turn provide a solid foundation for associated cognitive modeling.
Lehmann, Nico; Serova, Dina; Lukassek, Julia; Döring, Sophia; Goymann, Frank; Lüdeling, Anke; Akbari, Roodabeh (2023) Guidelines for the annotation of parameters of narration. In: REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
The present guidelines describe the annotation of narrative phenomena on the clause level, using a combination of ideas and methods from linguistics and lit- erary studies. The main categories marking the discourse strategy “narration” in stretches of text have been narrowed down to mediacy, i. e. involving a narrator, and sequentiality of events. This document specifies how to define mediacy, and in turn determine whether a narrator is present, as well as how to identify events and their sequential ordering. Lastly, a functional layer annotation is proposed which allows researchers to compare different types of narrative instances. This offers a basis for investigating a potential narrative register which is said to be important for many kinds of register studies.

2022

Adli, Aria; Lüdeling, Anke; Alexiadou, Artemis; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Solt, Stephanie; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik; Verhoeven, Elisabeth; Szucsich, Luka (2022) Register: Language Users’ Knowledge of Situational-Functional Variation. Frame text of the First Phase Proposal for the CRC 1412 In: REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
Lehmann, Nico; Verhoeven, Elisabeth (2022) Discourse-Independent Variation in V-Initial Constituent Order: The Yucatec Mayan Preverbal Domain Revisited In: ProcLingEvi2020, Universität Tübingen [DOI] [ViVo]
Contribution to Linguistic Evidence 2020
Lüdeling, Anke; Alexiadou, Artemis; Adli, Aria; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Szucsich, Luka; Verhoeven, Elisabeth; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik (2022) Register: Language Users’ Knowledge of Situational-Functional Variation In: REALIS: Register Aspects of Language in Situation [DOI] [ViVo]
The Collaborative Research Center 1412 “Register: Language Users’ Knowledge of Situational-Functional Variation” (CRC 1412) investigates the role of register in language, focusing in particular on what constitutes a language user’s register knowledge and which situational-functional factors determine a user’s choices. The following paper is an extract from the frame text of the proposal for the CRC 1412, which was submitted to the Deutsche Forschungsgemeinschaft in 2019, followed by a successful onsite evaluation that took place in 2019. The CRC 1412 then started its work on January 1, 2020. The theoretical part of the frame text gives an extensive overview of the theoretical and empirical perspectives on register knowledge from the viewpoint of 2019. Due to the high collaborative effort of all PIs involved, the frame text is unique in its scope on register research, encompassing register-relevant aspects from variationist approaches, psycholinguistics, grammatical theory, acquisition theory, historical linguistics, phonology, phonetics, typology, corpus linguistics, and computational linguistics, as well as qualitative and quantitative modeling. Although our positions and hypotheses since its submission have developed further, the frame text is still a vital resource as a compilation of state-of-the-art register research and a documentation of the start of the CRC 1412. The theoretical part without administrative components therefore presents an ideal starter publication to kick off the CRC’s publication series REALIS. For an overview of the projects and more information on the CRC, see https://sfb1412.hu-berlin.de/.
Adli, Aria (2022) Coherence and implicational hierarchies in the speech of the very old In: The Coherence of Linguistic Communities Orderly Heterogeneity and Social Meaning [ViVo]

2018

Verhoeven, Elisabeth; Lehmann, Nico (2018) Self-embedding and complexity in oral registers In: Glossa: a journal of general linguistics [DOI] [ViVo]
This article reports the results of a study on the self-embedding depth of nominal, verbal and clausal projections in spoken corpora of German. We compared two spoken registers featuring public and non-public (i.e. private) conversation by measuring the depth of self-embedding in C, V, and N projections. The findings confirm the hypothesis that the familiarity of the speech situation (public vs. non-public speech) has a significant impact on complexity in terms of self-embedding: speakers use more self-embedding in public speech production in different syntactic projections. In addition, we examined previous assumptions about the differences between right, left, and center embedding in C projections. The results confirm a preference against center embedding in non-public texts, which reflects the complexity of center embedding. Finally, we find evidence that the depth of self-embedding in V and C projections is correlated. This finding suggests that self-embedding depth is part of a general strategy, i.e., speakers select more or less complex structures (of different types) depending on factors of the speech situation.

Präsentationen

2025

Machicao y Priemer, Antonio; Varaschin, Giuseppe; Verhoeven, Elisabeth; Fritz-Huechante, Paola (2025) Spanish Experiencer Object Verbs: Different Case Marking of the Experiencer and Different Levels of Politeness In: XXIV. Deutscher Hispanistiktag 2025: Dinámicas de transferencia e hibridación [ViVo]
Lehmann, Nico (2025) Mode variation in clausal complexity in German and Persian In: Workshop on the role of language modality in variation and change. [ViVo]

2024

Adli, Aria (2024) Code-switching into the dominant language in multilingual societies? Pronominal forms as markers of politeness and register In: DGfS 2024 (Ruhr-Universität Bochum) [ViVo]
Lehmann, Nico (2024) Clausal complexity across register in German and Persian In: Workshop on Complexity in Language [ViVo]
Lehmann, Nico (2024) Classifying communicative situations and assessing formality In: Kolloquium SFB1412 (2024) [ViVo]

2023

Lehmann, Nico; Mortezapour, Vahid; Sameri, Motahareh; Verhoeven, Elisabeth; Adli, Aria (2023) The formal and functional distribution of right-peripheral arguments in German and Persian across registers In: 56th Annual Meeting of the Societas Linguistica Europaea [ViVo]
Haig, Geoffrey (2023) Which domains of morphosyntax are sensitive to register variation? Thoughts from Iranian languages. In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2023) [ViVo]

2022

Adli, Aria (2022) Local person referents and the role of politeness: Comparing variable subject pronouns in Spanish and Persian In: 11th International Conference on Language Variation in Europe - ICLaVE|11 [ViVo]
Liu, Mingya; Adli, Aria (2022) External Factors In: CRC 1412 – Spring Retreat 2022 [ViVo]
Liu, Mingya; Adli, Aria (2022) External factors under investigation In: CRC 1412 - Fall Retreat 2022 [ViVo]
Adli, Aria; Engel, Eric (2022) Complexity and fluency at the end of the life span In: Kolloquium SFB1412 (2022) [ViVo]
Farokhnejad, Zahra (2022) A general outlook of Kurdish register data: focusing on Code-switching and post-predicate constituents In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2022) [ViVo]
Abishek, Stephen; Farokhnejad, Zahra (2022) CRC fellows present their research projects In: Kolloquium SFB1412 (2022) [ViVo]
Lehmann, Nico (2022) Register and the function puzzle: Why register competence is not the whole story In: Kolloquium SFB1412 (2022) [ViVo]
Serova, Dina; Döring, Sophia; Lehmann, Nico (2022) Narration In: CRC 1412 - Fall Retreat 2022 [ViVo]
Lehmann, Nico; Marklová, Anna; Schnelle, Gohar (2022) Reading circle terminology In: CRC 1412 – Spring Retreat 2022 [ViVo]
Lehmann, Nico; Serova, Dina (2022) Register Aspects of Language in Situation (REALIS) In: CRC 1412 - Fall Retreat 2022 [ViVo]
Lehmann, Nico; Döring, Sophia; Lukassek, Julia; Serova, Dina (2022) On Narration In: CRC 1412 – Spring Retreat 2022 [ViVo]
Adli, Aria; Liu, Mingya; Szucsich, Luka (2022) Area A on modeling In: CRC 1412 – Spring Retreat 2022 [ViVo]
Lütke, Beate; Alexiadou, Artemis; Sauermann, Antje; Meyer, Roland; Verhoeven, Elisabeth (2022) Multilingualism In: CRC 1412 - Fall Retreat 2022 [ViVo]

2021

Adli, Aria; Lehmann, Nico; Verhoeven, Elisabeth; Müller, David (2021) Cross-linguistic aspects of register variation: Right-peripheral constituents in German In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2021) [ViVo]

2020

Lehmann, Nico; Adli, Aria; Verhoeven, Elisabeth; Mortezapour, Vahid (2020) Cross-linguistic aspects of register variation: Creating a Lang*Reg Corpus In: Kolloquium SFB1412 (2020) [ViVo]
Adli, Aria () Languages as registers: Language choice and language mixing as register markers In: 12th International Conference on Language Variation in Europe – ICLaVE|12 [ViVo]
Adli, Aria () Politeness expressions in Southern Kurdish and Javanese: Comparing the role of gender and situation In: New Ways of Analyzing Variation (NWAV) 53, University of Michigan [ViVo]
Lehmann, Nico; Verhoeven, Elisabeth; Adli, Aria () Linguistic and situational context in multiple and flexible plural marking: some data and directions for further research In: Workshop Flexible and Multiple Plural Marking in Language Contact and Creolization: Social and Situational Correlates [ViVo]
Verhoeven, Elisabeth; Bahmani, Maryam; Lehmann, Nico; Farokhnejad, Zahra; Adli, Aria () Lang*Reg: Collecting and processing comparable multi-lingual data for a variety of communicative interactions In: TALKING DATA. Methodological and theoretical challenges raised by spoken interaction data [ViVo]
Verhoeven, Elisabeth; Vander Klok, Jozina; Adli, Aria; Lehmann, Nico; Bahmani, Maryam () The interaction of register and null object instantiations across languages In: 58th Annual Meeting of the Societas Linguistica Europaea (SLE 2025) [ViVo]
Lehmann, Nico () The impersonal use of second person markers in Yucatec Maya In: Grammar in the Field: Languages of the Americas and Beyond [ViVo]
Lehmann, Nico () Communicative situations across cultures: classifying interactions beyond formality and mode. In: Inter-CRC networking meeting [ViVo]
Lehmann, Nico; Vander Klok, Jozina () How People-referring Expressions in Javanese Differ (or not) Across Registers In: International Symposium on the Languages of Java [ViVo]
Adli, Aria; Alexiadou, Artemis; Bunk, Oliver; Farokhnejad, Zahra; Krifka, Manfred; Lehmann, Nico; Sauermann, Antje; Vander Klok, Jozina; Veenstra, Tonjes; Verhoeven, Elisabeth; Wiese, Heike () Languages as registers: Language choice and language mixing as register markers In: International Conference on Language Variation in Europe (ICLaVE), Wien [ViVo]