A04 aims at uncovering and modeling aspects of register variation in German morphosyntax. In the first part, the project will utilize existing formal and empirical methods like Biber’s Multidimensional Analysis and develop them further by (i) using a probabilistic approach to registers, (ii) using a more sophisticated method for automatically inferring registers from distributions of features in texts, namely Latent Dirichlet Allocation, (iii) and validating the cognitive reality of the automatic identification of registers using psycholinguistic methods. In the second part, it will extend the PI’s CoreGram implementation in HPSG to model speakers’ probabilistic knowledge of register-related effects in morphosyntax. This is innovative in asmuch as no similar formal implementation of register effects has been attempted before.
Publications & Presentations
Müller, Stefan (2023) Germanic Syntax: A Constraint-Based View[DOI] [ViVo]This book is an introduction to the syntactic structures that can be found in the Germanic languages. The analyses are couched in the framework of HPSG light, which is a simplified version of HPSG that uses trees to depict analyses rather than complicated attribute value matrices. The book is written for students with basic knowledge about case, constituent tests, and simple phrase structure grammars (advanced BA or MA level) and for researchers with an interest in the Germanic languages and/or an interest in Head-Driven Phrase Structure Grammar/Sign-Based Construction Grammar without having the time to deal with all the details of these theories. Müller, Stefan (2023) Grammatical theory: From transformational grammar to constraint-based approaches. Fifth revised edition[DOI] [ViVo]This book introduces formal grammar theories that play a role in current linguistic theorizing (Phrase Structure Grammar, Transformational Grammar/Government & Binding, Generalized Phrase Structure Grammar, Lexical Functional Grammar, Categorial Grammar, Head-Driven Phrase Structure Grammar, Construction Grammar, Tree Adjoining Grammar). The key assumptions are explained and it is shown how the respective theory treats arguments and adjuncts, the active/passive alternation, local reorderings, verb placement, and fronting of constituents over long distances. The analyses are explained with German as the object language. The second part of the book compares these approaches with respect to their predictions regarding language acquisition and psycholinguistic plausibility. The nativism hypothesis, which assumes that humans posses genetically determined innate language-specific knowledge, is critically examined and alternative models of language acquisition are discussed. The second part then addresses controversial issues of current theory building such as the question of flat or binary branching structures being more appropriate, the question whether constructions should be treated on the phrasal or the lexical level, and the question whether abstract, non-visible entities should play a role in syntactic analyses. It is shown that the analyses suggested in the respective frameworks are often translatable into each other. The book closes with a chapter showing how properties common to all languages or to certain classes of languages can be captured. The book is a translation of the German book Grammatiktheorie, which was published by Stauffenburg in 2010. The following quotes are taken from reviews: With this critical yet fair reflection on various grammatical theories, Müller fills what was a major gap in the literature. Karen Lehmann, Zeitschrift für Rezensionen zur germanistischen Sprachwissenschaft, 2012 Stefan Müller’s recent introductory textbook, Grammatiktheorie, is an astonishingly comprehensive and insightful survey for beginning students of the present state of syntactic theory. Wolfgang Sternefeld und Frank Richter, Zeitschrift für Sprachwissenschaft, 2012 This is the kind of work that has been sought after for a while [...] The impartial and objective discussion offered by the author is particularly refreshing. Werner Abraham, Germanistik, 2012 Pescuma, Valentina Nicole; Serova, Dina; Lukassek, Julia; Sauermann, Antje; Schäfer, Roland; Adli, Aria; Bildhauer, Felix; Egg, Markus; Hülk, Kristina; Ito, Aine; Jannedy, Stefanie; Kordoni, Valia; Kühnast, Milena; Kutscher, Silvia; Lange, Robert; Lehmann, Nico; Liu, Mingya; Lütke, Beate; Maquate, Katja; Mooshammer, Christine; Mortezapour, Vahid; Müller, Stefan; Norde, Muriel; Pankratz, Elizabeth; Patarroyo, Angela Giovanna; Plesca, Ana-Maria; Ronderos, Camilo R.; Rotter, Stephanie; Sauerland, Uli; Schulte, Britta; Schüppenhauer, Gediminas; Sell, Bianca Maria; Solt, Stephanie; Terada, Megumi; Tsiapou, Dimitra; Verhoeven, Elisabeth; Weirich, Melanie; Wiese, Heike; Zaruba, Kathy; Zeige, Lars Erik; Lüdeling, Anke; Knoeferle, Pia; Schnelle, Gohar (2023) Situating language register across the ages, languages, modalities, and cultural aspects: Evidence from complementary methods In: Frontiers in Psychology [DOI] [ViVo]In the present review paper by members of the collaborative research center ‘Register: Language Users’ Knowledge of SituationalFunctional Variation’ (CRC 1412), we assess the pervasiveness of register phenomena across different time periods, languages, modalities, and cultures. We define ‘register’ as recurring variation in language use depending on the function of language and on the social situation. Informed by rich data, we aim to better understand and model the knowledge involved in situation- and function-based use of language register. In order to achieve this goal, we are using complementary methods and measures. In the review, we start by clarifying the concept of ‘register’, by reviewing the state of the art, and by setting out our methods and modeling goals. Against this background, we discuss three key challenges, two at the methodological level and one at the theoretical level: 1. To better uncover registers in text and spoken corpora, we propose changes to established analytical approaches. 2. To tease apart between-subject variability from the linguistic variability at issue (intra-individual situation based register variability), we use within-subject designs and the modeling of individuals’ social, language, and educational background. 3. We highlight a gap in cognitive modeling, viz. modeling the mental representations of register (processing), and present our first attempts at filling this gap. We argue that the targeted use of multiple complementary methods and measures supports investigating the pervasiveness of register phenomena and yields comprehensive insights into the cross-methodological robustness of register-related language variability. These comprehensive insights in turn provide a solid foundation for associated cognitive modeling. Weber, Thilo; Bildhauer, Felix; Münzberg, Franziska (2023) Finite vs. infinite Attributsätze: zu/dass-Alternation bei Substantiven In: Fugenelemente, Präfix-und Partikelverben, Attributsätze [ViVo] Varaschin, Giuseppe; Culicover, Peter W.; Winkler, Susanne (2023) In pursuit of Condition C: (Non-)coreference in grammar, discourse and processing In: Information Structure and Discourse in Generative Grammar [ViVo] Varaschin, Giuseppe (2023) LFG and Simpler Syntax In: Handbook of Lexical Functional Grammar [ViVo] Machicao y Priemer, Antonio; Müller, Stefan; Schäfer, Roland; Bildhauer, Felix (2022) Towards a treatment of register phenomena in HPSG In: Proceedings of the 29th International Conference on Head-Driven Phrase Structure Grammar, Nagoya University & Institute for Japanese Language and Linguistics [ViVo] Müller, Stefan; Machicao y Priemer, Antonio (2022) Modelling Register Variation in HPSG In: CRC 1412 – Spring Retreat 2022 [ViVo] Machicao y Priemer, Antonio; Müller, Stefan (2021) NPs in German: Locality, theta roles, possessives, and genitive arguments In: Glossa: a journal of general linguistics [DOI] [ViVo]
Since Abney (1987), the DP-analysis has been the standard analysis for nominal complexes, but in the last decade, the NP analysis has experienced a revival. In this spirit, we provide an NP analysis for German nominal complexes in HPSG. Our analysis deals with the fact that relational nouns assign case and theta role to their arguments. We develop an analysis in line with selectional localism (Sag 2012: 149), accounting for the asymmetry between prenominal and postnominal genitives, as well as for the complementarity between higher arguments and possessives, providing a syntactic and semantic analysis.
Alexiadou, Artemis; Lüdeling, Anke; Adli, Aria; Donhauser, Karin; Dreyer, Malte; Egg, Markus; Feulner, Anna Helene; Gagarina, Natalia; Hock, Wolfgang; Jannedy, Stefanie; Kammerzell, Frank; Knoeferle, Pia; Krause, Thomas; Krifka, Manfred; Kutscher, Silvia; Lütke, Beate; McFadden, Thomas; Meyer, Roland; Mooshammer, Christine; Müller, Stefan; Maquate, Katja; Norde, Muriel; Sauerland, Uli; Szucsich, Luka; Verhoeven, Elisabeth; Waltereit, Richard; Wolfsgruber, Anne; Zeige, Lars Erik (2020) Register: Language Users’ Knowledge of Situational-Functional Variation In: REALIS: Register Aspects of Language in Situation [DOI] [ViVo]The Collaborative Research Center 1412 “Register: Language Users’ Knowledge of Situational-Functional Variation” (CRC 1412) investigates the role of register in language, focusing in particular on what constitutes a language user’s register knowledge and which situational-functional factors determine a user’s choices. The following paper is an extract from the frame text of the proposal for the CRC 1412, which was submitted to the Deutsche Forschungsgemeinschaft in 2019, followed by a successful onsite evaluation that took place in 2019. The CRC 1412 then started its work on January 1, 2020. The theoretical part of the frame text gives an extensive overview of the theoretical and empirical perspectives on register knowledge from the viewpoint of 2019. Due to the high collaborative effort of all PIs involved, the frame text is unique in its scope on register research, encompassing register-relevant aspects from variationist approaches, psycholinguistics, grammatical theory, acquisition theory, historical linguistics, phonology, phonetics, typology, corpus linguistics, and computational linguistics, as well as qualitative and quantitative modeling. Although our positions and hypotheses since its submission have developed further, the frame text is still a vital resource as a compilation of state-of-the-art register research and a documentation of the start of the CRC 1412. The theoretical part without administrative components therefore presents an ideal starter publication to kick off the CRC’s publication series REALIS. For an overview of the projects and more information on the CRC, see https://sfb1412.hu-berlin.de/. Machicao y Priemer, Antonio; Fritz-Huechante, Paola (2020) Boundaries at play In: Interfaces in Romance [DOI] [ViVo] Summary
In this paper, we model the left-bounded state reading and the true reflexive reading of the
seclitic in the Spanish psychological domain. We argue that a lexical analysis of seprovides us with a more accurate description of the different classes of psychological verbs that occur with the clitic. We provide a unified analysis where the use of the two readings of seare modeled by means of lexical rules. We take the morphologically simple but semantically more complex basic items (e.g. asustar‘frighten’) as input of the lexical rules, getting as the output a morphologically more complex but semantically simpler verb (e.g asustarse‘get frightened’). The analysis for psych verbs correctly allows only those verbs assigning accusative to the experiencer or the stimulus to combine with se, hence preventing dative verbs from entering the lexical rules. The analysis also demonstrates how to account for punctualand non-punctualreadings of psych verbs with seincorporating ‘boundaries’ into the type hierarchy of eventualities. Sailer, Manfred (2023) Explicit or redundant: The social meaning of multiple exponence In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2023) [ViVo] Sailer, Manfred (2023) Explicit or redundant: The social meaning of multiple exponence In: Kolloquium SFB1412 (2023) [ViVo] Machicao y Priemer, Antonio; Varaschin, Giuseppe (2023) Agreement in Brazilian Portuguese: A case of Register Variation In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2023) [ViVo] Varaschin, Giuseppe; Machicao y Priemer, Antonio (2022) Agreement mismatches and register-driven variation in Brazilian Portuguese In: Oberseminar Syntax and Semantics, Institut für England- und Amerikastudien, Goethe-Universität Frankfurt am Main [ViVo] Machicao y Priemer, Antonio; Schäfer, Roland; Bildhauer, Felix; Müller, Stefan (2022) Towards a treatment of register phenomena in HPSG In: The 29th International Conference on Head-Driven Phrase Structure Grammar, Nagoya University & the National Institute for Japanese Language and Linguistics [ViVo] Machicao y Priemer, Antonio (2020) L4L – LaTeX for Linguists Workshop In: MGK Integrated Graduate School – CRC 1412, Humboldt-Universität zu Berlin [ViVo] Schäfer, Roland (2020) Grammatische Variation zwischen Individuen und Situationen: Perspektiven für Linguistik und Bildungsspracherwerb In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2020) [ViVo] Schäfer, Roland; Bildhauer, Felix (2020) Beyond Multidimensional Analysis: Probabilistic Register Induction for Large Corpora In: Humboldt-Universität zu Berlin: Kolloquium Syntax und Semantik (2020) [ViVo]The analysis of the register in which a corpus document is written is prominently associated with Biber’s (1988; 1995) Multidimensional Analysis (MDA). We present an approach superficially similar to MDA but which solves three major conceptual problems of MDA by using Bayesian inference to uncover registers or – rather potential registers. First, in Biber’s MDA, registers are associated discretely with documents, and each document can only instantiate one specific register, whereas we allow registers to be associated probabilistically with documents, and we allow mixtures of registers in single documents. Given that many linguistic phenomena are now understood as being probabilistic in nature (cf. Schäfer 2018), we suggest that this is a much more realistic assumption. Second, we assume the surface features to be associated with registers in a probabilistic manner for similar reasons. Third, we do not use a catalogue of registers assumed to exist a priori, but instead we merely infer potential registers (pregisters) via clusters of surface features. The question of which pregisters actually correspond to registers with an identifiable situational communicative setting will be dealt with in a future stage of the project using theory-driven evaluation and experimental validation. Given our assumptions about the nature of the mapping between features and pregisters and pregisters and documents, an obvious algorithm to use is Bayesian inference in the form of Latent Dirichlet Allocation (LDA; Blei et al. 2003; Blei 2012) as used in Topic Modelling. In our approach, we deal with pregisters instead of topics and with distributions of lexico-grammatical surface features instead of lexical words. The LDA algorithm otherwise performs an exactly parallel inference task. We first show how we extended the COReX feature extraction framework (Bildhauer & Schäfer in prep.) developed at FU Berlin and the IDS Mannheim in order to provide a large enough number of features for the LDA algorithm to work. We then present first results and discuss how we tuned the LDA algorithm and the feature set to lead to interpretable results. In order to be able to interpret the pregisters found by LDA, we extract the documents which most strongly instantiate the inferred pregisters. We introduce the PreCOX20 sub-corpus of the DECOW German web corpus, in which those prototypical documents are collected for further analysis w.r.t. their situational communicative setting. References: Biber, D. (1988). Variation across Speech and Writing. CUP. Biber, D. (1995). Dimensions of Register Variation: A Cross-Linguistic Comparison. CUP. Bildhauer, F. & R. Schäfer (in prep.) Automatic register annotation and alternation modelling. Blei, D. M (2012). Probabilistic topic models. Communications of the ACM 55(4), 77-84. Blei, D. M., A. Y. Ng & M. I. Jordan (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993-1022. Schäfer, R. (2018). Probabilistic German Morphosyntax. Habilitation thesis. HU Berlin.
Resources (re-)used by A04
Details: [ViVo] [URL]
Access: webcorpora.org (free access for academic use)
Engines: NoSkE and RStudio Server
Used by: A04, B01
PreCOXX25: Register-annotated German webcorpusType: Corpus
Access: webcorpora.org (free access for academic use)
Engines: NoSkE and RStudio Server
Used by: A04
RStudio ServerType: Software publication
Details: [ViVo] [URL]
RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
Used by: A04, A07, C03, INF
Featured Bachelor & Master's Theses
Boeke, S. (2021). Funktionen des Vorgangspassivs im Deutschen. BA thesis. Humboldt-Universität zu Berlin.