刊讯|SSCI 期刊《语料库语言学和语言学理论》2022年第1-2期
2022-11-02
2022-11-01
2022-11-01
Corpus Linguistics and Linguistic Theory
Corpus Linguistics and Linguistic Theory(SSCI二区,2021 IF:2.143)2022年第1-2期共发文14篇。研究论文涉及附加问句、及物性构式中的代词、汉语轻动词、科技英语的历时演变、汉语的处所结构和处所词、语言产出实验、汉语致使性构式、英语进行体构式的习得、神经网络、词语借贷、文本自动检测、文本到语音的技术等方面。
目录
Issue 1
■ A corpus-based analysis of meaning variations in German tag questions Evidence from spoken and written conversational corpora, by Yulia Clausen, Tatjana Scheffler, Pages 1-31.
■ Effects of task and corpus-derived association scores on the online processing of collocations, by Kyla McConnell, Alice Blumenthal-Dramé, Pages 33-76.
■ Conservation in ongoing analogical change: The measurement and effect(s) of token frequency, by Anne Krause-Lerche, Pages 77-114.
■Syntactico-semantic realizations of pronouns in the English transitive construction: A corpus-based analysis, by Haerim Hwang, Pages 115-143.
■ Light verb variations and varieties of Mandarin Chinese: Comparable corpus driven approaches to grammatical variations, by Hongzhi Xu, Menghan Jiang, Jingxia Lin, Chu-Ren Huang, Pages 145-173.
■ Toward an optimal code for communication: The case of scientific English, by Stefania Degaetano-Ortlieb, Elke Teich, Pages 175-207.
Issue 2
■ Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles, by Alvin Cheng-Hsien Chen, Pages 209-235.
■ Language production experiments as tools for corpus construction: A contrastive study of complementizer agreement, by Matthias Fingerhuth, Ludwig Maximilian Breuer, Pages 237-262.
■ Profiling the Chinese causative construction with rang (讓), shi (使) and ling (令) using frame semantic features, by Andreas Liesenfeld, Meichun Liu, Chu-Ren Huang, Pages 263-306.
■Development of the progressive construction in Chinese EFL learners’ written production: From prototypes to marginal members, by Tianqi Wu, Min Wang, Pages 307-335.
■A connectionist approach to analogy. On the modal meaning of periphrastic do in Early Modern English, by Sara Budts, Pages 337-364.
■ Phraseology in a cross-linguistic perspective: A diachronic and corpus-based account, by Andersen Gisle, Pages 365-389.
■ Using automated methods to explore the social stratification of anglicisms in Spanish, by Jacqueline Serigos, Pages 391-418.
■ The Information Structure–prosody interface in text-to-speech technologies. An empirical perspective, by Mónica Domínguez, Mireia Farrús, Leo Wanner, Pages 419-445.
摘要
A corpus-based analysis of meaning variations in German tag questions Evidence from spoken and written conversational corpora
Yulia Clausen, Tatjana Scheffler
Abstract This paper addresses semantic/pragmatic variability of tag questions in German and makes three main contributions. First, we document the prevalence and variety of question tags in German across three different types of conversational corpora. Second, by annotating question tags according to their syntactic and semantic context, discourse function, and pragmatic effect, we demonstrate the existing overlap and differences between the individual tag variants. Finally, we distinguish several groups of question tags by identifying the factors that influence the speakers’ choices of tags in the conversational context, such as clause type, function, speaker/hearer knowledge, as well as conversation type and medium. These factors provide the limits of variability by constraining certain question tags in German against occurring in specific contexts or with individual functions.
Keywords German; tag questions; discourse functions; pragmatic variability; corpus annotation
Effects of task and corpus-derived association scores on the online processing of collocations
Kyla McConnell, Alice Blumenthal-Dramé
Abstract In the following self-paced reading study, we assess the cognitive realism of six widely used corpus-derived measures of association strength between words (collocated modifier–noun combinations like vast majority): MI, MI3, Dice coefficient, T-score, Z-score, and log-likelihood. The ability of these collocation metrics to predict reading times is tested against predictors of lexical processing cost that are widely established in the psycholinguistic and usage-based literature, respectively: forward/backward transition probability and bigram frequency. In addition, the experiment includes the treatment variable of task: it is split into two blocks which only differ in the format of interleaved comprehension questions (multiple choice vs. typed free response). Results show that the traditional corpus-linguistic metrics are outperformed by both backward transition probability and bigram frequency. Moreover, the multiple-choice condition elicits faster overall reading times than the typed condition, and the two winning metrics show stronger facilitation on the critical word (i.e. the noun in the bigrams) in the multiple-choice condition. In the typed condition, we find an effect that is weaker and, in the case of bigram frequency, longer lasting, continuing into the first spillover word. We argue that insufficient attention to task effects might have obscured the cognitive correlates of association scores in earlier research.
Keywords collocations; cognitive realism; association scores; task effects; self-paced reading
Conservation in ongoing analogical change: The measurement and effect(s) of token frequency
Anne Krause-Lerche
Abstract In a number of studies of analogical levelling, it has been found that the conservation of irregular formation patterns is typically correlated with the token frequency of the members of a changing class. Interestingly, although it was suggested decades ago that this “conserving effect” of high token frequency may also affect ongoing analogical change, only one case of a change-in-progress in morphology has been investigated so far. Moreover, instead of scrutinizing the concept of frequency, previous research has largely taken the importance of lemma token frequency for granted. The present contribution analyses a case of ongoing analogical levelling in the formation of the imperative singular of German strong verbs with e/i-gradation. A corpus-based study is used to test whether the phenomenon is rightly classified as ongoing change and whether and which frequency variables can explain the trajectory of this change. Evidence is presented that justifies the assumption of a conserving effect of token frequency in ongoing morphological change; however, the study stresses the importance of reconsidering the concept of frequency for different languages and different phenomena of change because even measures like lemma token frequency are not as indisputable as they seem.
Keywords change-in-progress; frequency; persistence; web corpus; analogical levelling
Syntactico-semantic realizations of pronouns in the English transitive construction: A corpus-based analysis
Haerim Hwang
Abstract Pronouns serve as early linguistic cues for the acquisition of the English transitive construction (TC), but previous research has been limited to first language (L1) settings. This study focuses on TC input in the English as a foreign language (EFL) context, investigating syntactico-semantic differences in realizations of TC arguments, particularly pronouns, between L1 parental input and Korean EFL input. To this end, four corpora were created by collecting spoken data from L1-English parents talking to their children, L1-Korean EFL teachers, L1-English EFL teachers, and auditory EFL textbooks. From these corpora, transitive clauses were extracted so that their arguments could be categorized. Mixed-effects negative binomial regression analyses and hierarchical cluster analyses (preceded by principal component analyses) showed that in the realization of TC arguments, Korean EFL input differs syntactically and semantically from L1-English parental input, both for the subjects and objects of TCs. The syntactic difference was particularly pronounced for objects, where fewer pronouns were observed in the EFL input than in the L1-English parental input. Semantically, co-occurrence regularities between transitive verbs and arguments were identified only in the L1-English input and not in the EFL input. Pedagogical implications of the findings are also discussed.
Keywords pronoun; transitive construction; input; usage-based approach; Korean EFL setting
Light verb variations and varieties of Mandarin Chinese: Comparable corpus driven approaches to grammatical variations
Hongzhi Xu, Menghan Jiang, Jingxia Lin, Chu-Ren Huang
Abstract This article presents a classification and clustering based study to account for the differences among five Chinese light verbs (congshi, gao, jiayi, jinxing, and zuo) as well as their variations in Mainland of China Mandarin (ML) and Taiwan Mandarin (TW). Based on 13 linguistic features, both competition and co-development of these light verbs are studied in terms of their distinct and shared collocates. The proposed method discovers significant new grammatical differences in addition to confirming previously reported ones. Most significant discoveries include selectional restrictions differentiating deverbal nominals and event nouns, and degrees of transitivity of VO compounds. We also find that most variations between Mainland of China Mandarin and Taiwan Mandarin are in fact differences in tendencies or preferences in contexts of usage of shared grammatical rules.
Keywords Chinese light verbs; language variations; clustering; classification; corpus approach
Toward an optimal code for communication: The case of scientific English
Stefania Degaetano-Ortlieb, Elke Teich
Abstract We present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.
Keywords diachronic change; scientific English; Kullback–Leibler Divergence; surprisal
Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles
Alvin Cheng-Hsien Chen
Abstract In this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.
Keywords usage-based grammar; collocation; collostruction analysis; network analysis; space particles; construction grammar
Language production experiments as tools for corpus construction: A contrastive study of complementizer agreement
Matthias Fingerhuth, Ludwig Maximilian Breuer
Abstract The investigation of linguistic phenomena in corpora of spontaneous speech is sometimes hindered by corpus size or by the complexity of the factors influencing their occurrence. Language Production Experiments (LPEs) can specifically elicit such phenomena and can therefore be used to build corpora that allow for their investigation. Yet experiments are a wide category that covers very different tasks, and there is little empirical research that compares speakers’ response behavior to different task types. In this paper, we compare the responses of a group of 22 speakers to a translation task and a completion task, both of which target the syntactic phenomena complementizer agreement (CA). The results indicate that both experimental methods offer legitimate ways to investigate the phenomenon with specific advantages and disadvantages. However, a comparison of results from both tasks allows for insights that a single task could not have provided.
Keywords syntax; language production experiments; complementizer agreement; corpus construction
Profiling the Chinese causative construction with rang (讓), shi (使) and ling (令) using frame semantic features
Andreas Liesenfeld, Meichun Liu, Chu-Ren Huang
Abstract This behavioural profiling (BP) study examines the use of the near-synonyms rang (讓), shi (使) and ling (令), three ways to express cause-effect relationships in Chinese. Instead of using an out-of-the-box BP design, we present a modified approach to profiling that includes a range of frame semantic features that aim to capture variation of slot fillers of this construction. The study investigates the intricate semantic variation of rang, shi and ling through a comprehensive analysis of 38 contextual features (ID tags) that characterize the collocational, lexical semantic and frame semantic environment of the near-synonyms. Our dataset consists of around 100.000 data points based on the annotation of 1002 sentences of Mandarin Chinese of three varieties. The BPs of each near-synonym are compared using multidimensional scaling and hierarchical cluster analysis. The results show that rang, shi and ling are each characterized by a combination of distinctive features and how different feature types contribute to setting the near-synonyms apart based on their usage patterns. Methodologically, this study illustrates how behavioural profiling can be modified to include frame semantic features in accordance with the method’s emphasis on producing empirically verifiable results and how these features can aid a comparative analysis of near-synonyms.
Keywords behavioural profiling; construction grammar; frame semantics; near-synonymy; causative constructions; Mandarin Chinese
Development of the progressive construction in Chinese EFL learners’ written production: From prototypes to marginal members
Tianqi Wu, Min Wang
Abstract This study investigates the developmental trajectory of L2 English progressive construction with a focus on frequency, verb-construction contingency and semantic prototypicality. Comparisons were made on the use of the progressive construction in argumentative essays written by Chinese learners at three different proficiency levels and English native speakers. Data of frequency and verb type distribution indicate that L2 learners’ progressive repertoire showed an increase in productivity and variability and a spread from a fixed type to a wider range of verbs. Contingency data demonstrate that, when associating verbs with the progressive, learners’ preference shifted from prototypical progressive verbs which denote specific and dynamic meanings to more marginal members represented by generic verbs. In addition, semantic prototypicality overweighs generality in driving the development of the progressive, which presents an interesting contrast with findings in the verb-argument construction learning literature where semantically general verbs were first predominantly used in the construction.
Keywords the progressive; construction learning; learner corpora; second language acquisition; Chinese EFL learners
A connectionist approach to analogy. On the modal meaning of periphrastic do in Early Modern English
Sara Budts
Abstract This paper innovatively charts the analogical influence of the modal auxiliaries on the regulation of periphrastic do in Early Modern English by means of Convolutional Neural Networks (CNNs), a flavour of connectionist models known for their applications in computer vision. CNNs can be harnessed to model the choice between competitors in a linguistic alternation by extracting not only the contexts a construction occurs in, but also the contexts it could have occurred in, but did not. Bearing on the idea that two forms are perceived as similar if they occur in similar contexts, the models provide us with pointers towards potential loci of analogical attraction that would be hard to retrieve otherwise. Our analysis reveals clear functional overlap between do and all modals, indicating not only that analogical pressure was highly likely, but even that affirmative declarative do functioned as a modal auxiliary itself throughout the late 16th century.
Keywords neural networks; connectionism; analogy; modality; periphrastic do
Phraseology in a cross-linguistic perspective: A diachronic and corpus-based account
Andersen Gisle
Abstract English exerts great influence on other languages at the lexical level, as seen from extensive borrowing of terminology and everyday words into many languages (i.e. Anglicisms such as swap, blog, etc.). Although much less studied, it is also clear that the “phrasicon” (Granger, Sylviane. 2009. Comment on: learner corpora: A window onto the L2 phrasicon. In Andy Barfield & Henrik Gyllstad (eds.), Researching collocations in another language. multiple interpretations, 60–65. Houndmills: Palgrave Macmillan) of a language can similarly be affected by such external influence. This paper investigates “the largely unexplored area of phraseological borrowing” (Fielder, Sabine (2017) Phraseological borrowing from English into German: Cultural andpragmatic implications. Journal of Pragmatics 113: 89–102, 90) by introducing the diachronic-contrastive corpus method and exemplifying it with reference to a set of expressions that have been considered the products of language contact between English and Norwegian. I argue that the proposed corpus method can be used efficiently for investigating phraseology across time, for shedding light on the question of whether cross-linguistically parallel structures are the result of borrowing or parallel developments, and – importantly – as a vehicle for rejecting preconceived ideas about a form’s alleged origin in English.
Keywords phraseology; corpus pragmatics; Norwegian; contrastive corpus analysis; diachrony
Using automated methods to explore the social stratification of anglicisms in Spanish
Jacqueline Serigos
Abstract Traditionally, automated methods for loanword detection have not received an abundance of attention within the field of language contact. However, as research on loanwords has begun utilizing corpora with word counts in the millions, these generous quantities of data pose challenges for traditional methods of linguistic annotation. This paper presents a method for automatically detecting anglicisms within Spanish text and presents a case study, applying this method to explore the social stratification of anglicisms in Argentine media. The findings of the case study suggest that anglicisms may function as prestige markers in Argentina, which may be a logical consequence of the mode of contact: those of upper socio-economic status have greater access to outlets where loanwords seem to emerge, such as the media, Internet, and second language education.
Keywords loanwords; language contact; sociolinguistics; automated methods; generalized linear model
The Information Structure–prosody interface in text-to-speech technologies. An empirical perspective
Mónica Domínguez, Mireia Farrús, Leo Wanner
Abstract The correspondence between the communicative intention of a speaker in terms of Information Structure and the way this speaker reflects communicative aspects by means of prosody have been a fruitful field of study in Linguistics. However, text-to-speech applications still lack the variability and richness found in human speech in terms of how humans display their communication skills. Some attempts were made in the past to model one aspect of Information Structure, namely thematicity for its application to intonation generation in text-to-speech technologies. Yet, these applications suffer from two limitations: (i) they draw upon a small number of made-up simple question-answer pairs rather than on real (spoken or written) corpus material; and (ii) they do not explore whether any other interpretation would better suit a wider range of textual genres beyond dialogs. In this paper, two different interpretations of thematicity in the field of speech technologies are examined: the state-of-art binary (and flat) theme-rheme, and the hierarchical thematicity defined by Igor Mel’čuk within the Meaning-Text Theory. The outcome of the experiments on a corpus of native speakers of US English suggests that the latter interpretation of thematicity has a versatile implementation potential for text-to-speech applications of the Information Structure–prosody interface.
Keywords communicative structure; information structure; intonation; prosody; rheme; specifier; thematicity; theme; ToBI
期刊简介
Corpus Linguistics and Linguistic Theory (CLLT) is a peer-reviewed journal publishing high-quality original corpus-based research focusing on theoretically relevant issues in all core areas of linguistic research, or other recognized topic areas. It provides a forum for researchers from different theoretical backgrounds and different areas of interest that share a commitment to the systematic and exhaustive analysis of naturally occurring language. Contributions from all theoretical frameworks are welcome but they should be addressed at a general audience and thus be explicit about their assumptions and discovery procedures and provide sufficient theoretical background to be accessible to researchers from different frameworks.《语料库语言学和语言学理论》(CLLT)是一本同行评议期刊,发表高质量的基于语料库的原创研究,重点关注语言学研究所有核心领域或其他公认领域的相关理论问题。本刊为来自不同理论背景和兴趣领域的、共同致力于对自然语言进行系统和详尽的分析的研究者提供了一个论坛。我们欢迎来自所有理论框架的文章,但文章应针对普通读者,因此要明确其假设和发现程序并提供足够的理论背景,以使不同背景的研究者都可接受。Topics
Corpus Linguistics Quantitative Linguistics Phonology Morphology Semantics Syntax Pragmatics
官网地址:https://www.degruyter.com/journal/key/cllt/html
本文来源:CLLT官网
点击文末“阅读原文”可跳转下载
课程推荐
2022-11-02
2022-11-02
2022-11-02
2022-11-01
2022-11-01
2022-11-01
2022-10-31
2022-10-31
2022-10-30
2022-10-27
2022-10-25
2022-10-24
欢迎加入
“语言学心得交流分享群”“语言学考博/考研/保研交流群”请添加“心得君”入群请务必备注“学校+研究方向/专业”
今日小编:young
审 核:心得小蔓
转载&合作请联系
"心得君"
微信:xindejun_yyxxd
点击“阅读原文”可跳转下载