刊讯|SSCI 期刊《语音学杂志》2022第91-92卷
JOURNAL OF PHONETICS
Volume 91-92, March 2022
Journal of Phonetics(SSCI一区,2021 IF:2.44)2022年第91卷共发文6篇。内容涉及重音、社会语音学等。2022年第92卷共发文5篇。内容涉及卷舌擦音、重音、语音变异等。
往期推荐:
目录
Research Articles, Volume 91
■ The online effect of clash is durational lengthening, not prominence shift: Evidence from Italian, by Francesco Burroni, Sam Tilsen.
■ Reactive feedback control and adaptation to perturbed speech timing in stressed and unstressed syllables, by Miriam Oschkinat, Philip Hoole.
■ The phonetics of sociophonetics: Validating acoustic approaches to Spanish /s/, by Michael S. Gradoville, Earl Kjar Brown, Richard J. File-Murie.
■ Analysing the relationship between L2 production and different stages of L2 processing: Eye-tracking and acoustic evidence for a novel contrast, by James Turner.
■ Theorizing positive transfer in cross-linguistic speech perception: The Acoustic-Attentional-Contextual hypothesis, by William Choi.
■ Voicing in Qaqet: Prenasalization and language contact, by MarijaTabain, MarcGarellek, BirgitHellwig, AdeleGregory, RichardBeare.
Research Articles, Volume 92
■ Neural representations for modeling variation in speech, by Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling.
■ Phonetic imitation of the acoustic realization of stress in Spanish: Production and perception, by Bethany MacLeod, Sabrina M. Di Lonardo Burr.
■ Voicing and frication at the phonetics-phonology interface: An acoustic study of Greek, Serbian, Russian, and English, by Christina Bjorndahl.
■ Extreme stop allophony in Mixtec spontaneous speech: Data, word prosody, and modelling, by Christian Di Canio, Wei-RongChen, Joshua Benn, Jonathan D. Amith, Rey Castillo García.
■ Noise-based acoustic features of Polish retroflex fricatives in children with normal pronunciation and speech disorder, by Zuzanna Miodonska, Pawel Badura, Natalia Mocko.
摘要
The online effect of clash is durational lengthening, not prominence shift: Evidence from Italian
Francesco Burroni, Sam Tilsen, Cornell University, Department of Linguistics, Morrill Hall 203, 14850 Ithaca, NY, USA
Abstract A fundamental question about speech is whether it is governed by rhythmic constraints. One phenomenon that may support the existence of such constraints is the rhythm rule, a phonological pattern hypothesized to resolve prominence clashes and enforce alternations of prominent and non-prominent syllables via shift/deletion of stress and/or pitch accents. We evaluated evidence for the rhythm rule by studying the acoustic correlates of clash in two experiments with speakers of Italian. We found that the first prominent syllable in a clash displays a durational increase and more extreme formant values, when compared to no clash. Thus, a clash is manifested as a localized decrease in speech rate, not as a change to the prominence profile of a word. Since durational increases have been reported for other languages, we argue that they are an online acoustic correlate of clash. We compare two dynamical models of the durational effects, rooted in the framework of Articulatory Phonology: a π-gesture model and a feedback modulation model. Based on our findings, we argue that the rhythm rule is best conceptualized as the result of contextual biases on lexical selection of prominence patterns.
Key words Speech rhythm, Prominence, Stress, Clash, Speech production, Speech planning, Italian
Reactive feedback control and adaptation to perturbed speech timing in stressed and unstressed syllables
Miriam Oschkinat, Philip Hoole, Institute of Phonetics and Speech Processing, Ludwig Maximilian University Munich, Schellingstrasse 3, 80799 Munich, Germany
Abstract This study examines speakers’ reaction to focally applied temporal real-time auditory feedback perturbation in a word-initial unstressed syllable (Unstressed condition) and a similar word-medial stressed syllable (Stressed condition) in a three-syllabic word. Speakers compensate locally in both conditions for the perturbed syllable’s nucleus (V; compressed by the perturbation) but not for the complex onsets (CC; stretched by the perturbation). The perturbation of the first, unstressed syllable causes a global slowing down of all segments following the perturbation (syllable two and three), while the perturbation in the Stressed condition elicits local adjustments only in the perturbed (second) syllable. When viewed in a larger prosodic context, the timing strategy in the Unstressed condition indicates that speakers aim to keep relative durations within the word constant when the word-initial onset is auditorily stretched, leading to a compensatory pattern for both CC and V in word-proportional durations. In the Stressed condition, increasing the stressed vowel’s duration seems to be of the highest priority, causing all other segments to take up a shorter portion within the word. Adaptation effects of the stressed vowel indicate a durational representation on the segment level. Further adaptation effects additionally suggest a representation of timing/coordination in larger prosodic units. Complementary investigation of aperiodicity, spectral skewness, and intensity (RMS) indicates that spectral properties can change along with compensatorily increased duration.
The phonetics of sociophonetics: Validating acoustic approaches to Spanish /s/
Michael S. Gradoville, Arizona State University, United States
Earl Kjar Brown, Brigham Young University, United States
Richard J. File-Murie, University of New Mexico, United States
Abstract In many varieties of Spanish, syllable- and word-final /s/ is subject to a process of reduction from sibilance [s] through aspiration [h] to deletion. Sociolinguistic studies have traditionally used a three-way classification scheme on the basis of impressionistic coding; however, in the last decade, instrumental acoustic measurements have been favored. The present study examines several potential acoustic correlates of Spanish /s/ variants, including the often-used center of gravity, in their ability to faithfully represent the original perceptually-based observation of variation. The results indicate that many measurements can capture the contrast between sibilance [s] and aspiration [h]; however, fewer measurements (center of gravity after high-pass filter, skewness after high-pass filter, intensity without high-pass filter, zero crossing rate after high-pass filter, mel-frequency cepstrum coefficient 1 without high-pass filter) are also capable of detecting the contrast between aspiration [h] and deletion without making a priori assumptions about the appearance of non-zero variants. We propose that a combination of these measurements using Principal Component Analysis, which extracts the commonalities in the measurements, better represents the [s] > [h] > 0 cline than any one measurement by itself. We discuss the need for stricter evaluations of acoustic correlates of sociophonetic categories, especially regarding consonantal variation.
Key words Spanish /s/ realization, Reduction, Impressionistic coding, Instrumental acoustic measurements, Validity, Sociophonetics
Analysing the relationship between L2 production and different stages of L2 processing: Eye-tracking and acoustic evidence for a novel contrast
James Turner, University of Southampton, Southampton, Hampshire SO17 1BF, United Kingdom
Abstract This study analyses the relationship between native English speakers’ perception and production of the novel French /y/–/u/ contrast. Acoustic data were extracted from the learners’ production of French minimal pairs contrasting these French vowels and compared with their processing of the same items in a Visual World eye-tracking task. Results reveal that the vowel most acoustically similar to the learners’ native English /u/ vowel, French /y/, is both easier to identify at early processing stages and more acoustically similar to a native French control group in production, indicating a perception-production relationship. Furthermore, analyses of individual variation reveal that the learners who process both /y/ and /u/ more successfully at later processing stages are also more likely to mark a greater distinction between these phonemes in production. Together, these results indicate a relationship between L2 processing and L2 production at multiple levels. Implications for current L2 speech models are discussed.
Key words L2 speech perception, L2 speech production, Online processing, Acoustic phonetics, L2 phonology, Eye-tracking, Visual World Paradigm
Theorizing positive transfer in cross-linguistic speech perception: The Acoustic-Attentional-Contextual hypothesis
William Choi, Academic Unit of Human Communication, Development, and Information Sciences, The University of Hong Kong, Hong Kong
Abstract Can non-natives outperform natives on speech discrimination? Surprisingly, Cantonese listeners discriminated English stress more accurately than did English listeners. To ascertain its generalizability, I further ask whether this Cantonese advantage in English stress discrimination is equally potent across pitch accent and vowel reduction contexts. Sixty Cantonese and English listeners completed four blocks of English stress discrimination task with varying pitch accent and vowel reduction contexts. In the absence of rising pitch accent pattern and vowel reduction, the Cantonese listeners outperformed the English listeners on English stress discrimination. However, the Cantonese advantage disappeared when either rising pitch accent pattern or vowel reduction was present. When both rising pitch accent pattern and vowel reduction were present, the Cantonese listeners even performed poorer than the English listeners. The findings underscore two constraints of the Cantonese advantage in English stress discrimination—rising pitch accent pattern and vowel reduction. Based on collective research on non-native advantage in speech perception, the Acoustic-Attentional-Contextual hypothesis is proposed.
Key words Stress, Tone, Positive transfer, Vowel reduction, Pitch accent, Perception
Voicing in Qaqet: Prenasalization and language contact
Marija Tabain, La Trobe University, Melbourne, Australia
Marc Garellek, University of California, San Diego, San Diego, United States
Birgit Hellwig, University of Cologne, Cologne, Germany
Adele Gregory, La Trobe University, Melbourne, Australia
Richard Beare, Monash University, and Murdoch Children's Research Institute, Melbourne, Australia
Abstract
Qaqet is a non-Austronesian Baining language of Papua New Guinea, with a very small phoneme inventory of 16 consonants and four vowels, including the voiced stops /b d ɡ/. These stops are often phonetically realized as prenasalized [mb nd ŋɡ], and this feature is assumed to be a result of language contact with surrounding Oceanic languages. Our data consist of isolated word recordings from six female speakers of the language. Using a range of acoustic measures, we compare these phonetically prenasalized stops with non-prenasalized tokens of the same /b d ɡ/ phonemes, and with the nasal phonemes proper /m n ŋ/, as well as with the unaspirated voiceless stops /p t k/.
In general we find that the nasal murmur of the phonetically prenasalized stop does not fully resemble the nasal murmur of the nasal consonant proper – instead the phonetically prenasalized stop patterns between the nasal consonant and the phonetically non-prenasalized voiced stop. However, there is a clear place effect, whereby the phonetically prenasalized velar stop patterns more closely with the nasal consonants, and the phonetically prenasalized bilabial stop patterns more closely with the phonetically non-prenasalized voiced stops – with phonetically prenasalized alveolar in between the bilabial and the velar. This is particularly reflected in the distribution of energy below and above about 350 Hz. However, measures of voicing strength suggest that voicing for the velar is weaker across all manners of articulation, in line with the general difficulty of maintaining voicing at this place of articulation. We conclude that prenasalization of the voiced stops largely serves to maintain voicing for the velar place of articulation; and that if the feature of prenasalization was borrowed from neighbouring languages, it was to maintain voicing for long closure durations in a true voicing language, particularly at places of articulation where maintaining voicing is difficult.
Key words Voicing, Prenasalizaton, Language contact, Papuan languages, Acoustics of nasal consonants
Neural representations for modeling variation in speech
Martijn Bartelds, Wietse de Vries, Center for Language and Cognition, Faculty of Arts, University of Groningen, Groningen, The Netherlands
Faraz Sanal, Caitlin Richter, Mark Liberman, Department of Linguistics, University of Pennsylvania, Philadelphia, PA, USA
Martijn Wieling, Haskins Laboratories, New Haven, CT, USA
Abstract Variation in speech is often quantified by comparing phonetic transcriptions of the same utterance. However, manually transcribing speech is time-consuming and error prone. As an alternative, therefore, we investigate the extraction of acoustic embeddings from several self-supervised neural models. We use these representations to compute word-based pronunciation differences between non-native and native speakers of English, and between Norwegian dialect speakers. For comparison with several earlier studies, we evaluate how well these differences match human perception by comparing them with available human judgements of similarity. We show that speech representations extracted from a specific type of neural model (i.e. Transformers) lead to a better match with human perception than two earlier approaches on the basis of phonetic transcriptions and MFCC-based acoustic features. We furthermore find that features from the neural models can generally best be extracted from one of the middle hidden layers than from the final layer. We also demonstrate that neural speech representations not only capture segmental differences, but also intonational and durational differences that cannot adequately be represented by a set of discrete symbols used in phonetic transcriptions.
Key words Acoustic distance, Acoustic embeddings, Neural networks, Pronunciation variation, Speech, Transformers, Unsupervised representation learning
Phonetic imitation of the acoustic realization of stress in Spanish: Production and perception
Bethany MacLeod, School of Linguistics and Language Studies, Carleton University, Ottawa, ON, Canada
Sabrina M. Di Lonardo Burr, Department of Cognitive Science, Carleton University, Ottawa, ON, Canada
Abstract This study explores imitation of the acoustic realization of Spanish stress in disyllabic words produced in isolation, which is cued by three correlates: F0, duration, and intensity. Forty-eight native speakers of Mexican Spanish shadowed one of four model talkers of the same dialect. Differentials for each acoustic correlate of stress were generated by calculating the difference between the values of the first and second vowels for each of F0, duration, and intensity, for all recordings. Next, 87 Spanish speakers participated as listeners in a holistic perceptual assessment (4IAX task) of the shadowers’ productions. Bayesian mixed-effects modelling was performed for both the acoustic and perceptual data. The results showed that the shadowers imitated the model talkers on all three differentials, but made the greatest shifts on the F0 differential, followed by duration, shifting the least on intensity. Analysis of the perceptual pattern showed that the listeners perceived imitation and that the shadowers’ imitation on all three differentials contributed to the perceptual pattern. Lastly, the extent to which the listeners relied on imitation of the differentials aligned roughly, but not exactly, with how much the shadowers had converged on each differential, with listeners using imitation on duration the most, followed by F0, followed by intensity.
Key words Phonetic imitation, Phonetic convergence, Stress, Shadowing, Acoustic analysis, Perception, Spanish
Voicing and frication at the phonetics-phonology interface: An acoustic study of Greek, Serbian, Russian, and English
Christina Bjorndahl, Carnegie Mellon University, United States
Abstract This paper presents the results of an acoustic investigation of /f, v, s, z/ in Greek, Serbian, Russian, and English. The study is motivated by phonological considerations, specifically the cross-linguistic phonological identity of /v/ as either an obstruent (Greek, English), a sonorant (Serbian), or as a segment that patterns with both obstruents and sonorants (Russian). The investigation is framed in two ways, reflecting different interpretations of what it means for /v/ to be classified as part of the obstruent or sonorant system of a language: (1) a cross-linguistic comparison of /v/ tokens tackles the question of whether /v/ tokens are realized with frication indicative of an obstruent or sonorant realization; (2) within-language investigations into the relationship between voicing and frication type probe whether /f, v/ are a voicing pair analogous to /s, z/. Four acoustic measures are considered: duration, harmonicity, spectral centroid, and spectral energy difference. Furthermore, devoicing rates of /v, z/ are examined, contributing to our understanding of how aerodynamic tensions in voiced fricatives are resolved on a language-specific basis. Results show that in all languages, /z/ devoices more than /v/, but that otherwise fidelity to underlying voicing differs across languages. The cross-linguistic comparison of /v/ tokens and the within-language investigations suggests a partial correlation between phonological identity and phonetic realization, but that there is not a one-to-one relationship between phonological patterning and phonetic realization.
Key words Voiced fricatives, Obstruent devoicing, Spectral centroid, Spectral slope, Harmonicity, Russian /v/, Word-internal prosodic effects
Extreme stop allophony in Mixtec spontaneous speech: Data, word prosody, and modelling
Christian DiCanio, Department of Linguistics, University at Buffalo, Buffalo NY 14260, USA
Wei-Rong Chen, Haskins Laboratories, 300 George Street, New Haven CT 06511, USA
Joshua Benn, Department of Linguistics, University at Buffalo, Buffalo NY 14260, USA
Jonathan D. Amith, Department of Anthropology, Gettysburg College, 300 N Washington St, Gettysburg, PA 17325, USA
Rey Castillo García, Secretaría de educación pública (SEP), Guerrero, Mexico
Abstract Word-level prosody plays an important role in processes of consonant lenition. Typically, consonants in word-initial position are strengthened while those in word-medial position are lenited (Keating, Cho, Fougeron, & Hsu, 2003). In this paper we examine the relationship between word-prosodic position and obstruent lenition in a spontaneous speech corpus of Yoloxóchitl Mixtec, an endangered Mixtecan language spoken in Mexico. The language exhibits a surprising amount of lenition in the realization of otherwise voiceless unaspirated stops and voiceless fricatives in careful speech. In Experiment 1, we examine the relationships between word position, consonant duration, and passive voicing and find that word-medial pre-tonic position is the locus of both consonant lengthening and less passive voicing. Non-pre-tonic consonants are produced with more voicing and shorter duration. We also find that the functional status of the morpheme plays a role in voicing lenition. In Experiment 2, we examine manner lenition and find a similar pattern – word-medial pre-tonic stops are more often realized with complete closure relative to non-pre-tonic stops, which are more often realized with incomplete closure. In Experiment 3, we model these lenition patterns using a series of deep neural networks and find that, even with limited training data, we can achieve reasonably high accuracy in the automatic categorization of lenition patterns. The results of this research both complement recent work on the phonetics of lenition in the world’s languages (Katz and Fricke, 2018; White et al., 2020) and provide computational tools for modeling and predicting patterns of extreme lenition.
Key words Corpus phonetics, Speech reduction, Endangered languages, Mixtecan, Prosody
Noise-based acoustic features of Polish retroflex fricatives in children with normal pronunciation and speech disorder
Zuzanna Miodonska, Pawel Badura, Faculty of Biomedical Engineering, Silesian University of Technology, Roosevelta 40, 41-800 Zabrze, Poland
Natalia Mocko, Faculty of Humanities, Institute of Linguistics, University of Silesia, Sejmu Śląskiego 1, 40-001 Katowice, Poland
Abstract This study addresses Polish retroflex sibilants /ʂ, ʐ/ produced by preschool children. The aims of our research were (1) to explore acoustic characteristics of normal and distorted (dental and interdental) articulation patterns of retroflex fricatives and (2) to define and verify new acoustic features of frication noise. We extracted and analyzed a set of 80 acoustic features, including full-spectrum-based metrics (linear cepstral coefficients, mel-frequency cepstral coefficients, spectral moments) and noise-based metrics (noise energies, fricative formants, and original features: noise cepstral coefficients and fricative formant relations). The analysis involved linear mixed-effects models and Spearman’s rank correlation over speech samples from 42 Polish children (21 with normal and 21 with distorted pronunciation). Normal articulation of Polish retroflex sibilants proved to be acoustically distinguishable from non-normative interdental and dental articulation. Significant acoustic differences between the classes considered were found both in the full-spectrum-based and noise-based features, including our proposed measures (p﹤0.05). Thirty-six of 80 analyzed features proved significantly correlated (Spearman’s 丨p丨﹥0.5, p﹤0.05) with tongue position and front-cavity size. More evident cues for articulation pathologies were found in the voiceless sibilant /ʂ/ than in /ʐ/. Our study confirms that metrics describing the structure of frication noise bring information distinctive in particular articulatory oppositions for a more comprehensive acoustic description of sibilants.
Key words Sibilants, Speech disorders, Child speech, Frication noise, Cepstral coefficients, Polish
期刊简介
The Journal of Phonetics publishes papers of an experimental or theoretical nature that deal with phonetic aspects of language and linguistic communication processes. Papers dealing with technological and/or pathological topics, or papers of an interdisciplinary nature are also suitable, provided that linguistic-phonetic principles underlie the work reported. Regular articles, review articles, and letters to the editor are published. Themed issues are also published, devoted entirely to a specific subject of interest within the field of phonetics.
《语音学杂志》刊发实验性或理论性的论文,涉及语言和语言交流过程的语音方面。也包含涉及技术或病理主题的论文,或跨学科性质的论文,只要报告的工作是以语言学或语音学原理为基础即可。该刊定期发表文章、评论文章和致编辑的信。也会出版主题期刊,完全致力于语音学领域。
官网地址:https://www.sciencedirect.com/journal/journal-of-phonetics
本文来源:Journal of Phonetics官网
点击文末“阅读原文”可跳转下载
课程推荐
2022-08-31
2022-08-29
2022-08-28
2022-08-26
2022-08-25
2022-08-24
2022-08-23
2022-08-21
2022-08-19
2022-08-18
2022-08-16
2022-08-15
2022-08-14
2022-08-13
2022-08-12
欢迎加入
今日小编:钊 君
审 核:心得小蔓
转载&合作请联系
"心得君"
微信:xindejun_yyxxd
点击“阅读原文”可跳转下载