查看原文
其他

刊讯|SSCI 期刊《语音学杂志》2022年第93-95卷

六万学者关注了→ 语言学心得 2023-05-17

刊讯|《国际中文教育(中英文)》2022年第4期(留言赠刊)

2023-01-24

刊讯|SSCI 期刊《多语与多元文化发展》2022年第3-6期

2023-01-21

新刊|《汉学与国际中文教育》2022年第1-2期

2023-01-20

Journal of Phonetics

Volume 93-95, 2022

《语音学杂志》(SSCI一区,2021 IF:2.44)2022年第93-95卷共发文27篇,欢迎转发扩散!(2022年已更完)《语音学杂志》2022第93卷共发文6篇。话题涉及语速、语言起始辅音序列、语境、汉语声调等方面。
《语音学杂志》2022第94卷共发文12篇。话题涉及美式英语音高、二语对比研究、会话话轮转换处理模型、音质语音学等方面。《语音学杂志》2022年第95期共发文9篇。研究论文主题涉及同音字、普通话反问句的韵律、语言节奏、母语声调对二语习得的影响、音系学、交际中的语音调节等。
往期推荐:刊讯|SSCI 期刊《语音学杂志》2022第91-92卷
刊讯|SSCI 期刊《语音学杂志》2022第90卷

目录


Volume 93

■ An investigation of functional relations between speech rate and phonetic variables, by Seung-Eun Kim, Sam Tilsen

■ The impact of phonotactic features on novel tone discrimination, by Jonathan Wright, Melissa Baese-Berk

■ Responses to time pressure on phrase-final melodies in varieties of Dutch and West Frisian, by Judith Hanssen, Jörg Peters, Carlos Gussenhoven

■ Language and cluster-specific effects in the timing of onset consonant sequences in seven languages, by Marianne Pouplier, Manfred Pastätter, Philip Hoole, Stefania Marin, Ioana Chitoran, Tomas O.Lentz, Alexei Kochetov

■ The influence of preceding speech and nonspeech contexts on Mandarin tone identification, by Hui Zhang, Hongwei Ding, Wai-SumLee

■ Effects of native language and habituation in phonetic accommodation, by Stephen J.Tobin


Volume 94

■ Prosodic phrasing mediates listeners' perception of temporal cues: Evidence from the Korean Accentual Phrase, by Jeremy Steffman, Sahyang Kim,  Taehong Cho, Sun-Ah Jun

■ American English pitch accents in variation: Pushing the boundaries of mainstream American English-ToBI conventions, by Rachel Steindel Burdin, Nicole R. Holliday, Paul E. Reedc

■ Constituent durations in English NNN compounds: A case of strategic speaker behavior? by Annika Schebesta

■ Who is Fu? – Perception of L2 sounds that are partially neutralized in L1, by Makiko Aoyagi, Yue Wang

■ The influence of expectations on tonal cues to prominence, by Christine T. Röhr, Stefan Baumann; Martine Grice

■ Spelling provides a precise (but sometimes misplaced) phonological target. Orthography and acoustic variability in second language word learning, by Pauline Welby, Elsa Spinelli, Audrey Bürkid

■ Diachronic phonological asymmetries and the variable stability of synchronic contrast, by Sam Kirkham, Claire Nance

■ Sub-band cepstral distance as an alternative to formants: Quantitative evidence from a forensic comparison experiment, by Yuko Kinoshita, Takashi Osanai, Frantz Clermont

■ Vocal reaction times to speech offsets: Implications for processing models of conversational turn-taking, by Francisco Torreira, Sara Bögels

■ Phonetic and phonological cues to prediction: Neurophysiology of Danish stød, by Anna Hjortdal, Johan Frid, Mikael Roll

■ Theoretical achievements of phonetics in the 21st century: Phonetics of voice quality, by Marc Garellek

■ Classifying conversational entrainment of speech behavior: An expanded framework and review, by Camille J. Wynn, Stephanie A. Borrie


Volume 95

■ Homophone discrimination based on prior exposure, by ChelseaSanke

■ Exposure to speech via foreign film and its effects on non-native vowel production and perception, by Amy E. Hutchinson

■ Analyzing time-varying spectral characteristics of speech with function-on-scalar regression, by Rasmus Puggaard-Rode

■ The prosodic marking of rhetorical questions in Standard Chinese, by Katharina Zahner-Ritter, Yiya Chen, NicoleDehé, BettinaBraun, 

■ Measured and perceived speech tempo: Comparing canonical and surface articulation rates, by Leendert Plug, Robert Lennon, Rachel Smith

■ The production of English syllable-level timing patterns by bilingual English- and Spanish-speaking children with cochlear implants and their peers with normal hearing, by Mark Gibson, Ferenc Bunta, Charles Johnson, Miriam Huárriz

■ Native language experience with tones influences both phonetic and lexical processes when acquiring a second tonal language, by Eric Pelzl, Jiang Liu, Chunhong Qi

■ Advancements of phonetics in the 21st century: A critical appraisal of time and space in Articulatory Phonology, by Khalil Iskarous, Marianne Pouplier, 

■ Special issue: Vocal accommodation in speech communication, by Jennifer S.Pardo, Elisa Pellegrino, Volker Dellwo, Bernd Möbius

摘要

An investigation of functional relations between speech rate and phonetic variables

Seung-Eun Kim, Sam Tilsen, Department of Linguistics, Cornell University, 203 Morrill Hall, Ithaca, NY 14853, USA

Abstract  It is well known that speech rate is correlated with many phonetic variables. The current study aims to obtain a more precise characterization of how phonetic measures covary with speech rate. Specifically, we assess whether there is evidence for linear and/or non-linear relations with rate, and how those relations may differ between phrase boundaries. Productions of English non-restrictive (NRRCs) and restrictive relative clauses (RRCs) were collected using a method in which variation in speech rate is cued by the speed of motion of a visual stimulus. Articulatory and acoustic variables associated with phrase boundaries were analyzed; for each variable, Bayesian regression was used to obtain posterior parameter distributions for a set of generalized linear models. Analyses of posterior predictions showed that phonetic variables associated with a phrase boundary that follows the relative clause (post-RC boundary) were more susceptible to rate variation than those at a boundary that precedes the relative clause (pre-RC boundary). Phonetic variables at the post-RC boundary also showed evidence for non-linear relations with rate, which suggest floor or ceiling attenuation effects at extreme rates. On the other hand, substantial differences between syntactic contexts were found primarily at the pre-RC boundary. A high degree of participant-specificity was observed in F0-related variables.


Key words Speech rate; Prosody; Functional relations; Non-linearity; Attenuation effects; Bayesian analysis; Syntactic context


The impact of phonotactic features on novel tone discrimination

Jonathan Wright, Melissa Baese-Berk, University of Oregon, Department of Linguistics, 1290 University of Oregon, Eugene, OR 97403-1290, USA

Abstract Many studies have examined novel tone perception, but few have investigated the interaction between novel tone perception ability and phonotactic structure. We examined the discrimination of Thai low and mid tones by native Mandarin and native English participants across four syllable types (CCVV, CVV, VV, hums) to test the interaction between first language suprasegmental experience and phonotactic complexity on novel tone discrimination. We also tested the impact of unfamiliar consonant clusters and unfamiliar segments in the onset. Across syllable types, native Mandarin participants discriminated tones better than native English speakers. Further, discrimination ability was not impacted by phonotactic complexity. However, unfamiliar syllable structure impacted discrimination ability for native Mandarin participants in an unintuitive way. They discriminated tones significantly better in CCVV syllables. Unfamiliar segments in the onset, however, had a negative impact on tone discrimination. The presence of /ŋ/ onsets, which are illegal in English and allophonically permitted in Mandarin, significantly reduced tone discrimination accuracy. For native English participants, /ŋ/ onsets resulted in no discrimination between tones. These results suggest that the phonotactic structure of carrier words for tones interacts with L1 phonotactic experience in modulating novel tone perception ability.


Key words Novel tone perception; Non-native lexical tones; Phonotactic structure; Thai; Mandarin


Responses to time pressure on phrase-final melodies in varieties of Dutch and West Frisian

Judith Hanssen, Centre for Language Studies, Radboud Universiteit, Faculteit der Letteren, Postbus 9103, 6500 HD Nijmegen, Netherlands

Jörg Peters, Avans Hogeschool, 's Hertogenbosch, Netherlands

Carlos Gussenhoven, Institute for German Studies, Carl von Ossietzky University Oldenburg, Germany


Abstract In order to investigate the effect of time pressure on the execution of falling and falling-rising pitch movements on phrase-final syllables, we ran a production experiment with 119 speakers distributed over five regional varieties of West-Germanic spoken in the Netherlands and Standard Dutch. They realized nuclear Falls and Fall-Rises on four IP-final monosyllabic target words in which the duration of the sonorant portion of the syllable rhyme was varied by combining a short and a long vowel with a sonorant and obstruent coda consonant. The different contours were elicited with the help of a ‘Statement’ dialogue and a ‘Rhetorical question’ dialogue, respectively. Phonetic adjustments fell into three categories, target undershoot, increased f0 slopes, and durational easing, the latter either by f0 target retraction or sonorant rhyme lengthening. Talkers produced monosyllabic Fall-Rises quite generally, although avoidance in favour of a plain rising contour was evident in one-fifth of the cases of the shortest sonorant rhyme in the more prestigious varieties. Broadly, southwestern Zeelandic Dutch and northeastern Low Saxon differed most noticeably from each other in their treatment of the valley of the Fall-Rise, which Zeelandic maximally undershot and Low Saxon faithfully preserved. Urban Hollandic and Standard Dutch were similar, while West Frisian shares some features with Hollandic and some with Low Saxon.

    Apart from these effects of geographic contiguity, the data argue for an interpretation of the various measures alleviating time pressure within the more general conception of speech as a compromise between the interests of the speaker and those of the hearer in the process of speech communication.


Key words Truncation; Compression; Durational easing; Target undershoot; Dutch; Frisian; Geographical cline


Language and cluster-specific effects in the timing of onset consonant sequences in seven languages

Marianne Pouplier, Institute of Phonetics and Speech Processing, LMU Munich, Germany

Manfred Pastätter, Institute of Phonetics and Speech Processing, LMU Munich, Germany; Department of Linguistics, University of Potsdam, Germany

Philip Hoole, Institute of Phonetics and Speech Processing, LMU Munich, Germany

Stefania Marin, Institute of Phonetics and Speech Processing, LMU Munich, Germany

Ioana Chitoran, CLILLAC-ARP, University of Paris, France

Tomas O. Lentz, Institute of Phonetics and Speech Processing, LMU Munich, Germany; Department of Communication and Cognition, Tilburg University, The Netherlands

Alexei Kochetov, Department of Linguistics, University of Toronto, Canada

Abstract In this paper, we draw on available data from previous experiments to explore cross-linguistic variation in articulatory overlap in CC onset clusters, taking into account the role of cluster composition. Our sample includes articulography recordings of eleven clusters for seven languages. We find that cross-linguistic variability is conditional on cluster composition. Previous suggestions that languages may have individual global articulatory timing profiles for consonant clusters in terms of an overall relatively lower or higher degree of overlap are not confirmed for our sample. All included languages converge on a relatively higher degree of overlap for some of the clusters, whereas only some of the languages additionally extend into the lower overlap range, particularly for stop-sonorant sequences. Manner and voicing are further identified as factors conditioning variation in consonantal overlap. Overall languages differ in their degree of overlap in multi-faceted ways, but the relative effects of cluster composition work in the same direction across languages.


Key words Articulatory timing; Overlap; Consonant clusters; Phonotactics; Sibilants


The influence of preceding speech and nonspeech contexts on Mandarin tone identification

Hui Zhang, School of Foreign Studies, Shandong University of Finance and Economics, Shungeng Road 42#, Shizhong District, Jinan, Shandong Province 250014, China; Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Rd, Minhang District, Shanghai 200240, China; Department of Linguistics and Translation, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China

Hongwei Ding, Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Rd, Minhang District, Shanghai 200240, China

Wai-SumLee, Department of Linguistics and Translation, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China

Abstract This study examines the effect of preceding tones on tone perception within Mandarin disyllabic utterances and the underlying mechanism that causes such an effect. Listeners were presented with a series of tone targets varying perceptually from Mandarin Tone 3 to Tone 4 following Tone 1, Tone 2, or Tone 4. The results showed that the targets were more likely to be categorized as Tone 3 following the context tones with high offset f0 (Tone 1 and Tone 2) than following those with low offset f0 (Tone 4). The effect of preceding tones compensated for the acoustic consequence of coarticulation because the context tone with high offset f0 produced Tone 4-like Tone 3 variant in production, but tended to elicit Tone 3 identification of this variant in perception. Moreover, we also observed an effect of nonspeech contexts that preserved only the f0 contours of speech contexts. However, the effect of nonspeech contexts was significantly smaller than that of speech contexts, and the difference was not caused by focal attention. Our findings lend evidence to a general auditory mechanism and point to future work clarifying the factors that modulate the magnitude of perceptual context effect.


Key words Mandarin tone perception; Perceptual compensation for the acoustic consequence of coarticulation; General auditory mechanism; Speech-specific mechanism; Focal attention


Effects of native language and habituation in phonetic accommodation

Stephen J.Tobin, Department of Linguistics, University of Michigan, Lorch Hall 440, 611 Tappan Street, Ann Arbor, MI 48109-1220, USA

Abstract Effects of vocal accommodation have been reported in a wide range of contexts, but they have typically been small. The absence of effects in some cases has proven perplexing. In the present investigation I present innovative methods for the representation of phonetic distance between phonetic tokens and the analysis of phonetic accommodation. I take a broad crosslinguistic perspective and report effects of linguistic background (L1) on patterns of phonetic convergence toward typical monolingual English voiceless stop voice-onset-times (VOTs).

I propose that patterns of accommodation in laryngeal-oral coordination, as instantiated by voiceless stop VOTs, will reflect general principles of motor coordination (preferences for stable/in-phase coordination, cf. Browman & Goldstein, 1988; Haken, Kelso, & Bunz, 1985). Thus, stable, near-zero VOTs (cf. Spanish) will be less likely to show convergence toward intermediate English VOT, whereas less stable, long VOTs (cf. Korean) will be more likely to converge. Monolingual English and bilingual (Korean-English, Spanish-English) participants completed word shadowing and reading tasks. Their vowel-normalized voiceless stop VOTs were submitted to two analyses which confirm the articulatory stability hypothesis and reveal group-specific changes in vocal accommodation over time. The first involves a general baseline-to-test comparison, while in the second, a trial-specific difference from baseline is used as a dependent measure.

The results offer new insights into the effects of language background on vocal accommodation, and the analytical approach offers a means to more cleanly isolate subtle effects of accommodation in speech among a multitude of competing factors.


Key words Vocal accommodation; Phonetic accommodation; Convergence; VOT; Articulatory coordination; Bilingualism; Timecourse


Prosodic phrasing mediates listeners' perception of temporal cues: Evidence from the Korean Accentual Phrase

Jeremy Steffman, Northwestern University, United States

Sahyang Kim,  Hongik University, Republic of Korea

Taehong Cho, Hanyang University, Republic of Korea

Sun-Ah Jun, University of California, Los Angeles, United States

Abstract In two experiments we examine how listeners make reference to prosodic phrasing in their perception of temporally cued segmental contrasts. We test how the prosodic-structurally conditioned modulation of segmental cues (in domain-initial strengthening) translates into speech perception. We adopt the test case of stop contrasts in Seoul Korean (aspirated versus fortis), which are cued by vowel duration and voice onset time (VOT). The phrasing manipulation was carried out at the level of the Accentual Phrase (AP), a small phrase that is marked by intonational features. The AP was chosen because it was possible to create two prosodic phrasing contexts (AP-initial versus AP-medial) by manipulating only f0 before the target segment with the duration of contextual segments unchanged, controlling for temporal context effects. In Experiment 1, listeners shift their perception of a VOT continuum based on phrasing, in line with the domain-initial strengthening pattern of post-stop vowel lengthening, where AP-initial post-fortis vowels are lengthened. Experiment 2 shows that vowel duration is used as a cue to the contrast and that perceptual categorization of vowel duration itself is also mediated by contextual phrasing information. Results thus suggest that prosodic phrasing, signaled by intonation only, mediates perception of the segmental contrast, with temporal context controlled. We discuss these findings in terms of their implications for the role of phrasing in segmental perception and in processing.


Key words Speech perception; Prosody; Prosodic phrasing; Korean; Accentual phrase


American English pitch accents in variation: Pushing the boundaries of mainstream American English-ToBI conventions

Rachel Steindel Burdin, University of New Hampshire, Durham 03824, NH, USA

Nicole R. Holliday, Pomona College, Claremont, 91711, CA, USA

Paul E. Reedc, University of Alabama, Tuscaloosa 35478, AL, USA

Abstract Linguists interested in intonation have long struggled to establish a maximally broad set of annotation conventions that function equally well across varieties of American English. The current study tests the advantages and limitations of the widely-used MAE-ToBI conventions, focusing on the H* and L+H* distinction, for three varieties of American English: African American English, Appalachian English, and Jewish English. Results of quantitative analysis of production data from 30 speakers of the three varieties finds major differences in rate of use of the H* and L+H* pitch accent as well as the phonetic realizations of these pitch accents, which may not be captured solely using the MAE-ToBI conventions. These differences appear not only between MAE-ToBI and the other three varieties, but also between the varieties themselves in unique ways that may shed light on the nature of sociolinguistic variation at the level of intonation, as well as the debated status of the distinction of H* vs. L+H* as a phonological or phonetic distinction. These findings provide further motivation for the development and use of annotation systems that explicitly consider sociolinguistic variation as well as phonetic parameters. Such systems will become even more essential as both sociolinguists and phoneticians expand intonational analysis beyond so-called “standard varieties” in order to arrive at a richer and more accurate picture of the intonational system of American English.


Key words Intonational variation; ToBI; Variationist sociolinguistics




Constituent durations in English NNN compounds: A case of strategic speaker behavior?

Annika Schebesta, Gero Kunter, Universität Siegen, Seminar für Anglistik, Adolf-Reichwein-Str. 2, 57068 Siegen, Germany

Abstract This paper investigates the effect of morphological embeddedness and lexical frequency on the duration of constituents in left- and right-branching NNN compounds from a corpus of spoken English (Boston University Radio Speech Corpus, Ostendorf et al., 1997). Theories assuming that the phonetic signal is not affected by the internal structure of multimorphemic words are opposed by empirical studies on the morpho-phonetic interface which provide evidence that the phonetic signal is sensitive to different morphological boundaries. The analysis of 465 NNN compounds reveals that morphological embeddedness alone does not have the expected effect on constituent durations, however, we detected a complex interplay of the morphological structure of NNN compounds and the two involved bigram frequencies. For instance, the duration of N2 in left-branching compounds is affected by the frequency of N2N3 even though these two constituents do not form a morphological unit in this type of NNN compound. This interplay may be interpreted as a listener-oriented strategy employed by the speaker in order to resolve potential conflicts between the frequency of adjacent constituents and the morphological structure: In such an instance, speakers appear to use acoustic duration to signal the branching direction of the triconstituent compound.


Key words English compounds; Phonetic reduction; Morphological embeddedness; Lexical frequency


Who is Fu? – Perception of L2 sounds that are partially neutralized in L1

Makiko Aoyagi, Dokkyo University, 1-1 Gakuencho, Soka, Saitama Pref., 340-0042, Japan

Yue Wang, Simon Fraser University, RCB, 8888 University Drive, Burnaby, BC V5A 1S6, Canada

Abstract This study investigates the interference of L1 phonology with L2 speech perception, focusing on the identification of the English fricatives /f/ and /h/ by Japanese listeners. In Japanese, the contrast of /f/ vs /h/ is lost only in the /_u/ environment by neutralization (/fu/ and /hu/ > [ɸu]). This makes food vs who’d extremely difficult for Japanese to discern while fee vs he poses no problem. Perceptual experiments demonstrated that while /f/ and /h/ were accurately identifiable when isolated from non-/u/ syllables [e.g. f(a) vs h(a)], those isolated from /_u/ were rather hard [f(u) vs h(u)] and became even harder when presented with the vowel [fu vs hu]. Such Cs alone [f(u)/h(u)] were easier than the whole CVs [fu/hu] from which those Cs were excised, contrary to our general expectations. A cross-splicing experiment further revealed it was not the acoustics of coarticulated fricatives but the presentation of /u/ that made the identification difficult. The basic phonetic process of f/h was debilitated in fu/hu, where the L1 neutralization applied, cancelling the contrast and the need for attunement, which was then transferred to L2. It was argued that the L1-L2 sound correspondence can be affected by the knowledge of L1 phonological processes.


Key words L2 speech perception; L1 phonology influence; Neutralization; English fricatives; Japanese listeners


The influence of expectations on tonal cues to prominence

Christine T. Röhr, Stefan Baumann, Martine Grice, University of Cologne, IfL Phonetik, Herbert-Lewin-Str. 6, 50931 Köln, Germany

Abstract Contexts such as “Guess what happened yesterday” lead to expectations as to how unusual and exciting the content of a following utterance will be. This paper investigates how speakers encode this pragmatic meaning in their productions and further evaluates the findings from the listeners’ perspective. Contexts in which the speaker is required to make information exciting for the listener lead to target words being made more prominent, with more frequent use of rising accents and larger rising tonal onglides than information placed in a neutral or ordinary context. Conversely, ordinary information is made less prominent by means of fewer and smaller rising onglides as well as more and larger falling onglides. Individual speakers convey this information in different but systematically compatible ways, supporting a view of intonational phonology that integrates qualitative pitch accent categories and quantitative phonetic parameters. Listeners’ ratings of the contexts and the production stimuli confirm the interpretation of the intended meanings and the role of the tonal onglide: A large rising onglide (as in L+H*) on the target word clearly leads to the interpretation of unusual/exciting information, whereas a small rising onglide (as in H*) or falling onglides (as in H+!H* and H+L*) do not.


Key words Prosodic prominence; Pitch accent; Tonal onglide; Expectation; Context; Production; Perception


Spelling provides a precise (but sometimes misplaced) phonological target. Orthography and acoustic variability in second language word learning

Pauline Welby, Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France, University of New Caledonia, Nouméa, New Caledonia

Elsa Spinelli, Univ. Grenoble Alpes, CNRS, LPNC, 38000 Grenoble, France

Audrey Bürkid, Department of Linguistics, University of Potsdam, Potsdam, Germany

Abstract L1 French participants learned novel L2 English words over two days of learning sessions, with half of the words presented with their orthographic forms (Audio-Ortho) and half without (Audio only). One group heard the words pronounced by a single talker, while another group heard them pronounced by multiple talkers. On the third day, they completed a variety of tasks to evaluate their learning. Our results show a robust influence of orthography, with faster response times in both production (Picture naming) and recognition (Picture mapping) tasks for words learned in the Audio-Ortho condition. Moreover, formant analyses of the Picture naming responses show that orthographic input pulls pronunciations of English novel words towards a non-native (French) phonological target. Words learned with their orthographic forms were pronounced more precisely (with smaller Dispersion Scores), but were misplaced in the vowel space (as reflected by smaller Euclidian distances with respect to French vowels). Finally, we found only limited evidence of an effect of talker-based acoustic variability: novel words learned with multiple talkers showed faster responses times in the Picture naming task, but only in the Audio-only condition, which suggests that orthographic information may have overwhelmed any advantage of talker-based acoustic variability.


Key words Orthography; Talker variability; Second language learning; Word learning; Phonological representations; Speech production; Speech perception


Diachronic phonological asymmetries and the variable stability of synchronic contrast

Sam Kirkham, Claire Nance, Department of Linguistics and English Language, County South, Lancaster University, Lancaster LA1 4YL, United Kingdom

Abstract This article aims to understand the development of diachronic asymmetries in phonological systems by evaluating the variability stability of synchronic contrasts. We focus on sonorant systems involving secondary palatalisation, grounded in the claim that palatalised laterals are more common than palatalised rhotics cross-linguistically. Our analysis reports acoustic and articulatory data on Scottish Gaelic, a Celtic language with a large sonorant inventory contrasting palatalised, plain and velarised phonemes across laterals, nasals and rhotics. We summarise high-dimensional dynamic characteristics of the acoustic spectrum and midsagittal tongue shape using a two-stage data reduction process and use these coefficients as inputs for training a Support Vector Machine. This trained model classifies unseen data in terms of its phonemic identity, which reveals that rhotics are classified best word-initially and worst word-finally, with nasals always classified better than laterals. We find that dynamic information substantially improves acoustic classification, but only improves articulatory classification for some sonorants. We propose that the variable synchronic stability of palatalisation contrasts complicates potential trajectories of diachronic change in Gaelic.


Key words Sound change; Palatalisation; Scottish Gaelic; Sonorants; Synchronic variation; Diachronic change; Ultrasound


Sub-band cepstral distance as an alternative to formants: Quantitative evidence from a forensic comparison experiment

Yuko Kinoshita, Australian National University, Canberra, ACT 2601, Australia

Takashi Osanai, National Research Institute of Police Science, 6-3-1, Kashiwanoha, Kashiwa-shi, Chiba 277-0882, Japan

Frantz Clermont, Australian National University, Canberra, ACT 2601, Australia, J.P. French Associates Forensic Speech and Acoustic Laboratory, 86 The Mount, York YO24 1AR, United Kingdom

Abstract This paper demonstrates the potential of the sub-band parametric cepstral distance (PCD) formulated by Clermont and Mokhtari (1994), as an alternative to formants in acoustic phonetic research. As a cepstrum-based measure, the PCD is automatically and reliably extracted from the speech signal. By contrast, formants are time-consuming and often difficult to estimate, a well-known bottleneck for studies based on large-scale datasets. The PCD measure gives flexibility in selecting the frequency limits of any sub-band of interest within the available full band. We suggest that, if sub-band selection were guided by the acoustic–phonetic theory of speech production, PCD analysis could facilitate phonetically meaningful cepstral comparisons without relying directly on formants. We evaluate this idea by exploiting the PCD properties in the context of forensic voice comparison as an application example. The cepstral data were obtained from the vowels uttered by 306 male Japanese speakers. Similar patterns of results were observed using formants and sub-band PCDs, the latter yielding better performance. This suggests that sub-band PCDs are able to capture the spectral characteristics that we normally quantify through formants, but with better reliability and efficiency. The PCD results reported here are encouraging for other types of acoustic phonetic studies in which comparisons of spectral characteristics are required.


Key words Forensic voice comparison; Sub-band; Parametric cepstral distance; Formants; LPCC; Likelihood ratio; Vowel


Vocal reaction times to speech offsets: Implications for processing models of conversational turn-taking

Francisco Torreira, Department of Linguistics, McGill University, 1085 Dr. Penfield Ave, H3A 1A7 Montreal, Quebec, Canada

Sara Bögels, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Nijmegen, The Netherlands, Department of Communication and Cognition, Tilburg University, The Netherlands

Abstract Everyday conversation is characterized by a rapid alternation of turns among successive speakers. We investigate vocal reaction times to speech offsets in order to shed light on the limits of reactive behavior in conversational turn-taking. Twenty-three speakers of Dutch produced a prepared response ([ja], ‘yes’) as fast as possible in response to (a) the onset of a pure tone preceded by a variable amount of silence, and (b) the offset of several types of speech-like auditory stimuli varying in duration, prosodic characteristics, and speech rate. Reactions to the offset of stimuli lacking final prosodic cues were significantly longer than reactions to stimulus onsets (283 vs. 215 ms on average), and were rare below 200 ms (3%). Speaking latencies decreased as prosodic cues appeared further away from the stimulus end. Slowing down the speech rate produced an entrainment effect (i.e. slower responses) for stimuli lacking prosodic cues vs. a facilitatory effect (i.e. faster responses) when prosodic cues were present. These findings suggest that smooth turn transitions taking less than 200 ms are unlikely to involve reactions to silence at the end of a turn, but that they can be achieved when turn-final prosodic cues are available.


Key words Reaction times; Speech production; Conversational turn-taking; Prosody; Entrainment; Psycholinguistics


Phonetic and phonological cues to prediction: Neurophysiology of Danish stød

Anna Hjortdal, Centre for Languages and Literature, Lund University, Lund, Sweden

Johan Frid, Lund University Humanities Lab, Lund University, Lund, Sweden

Mikael Roll, Centre for Languages and Literature, Lund University, Lund, Sweden

Abstract A corpus study and a combined behavioural and neurophysiological study tested how phonetic and phonological features of the Danish creaky voice feature ‘stød’ influence predictive processing. Being associated with certain word endings, stød and its modal voice counterpart non-stød can cue upcoming speech. Stød has two phases. The first shows phonetic differences in pitch while the second, characterised by creaky voice, has been interpreted as the phonological stød proper. Participants listened to nouns cross-spliced between the two stød phases and between stem and a following singular or plural suffix. Suffixes invalidly cued by phonological stød or non-stød showed longer response times and N400 and P600 effects, the former suggesting that stød/non-stød are becoming grammaticalized as singular and plural morphemes. Even subtle phonetic differences preceding stød proper increased response times, but N400 and P600 amplitudes were not significantly increased. Results suggest predictive use of both phonetic and phonological features, but that phonological stød cues override phonetic cues. The corpus study indicated that word beginnings with stød are less frequent and have fewer possible continuations than non-stød. Stød yielded increased negativity 280–430 ms after stød proper onset, which might be interpreted as a pre-activation negativity for the more predictively useful cue.


Key words Prosody; Stød; Danish; Prediction; Brain; ERP


Theoretical achievements of phonetics in the 21st century: Phonetics of voice quality

Marc Garellek, Department of Linguistics, University of California San Diego, 9500 Gilman Drive #0108, La Jolla, CA 92093-0108, USA

Abstract Twenty years after the publication of a special issue in this journal on non-modal phonation (JPhon 2001: 49(4)), the phonetic study of voice quality has shown impressive progress. Here I focus on what we have learnt over these years about the linguistic sources of voice quality modulation. I stress how voice quality has a role to play in all of speech: among its many functions, the voice is involved in the articulation of all sounds, and voice quality is worth investigating as much for “modal” consonants and vowels as for contrastive phonation type. The voice also encodes structure at many prosodic levels: sub-lexical, lexical, and post-lexical. I further highlight some of the important technological developments and refinement of various voice quality models that have led to progress in the phonetic study of voice quality. Reviewing all of the above, one can only conclude that human voice has a central role to play in the phonetician’s pursuit towards understanding spoken language.


Key words Voice; Voice quality; Phonation; Laryngeal; Glottal


Classifying conversational entrainment of speech behavior: An expanded framework and review

Camille J. Wynn, Stephanie A. Borrie, Department of Communicative Disorders and Deaf Education, Utah State University, Logan, Utah, USA

Abstract Conversational entrainment, also known as alignment, accommodation, convergence, and coordination, is broadly defined as similarity of communicative behavior between interlocutors. Within current literature, specific terminology, definitions, and measurement approaches are wide-ranging and highly variable. As new ways of measuring and quantifying entrainment are developed and research in this area continues to expand, consistent terminology and a means of organizing entrainment research is critical, affording cohesion and assimilation of knowledge. While systems for categorizing entrainment do exist, these efforts are not entirely comprehensive in that specific measurement approaches often used within entrainment literature cannot be categorized under existing frameworks. The purpose of this review article is twofold: First, we propose an expanded version of an earlier framework which allows for the categorization of all measures of entrainment of speech behaviors and includes refinements, additions, and explanations aimed at improving its clarity and accessibility. Second, we present an extensive literature review, demonstrating how current literature fits into the given framework. We conclude with a discussion of how the proposed entrainment framework presented herein can be used to unify efforts in entrainment research.


Key words Entrainment; Convergence; Alignment; Accommodation; Acoustic-prosodic

Homophone discrimination based on prior exposure

ChelseaSanker, Stanford University, Margaret Jacks Hall, Building 460, Stanford, CA 94305, USA

Abstract This article presents three studies testing the potential role of word-specific acoustic details in perception, based on how several factors impact listeners’ accuracy in identifying homophones. Experiment 1 tests how prior exposure to particular homophones said by the same talker impacts identifications; listeners could discriminate between homophone mates with above chance accuracy after exposure to disambiguated tokens of these words produced by the talker, but not when prior exposure did not include the test words. Experiment 2 tests whether having the same talker in exposure and testing is crucial; accuracy is above chance even when the prior exposure to the homophone mates was from a different talker. Experiment 3 tests whether accuracy in homophone identification might be driven by broad associations between meaning and acoustic form rather than the details of particular words; there is no difference between exposure to the particular homophone mates and exposure to semantically similar words. Just having strong positive or negative emotional valence seems to result in higher accuracy for how homophone mates are identified. These results suggest that listeners can make use of semantically-driven acoustic differences between homophone mates when recent exposure makes these details salient or when the form-meaning associations are already strong. This link to acoustic details can be explained via associations with broad aspects of meaning, rather than depending on word-specific phonetic representations.


Key words Homophones, Perception, Word-specific phonetics, Talker-specific learning, Emotional valence


Exposure to speech via foreign film and its effects on non-native vowel production and perception

Amy E. Hutchinson, OlgaDmitrieva, Purdue University, West Lafayette, 47907 IN, USA

Abstract This research presents two experiments that examine the effect of exposure to second language speech via foreign film on non-native speech production and perception. Experiment 1a investigated whether exposure to French film aided in the ability of naïve monolingual American English speakers to shadow French words containing high rounded vowels /y/ and /u/ as tested via acoustic analyses and native French listener perceptual judgements. Experiment 1b was a crosslinguistic perceptual assimilation task completed by the same participants, designed to explore the perception of rounded vowels /y/ and /u/, before and after film exposure. The results of Experiment 1a indicated that a single session of exposure to French film had a small, but significant, effect on shadowing of French /y/, which was also perceptible to native French listeners. Shadowing of /u/, however, was not significantly affected by exposure. Experiment 1b showed that participants did not alter the patterns of perceptual assimilation between the two French vowels and native English vowels following film exposure. We conclude that exposure to non-native speech via foreign film can affect some aspects of non-native speech development and hypothesize that further sessions may compound these initial benefits, especially in those who are already learning a second language.


Key words Non-native speech development, Speech shadowing, Foreign film, Vowels, French, Production, Perception


Analyzing time-varying spectral characteristics of speech with function-on-scalar regression

Rasmus Puggaard-Rode, Leiden University Centre for Linguistics, Netherlands

Abstract The acoustic characteristics of noise from fricatives and stop releases are difficult to analyze. The spectral characteristics of such noise are multi-dimensional, and popular methods for analyzing them typically rely on reducing this complex information to one or a few discrete numbers, such as spectral moments or coefficients of discrete cosine transformations. In this paper, I propose using function-on-scalar regression models as a method for analyzing and mass-comparing spectra with minimal reduction of the complexity in the signal. The method is further useful for analyzing how spectra change as a function of time. The usefulness of this method is demonstrated with a corpus analysis of Danish aspirated stop releases, using the DanPASS corpus. The analysis finds that /t/ releases are invariably affricated; /k/ releases are highly affected by coarticulatory context; and /p/ releases are almost always dominated by aspiration in the latter half of the release, but are affricated in the first half in certain contexts.


Key words Functional data analysis, Function-on-scalar regression, Stop releases, Affrication, DanishCorpus phonetics


The prosodic marking of rhetorical questions in Standard Chinese

Katharina Zahner-Ritter, University of Trier, Phonetics, Universitätsring 15, 54296 Trier, Germany, University of Konstanz, Department of Linguistics, P.O. Box 186, 78457 Konstanz, Germany

Yiya Chen, Leiden University Center for Linguistics, Postbus 9515, 2300 RA Leiden, the Netherlands

NicoleDehé, University of Konstanz, Department of Linguistics, P.O. Box 186, 78457 Konstanz, Germany

BettinaBraun, University of Konstanz, Department of Linguistics, P.O. Box 186, 78457 Konstanz, Germany

Abstract The present study investigates the prosody of information-seeking (ISQs) and rhetorical questions (RQs) in Standard Chinese, in polar and wh-questions. Like in other languages, ISQs and RQs in Standard Chinese can have the same surface structure, allowing for a direct prosodic comparison between illocution types (ISQ vs RQ). Since Standard Chinese has lexical tone, the use of f0 as a cue to illocution type may be restricted. We investigate the prosodic differences between ISQs and RQs as well as the interplay of prosodic cues to RQs. In terms of f0, results showed that RQs were lower in f0, with the f0 range on the first word being expanded followed by f0 compression. RQs were further longer in duration and more often realized with non-modal voice quality (glottalized voice) as compared to ISQs. These prosodic cues were largely manipulated in tandem (illocutionary pairs with larger durational differences also showed larger differences in mean f0; voice quality, in turn, seemed to be an additional cue). We suggest three possible explanations (assertive force, focus, speaker attitude) that unite the present findings on RQs in Standard Chinese with the findings on RQs in other, non-tonal languages.


Key words Lexical tone, Prosody, Intonation, Rhetorical questions, Standard Chinese


Measured and perceived speech tempo: Comparing canonical and surface articulation rates

Leendert Plug, University of Leeds, United Kingdom

Robert Lennon, University of Glasgow, United Kingdom

Rachel Smith, Lancaster University, United Kingdom

Abstract Studies that quantify speech tempo tend to use one of various available rate measures. The relationship between these measures and perceived tempo as elicited through listening experiments remains poorly understood. This study furthers our understanding of the relationship between measured articulation rates and perceived speech tempo, and the impact of syllable and phone deletions on speech tempo perception. We follow previous work in using stimuli from a corpus of unscripted speech, and in sampling stimuli in distinct ‘global tempo’ ranges. Within our stimulus sets, the differences between canonical and surface rate measurements are directly due to syllable or phone deletions. Our results for syllable rates suggest that listeners use both canonical and surface rates to estimate speech tempo: that is, deletions do not have a consistent effect on perceived tempo. Our results for phone rates suggest that surface phone rate also influences judgements, but canonical phone rate does not. Our results also confirm previously-reported effects of f0 and intensity on speech tempo perception, plus an effect of stimulus duration, but no effect of listeners’ own tempo production tendencies.


Key words Speech tempo, Articulation rate, Deletion, Perception, English


The production of English syllable-level timing patterns by bilingual English- and Spanish-speaking children with cochlear implants and their peers with normal hearing

Mark Gibson, Universidad de Navarra, Spain

Ferenc Bunta, University of Houston, United States

Charles Johnson, University of California Santa Barbara, United States

Miriam Huárriz, Universidad de Navarra, Spain

Abstract We examined the timing parameters in syllables of English containing word initial singleton sonorants (/l/ and /ɹ/) and stop+sonorant clusters by bilingual English- and Spanish-speaking children with cochlear implants (CImp group) and their cohorts with normal hearing (NH group). The timing parameters included: voice onset time (henceforth, VOT), vowel duration following word initial singleton consonants and vowels following word initial stop+sonorant clusters as well as lateral and rhotic duration in word initial position and in stop+sonorant clusters. Our motivation for the current study was to address whether hearing loss affects the production of the timing parameters of syllables and whether language interference in bilingual productions is modulated by hearing loss. The results of the English production tasks show effects of onset type (singleton, C, vs complex onsets, CC) on VOT, whereby VOT for word initial singleton stops was shorter than VOT in stop+sonorant clusters. To the contrary vowel duration tended to extend temporally as more consonants are added to the onset. With regard to lateral and rhotic duration in word initial singleton and complex onsets, both groups (CImp and NH) showed the typical compression effect of /l/ in complex onsets characteristic of the timing pattern for sonorants in English complex syllables, though no shortening of the rhotic was found for the NH group. Combined results suggest little effect of hearing loss on the production of the timing parameters of English syllables, and virtually no effects across groups that can be attributed to language interference.


Key words Syllable timing, Bilingual phonological acquisition, English/Spanish, Cochlear implant users


Native language experience with tones influences both phonetic and lexical processes when acquiring a second tonal language

Eric Pelzl, The Pennsylvania State University, University Park, PA, USA

Jiang Liu, University of South Carolina, Columbia, SC, USA

Chunhong Qi, Yunnan Normal University, Yunnan, China

Abstract Second language acquisition of lexical tones requires that a learner form appropriate tone categories and bind those categories to lexical representations for fluent word recognition. Research has shown that second language (L2) learners with no previous tone language experience can become highly accurate at identification of tones in isolation, but, even at advanced levels, have difficulty using tones to differentiate real words from nonwords. The present research considers the same skills in L2 learners who do have previous tone experience. Using largely the same tasks and stimuli previously used with English speakers in Pelzl, Lau, Guo, & DeKeyser (2021a) (“PLGD21”), we examined the tone identification and (tone) word recognition abilities of thirty-three Vietnamese speakers who had achieved advanced L2 proficiency in Mandarin. Results indicate that Vietnamese speakers experience different tone identification difficulties than English speakers, presumably due to interference from their native language tone categories. However, unlike English speakers in previous studies, Vietnamese speakers did not display differences in lexical decision accuracy for vowel and tone nonwords. These results provide evidence of the complexities of cross-linguistic influence, illustrating that the influence of native language tones can be illuminated by considering perception and acquisition at multiple levels.


Key words Vietnamese, Mandarin, Tones, Cross-linguistic influence, Speech learning, Second language acquisition


Advancements of phonetics in the 21st century: A critical appraisal of time and space in Articulatory Phonology

Khalil Iskarous, Department of Linguistics, University of Southern California, Los Angeles, USA

Marianne Pouplier, Institut für Phonetik und Sprachverarbeitung, LMU, Munich, Germany

Abstract Articulatory Phonology and Task Dynamics model spoken language mathematically based on dynamical systems, expressing the view that speaking is similar in nature to many other biological phenomena that have been described in this way. In this paper, we present a critical appraisal of developments in Articulatory Phonology and Task Dynamics in the 21st century, illustrating how this point of view addresses some fundamental questions in phonetics. Our paper identifies some of the key areas in which progress has been made, and others in which more progress is warranted. We thereby touch on recent work contributing to the empirical underpinning of some assumptions of the Task Dynamic model, then consider recent proposals of how Articulatory Phonology can deal with linguistically structured macro- and microscopic variation in constriction gestures induced by syllabic and phrasal prosodic structure. Part and parcel of these developments is the integration of the dynamical expression of phonological contrast into a model of utterance planning, and the structuring of the timeflow of speech by prosody. We finish our overview with a discussion on how a stronger link between articulation and acoustics could further enhance the dynamical approach to spoken language.


Key words Articulatory Phonology, Task Dynamics, Dynamical systems, π-gesture, Syllable, Prosody, Planning


Special issue: Vocal accommodation in speech communication

Jennifer S.Pardo, Montclair State University, United States

Elisa Pellegrino, University of Zurich, Switzerland

Volker Dellwo, University of Zurich, Switzerland

Bernd Möbius, Saarland University, Germany

Abstract This introductory article for the Special Issue on Vocal Accommodation in Speech Communication provides an overview of prevailing theories of vocal accommodation and summarizes the ten papers in the collection. Communication Accommodation Theory focusses on social factors evoking accent convergence or divergence, while the Interactive Alignment Model proposes cognitive integration of perception and production as an automatic priming mechanism driving convergence language production. Recent research including most of the papers in this Special Issue indicates that a hybrid or interactive synergy model provides a more comprehensive account of observed patterns of phonetic convergence than purely automatic mechanisms. Some of the fundamental questions that this special collection aimed to cover concerned (1) the nature of vocal accommodation in terms of underlying mechanisms and social functions in human–human and human–computer interaction; (2) the effect of task-specific and talker-specific characteristics (gender, age, personality, linguistic and cultural background, role in interaction) on degree and direction of convergence towards human and computer interlocutors; (3) integration of articulatory, perceptual, neurocognitive, and/or multimodal data to the analysis of acoustic accommodation in interactive and non-interactive speech tasks; and (4) the contribution of short/long-term accommodation in human–human and human–computer interactions to the diffusion of linguistic innovation and ultimately language variation and change.


Key words Vocal accommation, Phonetic convergence, Speech production


期刊简介

The Journal of Phonetics publishes papers of an experimental or theoretical nature that deal with phonetic aspects of language and linguistic communication processes. Papers dealing with technological and/or pathological topics, or papers of an interdisciplinary nature are also suitable, provided that linguistic-phonetic principles underlie the work reported. Regular articles, review articles, and letters to the editor are published. Themed issues are also published, devoted entirely to a specific subject of interest within the field of phonetics.


官网地址:https://www.sciencedirect.com/journal/journal-of-phonetics/about/aims-and-scope

本文来源:Journal of Phonetics官网

点击文末“阅读原文”可跳转下载



课程推荐



刊讯|《国际中文教育(中英文)》2022年第4期(留言赠刊)

2023-01-24

刊讯|《华文教学与研究》2022年第4期

2023-01-22

刊讯|SSCI 期刊《多语与多元文化发展》2022年第3-6期

2023-01-21

新刊|《汉学与国际中文教育》2022年第1-2期

2023-01-20

刊讯|SSCI 期刊《二语写作杂志》2022年第56-58卷

2023-01-19

刊讯|《复旦汉学论丛》第十一辑

2023-01-18

刊讯|SSCI 期刊《语言学习》2022年第3-4期

2023-01-17

刊讯|《励耘语言学刊》2022年第1辑

2023-01-16

刊讯|《方言》2022年第4期

2023-01-15

刊讯|《清华语言学》2022年第三辑

2023-01-13

刊讯|《语言与教育》2022年第1-6期

2023-01-12

刊讯|SSCI 期刊 ReCALL 第3期

2023-01-11


欢迎加入

“语言学心得交流分享群”

“语言学考博/考研/保研交流群”

请添加“心得君”入群务必备注“学校+研究方向/专业”

今日小编:钊    君

审    核:心得小蔓

转载&合作请联系

"心得君"

微信:xindejun_yyxxd

点击“阅读原文”可跳转下载

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存