Duration, vowel quality, and the rhythmic pattern of English

Anya Lunden; Anya Lunden

doi:10.5334/labphon.37

1. Introduction

English is well-known for correlating duration and vowel quality with stress: Unstressed syllables are both notably shorter and phonologically reduced compared with stressed vowels (e.g., Fry, 1958; Beckman & Edwards, 1994). Word-final vowels in English, however, show greater duration (Oller, 1973; Klatt, 1976; Wightman et al., 1992) and less vowel reduction (Hammond, 1999; Flemming & Johnson, 2007) than non-final unstressed vowels. Hammond (1999) proposes a rule of final vowel tensing in English, to account for the fact that lax vowels, apart from [ə], cannot occur in this position. Flemming and Johnson (2007) found that word-final [ə] shows a significantly different F1 and F2 from non-final reduced vowels (they argue that [ə] should be used only for word-final reduced vowels and that [ɨ] should be used non-finally).

Phonetic differences, such as length, inherent to particular positions have been linked to phonological patterns. Final-lengthening occurs to varying degrees at different prosodic boundaries (e.g., word, phrase, utterance), and is known to be stronger at higher prosodic levels (Wightman et al., 1992). Nakai (2013) connects utterance-final lengthening to final shortening. Barnes (2006) discusses phrase-final lengthening as the source of phrase-final syllable resistance to assimilation/reduction in many languages, and Lunden (2011, 2013) links word-level final lengthening to the phonological weight of word-final syllables. This work expands the evidence for the proposal in Lunden (to appear) that final lengthening can also be linked to word-final stress lapse in binary stress languages.

A well-known asymmetry among binary stress languages is that it is not uncommon for a stress lapse (i.e., two adjacent unstressed syllables) to be tolerated word-finally but it almost never occurs word-initially (Winnebago [Hale & White Eagle, 1980] is the one known case). Final stress lapse occurs in English in words like América, aspáragus, c.f. the alternating rhythm without stress lapse in Màssachúsetts, àvocádo, where secondary stress to prevent stress lapse is mandatory. Lunden (to appear) proposes that stress lapse word-finally may still involve a perceptual alternation of prominence due to the inherent phonetic duration of a word-final syllable. She presents evidence from perceptual studies of alternating patterns involving stressed and unstressed syllables which found that a final stress lapse was much more likely to be mistaken for an alternating rhythm when the final vowel had undergone final lengthening, despite still having the intensity and pitch of other unstressed syllables. Further supporting the hypothesis that the phonetic duration and vowel quality inherent to a final syllable facilitates final stress lapse, Lunden shows that there is a significant correlation between languages that permit final stress lapse and those that use duration as a stress correlate (using Lunden and Kalivoda’s [2016] stress correlate database). Lunden (to appear) largely leaves aside the issue of vowel quality of an unstressed final syllable, while showing that it clearly plays an important role in the perceived strength of word-final unstressed syllables. The present work explores the quality of reduced vowels under final stress lapse compared to those that are not involved in a stress lapse.

Phonetic variation of [ə] has been noted in multiple languages and has been shown in English to be due to (i) influence from surrounding consonants (Flemming, 2007), (ii) vowel-to-vowel co-articulation (e.g., Grosvald, 2009), and (iii) final vs. non-final position (Flemming & Johnson, 2007). There appears to be no previous work that connects the degree of vowel reduction to the presence or absence of stress lapse.

The experiments presented here explore the differences in vowel duration and quality of unstressed syllables that are involved in final lapse as compared to unstressed non-final and word-final vowels that are not. The first study, in Section 2, reports on the elicitation of four-syllable nonce words where pronunciations with primary stress on the antepenult are compared to those with primary stress on the penult. In particular, the F1 and F2 of the pronunciation of /ɑ/ in word-medial unstressed syllables are compared, as are those of the word-final unstressed syllables. Data from one of the subjects in the production study was subsequently used to make stimuli for the perception study presented in Section 3, which explores whether the differences found between unstressed syllables in the production study influence the perception of rhythm.

2. Production experiment

2.1 Methods

2.1.1 Participants

The participants were 24 native English speakers, all undergraduates at the College of William & Mary in Virginia (male = 5, aged 18–21, average age = 19.0). Participants received participation pool credit.

2.1.2 Stimuli

Four-syllable nonce words were constructed with CV syllables consisting of onsets [b], [ɡ], [f], [s] (each unique within the word) and the vowel /ɑ/. The 24 permutations were duplicated, with one set marked for antepenultimate stress (e.g., baFAgasa) and the other marked for penultimate (e.g., bafaGAsa).

Sentence frames were 24 question/answer pairs in a standard form of “Which nonce-N did PossDet N V?” and answer “Det N past-V the nonce-N that past-non-passive-relative-clause.” The question/answer form makes it less likely for the nonce word to be pronounced with special emphasis. The relative clause prevents the nonce word from being phrase final in order to limit final lengthening to the word-level. Each question/answer pair was used once within each of the two word sets. Nonce words were randomized into the question/answer frames.

Example question/answer pair:

Which baGAsafa did his sister polish?

His sister polished the baGAsafa that was particularly dirty.

2.1.3 Procedure

Recordings were done in a sound-attenuated booth. Subjects were fitted with a Shure WH30 head-mounted microphone connected to a Tascam-DR100 recorder. Stimuli were presented via pdf slides. The task was self-paced, with subjects instructed to read each question/answer pair as fluently as possible and then proceed to the next slide.

There were two recording sessions per subject, one for each set of 24 stimuli. Prior to entering the booth, subjects were trained to pronounce four-syllable nonce words with the target stress pattern, first in citation form and then in question/answer pairs similar to those used in the study. When being trained to produce antepenultimate stress, the words America and asparagus were modeled, including the fact that they are not naturally pronounced with a secondary stress on the final syllable. When being trained to produce penultimate stress words, the words Massachusetts and California were modeled. Secondary stress was not mentioned in reference to the latter set, since it will naturally fall on the first syllable for native English speakers. When subjects could fluently read the question/answer pairs without a pause before or after the nonce word and with the correct stress pattern on the nonce word they then proceeded to the sound booth for one-half of the task. When completed, they were re-trained in the second stress pattern, and practiced as they had with the first. One half of the subjects were trained and recorded with the antepenultimate pattern first, the other half with the penultimate pattern first.

2.1.4 Measurement

The vowels of the nonce words in the resulting sound files were delineated in Praat (Boersma & Weenink, 2016). The nonce word in the answer (of the question/answer) was always used. If the subject read the same question/answer pair more than one time, the final reading was used. No partial words were delineated; if there was an issue that made a syllable unusable the entire word was not delineated.

The data from 9 subjects had to be completely discarded. Seven of these either completely or overwhelmingly (at least 20 out of 24 words) pronounced the nonce words marked for antepenultimate stress with penultimate stress. One subject pronounced the stressed syllable with a schwa-quality vowel, and another regularly paused after every nonce word. For the 15 remaining subjects, the test words were delineated only if they were pronounced fluently with the target stress pattern, with an [ɑ] in the stressed syllable. Of the 720 words, 214 were not measured, overwhelmingly due to pauses before, during, or after the test word. An additional 11 words were pronounced correctly but were not measured, either because an unstressed vowel was so reduced as to not include measurable formants, or because an intervocalic [ɡ] was lenited so severely that the vowels on either side could not be reliably delineated. At this stage the dataset included 495 words, 196 of which had antepenultimate stress.

Researchers delineating the vowels noticed that several speakers seemed to pronounce the nonce words having antepenultimate stress with secondary stress on the final syllable. In order to make a numerical comparison, all correctly/fluently pronounced words were delineated and, subsequently, the average intensity of each vowel was measured via a Praat script (Hirst, 2009). The intensity of word-final vowels in the antepenultimate stress pattern were taken as a percentage of the intensity of the antepenultimate vowel in the same word. The average percentage for most subjects formed a group ranging between 89% and 96%, but 4 subjects’ averages were notably higher than the main group, falling between 100% and 103%. The data from these 4 subjects was therefore discarded, as it seems likely that they put some degree of secondary stress on the final syllable. The data from the remaining 11 subjects consisted of 351 words (=1404 vowels); 131 of which had antepenultimate stress (with a minimum of 6 antepenultimate stress words per subject).

The duration, F1, and F2 were measured via a Praat script (Lennes, 2003) and then normed by subject. Vowels for which z > |3.29| were deleted from the dataset, resulting in a final dataset of 1399 vowels.

2.2 Results

The average durations are reported in Table 1, and the average F1 and F2 are broken out by gender in Table 2, all for each position under each stress pattern.

Table 1

Duration (ms) averages.

		Duration	SD	N

σσ́σσ	initial	45.6	14.2	131
	antepenult	122.4	14.7	131
	penult	43.7	12.3	129
	final	118.2	34.8	131
σ̀σσ́σ	initial	106.6	16.3	220
	antepenult	48.0	13.6	220
	penult	127.6	17.9	220
	final	76.5	24.3	217

Table 2

Formant (Hz) averages (by gender).

Female speakers
		F1	SD	F2	SD	N

σσ́σσ	initial	543.9	91.1	1634.0	221.5	101
	antepenult	890.3	66.7	1379.9	110.7	101
	penult	534.2	69.1	1616.4	240.6	100
	final	777.9	95.5	1555.7	160.0	101

σ̀σσ́σ	initial	874.1	64.6	1409.7	114.7	160
	antepenult	585.4	83.3	1587.7	218.0	160
	penult	883.4	64.6	1375.2	112.2	160
	final	671.6	102.9	1656.0	134.1	157

Male speakers
		F1	SD	F2	SD	N

σσ́σσ	initial	453.0	45.5	1291.0	234.8	30
	antepenult	704.8	39.3	1142.2	59.3	30
	penult	453.7	32.5	1286.2	216.7	29
	final	508.5	32.7	1277.5	114.1	30

σ̀σσ́σ	initial	689.4	43.7	1157.7	79.5	60
	antepenult	483.2	45.0	1239.4	178.2	60
	penult	704.8	36.4	1133.5	58.3	60
	final	503.3	47.0	1291.7	136.7	60

Linear models were fit in SPSS with dependent variables z-duration, z-F1, and z-F2 and independent variables Position (4 levels: Initial, antepenultimate, penultimate, final), Word-Stress (2 levels: Antepenultimate, penultimate), and Consonant (4 levels), as well as their interaction terms, and treating Subject as a blocking factor.¹ The onset consonant that preceded each vowel is included because of its often significant effect, but consonant-related effects are not discussed further.² These models are given in Appendix A. Post-hoc pairwise comparisons using Fisher’s least significant difference (LSD) adjustment were run for Position * Word-Stress, and significance levels from these tests are the source of reported p-values.

2.2.1 Duration

Unsurprisingly, both the Word-Stress and the Position were found to have a significant effect on vowel duration, and there is a significant interaction of the two factors. The visualization of the duration associated with each position in each of the stress patterns is shown in Figure 1. Color coding introduced here carries through not only the results from this experiment but also through the stimuli in the perception experiment presented in Section 3: Red corresponds to words with antepenultimate stress, blue to words with penultimate stress.

Figure 1

Vowel Duration depending on Word-Stress and Position within word.

The chart of box plots in Figure 1 shows a similar pattern of alternating duration for words under both stress patterns. While the primary stressed vowels show the greatest duration and are fairly similar to each other, those in penultimate position are significantly longer than those in antepenultimate (p = 0.002). Both primary-stressed syllables are significantly longer than all others (p = 0.014 stressed antepenult compared to word-final vowel under antepenultimate stress; p < 0.001 all other positions’ comparison to the primary-stressed vowels). Vowels from word-initial syllables under secondary stress (in penultimate-stress words) have a length similar to unstressed word-final syllables in antepenultimate-stress words; we will see, however, in Section 2.2.2 that their F1, F2 are distinct.

The three unstressed vowels from non-final syllables show a significant difference in duration between the very shortest, those from an unstressed penultimate, and the least short of the three, those from an unstressed antepenult (p = 0.023). In between these two are vowels from an unstressed initial syllable; these do not differ from either of the other unstressed non-final vowels (p ≥ 0.225).

Word-final syllables clearly exhibit final lengthening, but we find a greater level of final lengthening under antepenultimate stress (p < 0.001).

2.2.2 Vowel quality

Both word-stress and position significantly affect F1 and F2, and the two factors show a significant interaction. This can be seen visually in Figure 2, which shows the average for each subject for each of the four positions in the two word-types in a vowel space scatterplot.

Figure 2

Vowel quality depending on Word-Stress and Position; averaged by subject.

We see that the vowels from penultimate-stress words have a binary distribution, essentially falling into [ɑ] (primary and secondary stress) and [ə] (unstressed). Within the [ɑ] group from penultimate-stress words, we see vowels bearing secondary stress (from the initial syllable) showing undershoot (in the sense of Lindblom [1963] and Stevens and House [1963]) of the primary stressed vowels, and indeed they differ in both F1 (p = 0.020) and F2 (p = 0.007). Within the reduced group from penultimate-stress words, it can be seen that the averages of the word-final reduced vowels fall below (i.e., have a higher F1 than) those of non-final reduced vowels (p < 0.001). The two reduced vowels also differ significantly in F2 (p = 0.001), but it is the non-final vowel that has a lower F2, and so in that respect is less centralized.

The vowels from the antepenultimate-stress words, on the other hand, show a ‘trail’ between the mid-central reduced vowel and the low back full vowel. The word-final unstressed vowels fall in this in-between range, showing a notable difference from both the reduced vowels and the stressed full vowel (F1: p < 0.001 comparison with all other positions; F2: p = 0.002 compared with unstressed vowels from penultimate positions, p < 0.001 for all other positions). A reviewer raised the concern that these final vowel averages may result from subjects producing a mix of unstressed and secondarily-stressed final vowels in the antepenultimate stress words. The by-subject scatterplots are given in Appendix B. While there is a great deal of variability in the quality of these final vowels between subjects (as can also be seen in Figure 2), the vowels do regularly fall between that of those pronounced with secondary stress and word-final vowels under penultimate stress, with the exception of subjects 2 and 11. (Subjects 2 and 11 pronounce these final vowels under stress lapse like other unstressed vowels.) The subject averages seen in Figure 2 are not the result of bimodal distributions, and do reasonably reflect each subject’s typical final vowel under antepenultimate stress. An alternative interpretation raised by a reviewer was that the variation between subjects in the F1, F2 of word-final vowels from words with antepenultimate stress might reflect the final syllable being pronounced with secondary stress by some speakers and without stress by others. In order to test this (as per the reviewer’s suggestion), a hierarchical multiple linear regression was run on the set of word-final vowels with z-duration as the dependent variable and the independent variables Euclidian Distance and Final Lapse. Each vowel’s Euclidian Distance was calculated as F1 and F2 distance from that speaker’s average [ə], where that average was each speaker’s mean F1 and F2 of the three most centralized vowels (initial and penultimate syllables from antepenultimate-stress words, and antepenultimate syllables from penultimate-stress words). When Euclidean Distance was entered alone, it significantly predicted z-duration (R² = 0.229, F(1, 333) = 99.180, p < 0.001). Adding Final Lapse to the model improved it significantly (R² change = 0.222, F(1, 332) = 134.659, p < 0.001). This means that being part of a final stress lapse has a significant effect on vowel duration, independent from the vowel’s quality. Hypothesizing secondary stress on the less reduced final vowels would therefore not account for the longer durations also seen in the more reduced word-final vowels. The multiple regression model with both variables yielded R² = 0.672, F (2, 332) = 136.824, p < 0.001.

Turning to the two non-final unstressed vowels under antepenultimate stress (from the initial and penultimate syllables), we find that they do not differ from each other (F1: p = 0.207; F2: p = 0.253).

Comparing vowels between the two stress patterns, we find that those from the two primary-stressed syllables do not vary significantly in either F1(p = 0.475) or F2 (p = 0.712). However, the two word-final vowels show a significant difference in F1 (p < 0.001) and F2 (p < 0.001). We also find a significant difference in F1 between the two unstressed vowels from non-edge syllables (F1: p < 0.001; F2: p = 0.051), where an unstressed penultimate vowel under antepenultimate stress is more centralized, having a lower F1 and a higher (although not significantly so) F2.

2.3 Production experiment summary and discussion

We find several differences of note between the pronunciation of vowels from non-initial syllables in the two different stress patterns. Both the duration and F1, F2 show four levels of unstressed vowels: The shortest/most centralized are those from unstressed penultimate syllables, the next shortest/most centralized are from unstressed antepenultimate syllables; notably longer/less centralized vowels are found in word-final unstressed syllables under penultimate stress, and the longest/least centralized are found in word-final unstressed under antepenultimate stress.³

While word-final vowels from words with antepenultimate stress are significantly shorter than stressed vowels (p = 0.014 compared to a stressed antepenult; p < 0.001 compared to a stressed penult), they are notably long, closer to the duration of stressed vowels than to other unstressed vowels. This might raise the concern that, despite discarding the data from speakers who seemed to place secondary stress on the final syllable in the antepenultimate-stress words, there was some degree of secondary stress on word-final vowels in these words. However, we see that the F1 and F2 of word-final vowels from words with antepenultimate stress were notably different from both those of stressed vowels and also from the case of actual secondary stress: Vowels from word-initial syllables from penultimate-stress words (F1, F2: p < 0.001 compared to initial vowels under secondary stress).

In both duration and vowel quality, we see a significant difference between word-final unstressed vowels coming from words with antepenultimate versus penultimate primary stress. Further, we again see a difference in both duration and vowel quality between vowels from non-edge unstressed syllables based on the stress pattern. The averages of the subjects’ averages are shown in Figure 3.

Figure 3

Vowel quality depending on Word-Stress and Position; averaged over subjects.

The two unstressed vowels from penultimate-stress words are closer in quality (as well as in duration) than the unstressed vowels from non-edge and word-final syllables from antepenultimate-stress words are. Thus, the difference between a non-final unstressed vowel and a word-final unstressed vowel is accentuated in the antepenultimate-stress words (the red circle and square in Figure 3); that is, the vowel of the final syllable is much less reduced while the vowel of the non-edge unstressed syllable is more centralized, compared with word-final and non-final unstressed vowels from penultimate-stress words (the blue oval and rectangle in Figure 3).

The hypothesis put forward here, that a phonetic rhythm continues through the two unstressed syllables in the antepenultimate-primary-stress pattern, explains the differences between the sets of vowels from words with antepenultimate and those with penultimate primary stress. Under both stress patterns, non-final unstressed vowels are naturally reduced and word-final unstressed vowels are naturally longer and less reduced. We see here that an unstressed vowel in a penult, coming after antepenultimate stress, is further reduced, and an unstressed vowel in a following word-final syllable is even longer and even less reduced than would otherwise be found in word-final position. Both the enhanced phonetic weakness of the penult and the enhanced phonetic strength of the final vowel contribute to the rhythm of the word, while actually occurring in a stress lapse. As noted by a reviewer of this paper, this brings the phonetics of the final vowel very close to something like phonological stress. I am assuming that there is a meaningful difference between metric, phonological stress (which, in Optimality Theory [Prince & Smolensky, 1993] incurs some cost, such as the alignment of feet [e.g., McCarthy & Prince, 1993]), and phonetic strength. The fact that word-final syllables are found to have phonetic strength under both stress conditions shows that at least the lower level of word-final augmentation exists outside of the metrical system. The difference in duration and vowel quality between word-final vowels under antepenultimate stress and those under penultimate stress is less than is found between initial-syllable vowels that are unstressed and those that bear secondary stress. It is therefore more plausible to take the phonetic strength of word-final vowels under antepenultimate stress as a variation of the phonetic effect of final lengthening found in all final syllables.

The findings of the differences among non-final vowels and among word-final vowels leads to the question of whether one or both of these differences is perceptually consequential.

3. Perception experiment

The production experiment described in Section 2 found four levels of duration and vowel reduction in unstressed non-initial syllables. The vowels that occurred as part of a stress lapse showed a greater difference in duration and F1, F2 than vowels from the equivalent (but non-adjacent) unstressed syllables that were part of an alternating stress pattern. The hypothesis is the phonetic duration and less reduced vowel quality due to final lengthening can contribute to the perception of continued alternation in the case of final stress lapse, while not interrupting the rhythm when there is no final stress lapse. The production experiment found that the presence or absence of a final stress lapse correlates both with different average durations of vowels in word-final syllables, and also with different average durations of the word-internal unstressed vowel. While the vowel of a final syllable was longer under stress lapse, the vowel of the penultimate syllable was shorter under stress lapse (compared to an unstressed vowel in antepenultimate position in a word without final stress lapse). These same relative differences were found to extend to the degree of centralization of the vowels of non-edge unstressed syllables as well.

While the production study results support the hypothesis that final lapse still shows a phonetic-level alternation, we need a perception study to investigate whether this alternation is perceptible and, if so, to what degree the various factors contribute. Following Lunden (to appear), a stressed and an unstressed syllable were copied into five-syllable strings in either an alternating pattern or a non-alternating one with stress lapse (two adjacent unstressed syllables) or stress clash (two adjacent stressed syllables) either initially or finally. Five-syllable strings were chosen because this is the minimum number needed to set up a true rhythm (i.e., at least two stresses per string) in all versions of the stimuli. Test strings were made by putting syllables originally produced as unstressed word-initially or word-finally in a position to potentially work as part of the rhythmic pattern (perceptually cause stress lapse) or to interrupt the rhythmic pattern (cause the perception of stress clash). We want to test the effect of word-final syllables not only under stress lapse, where they might contribute to the rhythm, but also in strings with a penultimate stressed syllable, where the potential for these word-final syllables for interrupting rhythmic stress can also be assessed. While the focus is the effect of the syllables originally produced in word-final position, test syllables from initial position were also used in order to demonstrate that any initial strengthening effects (e.g., Fougeron & Keating, 1997) that might be present cannot contribute to a rhythmic pattern under stress lapse in the way that word-final syllables can. Having syllable strings with initial stress lapse, stress clash, and initial test syllables also forces subjects not to become hyper-focused on only the ending of the syllable strings.

3.1 Methods

3.1.1 Participants

The participants were 56 native English speakers with reported normal hearing (male = 25, aged 18–22, average age 19.0). All were undergraduates of the College of William & Mary and received participation pool credit. None had been subjects in the production study.

3.1.2 Stimuli

A visual inspection was made of subjects’ averages from the previously-reported production study and the subject whose averages most closely corresponded to the overall averages, based on a visual inspection, was selected. This subject’s averages are shown in Figure 4 in comparison to the average of the other subjects’ averages.

Figure 4

Average vowel quality depending on Word-Stress and Position for selected subject (circled) compared with averages of all other subjects.

The selected subject (female, age 21) shows somewhat more difference between the vowels from unstressed non-edge syllables and between the word-final vowels than the average, but neither difference was the most extreme among the production study subjects.

Particular instances of the six syllables (stressed, unstressed antepenult, unstressed penult, unstressed word-initial, word-final under antepenultimate stress, word-final under penultimate stress) needed for /bɑ/ and /ɡɑ/ syllable strings were selected by finding the most typical (relative F1, F2, duration). The acoustic characteristics of each of the selected syllables are given in Figure 5,⁴ which is organized so as to clearly show from which stress pattern and position each was taken. Word-edge syllables are underlined, and, in the case of final syllables, subscripted for how many syllables away from the primary stress they are. This notation is needed to clearly identify each syllable’s source in the created stimuli. Each syllable is given a letter identifier just below it which will be referenced when discussing results. As the source/position of the primary stressed syllable is not a factor in the study, the most typical were chosen, which happened to be from a word with antepenultimate stress in the case of [bɑ] and a word with penultimate stress in the case of [ɡɑ]. Both are identified as (b), and will be represented without color coding in the stimuli strings. When word-initial syllables are used in the stimuli, that syllable is always taken from a word pronounced with antepenultimate stress, and therefore it does not bear any secondary stress.

Figure 5

Stimuli syllables’ duration (ms), F1 (Hz), F2 (Hz), F0 (Hz), intensity (dB).

Concatenated strings of five syllables were created. The control stimuli are visually represented in Figure 6. Control strings either have (i) alternating stressed and unstressed syllables, (ii) two like syllables at the beginning, or (iii) two like syllables at the end. The like syllables may be either two unstressed syllables (part of the ‘lapse set’) or two stressed syllables (part of the ‘clash set’). The lapse set was based around alternating strings with a stressed antepenult (e.g., BAbaBAbaBA), as the antepenultimate remains a stressed syllable when the string begins or ends with two unstressed syllables (e.g., babaBAbaBA, BAbaBAbaba). The clash set was based around alternating strings with a stressed penult (e.g., baBAbaBAba), as the penult remains a stressed syllable when the string begins or ends with two stressed syllables (e.g., BABAbaBAba, baBAbaBABA). As can also be seen in Figure 6, all six of these types were made with each type of unstressed syllable: Those that were originally pronounced as unstressed antepenultimate syllables—(e) in Figure 5—and those that were originally produced as unstressed penultimate syllables (c). The unstressed syllables used in a particular string are indicated both with the introduced color coding and with a prefixed ‘UA’ (unstressed antepenult) or ‘UP’ (unstressed penult). All stimuli were made with both unstressed syllables as the production study found a significant difference between both the duration and the F1, F2 of such syllables, and it was noted that the relative weakness of the unstressed penultimate (c) contributed to the rhythmic difference between the final two unstressed syllables in a word with antepenultimate stress.

Figure 6

Control stimuli strings (created with ‘ba’s and ‘ga’s).

Test versions of initial lapse/clash and final lapse/clash strings were made, which are visually represented in Figure 7. Initial test lapse/clash strings were made using an unstressed syllable that was originally pronounced in word-initial position (a), and therefore presumably pronounced with initial strengthening effects.⁵ There were two final test strings: One used an unstressed final syllable that came from a penultimate-stressed word—…σ́σ₁ #; (f), and the other used an unstressed final syllable that came from an antepenultimate-stressed word—…σ́σσ₂ #; (d).

Figure 7

Test stimuli strings (created with ‘ba’s and ‘ga’s).

Alternating control strings and initial test strings were each included twice, which, in the case of the former, resulted in a balanced number of alternating and non-alternating control strings, and, in the case of the latter, a balanced number of initial and final test strings. Given these doublings, there were also a balanced number of control strings and test strings. Each of the four sets (i.e., the combinations of lapse/clash with UA/UP unstressed syllables) had eight strings (six unique). All these were created twice, once with the selected subject’s /bɑ/ syllables, and once with the /ɡɑ/ syllables. This resulted in 64 stimuli, which were repeated twice in a multiple forced choice experiment administered through Praat.

3.1.3 Procedure

Subjects were told they would hear five-syllable strings with both ‘strong’ and ‘weak’ syllables and that these syllables alternated except at certain times, and that their task was to decide whether each string was ‘alternating’ or ‘not alternating.’ They listened to six example control strings (all using ‘ba’ syllables): Stressed-initial alternating and unstressed-initial alternating, which they were told were ‘alternating’; lapse-initial and clash-initial, which they were told were ‘not alternating’ because the beginning failed to alternate; and lapse-final and clash-final strings, which they were told were ‘not alternating’ because the end failed to alternate. Because sample strings played to subjects to explain the task had to include one or the other unstressed syllables—i.e., UA (e) or UP (c) syllables—this was kept consistent for each subject, but half (N = 28) heard sample strings with unstressed antepenultimate syllables (e) as the weak syllables in the string, and the other half heard sample strings with unstressed penultimate syllables (c) as the weak syllables. All subjects received all versions of the stimuli in the experiment itself.

The data from subjects who failed to get over two-thirds of the strong-initial alternating sequences correct were discarded (N = 5), leaving 51 subjects.

3.2 Results

A full model run with the additional factor of Consonant (2 levels: [b], [ɡ]) found no significant effect of Consonant (p = 0.617), nor were any interaction terms with Consonant significant (p ≥ 0.144), and so this factor was excluded from the final model, given in Appendix C. The model was run with Subject as a blocking factor. The percentage of ‘alternating’ responses for each of the stimuli are also given in Appendix D. Both the actual percentage and the estimated means from a logistic regression model are given, as the latter serve as the basis for the post-hoc pairwise comparisons (Fisher’s LSD). All reported p-values come from the pairwise comparisons of the levels of the four independent variables: Alternating (yes/initial-at-issue/final-at-issue), Set (lapse/clash), Unstressed (UA/UP), and Test-Syllable (none/correct-for-string/incorrect-for-string).

The perception study was set up to establish whether the phonetic differences between vowels from word-internal unstressed syllables—(c) vs. (e) in Figure 5—and between vowels from word-final syllables—(d) vs. (f)—found in the production study had consequences for the perception of rhythm. We find significant differences with the final test strings; both in comparison to their final control lapse/clash counterparts and between (most) final test versions. All versions of the final test strings were significantly more likely to result in ‘alternating’ responses than any final control lapse/clash string (p ≤ 0.002). The comparison between final test strings can most clearly be shown through a ranking with respect to the percentage identified as alternating (estimated means). We find the opposite progression between the final-lapse-type strings and the final-clash-type strings, as shown in Figure 8. In the top row we see that the final test strings with unstressed antepenultimate syllables (e) and a final syllable that originally directly followed the main stress (f) was the least likely to elicit ‘alternating’ responses under final lapse but was the most likely to elicit them in the potential clash environment. In the bottom row we find that strings with unstressed penultimate syllables (c) and word-final syllables that were originally non-adjacent to the main stress syllable (d) had the reverse effect. They were the most likely to elicit an ‘alternating’ response despite final stress lapse but were (among) the least likely to elicit it in the potential clash environment, despite the fact that final test strings in the clash set actually do alternate between stressed and unstressed syllables.

Figure 8

Percent identified as alternating: Final test strings.

As previously established, word-final syllables produced directly after the main stress (f) were significantly shorter (see Section 2.2.1) and significantly more centralized (see Section 2.2.2) than word-final syllables produced as part of a final stress lapse (d). We see this difference reflected in the expected direction in the rankings above: Final test strings with a final syllable originally produced under final stress lapse—σ₂; (d)—were more often identified as ‘alternating’ when coming after an unstressed syllable and were less often identified as ‘alternating’ when coming after a stressed syllable—compared to final test strings with σ₁; (f). Likewise, the source of the unstressed syllables in the strings also has an effect, which goes in the expected direction. Vowels from unstressed antepenultimate syllables (e) were found to be the stronger of the two types of vowels from non-edge unstressed syllables, in the sense of having a longer duration and a significantly higher F1 than a vowel from a non-edge syllable pronounced under stress lapse (c). A final lapse string with an unstressed antepenult (e) and the weaker type of word-final vowel (f) was the string least likely to be identified as ‘alternating.’

The percentages in Figure 8 are shown separated into the ba-strings and the ga-strings in Figure 9.⁶

Figure 9

Percent identified as alternating: Final test strings by consonant.

We see notably consistent results through both sets of syllables, and so will return to discussing the combined results. The pairwise comparisons between the strings most identified as alternating from each set and the other final test strings in the same set are shown in Figure 10. The last comparison in each set is between the most relevant other string and the string least often identified as ‘alternating.’

Figure 10

Pairwise comparisons using Fisher’s LSD: Final test strings.

Among the lapse set we see that while the string most identified as alternating contains the weaker word-internal unstressed syllable—an unstressed penult; (c)—and the stronger word-final syllable—under antepenultimate stress; (d), it differs significantly only from strings in which both syllable types have been changed, to the stronger word-internal unstressed syllable—an unstressed antepenult; (e)—and the weaker word-final—under penultimate stress; (f). Changing only one of these two does not result in the string being significantly less likely to be identified as alternating. The differences between both the unstressed word-internal syllable types and the word-final syllable types have an important effect when combined, since strings with the combined effect of stronger word-internal unstressed syllables (e) and weaker word-final syllables (f) are much less likely to be identified as alternating. The comparison in the fourth line in Figure 10 shows a significant difference between strings with unstressed antepenultimate syllables (e) that differ only in the source of the final syllable—(d) vs. (f).

Among the clash set, we do not see the same pattern in reverse; rather, the source of the word-internal unstressed syllable does not have an effect, while the source of the final syllable does. Strings with a stronger final syllable—under antepenultimate stress; (d)—make an alternating string significantly less likely to be identified as alternating.

If we look at the initial test strings (see table of means in Appendix D), both in comparison to their initial control lapse/clash counterpart and between the initial test versions, we find that initial test strings do not show any evidence that an initial unstressed syllable (a) (which presumably shows some effects of initial strengthening in the consonant) can either contribute to the rhythm of the word (under initial lapse) or interrupt the rhythm (when the peninitial is stressed). Initial test strings with lapse were no more likely to be identified as alternating than initial control strings with lapse (p ≥ 0.237). Likewise, initial test strings with (potential) clash were no less likely to be identified as alternating than true alternating strings from the clash set (p ≥ 0.475 within pairs with the same type of unstressed syllable).

Finally, looking to the true alternating strings (see table of means in Appendix D), we find a significant difference between the identifiability of those that start with a stressed syllable compared with those that start with an unstressed syllable (p < 0.001), while there is no significant difference between forms with differently-sourced unstressed syllables (p ≥ 0.190).

3.3 Perception experiment summary and discussion

The perception study tested the degree to which vowels from words with different stress patterns (final stress lapse, no stress lapse) influenced the perception of an alternating rhythm. Vowels originally produced in word-final syllables under stress lapse (d) were significantly more likely to upset the perception of alternating rhythm when occurring after a stressed penult than vowels originally produced in word-final syllables that followed a stressed penult (f). This finding is consistent with the longer duration and less reduced vowel quality that was found for word-final vowels that are part of a stress lapse (d). While the word-final vowels under stress lapse (d) did not, on their own, result in significantly more identifications of ‘alternating’ in cases of final stress lapse, they did when combined with a non-final unstressed syllable originally produced under final lapse (c). Thus, the relative strength of the stress-lapse word-final vowels (d) and the relative weakness of the stress-lapse non-final vowels (c) found in the production study have been shown to have perceptual consequences.

Of course, the ‘wrong’ type of vowel would not be naturally produced: For example, the stronger type of word-final vowel (d) would not occur following a penultimate stress, and so the fact that it is seen to interrupt the perception of alternating rhythm when it does follow a stressed syllable does not have a real-world consequence. The fact that there are four levels of non-initial unstressed vowels—(c) versus (e); (d) versus (f), however, offers an explanation for how a word-final syllable can be perceived as contributing to the rhythm of a word under final stress lapse but does not create the perception of clash when following a stressed penult.

4. General discussion and conclusion

Whether or not vowels occurred as part of a stress lapse was found to influence both the vowel’s duration and F1, F2. Only /ɑ/ was investigated, both because as a low vowel it offers more room for degrees of reduction and because orthographic ‘a’ can fairly straightforwardly be used to elicit stressed [ɑ] and unstressed [ə]. Hammond’s (1999) account of word-final vowels in English takes /ɑ, æ, ɛ/ to be realized [ə], whereas other lax vowels tend to become one of [i, u, o], which can surface word-finally. The degree to which the multiple categories of duration and F1, F2 found for /ɑ/ from non-initial unstressed syllables occurs with other vowels remains to be investigated. Marry (2015) gives some support for a more general durational difference between word-final rhymes under and not under stress lapse. Her production study used English words paired for final rhymes where one member of the pair has penultimate stress and the other has final stress (e.g., assássin/móccasin; umbrélla/góndola) and found that final rhymes under stress lapse were significantly longer than final rhymes in syllables that followed a stressed penult.

The experiments presented here are an in-depth investigation into whether syllables under word-final stress lapse in English can nevertheless be heard to have something of an alternating rhythm. Support for this hypothesis was found both in the differences in duration and in F1, F2 for vowels involved in a final stress lapse and in the differences these syllables made to the perception of alternating rhythm. Lunden’s (to appear) proposal that the inherent duration of the final syllable could contribute a phonetic ‘prominence alternation’ between the last two syllables left the paradox that even if such a syllable could be perceived as somewhat strong under final stress lapse, it would need to be perceived as weak when following a stressed penult to avoid interrupting the stress rhythm. This is a particular concern as the penultimate syllable is a very common place for primary stress across languages (Gordon, 2002), including some the languages that have stress lapse in some words (such as English, Dutch [Trommelen & Zonneveld, 1999], and the Australian language Limilngan [Harvey, 2001]). The present studies offer a resolution to this paradox through the finding that the degree of final lengthening and vowel reduction varies as a consequence of the stress pattern of the word. Word-final vowels in a syllable following a stressed penult were found to show less final lengthening and greater reduction than those that occurred as part of a stress lapse, meaning they were significantly less likely to be perceived as interrupting the rhythm of the word.

The fact that duration and vowel quality were found to differ in syllables that were part of a word-final stress lapse supports the hypothesis that binary stress languages that tolerate final stress lapse nevertheless have a phonetic-level alternation over the final two syllables. It is an open question how the phonetic differences, in particular the relatively strong difference in degrees of final lengthening and vowel quality of word-final syllables, interact with the phonological stress system. While the proposal that there is a phonetic-level alternation which carries the rhythm through the end of the word is meant to broadly account for the typological fact that final lapse is not uncommon in binary-stress languages, it is unknown at this point whether other languages also show the phonetic differences between final vowels that have been found in the current English study. It may be that the standard level of final lengthening is enough to perceptually continue the rhythm in some languages, and that we do not generally find additional phonetic augmentations. If this is the case, it may be that English has non-crucially enhanced the phonetic difference. If, on the other hand, further investigation finds this phonetic augmentation is consistently part of the phonetic realization of final lapse, then the question of how it comes about is particularly important because the augmentation would appear to be crucial to the tolerance of final lapse. Conversely, it is also possible that it is final vowels under lapse that show ‘normal’ levels of final lengthening and vowel quality, and that these are phonetically dampened when occurring adjacent to a stressed syllable.⁷ In this case, the phonetic differences found here between word-final vowels would exist due to independent pressures on an unstressed syllable that is adjacent to a stressed one, and not be directly related to the condition of final lapse. Further work will hopefully give insight into the domain of the phonetic differences found here.

Additional Files

The additional files for this article can be found as follows:

Appendix A

Production study linear models of z-duration, z-F1, and z-F2. DOI: https://doi.org/10.5334/labphon.37.s1

Appendix B

Production study results of word-final and secondary stressed vowels by subject. DOI: https://doi.org/10.5334/labphon.37.s2

Appendix C

Perception study linear model: DV = Response. DOI: https://doi.org/10.5334/labphon.37.s3

Appendix D

Percent identified as alternating in perception study. DOI: https://doi.org/10.5334/labphon.37.s4

Notes

Because all effects are within-subjects, the variance due to differences in subjects can be accounted for by treating Subject as a fixed effect in the programming statements. This treats Subject as a blocking factor and is equivalent to treating it as a random effect in a linear mixed model with a compound symmetry covariance structure (assuming a normal distribution). [^{^}]
Effects of the following consonant could not be considered in the models as final syllables lack a following consonant, and so the factors Position and Following-Consonant are not fully crossed. [^{^}]
Unstressed initial vowels are put aside in this four-tier description, as they do not have a counterpart in the other stress pattern. [^{^}]
F0, like the formants, was measured at the midpoint (using a script from the Praat website); intensity is an average over the vowel (script by Hirst 2009). [^{^}]
While the duration and formant values of vowels in word-initial syllables in words with antepenultimate stress were not found to be significantly different from the (other) weakest position, the penult under antepenultimate stress (duration: p = 0.337; F1: p = 0.201; F2: p = 0.253), we expect initial strengthening to primarily affect the initial consonant, which was not measured. [^{^}]
Consistent with the source for Figure 8, the percentages are Figure 9 are estimated means, taken from linear models run separately for ba-strings and ga-strings. These models are not discussed further. [^{^}]
This could be looked at as foot-based phonetic effect, alongside the phonological parallel that the non-head members of feet are often forced to be weaker than unstressed syllables that are unfooted (Bennett 2012). [^{^}]

Acknowledgements

Thanks are due to the editors and two very helpful reviewers: Jonah Katz and another, anonymous reviewer. I am very grateful to my research assistants for their help on multiple fronts. Abigail Delgado and Johnny Willing assisted with running the production experiment and with measuring the resulting sound files. Additional thanks are due to Johnny for serving as a research assistant on the project overall. Noella Handley assisted with running the perception study. Marissa Messner wrote the script used to randomize the stimuli into question/answer pairs. I thank the CELL research group at William & Mary, especially Jessica Campbell, Kate Harrigan, Mark Hutchens, Dan Parker, and Megan Rouch, who helped determine how to most clearly present the perception stimuli in Section 3. Thanks to Renee Kemp and Nathan Sanders for discussion of Euclidean distance calculations. Finally, I thank Nick Kalivoda for helpful discussion and Kim Love of K. R. Love Quantitative Consulting and Collaboration for consultation regarding aspects of the statistical tests reported here.

Competing Interests

The author has no competing interests to declare.

References

J. Barnes, (2006). Strength and weakness at the interface: Positional neutralization in phonetics and phonology. Berlin: Mouton de Gruyter.

M. E. Beckman, J. Edwards, (1994). Articulatory evidence for differentiating stress categories. Papers in laboratory phonology III: Phonological structure and phonetic form, : 7. DOI: http://dx.doi.org/10.1017/CBO9780511659461.002

R. Bennett, (2012). Foot-conditioned phonotactics and prosodic constituency (Unpublished doctoral dissertation). Santa Cruz: University of California.

P. Boersma, D. Weenink, (2016). Praat: A doing phonetics by computer. Computer program, Retrieved from: http://http://www.fon.hum.uva.nl/praat/ (Version 5.4.22, retrieved 8 October 2015).

E. Flemming, (2007). The phonetics of schwa vowels. Cambridge, MA: MIT. Retrieved from: http://web.mit.edu/flemming/www/paper/schwaphonetics.pdf (manuscript).

E. Flemming, S. Johnson, (2007). Rosa’s roses: Reduced vowels in American English. Journal of the International Phonetic Association 37 (01) : 83. DOI: http://dx.doi.org/10.1017/S0025100306002817

C. Fougeron, P. Keating, (1997). Articulatory strengthening at edges of prosodic domains. Journal of the Acoustical Society of America 101 (6) : 3728. DOI: http://dx.doi.org/10.1121/1.418332

D. Fry, (1958). Experiments in the perception of stress. Language and Speech 1 : 126. DOI: http://dx.doi.org/10.1177/002383095800100207

M. Gordon, (2002). A factorial typology of quantity-insensitive stress. Natural Language and Linguistic Theory 20 : 491. DOI: http://dx.doi.org/10.1023/A:1015810531699

M. Grosvald, (2009). Interspeaker variation in the extent and perception of long-distance vowel-tovowel coarticulation. Journal of Phonetics 37 (2) : 173. DOI: http://dx.doi.org/10.1016/j.wocn.2009.01.002

K. Hale, J. W. Eagle, (1980). A preliminary metrical account of Winnebago accent. International Journal of American Linguistics 46 (2) : 117. DOI: http://dx.doi.org/10.1086/465641

M. Hammond, (1999). The phonology of English: A prosodic optimality-theoretic approach. New York: Oxford University Press.

M. Harvey, (2001). A grammar of Limilngan: A language of the Mary River region, Northern Territory, Australia. Pacific Linguistics, p. 516.

D. Hirst, (2009). Praat script analyse_tier, version 8 July 2009, (Retrieved 23 June 2011).

D. Klatt, (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America 59 (5) : 1208. DOI: http://dx.doi.org/10.1121/1.380986

M. Lennes, (2003). Praat script, (Modified by Dan McCloy, December 2011; Retrieved 14 January 2016).

B. Lindblom, (1963). Spectrographic study of vowel reduction. Journal of the Acoustical Society of America 35 (11) : 1773. DOI: http://dx.doi.org/10.1121/1.1918816

A. Lunden, (2011). The weight of final syllables in English. Proceedings of the 28th West Coast Conference on Formal Linguistics. : 152. Retrieved from: http://www.lingref.com/cpp/wccfl/28/index.html.

A. Lunden, (2013). Reanalyzing final consonant extrametricality: A proportional theory of weight. Journal of Comparative Germanic Linguistics 16 (1) : 1. DOI: http://dx.doi.org/10.1007/s10828-013-9053-3

A. Lunden, (). Explaining word-final stress lapse In: R. Goedemans, J. Heinz, H. van der Hulst, The study of word stress and accent: Theories, methods and data. Cambridge: Cambridge University Press. (to appear).

A. Lunden, N. Kalivoda, (2016). Stress correlate database, Retrieved from: http://wmpeople.wm.edu/sllund/stresscorrelatedatabase (Retrieved 28 December, 2016).

A. Marry, (2015). Effects of rhythmic stress on unstressed syllables, Retrieved from: http://publish.wm.edu/honorstheses/990 (College of William & Mary Undergraduate Honors Theses. Paper 990).

J. McCarthy, A. Prince, (1993). Generalized Alignment In: G. Booij, J. van Marle, Yearbook of Morphology. Dordrecht: Kluwer, pp. 79. DOI: http://dx.doi.org/10.1007/978-94-017-3712-8_4

S. Nakai, (2013). An explanation for phonological word-final vowel shortening: Evidence from: Tokyo Japanese. Laboratory Phonology 4 (2) : 513. DOI: http://dx.doi.org/10.1515/lp-2013-0016

D. Oller, (1973). The effect of position in utterance on speech segment duration in English. Journal of the Acoustical Society of America 54 (5) : 1235. DOI: http://dx.doi.org/10.1121/1.1914393

A. Prince, P. Smolensky, (1993). Optimality theory: Constraint interaction in generative grammar. Boulder: Rutgers University, New Brunswick and University of Colorado. (manuscript).

K. Stevens, A. House, (1963). Perturbation of vowel articulations by consonantal context: An acoustical study. Journal of Speech and Hearing Research, : 111. DOI: http://dx.doi.org/10.1044/jshr.0602.111

M. Trommelen, W. Zonneveld, (1999). Dutch In: H. van der Hulst, Word prosodic systems in the languages of Europe. Berlin: Mouton de Gruyter, pp. 492.

C. Wightman, S. Shattuck-Hufnagel, M. Ostendorf, P. Price, (1992). Segmental durations in the vicinity of prosodic phrase boundaries. Journal of the Acoustical Society of America March 1992 91 (3) : 1707. DOI: http://dx.doi.org/10.1121/1.402450

Article No.	27
Submitted on	2016-07-20
Accepted on	2017-06-03
Published on	2017-11-13

Abstract

Keywords

How to Cite

Downloads

9570

1449

3

1. Introduction

2. Production experiment

2.1 Methods

2.1.1 Participants

2.1.2 Stimuli

2.1.3 Procedure

2.1.4 Measurement

2.2 Results

2.2.1 Duration

2.2.2 Vowel quality

2.3 Production experiment summary and discussion

3. Perception experiment

3.1 Methods

3.1.1 Participants

3.1.2 Stimuli

3.1.3 Procedure

3.2 Results

3.3 Perception experiment summary and discussion

4. General discussion and conclusion

Additional Files

Notes

Acknowledgements

Competing Interests

References

Share

Authors

Downloads

Issue

Publication details

Supplementary Files

Licence

Identifiers

Peer Review

File Checksums (MD5)

Table of Contents

Non Specialist Summary