1. Introduction

Final Devoicing is a phonological pattern whereby the [voice] contrast1 is neutralised in word- or syllable-final position, as in Catalan [ˈɡɾiz-ə] versus [ˈɡɾis] “gray (f/m)”. The general research question behind this study is the following: how does Final Devoicing develop as a sound change? We observe a mismatch between the contexts in which word-final [+voice] obstruents are expected to devoice according to the phonetic literature (for instance, within fricatives vs. stops), and the actual patterns of phonologised Final Devoicing in the typology. We argue that this mismatch may stem from an incomplete assessment of the contexts in which the change is expected to begin: what should be considered is not the partial lack of vocal fold vibration in [+voice] obstruents, but the magnitude of the [voice] contrast. Since contrast is signalled by various cues, this assessment should also include other cues besides phonation, such as durations. This study investigates the phonetic precursors of the change by quantifying the robustness of the [voice] contrast in terms of two acoustic cues: the voicing ratio and the V/VC duration ratio, in word-final obstruents in large corpora of French. The effects of three factors which are expected to influence the magnitude of the [voice] contrast are examined: position in the utterance (internal vs. final), manner (stops vs. fricatives), and place of articulation.

1.1. Identifying the phonetic precursors of Final Devoicing

Final Devoicing is a phonological neutralisation pattern: [+voice] and [ – voice] obstruents do not contrast in domain-final position. The consensus in the literature is that the sound change towards neutralisation stems from misperception, which is more likely to occur in contexts where the phonation of the [+voice] obstruents is more difficult to sustain (Blevins, 2006; Myers, 2012):

    1. (1)
    1. Contexts in which voicing is more difficult to sustain
    1.  
    1. a.
    1. At the end of utterances
    1.  
    1. b.
    1. In fricatives
    1.  
    1. c.
    1. In posterior obstruents.

The first context in (1a) is the utterance-final, prepausal context. In this position, the phonetic pressures on voicing production have multiple sources (Blevins, 2006). Subglottal pressure decreases at the end of utterances,2 and the pressure differential between the subglottal and supraglottal areas may cause vocal fold vibration to stop before the end of the obstruent (Westbury & Keating, 1986). Since obstruents are longer under phrase-final lengthening, voicing in [+voice] obstruents may decay before the occlusion is released (Ohala, 1997). Additionally, vocal folds may open early to allow breathing in anticipation of the following pause (Myers, 2012). Finally, several cues to the [voice] contrast lie in the CV transition, which is absent in final position (Steriade, 1997, 1999). The tendency to variably devoice [+voice] obstruents at the end of prosodic domains has been found in several languages, and represents a strong tendency in L1 and L2 acquisition (Broselow, 2018).

Second, fricatives are more likely to undergo variable devoicing than stops (1b), because maintaining frication requires a high oral pressure, which conflicts with the low oral pressure needed to maintain a differential with infra-laryngeal pressure to sustain vocal fold vibration (Ohala, 1983, 1997). These contradictory requirements are reflected in the typology, where many languages allow the [voice] contrast only for stops and not for fricatives.

Third, voicing is expected to be more difficult to maintain in posterior consonants compared to anterior consonants (1c). Under the Aerodynamic Voicing Constraint (AVC, Ohala, 1997, 2011), velar stops have a smaller oral cavity than labial and alveolar stops, and may accommodate less glottal airflow before the supra-laryngeal pressure becomes higher than the infra-laryngeal one. Moreover, voicing can be maintained by the passive expansion of the oral cavity during closure, and posterior stops cannot take advantage of the cheek and tongue surface compliance in this respect (Westbury, 1983). This asymmetry is reflected in the typology of phonemic inventories, as /ɡ/ is relatively rare compared to /b/ (Maddieson, 2013).

With multiple sources and many attested phonologised patterns in the typology (Keating et al., 1983), Final Devoicing is often considered the archetype of natural, phonetically-grounded changes in the sound change literature (Beguš, 2020). However, while the general mechanism seems plausible, there are still problems in the details of the change’s development. As noted by Broselow (2018), if Final Devoicing originates in contexts where it is more difficult to maintain phonation, the asymmetries of the contexts in (1) should be phonologised in at least some of the many patterns of Final Devoicing found in the world’s languages: there should be languages in which the neutralisation pattern categorically targets only prepausal obstruents, only fricatives, or only posterior obstruents. However, none of these three biases is reflected in the typology of phonologised patterns of Final Devoicing. To the best of our knowledge, only variable, gradient devoicing is reported in utterance-final position (1a) (Blevins, 2006); in phonologised processes, all word- (or syllable-) final obstruents are involved. Regarding the manner asymmetry in (1b), Myers (2012) reports that only two languages neutralise only fricatives: Gothic and Old English. On the other hand, three languages neutralise the [voice] contrast in final stops, but not in final fricatives: Turkish, Ferrarese Italian and Saranda Ekklisies Greek. Jansen (2004, p. 97) points out that the laryngeal neutralisation developed historically in fricatives before stops in German, but in stops before fricatives in Belorussian. In the large majority of languages, the process applies to the whole set of obstruents; when manner matters, there is no clear preference for stops or fricatives. Finally, concerning (1c), we know of only one example, Tonkawa, which is reported to devoice only the word-final velar /ɡ/ (Hoijer, 1933; the lenis stops are said to be pronounced halfway between the English lenis and fortis stops). The overwhelming majority of phonologised Final Devoicing processes target the class of obstruents as a whole, regardless of their place of articulation.

How can we explain these mismatches? The first asymmetry, concerning the generalisation of devoicing from the utterance-final position to the word-final position, has received various theoretical explanations in the literature. For Blevins (2006), this generalisation emerges from the frequent one-word sentences in child-directed speech; for Bermúdez-Otero (2015), from input restructuring in the Life Cycle of phonological processes; for Steriade (1997), from Paradigm Uniformity effects. However, if the other two mismatches identified above are correct, we still need to account for the generalisation from fricatives to stops, and from posterior places of articulation to anterior ones.

To better understand how Final Devoicing emerges, this paper develops a different hypothesis: neutralisation should not be expected because voicing itself is more challenging to maintain in certain contexts, but rather because the laryngeal contrast, with the various phonetic cues that are implicated therein, may be more challenging to maintain in those contexts. More specifically, the criteria for identifying the contexts in (1) are incomplete. There are two main issues: first, they focus on [+voice] obstruents alone, whereas Final Devoicing is a contrast neutralisation pattern; and second, they are based primarily on phonation, despite the fact that other cues contribute to the [voice] contrast. It may be that reassessing the question with different parameters will help shed light on the three mismatches reported above.

First, Final Devoicing is a neutralisation pattern, so the focus should be on the acoustic distance between [+voice] and [ – voice] phonemes. Importantly, the tendency for the [+voice] member of a given pair of phonemes to be partially devoiced does not necessarily indicate that the contrast between the two phonemes is reduced. For instance, Patience and Steele (2022) find that Quebec French [+voice] fricatives are nearly as voiced as [+voice] stops (87% vs. 90%) in word-final, prevocalic position, yet [+voice] and [ – voice] fricatives are better differentiated by the proportion of voicing than stops. In other terms, the [voice] contrast is smaller for stops than for fricatives, such that stops are actually closer to neutralisation than fricatives. Second, phonological features such as [voice] are cued by a range of acoustic parameters (Kingston & Diehl, 1994; Kirby & Ladd, 2016; Lisker, 1986). In particular, durations have been shown to be stable correlates of the [voice] contrast in many languages. Constriction durations of [+voice] obstruents are typically shorter than those of [ – voice] ones (Denes, 1955), and vowels in the VC# rhyme are typically longer before [+voice] obstruents than before [ – voice] obstruents, a phenomenon known as the voicing effect (Chen, 1970; see the extensive literature reviewed in Coretta, 2020). Identifying the phonetic precursors of Final Devoicing should therefore be based on contrast magnitude, and on a broader range of cues.

The present paper proposes to explore this research avenue by investigating the phonetic precursors of Final Devoicing in the following manner. We focus on contemporary French, using large corpora of natural speech (the choice of French is explained in Section 1.2). Two acoustic cues to the contrast are examined: the proportion of voicing during the obstruent (called voicing ratio or v-ratio) and the V/VC duration ratio, assessing their ability to discriminate between [+voice] and [ – voice] obstruents in word-final position. The V/VC duration ratio is the duration of the vowel divided by the duration of the vowel plus the duration of the consonant in the rhyme. This measure has the advantage of comparing the lengthening of the vowel and the lengthening of the consonant in the rhyme under the effect of domain-final lengthening, while being less susceptible to contextual variation such as speech rate (Barry, 1979; Kohler, 1979). After examining these two cues separately, we assess the overall robustness of the [voice] contrast in different contexts by comparing their normalised effect magnitude. The central question is whether the contexts most likely to trigger [voice] neutralisation in this approach align with the three contexts identified primarily through the phonation of [+voice] obstruents in (1). For instance, some studies have found that the durational correlates of the [voice] contrast are reinforced in utterance-final position: vowels lengthen more before [+voice] obstruents, and constrictions lengthen more in [ – voice] obstruents (Kohler et al., 1981; Luce & Charles-Luce, 1985). It could be that the V/VC duration ratio contrast is actually stronger in some of these contexts, so that the overall contrast is actually stable or enhanced. Our research questions are therefore as follows:

    1. (2)
    1. Research questions
    1.  
    1. RQ 1
    1. Is the voicing ratio contrast of word-final obstruents reduced in utterance-final position versus utterance-internal position, in fricatives versus stops, and in posterior obstruents versus anterior ones?
    1.  
    1. RQ 2
    1. Is the V/VC duration contrast of word-final obstruents reduced or enhanced in the same contexts?
    1.  
    1. RQ 3
    1. When both cues are combined, is the contrast overall more fragile in these contexts?

The following section provides background on the [voice] feature in French and in other true voicing languages (Section 1.2); a separate section is devoted to the results of two preliminary studies using comparable corpora and forced alignment techniques (Section 1.3).

1.2. Background: word-final [voice] in voicing languages

The present study focuses on Standard French, defined as the variety used in the French media and by Parisian educated speakers.3 Languages that allow 1. word-final stops, and 2. the [voice] contrast in word-final position, are relatively infrequent. French meets these conditions by allowing both stops and fricatives to contrast for [voice] at three places of articulation: labial, alveolar and velar for stops (/p-b/, /t-d/, /k-ɡ/), and labial, alveolar and post-alveolars for fricatives (/f-v/, /s-z/, /ʃ-ʒ/). The contrast is attested word-finally by several minimal pairs: râpe [ʁap] / rab [ʁab], rate [ʁat] / rade [ʁad], bac [bak] / bague [baɡ], baffe [baf] / bave [bav], case [kaz] / casse [kas], cage [kaʒ] / cache [kaʃ].

French is particularly interesting for the present study for two additional reasons. First, some varieties of French exhibit patterns of Final Devoicing, notably in Belgian French (Hambye, 2005, p. 89) and in varieties spoken in Northern France, close to Walloon and Flemish-speaking areas (two languages with Final Devoicing, Pooley, 1994; Temple, 2000). Other regional varieties of French have also been reported to show degrees of obstruent devoicing in domain-final position: in Alsace, in contact with German (Montreuil, 2010), in Brittany, in contact with Breton (Chauveau, 1991, p. 141), and in Bordeaux (Temple, 1999). Thus, the study of word-final obstruents in Standard French may facilitate comparisons between close varieties of the same language “before” and after the change.

Second, French is a true voicing language: the series of lenis obstruents involves active voicing and contrasts with short-lag Voice Onset Time (VOT) stops, as opposed to aspirating languages, where passively voiced stops contrast with fortis long-lag VOT stops (Beckman et al., 2013; Jansen, 2004; Keating, 1984). In French, phonation in [+voice] stops has been identified as the primary cue to the [voice] contrast in perception, including in utterance-final position (van Dommelen, 1983). Durational parameters, on the other hand, play a lesser role than obstruent phonation in the [voice] contrast (Kohler et al., 1981; van Dommelen, 1983). This differs from English (an aspirating language), where vowel duration has been argued to be a primary cue for the laryngeal contrast, especially under domain-final lengthening (Klatt, 1976; Laeufer, 1992; Mack, 1982). Given these differing profiles, one might expect voicing and aspirating languages to respond differently to pressures towards word-final neutralisation. In voicing languages, the failure or reduction of glottal pulsation during constriction could be limited, as it jeopardises a central cue of the contrast. Since the laryngeal contrast in word-final position has been less studied in voicing languages than aspirating ones, the present study aims to contribute to these open questions by examining the case of French in detail. Furthermore, recent corpus studies have reassessed the importance of durational cues in running speech: the voicing effect, which is found to be a primary cue to the contrast in English in laboratory experiments, is overall much smaller in spontaneous speech (Morley & Smith, 2023; Tanner, Sonderegger, Stuart-Smith, & Fruehwald, 2020). Investigating comparable corpora of French will help to better characterise the effect of running speech on the realisation of contrast in a typologically different language. For these reasons, the following review of the literature focuses on voicing languages.

1.2.1. Voicing

Word-final [voice] contrasts have received less attention in the literature compared to word-initial or internal ones, particularly in true voicing languages. Previous studies on this question have identified some degree of devoicing associated with the prepausal environment in languages such as Serbian (Sokolović-Perović, 2012), Hungarian (Gósy & Ringen, 2009), Romanian (Hutin et al., 2020), and other examples cited in Blevins (2006). This reduction in voicing for [+voice] obstruents also diminishes the magnitude of the [voice] contrast in Serbian stops: Sokolović-Perović (2012) reports that word-final intervocalic /b d ɡ/ are voiced during 90.5% of their closure, and /p t k/ during 8.13%, so that there is a difference of 82.37% (Cohen’s h = 1.944) in this position; in utterance-final position, /b d ɡ/ voicing drops to 61.84%, and /p t k/ to 6.45%, reducing the difference to 55.39% (Cohen’s h = 1.30). In French however, Kohler et al. (1981) find a surprising result: the prenasal /d/ and /ɡ/ in dide noire ([did nwaʁ]5), bagues noires ([baɡ nwaʁ]) exhibit a lower proportion of voicing during closure than the same stops in utterance-final position: /d/ before nasal shows 86% voicing, and before pause, 95% voicing; /ɡ/ before nasal shows 77% voicing, and before pause, 87% voicing; the difference between the proportion of voicing in word-final, prenasal [+voice] and [ – voice] stops is 53% (Cohen’s h = 1.15), and it increases to 69–72% (Cohen’s h = 1.62–1.61) in utterance-final position.

According to Ohala (1983, 1997), [+voice] fricatives are expected to be more prone to devoicing than [+voice] stops. This is the case in Dutch, where the devoicing of word-initial /v/ is more advanced than word-initial /b/ (Pinget et al., 2019). However, this prediction is not supported in Italian, as well as European Portuguese and German word-initial and intervocalic stops (Pape & Jesus, 2015). In Quebec French, the proportion of voicing in word-final prevocalic [+voice] fricatives is nearly as high as that of [+voice] stops, and fricatives are better distinguished by the proportion of voicing and the V/C duration ratio than stops (Patience & Steele, 2022). Thus, fricatives do not appear to be more susceptible to [voice] neutralisation than stops. Nevertheless, Jacques (1990) finds an effect of the utterance-final position on /z/ and /ʒ/ (see details below). It is important to note that results for Canadian varieties of French do not necessarily extend to Standard French: Caramazza and Yeni-Komshian (1974) showed that speakers of Canadian French use a different set of cues to the [voice] contrast at least for stops.

The prediction of the AVC (Ohala, 2011) – that consonants at posterior places of articulation should exhibit less voicing than anterior ones – is not consistently supported in the literature. For stops, this prediction holds true in Hungarian (in word-internal intervocalic and in utterance-final position, Gósy and Ringen, 2009) and Russian (in intervocalic, but not word-initial position, Ringen and Kulikov, 2012). Conversely, Sokolović-Perović (2012) finds no effect of place of articulation in Serbian (in word-initial, word-final intervocalic, and utterance-final position). Hutin et al. (2020) report that /d/ devoices more than /b/ and /ɡ/ in Romanian before a pause. Additionally, Popescu et al. (2023) find that velar stops do not devoice more than labial and alveolar stops in Italian, Spanish and French, and that they actually devoice less than the other two places of articulation in Romanian and Portuguese, all contexts in the word combined. The results for French are not consistent either. Laeufer (1996) indicates that the velar /ɡ/ is slightly less often fully voiced (68% of the time) compared to /b/ (76% of the time, Cohen’s h for the difference = 0.18) in word-final, sentence-medial position. Kohler et al. (1981) also find that the velar stop is less voiced than /d/ in word-final prenasal and utterance-final positions, as expected from the AVC. On the other hand, Abdelli-Beruh (2009) does not observe any effect of place of articulation on voicing, except in word-initial position after /s/ (/b/ is more voiced than /d/ and /ɡ/); word-final /ɡ/ does not devoice more than /b/ and /d/ in Temple (1999)’s study.

Regarding fricatives, Dutch has been reported to undergo a devoicing process which affects onset intersonorant [+voice] fricatives in the order /ɣ/ > /v/ > /z/, with velar fricatives showing a greater sensitivity to devoicing, consistent with the AVC (Van de Velde & van Hout, 1996). In Quebec French, Jacques (1990) observes that /ʒ/ is devoiced more frequently than /v/, but as often as /z/ in word-final, utterance-internal position; /ʒ/ devoices slightly more than /z/ in utterance-final position. This small place of articulation effect is consistent with the AVC, as /ʒ/ is further from the glottis than /ɡ/, allowing for more passive extension of the oral cavity’s soft tissues, and the distance between /z/ and /ʒ/ is smaller than the distance between /d/ and /ɡ/. On the other hand, Riverin-Coutlée (2020), in her more recent data from Quebec French, finds smaller rates of voicing in utterance-final position, and no significant effect of the place of articulation.

1.2.2. Duration

Regarding durational correlates, the [voice] contrast in French follows the general tendencies outlined in Section 1.1: [+voice] obstruents correlate with a longer preceding vowel, a shorter obstruent constriction, and a higher V/VC ratio (Abdelli-Beruh, 2004; Kohler et al., 1981; Laeufer, 1992). French has been argued to rely more on stop closure duration (or frication duration) than on preceding vowel duration (van Dommelen, 1983). Laeufer (1992) finds that preceding vowel duration fails to distinguish word-final /v-f/ and /z-s/ in sentence-medial, unfocused position. Nevertheless, vowels before word-final obstruents show considerable variation. They can in particular be longer before [+voice] fricatives /v z ʒ ʁ/, known as consonnes allongeantes, than before [+voice] stops (Fouché, 1956; Grégoire, 1911; Delattre, 1951, pp. 10, 17; van Dommelen, 1981). van Dommelen (1981) reports that vowels before [+voice] fricatives are very long (above 250 ms), with no effect of the place of articulation.

The durational correlates of the [voice] contrast exhibit complex interactions under the effect of the utterance-final position. Before a pause, articulatory gestures slow down and segments are longer (Crystal & House, 1988). This lengthening may preferentially affect vowels (as in English, Crystal and House, 1988; Luce and Charles-Luce, 1985, although not in Myers, 2012) or consonants (as in Hebrew, Berkovits, 1993, and Serbian, Sokolović-Perović, 2012), with various possible effects on contrast. In French, Kohler et al. (1981) find that the V/VC ratio of [+voice] stops increases in utterance-final position (the vowel lengthens more than the stop closure), thus reinforcing a cue towards [+voice]. Conversely, it decreases for [ – voice] stops (the stop closure lengthens more than the vowel), thus reinforcing a cue towards [ – voice]. Laeufer (1992) compares the duration correlates of French and English stops and fricatives in word-final position, considering varying focus and sentence positions. She uses a [ – voice]/[+voice] ratio to calculate the contrast in vowel duration and obstruent constriction duration. Her findings indicate that the contrast between the French pairs /d-t/, /ɡ-k/, /z-s/ and /v-f/ increases from sentence-medial unfocused to sentence-final position, with both vowel and consonant duration affected. Fricatives exhibit a larger contrast and contrast enhancement than stops. These enhancements of one or more durational correlates of the [voice] contrast may help compensate for the potential reduction of glottal pulsing contrast, supporting the overall stability of the [voice] distinction.

Finally, the effect of the place of articulation on the durational parameters of the [voice] contrast is unclear. In Serbian, no effect of place of articulation is observed on word-final stop closure duration, both within and at the end of the utterance; however, preceding vowels create a slightly better contrast before /b-p/ than /d-t/ and /ɡ-k/ (Sokolović-Perović, 2012). In French, Kohler et al. (1981) report that the stop contrast is more strongly cued by preceding vowel duration and consonant duration in prepausal positions than in word-final, prenasal contexts. In their results, we can observe a greater V/VC ratio contrast for /d-t/ than /ɡ-k/ both in word-final, prenasal position (the log(V/VC ratio) difference for /Vd/–/Vt/ is 0.23, which is greater than the same difference for //–/Vk/: 0.04) and in utterance-final position (log(V/VC ratio) difference for /Vd/ – /Vt/: 0.32, for // – /Vk/: 0.18). Regarding fricatives, the weak voicing ratio contrast between /ʒ-ʃ/ in Quebec French is not compensated by durations in Jacques (1990)’s study, as the post-alveolar pair exhibits the lowest consonant and vowel duration contrasts in utterance-final position compared to the other pairs.

1.2.3. Synthesis

This review of the literature on the [voice] contrast in French, in relation to the contexts where Final Devoicing could begin in (1), presents a different picture for stops and fricatives. Stops do not appear to exhibit a smaller voicing ratio contrast in utterance-final position (including for the velars), while the durational correlates of the contrast are expected to be enhanced. Fricatives, on the other hand, might have a better contrast than stops in word-final, utterance-medial position, but are sensitive to prepausal devoicing, particularly for the post-alveolar pair. Their duration contrasts should be enhanced under utterance-final lengthening. At this stage, it remains unclear whether the overall contrast of either type of obstruent is reduced in prepausal position. The present study aims at quantifying this overall effect in natural speech, including both voicing and V/VC duration ratio.

1.3. Large corpus phonetics

The studies reviewed above on French are based on laboratory experiments with manual segmentation (except for Popescu et al., 2023). The laboratory settings often imply a high degree of speech control among speakers, and the use of manual segmentation limits the size of the datasets. To better reflect real linguistic usage, the present study utilises a large corpus of natural speech. The possibility to reliably segment extensive corpora with automatic speech recognition (ASR) has recently opened new avenues for phonetic research. Recent studies examining large corpora of spontaneous speech have reevaluated (aspects of) the laryngeal contrasts in a number of languages (Danish, Puggaard-Rode et al., 2022; Japanese, Tanner, Sonderegger, and Stuart-Smith, 2020; Glasgow English, Sonderegger et al., 2020; American English, Chodroff and Wilson, 2017). In particular, both absolute and relative durations are shorter in spontaneous speech: Tanner, Sonderegger, Stuart-Smith, and Fruehwald (2020) demonstrate that the voicing effect is overall smaller in their English corpora than in laboratory recordings, and Morley and Smith (2023) argue that preceding vowel duration is not a reliable cue to the laryngeal contrast in English stops in conversational speech. The large amount of data available also permits to test a larger set of predictors at the same time (in our case, 2 cues × 3 contexts).

The present paper builds on two preliminary studies on voicing in large French corpora. Using the method of pronunciation variants,6 Jatteau, Vasilescu, Lamel, and Adda-Decker (2019) show that word-final [+voice] fricatives in French are devoiced in two contexts: before a [ – voice] obstruent, i.e., under regressive assimilation (68% of the time), and before a pause (25% of the time). Jatteau, Vasilescu, Lamel, Adda-Decker, and Audibert (2019) extended this initial approach by including stops, enlarging the corpus and replacing the analysis with variants with an acoustic analysis: devoicing is assessed through the computation of a voicing ratio similar to that used in the present study (Hallé and Adda-Decker, 2007; Snoeren et al., 2006, see Section 2.2 for details). It is important to note that our ASR system segments stops as a whole, including both closure and release intervals (see Figure 2 and the discussion on the implications on the results in 2.2). Jatteau, Vasilescu, Lamel, Adda-Decker, and Audibert (2019) report that word-final obstruents have a voicing ratio median of 60% before a pause, compared to 100% before other word-final contexts, except when followed by [ – voice] obstruents. Fricatives exhibit more devoicing than stops in utterance-final position: 27% of the [+voice] fricatives are fully voiced versus 43% of the stops. [+voice] fricatives align with the AVC prediction that posterior obstruents show more devoicing, with word-final /ʒ/ being fully voiced only 14% of the time, versus 29% for /z/ and 43% for /v/. Stops, however, do not show any place-related devoicing effect.

The present study expands on these findings by shifting the focus to the [voice] contrast, calculated as the difference between [+voice] and [ – voice] obstruents for each parameter under examination. Another important change is that we expand the evaluation of the [voice] feature to include durational measurements, enabling the computation of the V/VC duration ratio. While we anticipate that the results for vowel and obstruent durations will differ from laboratory data, as observed in Tanner, Sonderegger, Stuart-Smith, and Fruehwald (2020) and Morley and Smith (2023), the results for the V/VC ratio may not, as it assesses the proportion occupied by the vowel and the consonant in the rhyme, independently of speech rate. Finally, the relative importance in the [voice] contrast of the two parameters under consideration, voicing ratio and V/VC duration ratio, is compared to evaluate the overall magnitude of the effect.

The next section outlines the method (Section 2). The results are developed in Section 3, and discussed in Section 4. Section 5 concludes the study.

2. Methodology

This section presents our methodology based on automatic alignments to explore large speech corpora in French (2.1), along with the acoustic and statistical analyses conducted (2.2).

2.1. Corpora and automatic alignment

Two large corpora of French were used: ESTER (Galliano et al., 2005) and NCCFr (Torreira et al., 2010). The ESTER corpus comprises 90 hours of radio and TV broadcast news, recorded between 1998 and 2003. It includes spontaneous and prompted speech, predominantly by professional speakers. In order to restrict the corpus to Standard French, we removed files from Radio Télévision Maroc and Radio France International (approximately 40 hours in total), which contained many segments of Maghrebine and African French. The NCCFr corpus contains 36 hours of conversation between friends, mainly students, recorded in Paris in 2007.7 Overall, the two corpora include a total of 80 hours of speech.

The corpora were automatically segmented using the LISN (formerly LIMSI) lab’s automatic speech recognition (ASR) system (Gauvain et al., 2002). The LISN aligner is a GMM-HMM system, with monophonic and speaker-independant phone models (Adda-Decker & Lamel, 1999; Vasilescu et al., 2020). Forced alignment takes the manual, orthographic transcription as input, and matches it with the audio files, returning the temporal boundaries of each word and phone in the corpus. The acoustic models of the phonemes were trained on about 250 hours of transcribed broadcast news data. The ASR system segments stops as a whole, including both occlusion and release intervals (see Figure 2 and the discussion in section 2.2).

From these data all words ending in a vowel followed by /p, t, k, b, d, ɡ, f, s, ʃ, v, z, ʒ/ were selected. Several irrelevant tokens were filtered out, including incomplete words, interjections (such as pff, oups), loans and foreign names (such as Gomez, which can be pronounced with a final [z] or [s]), monosyllabic Cə words (such as j’ (je), s’ (se), d’ (de), which are extremely frequent but not phonologically word-final), and words with allomorphs (such as plus, which can be pronounced [ply] or [plys] depending on its meaning).

Very long segments often represent gross misalignments, particularly when overlapping speech occurs from two speakers or during background noise or music. These cases were excluded by removing all words whose final obstruent duration exceeded the mean plus twice the standard deviation. Shorter segments may be problematic too. Since the minimal duration of segments in the automatic aligner is 30 ms, segments shorter than 30 ms or absent are still segmented, with the minimal duration of 30 ms. To reduce noise in our data, all word-final obstruents aligned as 30 ms long were eliminated.

Additionally, since schwa has been shown to block the voicing alterations of word-final obstruents (Hutin et al., 2021; Jatteau, Vasilescu, Lamel, Adda-Decker, & Audibert, 2019), we excluded words whose last obstruent was followed by a schwa (e.g., fève aligned as [fɛvə] rather than [fɛv]), which accounted for 13% of the remaining data.

Jatteau, Vasilescu, Lamel, Adda-Decker, and Audibert (2019) show that the voicing rate of word-final, utterance-internal obstruents in French is highly sensitive to the onset of the following word. To avoid regressive voicing assimilation effects (Hallé & Adda-Decker, 2011; Kohler et al., 1981), we selected the presonorant position as a baseline representing the utterance-internal context (e.g., arrive mardi [aʁiv maʁdi]). This configuration is compared to the utterance-final position, when the word-final consonant is followed by a pause, including silences and breaths (e.g., arrive ## [aʁiv]).

This results in a total of 11,808 word-final obstruents. As can be seen in Figure 1, /t/, /k/ and /s/ stand out as the most frequent word-final obstruents in the corpus, and /b/ and /ɡ/ as the least frequent.

Figure 1
Figure 1

Counts of the word-final obstruents in the corpus, in utterance-internal and utterance-final position.

To control for the quality of the segmentation, a subset of 400 tokens was checked manually. It was found that 93% of the automatically assigned boundaries fall within 20 ms of the corresponding manual boundary, which was deemed an acceptable error margin. This threshold is comparable to the average inter-annotator variability of 16 ms reported by Pitt et al. (2005), rounded to 20 ms to take account of the 10 ms resolution of the forced alignment system used. Figure 2 shows some example annotations from the corpus NCCFr.

Figure 2
Figure 2

Waveforms and accompanying annotations for word-final /d/ before a sonorant (2a) and before a pause (2b). Top tier represents phonemic transcription, bottom tier contains word transcription.

2.2. Acoustic and statistical analyses

Two correlates of the [voice] contrast were measured: the voicing ratio and the V/VC duration ratio. The proportion of vocal fold vibration during constriction, or voicing ratio, was calculated using the F0 extraction module in Praat (Boersma & Weenink, 2017). Following Snoeren et al. (2006) and Hallé and Adda-Decker (2007), a voicing ratio was extracted, in the 0–1 range, defined as the number of points detected as voiced by Praat’s pitch detection algorithm (Boersma et al., 1993) (with default parameters, using a pitch floor of 75 Hz and a pitch ceiling of 600 Hz), divided by the total number of detection points (20 per consonant). This parameter will be referred to as the v-ratio. The second correlate of the [voice] contrast examined is the V/VC duration ratio, which represents the duration of the vowel preceding the obstruent divided by the duration of the entire rhyme. These durations are directly extracted from the automatic segmentation. As mentioned above, and illustrated in Figure 2, “consonant duration” in our study includes stop closure and release intervals. This differs from previous studies, which usually use the closure interval alone for the consonant duration. Our stop durations are therefore expected to occupy a larger proportion of the rhyme, resulting in a comparatively smaller V/VC ratio. Release duration has been found to cue the [voice] contrast in French, with [+voice] stops exhibiting both longer closure and longer release durations than [ – voice] stops (Abdelli-Beruh, 2004; Flege & Hillenbrand, 1987; van Dommelen, 1983). Thus, stop closure duration and release duration align in the same direction in cuing the contrast, being longer in [ – voice] stops. As a result, in our findings, the [voice] contrast for stops is anticipated to be larger than if closure duration was examined separately.

For the statistical analyses, we adopted a methodology similar to that of Tanner, Sonderegger, and Stuart-Smith (2020), utilising multivariate Bayesian modelling to account for the effect size of the different factors considered on the two correlates of the [voice] contrast, v-ratio and V/VC duration ratio. We used laryngeal feature (phonologically [+/– voice]), context (utterance-internal or final), manner (stop or fricative), and place of articulation (POA: labial, alveolar or back8) as fixed effects, together with the two, three and four-way interactions between fixed effects. The three and four-way interactions between fixed factors allow us to compare combinations of manner, place of articulation and position in the utterance on the [voice] contrast, in order to be able to distinguish the degree of contrast neutralisation through the comparison of [+voice] and [ – voice] on e.g., /ɡ-k/ versus the other obstruents in utterance-final position. In order to take account of individual variations in the various factors and their interactions that can influence voicing implementation, a maximal random structure is used with by-speaker random slopes for each of the four fixed factors and their interaction. In addition, a by-word random slope is taken into account for the effect of position in the utterance, considered as the predictor most likely to be word-dependent among the fixed factors considered. Correlation components between random slopes and random intercepts are included in the model to account for the possible correlation between v-ratio and V/VC ratio.

Since v-ratio values are bounded between 0 and 1, a regression model based on an unbounded distribution would have been unsuitable. The part of the model corresponding to v-ratio values was fitted using an ordered beta regression model (Kubinec, 2023), shown as more efficient than alternative models such as zero-one-inflated beta regression used for instance by DiCanio et al. (2022) for modelling bounded data likely to include a significant proportion of values corresponding to the lower bound and/or the upper bound. The general principle of ordered beta regression is to model the proportion of zeros and ones separately, while the continuous distribution of intermediate values is modelled by beta regression. Regarding V/VC ratio values, since we are mostly interested in differences between categories, raw values were log-transformed before being modelled. The part of the model dedicated to log-transformed V/VC ratio values was fitted using a skew normal distribution to account for the asymmetry introduced by the log transformation.

A multivariate model was then fitted using the Stan programming language (Carpenter et al., 2017) and the brms (Bürkner, 2021) package in R, as the combination of both regression models and correlation terms between intercepts and slopes for speakers and words. Default weakly informative priors were used for the ordered beta regression model fitted on v-ratio values and for the skew-normal regression model fitted on V/VC ratio values. Following Vasishth et al. (2018), a LKJ Cholesky covariance prior with η = 2 was used for correlations between intercepts and slopes instead of the default value (η = 1) in order to give lower prior probabilities to extreme correlations. The model used 16000 samples across 4 Markov chains since the default 2000 samples per chain were not sufficient to achieve convergence.

Similarly, the durations of the vowel V0 and the target consonant C1 were jointly modelled by a separate multivariate regression model taking into account the correlation between these durations, using the same fixed effects, interactions and random structure. Due to the 10 ms temporal resolution of the forced alignment used and the elimination of segments detected with a minimum duration of 30 ms, the distribution of these durations is bounded by the lower value 40 ms, resulting in a more pronounced right-skewness than is usually observed for duration data. To account for this skewness, the duration data was log-transformed before being modelled using a Gaussian distribution. As for the first model, default weakly informative priors were used, and a LKJ Cholesky covariance prior with η = 2 was used for correlations between intercepts and slopes. The model used 4000 samples across 4 Markov chains. The results for consonant duration and vowel duration will be reported only when they contribute to the discussion.

The R code used to fit models, extract information from posterior distributions and plot figures included in the article is available in an OSF repository with the anonymized version of the dataset used in analyses.9 Both models converged with a R^ value at convergence of 1. Posterior predictive check plots are included as supplementary material in the OSF repository.

Because of the complexity of the interactions between the four fixed factors in the two multivariate models used, considering the effect of the factors and interactions would only provide a partial answer to our research questions. In order to provide a more direct answer, the analysis of the results was based on the calculation of marginal effects. The general principle is to summarise the effect of a predictor, or of a combination of predictors when marginalisation is applied to an interaction, while keeping the values of the other predictors constant (Sonderegger, 2023, p. 113). In a frequentist framework, particularly that of linear mixed effects regression models, marginal effects are commonly calculated in R using the emmeans function in package emmeans (Lenth, 2024), which offers a wide range of methods for obtaining predicted values and specific contrasts. Although less widely used, these techniques can also be applied to Bayesian models fitted with brms to obtain posterior distributions of the dependent variables for a range of values taken by the predictors considered after marginalising the other predictors. Moreover, objects created by emmeans can be used as a basis to get direct access to values drawn from the posterior distributions (called draws in the Bayesian modelling framework) through the function gather_emmeans_draws of package tidybayes (Kay, 2023). In order to allow the predicted values to be rescaled at the level of each draw when necessary for comparison of effects magnitudes between dependent variables (Section 3.1), we used this last method and extracted the draws from the posterior distributions obtained for the two-way, three-way or four-way interactions. This was done separately for each of the two dependent variables in each multivariate model (v-ratio and log-transformed V/VC ratio on the one hand, and log-transformed V0 duration and phone duration on the other). The differences between the values of certain predictors, and in particular the difference between [+voice] and [ – voice] for evaluating the effect of voicing contrast and the comparison between conditions for this effect, were calculated from these draws using the compare_levels function in the tidybayes package. For example, to evaluate the effect of position in the utterance on voicing contrast as measured by the v-ratio presented in Section 3.1.1, the effect of voicing is calculated for each draw as the difference between [+voice] and [ – voice], and the effect of position is calculated as the difference between internal and final position on these draw-wise differences. For each of the combinations of predictor values and/or contrasts between predictor values considered, the median value over all the draws thus obtained and the 95% credible interval (hereafter CrI) estimated by the highest density interval (HDI) were extracted using the point_interval function in the ggdist package (Kay, 2024). Comparisons between values of predictors were calculated using the same scale as that used for modelling, i.e., log-transformed values for V/VC ratio, V0 duration and phone duration, and the original scale for v-ratio. To generate figures relating the distribution of values in the data to the marginal values predicted by the models, the predicted values of V/VC ratio were exponentiated to backtransform them to the original scale.

3. Results

This section presents the results for word-final presonorant and prepausal obstruents10 concerning the v-ratio (Section 3.1), the V/VC ratio (Section 3.2), and the normalised magnitude of the effect for the combination of these two measures (Section 3.3). The main findings are summarised in Table 9 (Section 3.4). For each case (v-ratio, V/VC duration ratio, and overall effect of the two), the results are presented with increasing granularity: we first examine the effect of position in the utterance (Sections 3.1.1 and 3.2.1), followed by the effect of manner and position in the utterance (Sections 3.1.2 and 3.2.2), and finally the effect of the place of articulation for stops (Sections 3.1.3 and 3.2.3) and fricatives (Sections 3.1.4 and 3.2.4) as a function of position in the utterance.

Following the recommendations of Nicenboim and Vasishth (2016), used e.g., by Tanner, Sonderegger, and Stuart-Smith (2020), we consider evidence of an effect to be strong when the 95% credible interval (CrI) does not include 0, and weak when 0 is included in the 95% CrI and the probability of the effect not changing direction (“Pr(β<>0)”) is at least 95%. In many instances where strong evidence is associated with an effect, the probability of no change of direction is 100%; thus, we report this probability only when weak evidence of the effect is observed. In addition to the strength of evidence for the effects, we also assess their magnitude, estimated from the median value for each of the contrasts considered. The magnitudes of the effects are therefore expressed in the scale used for model fitting: the original 0–1 scale for v-ratio, and scaled log-transformed values for V/VC duration ratio.

3.1. V-ratio

3.1.1. Effect of the position in the utterance

Figure 3 illustrates the distribution of the v-ratio of [+voice] and [ – voice] obstruents in presonorant and prepausal positions, and Table 1 provides the predicted median values, corresponding to the blue dots in Figure 3, as well as the magnitude of the contrast expressed as the voiced-unvoiced difference. It can be seen here and in the similar tables and figures presented in the following sections that the predicted median v-ratio values are more central (i.e., closer to 0.5) than the median values in the original data. This is mainly due to the fact that, by definition, the values in the posterior distributions cannot be larger than the upper bound or smaller than the lower bound.

Table 1

Predicted values of the median v-ratio and v-ratio contrast, with the 95% credible intervals, as a function of position in the utterance.

Internal Final
[+voice] 0.86 [0.84, 0.89] 0.66 [0.63, 0.70]
[ – voice] 0.31 [0.30, 0.33] 0.25 [0.24, 0.27]
Contrast 0.55 [0.52, 0.57] 0.41 [0.37, 0.45]
Figure 3
Figure 3

Obstruents v-ratio values as well as predicted median values and 95% credible intervals of the v-ratio (blue dot and line within each box) as a function of position in the utterance.

The v-ratio is, as expected, higher for the [+voice] obstruents compared to the [ – voice] ones, both in presonorant and prepausal position. [ – voice] obstruents are not completely voiceless, due to some carry-over voicing from the preceding vowel (Westbury & Keating, 1986). A clear effect of utterance-final position is observed: the v-ratio median of the [+voice] obstruents is lower in final position. This devoicing effect corresponds to a reduced contrast magnitude across positions (est. = 0.14, 95% CrI = [0.10, 0.18]).

3.1.2. Effect of manner and position

Figure 4 illustrates the distribution of the v-ratio for stops and fricatives in both utterance-internal and -final positions, and Table 2 presents the predicted medians for each category. Fricatives exhibit a lower v-ratio than stops in both the [+voice] and [ – voice] categories, as well as in internal and final positions. All four categories of obstruents are partially devoiced in prepausal position compared to the presonorant position. The effect of position on [+voice] fricatives is comparable to that observed in [+voice] stops (comparison presonorant/prepausal position and [+voice] stops/[+voice] fricatives: est. –0.01, 95% CrI = [–0.06, 0.05]). To compare with the results in the literature, the v-ratio differences derived from our data can be expressed as Cohen’s h values: h = 0.47 for the effect of position on [+voice] stops; h = 0.41 for the effect of position on [+voice] fricatives. This prepausal devoicing of stops differs from the findings of Kohler et al. (1981), who found more devoicing in prenasal than in prepausal position: they reported that [+voice] stops were voiced for 77%–86% of their closure duration in word-final position before a nasal, and 87%–95% before a pause (Cohen’s h = 0.26–0.32). However, our results align with those of Sokolović-Perović (2012) for Serbian [+voice] stops (90.5% voicing in word-final, utterance-internal position and 62% voicing in utterance-final position, Cohen’s h = 0.70). Jacques (1990) also finds that Quebec French [+voice] fricatives are less voiced in utterance-final position (28%–64% voicing before a pause) than in utterance-internal position (45%–77%, all following contexts pooled together, Cohen’s h = 0.36–0.27).

Figure 4
Figure 4

Obstruents v-ratio values as well as predicted median values and 95% credible intervals of the v-ratio (blue dot and line within each box) as a function of manner and position in the utterance.

Table 2

Predicted values of the median v-ratio and v-ratio contrast, with the 95% credible interval as a function of manner and position in the utterance.

Stops Fricatives
Internal Final Internal Final
[+voice] 0.92 [0.89, 0.95] 0.73 [0.67, 0.79] 0.81 [0.78, 0.84] 0.60 [0.56, 0.64]
[ – voice] 0.38 [0.36, 0.40] 0.28 [0.26, 0.30] 0.25 [0.23, 0.27] 0.23 [0.21, 0.24]
Contrast 0.54 [0.51, 0.57] 0.45 [0.39, 0.51] 0.56 [0.53, 0.59] 0.37 [0.33, 0.41]

Turning now to the size of the [voice] contrast, a slightly different picture emerges. The [voice] contrast for the v-ratio is comparable for word-final stops and fricatives in presonorant position (comparison stops/fricatives: est. –0.02, 95% CrI = [–0.06, 0.02]), but the fricative contrast is smaller in prepausal position (est. 0.08, 95% CrI = [0.01, 0.14]). This finding differs from Patience and Steele (2022)’s study for Quebec French, which reported a larger contrast for word-final prevocalic fricatives than for stops. Although both stops and fricatives exhibit a reduced contrast in prepausal position (effect of position on stops: est. 0.09, 95% CrI = [0.03, 0.15]; on fricatives: est. 0.19, 95% CrI = [0.15, 0.23]), fricatives demonstrate a stronger positional effect (comparison stops/fricative contrast across positions: est. –0.10, 95% CrI = [–0.17, –0.03]).

3.1.3. Effect of POA and position in the utterance for stops

Figure 5 illustrates the distribution of the v-ratio for stops depending on their place of articulation and position in the utterance, and Table 3 provides the corresponding predicted medians and v-ratio contrasts. As is visible from Figure 5, the results do not clearly support the prediction that velar [+voice] stops are less voiced than the other [+voice] stops. /ɡ/ shows a similar proportion to /b/ in both positions; there is weak evidence that it is less voiced than /d/ in presonorant position, and that it is more voiced than /d/ in prepausal position. All six stops show lower proportions of voicing in utterance-final position.

Table 3

Predicted values of the stops median v-ratio and v-ratio contrast, with the 95% credible intervals, as a function of place of articulation and position in the utterance.

Internal Final Internal Final Internal Final
/b/ 0.92 0.73 /d/ 0.95 0.68 /ɡ/ 0.90 0.79
[0.8, 0.96] [0.64, 0.82] [0.92, 0.97] [0.60, 0.75] [0.83, 0.95] [0.69, 0.87]
/p/ 0.45 0.30 /t/ 0.36 0.28 /k/ 0.33 0.26
[0.41, 0.49] [0.25, 0.34] [0.34, 0.38] [0.26, 0.30] [0.31, 0.35] [0.25, 0.28]
/b-p/ 0.47 0.43 /d-t/ 0.58 0.40 /ɡ-k/ 0.57 0.52
[0.41, 0.53] [0.33, 0.53] [0.55, 0.61] [0.32, 0.47] [0.50, 0.62] [0.43, 0.61]
Figure 5
Figure 5

Stops v-ratio values as well as predicted median values and 95% credible intervals of the v-ratio (blue dot and line within each box) as a function of place of articulation and position in the utterance.

Examining the magnitude of the contrast between [+voice] and [ – voice] stops in Table 3, we can see that in presonorant position, the /ɡ-k/ contrast is larger than that of /b-p/ (/b-p/–/ɡ-k/ comparison: est. = –0.10, 95% CrI = [–0.18, –0.02]) and comparable to /d-t/ (/d-t/–/ɡ-k/ comparison: est. = 0.02, 95% CrI = [–0.05, 0.08]). In utterance-final position, the /ɡ-k/ contrast is comparable to /b-p/, and larger than /d-t/ (/b-p/–/ɡ-k/ comparison: est. = –0.09, 95% CrI = [–0.22, 0.05]; /d-t/–/ɡ-k/ comparison: est. = –0.13, 95% CrI = [–0.24, 0.00]). The /b-p/ pair is the least differentiated in presonorant position, while /d-t/ is the least differentiated in prepausal position. The /ɡ-k/ contrast is stable across positions (est. 0.04, 95% CrI = [–0.06, 0.15]), as is the /b-p/ one (est: 0.03, 95% CrI = [–0.07, 0.15]); only the /d-t/ contrast shows a reduction effect (est. 0.19, 95% CrI = [0.11, 0.26]).

3.1.4. Effect of POA and position in the utterance for fricatives

Turning now to fricatives, Figure 6 shows the distribution of the v-ratio in the three places of articulation, and Table 4 provides the corresponding predicted medians and contrasts. Analysis of these data reveals that within the [+voice] category, /ʒ/ has the smallest proportion of voicing, followed by /z/ and then /v/. This pattern holds true in both utterance-internal and -final position. The hierarchy of /v/ > /z/ > /ʒ/ was also observed in Quebec French by Jacques (1990) in utterance-internal and utterance-final positions.

Figure 6
Figure 6

Fricatives v-ratio values as well as predicted median values and 95% credible intervals of the v-ratio (blue dot and line within each box) as a function of place of articulation and position in the utterance.

Table 4

Predicted values of the fricatives median v-ratio and v-ratio contrast, with the 95% credible intervals, as a function of place of articulation and position in the utterance.

Internal Final Internal Final Internal Final
/v/ 0.95 0.72 /z/ 0.82 0.58 /ʒ/ 0.66 0.49
[0.93, 0.97] [0.65, 0.79] [0.77, 0.86] [0.52, 0.63] [0.60, 0.71] [0.44, 0.54]
/f/ 0.30 0.26 /s/ 0.20 0.18 /ʃ/ 0.25 0.25
[0.27, 0.34] [0.23, 0.28] [0.19, 0.22] [0.16, 0.19] [0.22, 0.28] [0.22, 0.28]
/v-f/ 0.65 0.47 /z-s/ 0.62 0.40 /ʒ-ʃ/ 0.41 0.24
[0.61, 0.69] [0.39, 0.54] [0.57, 0.67] [0.35, 0.45] [0.35, 0.47] [0.19, 0.30]

Post-alveolar fricatives also exhibit the weakest contrast both in presonorant and prepausal positions (in presonorant position, /v-f/–/ʒ-ʃ/ comparison: est. 0.24, 95% CrI = [0.17, 0.32], /z-s/–/ʒ-ʃ/: est. = 0.21, 95% CrI = [0.13, 0.28]; in prepausal position, /v-f/–/ʒ-ʃ/: est. = 0.22, 95% CrI = [0.14, 0.31], /z-s/–/ʒ-ʃ/: est. = 0.16, 95% CrI = [0.09, 0.23]). The hierarchy in both positions is /v-f/ = /z-s/ > /ʒ-ʃ/. Finally, all three pairs demonstrate a comparable contrast reduction before a pause (effect of position on /v-f/: est: 0.19, 95% CrI = [0.11, 0.26]; on /z-s/: est. 0.22, 95% CrI = [0.15, 0.28]; on /ʒ-ʃ/: est. 0.17, 95% CrI = [0.10, 0.23]).

3.2. V/VC duration ratio

This section reports the results of the analysis for the V/VC duration ratio, to answer the RQ2 in (2). As explained in Section 2.2, a separate multivariate model was fitted for preceding vowel duration and consonant duration. The corresponding results are presented only when relevant for the discussion.

3.2.1. Effect of the position in the utterance

Figure 7 displays the predicted median duration of the vowel and the consonant for both [+voice] and [ – voice] obstruents in the VC# rhyme, depending on their position in the utterance, along with the predicted V/VC duration ratio. Note that the predictions of the V/VC ratio do not exactly match the ratio computed from the predicted vowel duration and consonant durations, as the predicted values of vowel and consonant duration are obtained from a separate model. Table 5 presents the logarithmic values of the predicted contrasts for the V/VC ratio. As expected, the V/VC ratio is higher for [+voice] obstruents than for [ – voice] ones. This is because vowels are longer before [+voice] consonants, and [ – voice] consonants are longer than the [+voice] ones. Additionally, [+voice] obstruents in our data exhibit a higher V/VC ratio in prepausal than in presonorant position: although both vowel and consonant durations lengthen before a pause, the vowel lengthens proportionally more than the consonant. This pattern does not hold for [ – voice] obstruents, whose V/VC ratio remains stable across positions. This results in a stable [voice] contrast for the V/VC duration ratio across positions (effect of position: est. –0.02, 95% CrI = [–0.05, 0.01]).

Figure 7
Figure 7

Predicted median values of vowel duration, consonant duration and V/VC duration ratio as a function of laryngeal feature and position in the utterance.

Table 5

Predicted median values of the log(V/VC ratio) contrast between [+voice] and [ – voice] and 95% credible intervals for all obstruents as a function of position in the utterance.

Internal Final
0.16 [0.13, 0.18] 0.18 [0.15, 0.20]

3.2.2. Effect of manner and position in the utterance

Figure 8 illustrates the duration of vowel + stop and vowel + fricative in the VC# rhymes, along with their predicted V/VC ratio; Table 6 provides the logarithmic values of the V/VC ratio contrast. In presonorant position, fricatives exhibit a lower V/VC ratio than stops. While stops show no credible effect of position, the V/VC ratio of fricatives increases in prepausal position (albeit only weakly so for [ – voice] fricatives), resulting in similar ratios for [+voice] stops and [+voice] fricatives before a pause.

Table 6

Predicted median values of the log(V/VC ratio) contrast between [+voice] and [ – voice] and 95% credible intervals as a function of manner and position in the utterance.

Internal Final
Stops 0.13 [0.09, 0.17] 0.15 [0.10, 0.19]
Fricatives 0.18 [0.15, 0.21] 0.21 [0.18, 0.24]
Figure 8
Figure 8

Predicted median values of vowel duration, consonant duration and V/VC duration ratio as a function of laryngeal feature, manner, and position in the utterance.

The V/VC ratio better signals the [voice] contrast for fricatives than for stops, in both presonorant position (comparison stops/fricatives: est. –0.05, 95% CrI = [–0.10, –0.01]) and final position (est. –0.06, 95% CrI = [–0.11, –0.01]). The V/VC ratio contrast remains stable across positions for both stops (est. = –0.02, 95% CrI = [–0.07, 0.04]) and fricatives (est. = –0.03, 95% CrI = [–0.06, 0.01]). The details of this (absence of) credible effect shed an interesting light on the traditional view that [+voice] fricatives trigger more vowel lengthening than [+voice] stops in French (consonnes allongeantes, Delattre, 1951, pp. 10, 17, van Dommelen, 1981). In our dataset, vowels before [+voice] fricatives are long indeed, and they are particularly lengthened in utterance-final position, where they reach the predicted median duration of 133 ms. But vowels are longer before [+voice] fricatives than before [+voice] stops only in prepausal position (difference in presonorant position, est. –0.04, 95% CrI = [–0.12, 0.03]; in prepausal position, est. –0.13, 95% CrI = [–0.21, –0.06]): the effect of the consonnes allongeantes /v, z, ʒ/ applies only before a pause. Additionally, this difference does not help signalling the [voice] contrast in the following fricative: since vowels are also longer before [ – voice] fricatives than before [ – voice] stops (in both positions), there is no evidence for a different [voice] contrast in preceding vowel duration for fricatives and for stops (comparison of the vowel duration contrast before stops versus fricatives in presonorant position: est. 0.05, 95% CrI = [–0.04, 0.14]; in prepausal position: est. –0.03, 95% CrI = [–0.12, 0.05]). In our data, the strength of the V/VC ratio contrast for fricatives is primarily due to the consonant duration contrast, which is larger for fricatives than for stops in both positions.

3.2.3. Effect of POA and position in the utterance for stops

Figure 9 displays the durations of the vowel and consonant in the VC rhyme for word-final stops, along with their predicted V/VC duration ratio and 95% credible intervals; Table 7 provides the voiced– voiceless log(V/VC) ratio difference. There is no credible effect of the place of articulation on the [+voice] stops in either position, nor of position for any of the three [+voice] stops. This suggests that the vowel and stop are lengthened in comparable proportions at the end of the utterance for each place of articulation. The V/VC ratio of /p/ and /k/ is enhanced in prepausal position (the vowel lengthens more than the stop), while /t/ does not show any credible effect of position.

Figure 9
Figure 9

Predicted median values of vowel duration, consonant duration and V/VC duration ratio for stops as a function of place of articulation and position in the utterance.

Table 7

Predicted median values of the log(V/VC ratio) contrast between [+voice] and [ – voice] and 95% credible intervals for stops across place of articulation and position in the utterance.

Internal Final
/b-p/ 0.14 [0.07, 0.22] 0.08 [–0.01, 0.16]
/d-t/ 0.13 [0.09, 0.17] 0.17 [0.13, 0.23]
/ɡ-k/ 0.12 [0.06, 0.19] 0.18 [0.10, 0.27]

In presonorant position, there is no credible effect of the place of articulation on the magnitude of the contrast (/b-p/–/ɡ-k/ comparison: est. = 0.02, 95% CrI = [–0.08, 0.12]; /d-t/–/ɡ-k/ comparison: est. = 0.00, 95% CrI = [–0.07, 0.08]). In prepausal position, no difference is observed between /ɡ-k/ and /d-t/ (/d-t/–/ɡ-k/ comparison: est. = –0.01, 95% CrI = [–0.10, 0.09]), and there is weak evidence that /ɡ-k/ shows a larger contrast than /b-p/ (comparison /b-p/–/ɡ-k/: est. = –0.10, 95% CrI = [–0.22, 0.02], Pr(β<>0) = 95%). The /d-t/ contrast is the only one enhanced before a pause, although the evidence is weak (est. –0.05, 95% CrI = [–0.10, 0.01], Pr(β<>0) = 96%), while /b-p/ and /ɡ-k/ remain stable across positions (effect of position on /b-p/: est. 0.06, 95% CrI = [–0.04, 0.16]), on /ɡ-k/: est. –0.06, 95% CrI = [–0.15, 0.04]). There is only weak evidence that the V/VC duration ratio supports the /b-p/ contrast before a pause. Examining the details of the V/VC ratio contrast, we find that consonant duration fails to support the contrast for /b-p/ before a pause. Preceding vowel duration supports the [voice] contrast for all three pairs of stops.

3.2.4. Effect of POA and position in the utterance for fricatives

Turning now to fricatives, Figure 10 represents the durations of the VC rhyme along with their predicted V/VC duration ratio, and Table 8 provides the log(V/VC ratio) contrast. The fricatives /ʒ/ and /z/ exhibit overall the lowest V/VC duration ratio. Nevertheless, /ʒ-ʃ/ demonstrates a larger contrast than /z-s/ in both positions (comparison /z-s/–/ʒ-ʃ/ in presonorant position: est. –0.13, 95% CrI = [–0.19, –0.07]; in prepausal position, est. –0.13, 95% CrI = [–0.19, –0.06]). The /ʒ-ʃ/ contrast is comparable to /v-f/ in presonorant position (comparison /v-f/–/ʒ-ʃ/: est. –0.01, 95% CrI = [–0.09, 0.06]) and there is weak evidence that it is smaller than /v-f/ in in prepausal position: est. = 0.07, 95% CrI = [–0.01, 0.15], Pr(β<>0 = 96%). There is no evidence that vowel duration supports the /v-f/ and /z-s/ contrasts in presonorant position. Consonant duration consistently supports the contrast. The labial V/VC ratio contrast is enhanced before a pause (est. –0.08, 95% CrI = [–0.15, -0.01]), while the other two contrasts remain stable across positions (/z-s/: est. –0.01, 95% CrI = [–0.05, 0.03]; /ʒ-ʃ/: est. 0.00, [–0.07, 0.06]). In both positions, /z-s/ is the pair for which the V/VC ratio distinguishes the least.

Figure 10
Figure 10

Predicted median values of vowel duration, consonant duration and V/VC duration ratio for fricatives as a function of place of articulation and position in the utterance.

Table 8

Predicted median values of the log(V/VC ratio) contrast and 95% credible intervals for fricatives as a function of place of articulation and position in the utterance.

Internal Final
/v-f/ 0.22 [0.16, 0.27] 0.30 [0.24, 0.35]
/z-s/ 0.10 [0.07, 0.13] 0.10 [0.07, 0.14]
/ʒ-ʃ/ 0.23 [0.18, 0.28] 0.23 [0.18, 0.28]

3.3. Combination of the v-ratio and V/VC ratio contrasts

These results enable us to address our last research question (RQ3 in (2)): when both the v-ratio and the V/VC duration ratio are combined, is the overall contrast more fragile in prepausal position, in fricatives, and in back obstruents? To quantify the relative importance of the two cues in the implementation of the [voice] contrast, we consider the magnitude of the effect of the voiced-unvoiced difference on the v-ratio, the V/VC duration ratio, as well as the magnitude of the overall effect on the two cues considered together. In order to obtain comparable measures of effect magnitude between the two measures under consideration, the values of each draw are standardised on the basis of the mean and standard deviation of the corresponding measure (either v-ratio or log(V/VC ratio)) in the original data. This produces a set of standardised draws for each of the two measures. Comparisons made on the basis of these standardised draws thus give rise to differences expressed on a standardised scale, in which the unit corresponds to one standard deviation. We refer to these differences in standardised units, which can be interpreted in a similar way to Cohen’s d, as effect magnitudes. To obtain the overall effect magnitude, the standardised values of each draw are combined by taking the mean of the two measures, before making comparisons between the values of the different predictors in the same way as with the standardised or non-standardised draws for each of the two measures. This method enables the estimation of overall effect magnitudes, which are also expressed in standardised units.

As illustrated by the leftmost two boxes in Figure 11, the v-ratio serves as a more important acoustic correlate of [voice] than the V/VC ratio in French. In Section 3.1.1, we observed that the v-ratio contrast decreases in utterance-final position, and in Section 3.2.1, that the V/VC duration ratio contrast remains stable in utterance-final position. This leads to a reduced overall contrast in prepausal position (est. 0.91, 95% CrI = [0.83, 0.98]) compared to the presonorant position (est. 1.06, 95% CrI = [1, 1.11]; effect of position: est. 0.15, 95% CrI = [0.07, 0.23]).

Figure 11
Figure 11

Normalised magnitude of the effect of the v-ratio contrast (left box) and the scaled log(V/VC) ratio contrast (central box), as well as their overall effect (right box), as a function of position in the utterance.

The effect of manner on the normalised magnitude of the v-ratio and V/VC ratio contrasts is shown in Figure 12. The leftmost two boxes of Figure 12 confirm the results of Sections 3.1.2 and 3.2.2: stops exhibit a reduced v-ratio contrast in prepausal position, while the V/VC duration contrast remains stable across positions. This results in an overall stability of the stop [voice] contrast across positions (est. 0.09, 95% CrI = [–0.04, 0.22]). Fricatives show a comparable pattern, but with a larger v-ratio contrast loss in prepausal position. The result is a reduced overall effect before a pause (effect of position: est. 0.20, 95% CrI = [0.12, 0.30]). In presonorant position, fricatives have a v-ratio contrast comparable to that of stops, along with a larger V/VC duration contrast. The overall contrast of [+voice] and [ – voice] fricatives is stronger than that of stops (comparison stops/fricatives in presonorant position: est. –0.13, 95% CrI = [–0.23, –0.02]). In prepausal position, the overall contrast reduction for fricatives results in a magnitude comparable to that of stops (comparison stops/fricatives: est. –0.02, 95% CrI = [–0.15, 0.13]). This indicates that when both cues are considered together, the fricatives’ [voice] contrast is not closer to neutralisation than that of stops, even before a pause.

Figure 12
Figure 12

Normalised magnitude of the effect of the v-ratio contrast (left box) and the scaled log(V/VC) ratio contrast (central box), as well as their overall effect (right box), as a function of manner and position in the utterance.

Turning now to the effect of the place of articulation in stops, Figure 13 summarises the results of Sections 3.1.3 and 3.2.3 in the two boxes on the left. For the v-ratio contrast, the scale was /ɡ-k/ > /b-p/ and /ɡ-k/ = /d-t/ in presonorant position, while in prepausal position, it was /ɡ-k/ = /b-p/ and /ɡ-k/ > /d-t/. Regarding the V/VC ratio contrast, there was no credible effect of place of articulation in presonorant position, and, in prepausal position, there was weak evidence for the /ɡ-k/ contrast being larger than /b-p/, and no credible evidence that it was different from /d-t/. This results in a null overall effect of POA in presonorant position (comparison /b-p/–/ɡ-k/: est. –0.11, 95% CrI = [–0.33, 0.13]; /d-t/–/ɡ-k/: est. 0.03, 95% CrI = [–0.15, 0.21]). In prepausal position, the /ɡ-k/ contrast is equivalent to /d-t/, and there is weak evidence that is it larger than /b-p/ (comparison /b-p/–/ɡ-k/): est. -0.32, 95% CrI = [–0.61, 0.0], Pr(β<>0) = 98%; /d-t/–/ɡ-k/: est. –0.18, 95% CrI = [–0.45, 0.06]). The labial and alveolar pairs show comparable contrast magnitudes (est. –0.14, 95% CrI = [–0.41, 0.11]). It is not the case, then, that /ɡ-k/ exhibits a weaker [voice] contrast than the other places of articulation in our data; there is little effect of the POA on the overall contrast. Concerning the effect of position, the /ɡ-k/ contrast remains stable before a pause, as does the /b-p/ contrast, while the /d-t/ contrast is reduced (effect of position on /b-p/: est. 0.16, 95% CrI = [–0.08, 0.42], on /d-t/: est. 0.16, 95% CrI = [0.01, 0.31], on /ɡ-k/: est. –0.05, 95% CrI = [–0.29, 0.19]).

Figure 13
Figure 13

Normalised magnitude of the effect of the v-ratio contrast (left box) and the scaled log(V/VC) ratio contrast (central box), as well as their overall effect (right box), as a function of place of articulation (Alv. = Alveolar) and position in the utterance.

The effect of the place of articulation is different for fricatives. The leftmost box in Figure 14 illustrates the hierarchy /v-f/ = /z-s/ > /ʒ-ʃ/ for the v-ratio contrast in both positions. The central box indicates that /z-s/ exhibits the weakest V/VC ratio contrast in both positions, there is no evidence for a difference between /v-f/ and /ʒ-ʃ/ in presonorant position, and weak evidence that the /v-f/ V/VC ratio contrast is larger than the /ʒ-ʃ/ one in prepausal position. The combination of these trends, as shown in the rightmost box, results in the hierarchy /v-f/ > /s-z/ = /ʒ-ʃ/ in both positions (in presonorant position, comparison /v-f/–/ʒ-ʃ/: est. 0.30, 95% CrI = [0.12, 0.48], /s-z/–/ʒ-ʃ/: 0.03, 95% CrI = [–0.13, 0.18]; in prepausal position: /v-f/–/ʒ-ʃ/: est. 0.43, 95% CrI = [0.24, 0.63], /s-z/–/ʒ-ʃ/: est. –0.03, 95% CrI = [–0.19, 0.13]). For one pair, /v-f/, the loss of v-ratio contrast before a pause is compensated by an enhancement of the V/VC ratio contrast, resulting in an overall contrast stability across positions (effect of position on /v-f/: est. 0.10, 95% CrI = [–0.08, 0.27]). This is not the case for /z-s/ and /ʒ-ʃ/, which exhibit an overall contrast reduction before a pause (effect of position on /z-s/: est. 0.28, 95% CrI = [0.17, 0.40], on /ʒ-ʃ/: est. 0.23, 95% CrI = [0.08, 0.39]).

Figure 14
Figure 14

Normalised magnitude of the effect of the v-ratio contrast (left box) and the scaled log(V/VC) ratio contrast (central box), as well as their overall effect (right box), as a function of place of articulation (Alv. = Alveolar) and position in the utterance.

3.4. Summary of the findings

Table 9 summarizes the main findings of the study.

Table 9

Summary of the findings. ‘#R’ designates the word-final, presonorant position, ‘##’ the word-final, prepausal position, and ‘*position’ the effect of position (from presonorant to prepausal). Stop and fricative in the corresponding lines stand for stop contrast and fricative contrast. POA stands for place of articulation. The symbols ≤ and ≥ are used to signal weak evidence for an effect, while > and < signal strong evidence.

Factor V-ratio V/VC dur. ratio Overall effect
Position Contrast reduced Contrast stable Contrast reduced
Manner #R Stop = fricative Stop < fricative Stop < fricative
## Stop > fricative Stop < fricative Stop = fricative
*position Stop contrast reduced Stop contrast stable Stop contrast stable
Fricative contrast reduced Fricative contrast stable Fricative contrast reduced
POA, stops #R /b-p/ < /d-t/ = /ɡ-k/ /b-p/ = /d-t/ = /ɡ-k/ /b-p/ = /d-t/ = /ɡ-k/
## /b-p/ = /d-t/, /b-p/ < /d-t/, /b-p/ = /d-t/,
/b-p/ = /ɡ-k/ /b-p/ ≤ /ɡ-k/ /b-p/ ≤ /ɡ-k/
and /d-t/ < /ɡ-k/ and /d-t/ = /ɡ-k/ and /d-t/ = /ɡ-k/
*position /b-p/ contrast stable /b-p/ contrast stable /b-p/ contrast stable
/d-t/ contrast reduced /d-t/ contrast enhanced /d-t/ contrast reduced
(weak evidence)
/ɡ-k/ contrast stable /ɡ-k/ contrast stable /ɡ-k/ contrast stable
POA, fric. #R /v-f/ = /z-s/ > /ʒ-ʃ/ /v-f/ = /ʒ-ʃ/ > /z-s/ /v-f/ > /z-s/ = /ʒ-ʃ/
## /v-f/ = /z-s/ > /ʒ-ʃ/ /v-f/ ≥ /ʒ-ʃ/ > /z-s/ /v-f/ > /z-s/ = /ʒ-ʃ/
*position /v-f/ contrast reduced /v-f/ contrast enhanced /v-f/ contrast stable
/z-s/ contrast reduced /z-s/ contrast stable /z-s/ contrast reduced
/ʒ-ʃ/ contrast reduced /ʒ-ʃ/ contrast stable /ʒ-ʃ/ contrast reduced

4. Discussion

This paper investigated the phonetic precursors of Final Devoicing in word-final French obstruents. We argued that, to identify the contexts likely to induce the first step of the change, it is necessary to consider the size of the contrast across different cues, rather than focusing on the behaviour of [+voice] obstruents alone. In this perspective, we examined two acoustic cues of the [voice] contrast, the voicing ratio and the V/VC duration ratio, in almost 12,000 word-final obstruents of contemporary French, under the effect of three contexts identified in the literature as more likely to host the early stages of the change towards Final Devoicing: utterance-final position, fricatives, and posterior obstruents. The present study also innovated in its methodology, employing large corpora of natural speech with automatic segmentation and extraction of acoustic information, in line with recent studies of [voice] in various languages (e.g., Hutin et al., 2020; Morley and Smith, 2023; Popescu et al., 2023; Sonderegger et al., 2020; Tanner, Sonderegger, and Stuart-Smith, 2020; Tanner, Sonderegger, Stuart-Smith, and Fruehwald, 2020). Before discussing our research questions in Section 4.2, and the limits of the study in Section 4.3, we first compare in Section 4.1 the characteristics of the [voice] contrast that emerge from this methodology with the findings of previous laboratory studies on French.

4.1. Contemporary French [voice] in large corpora

The properties of the [voice] contrast which appear in our automatically segmented data follow the general tendencies identified in previous literature. The main difference is the overall shorter durations, which are consistent with the higher speech rate typically observed in casual conversation and among professional speakers in radio and TV. We are also able to define more precisely the rule of vowel lengthening before [+voice] fricatives.

We first confirm that [+voice] obstruents differ from [ – voice] ones by exhibiting a higher v-ratio (that is, a larger proportion of the consonant is voiced) and a longer preceding vowel, consistent with earlier studies (Abdelli-Beruh, 2004, 2009; Delattre, 1962; Kohler et al., 1981; Laeufer, 1992). Additionally, [+voice] fricatives are shorter than [ – voice] ones, and stop + release duration follows the same pattern. Furthermore, the comparison of the normalised magnitude of the v-ratio and the V/VC duration ratio contrasts in Figure 11 confirms the greater importance of the v-ratio as a cue for [voice] in French, as expected in a true voicing language. Interestingly, our results allow us to qualify the finding that consonant duration in French is more important than vowel duration in the composition of the [voice] contrast (Laeufer, 1992). In our data, this holds true for fricatives: while consonant duration supports all three fricative contrasts in both positions, vowel duration fails to distinguish /v-f/ and /z-s/ in presonorant position. Laeufer (1992) found very similar results: in her data, vowel duration fails to support the /v-f/ and the /z-s/ contrast in word-final, sentence-medial unfocused context. However, it is not true for stops, when stop duration encompasses both closure and release intervals. Consonant duration fails to support the /b-p/ contrast before a pause, while vowel duration consistently supports the [voice] contrast in stops. The V/VC ratio proved to be contrastive for all pairs of obstruents (although there was only weak evidence that it supported the /b-p/ contrast before a pause), suggesting that it serves as a better correlate of the [voice] contrast overall than vowel duration or consonant duration considered independently.

Overall, our duration measurements are close to or slightly shorter than those reported by Kohler et al. (1981) and Abdelli-Beruh (2004) for preceding vowel duration, and notably shorter than those found by Laeufer (1992) for fricative duration and preceding vowel duration, as well as those reported by van Dommelen (1981) for fricative duration. We also observe equal or shorter durations for complete stops compared to the closure duration alone in previous studies, alongside a smaller prepausal lengthening effect and reduced duration contrasts than documented in earlier research. These findings suggest that our corpora reflect the higher speech rate typical of conversational and broadcast speech. In English, investigations of large corpora of natural speech have led to a reevaluation of the importance of the voicing effect: the longer duration of vowels before [+voice] obstruents appears to depend on the dialect (Tanner, Sonderegger, Stuart-Smith, & Fruehwald, 2020), and Morley and Smith (2023) argue that it actually does not reliably cue the laryngeal contrast in conversational speech. Our results also point to a reassessment of the findings based on laboratory data regarding the rule of vowel lengthening before [+voice] fricatives in French, traditionally referred to as consonnes allongeantes (Grégoire, 1911, Delattre, 1951, pp. 10, 17, Fouché, 1956; van Dommelen, 1981). In our data, vowels are indeed longer before [+voice] fricatives, although they “only” reach the predicted duration of 133 ms, against 223 ms for Laeufer (1992) and more than 250 ms for van Dommelen (1981). This is in line again with the characteristics of natural speech: the lengthening effect of [+voice] fricatives is observed, but at a smaller scale than in laboratory-elicited data. More importantly, the effect of the consonnes allongeantes applies in our data only before a pause: in presonorant position, vowels are not longer before [+voice] fricatives than before [+voice] stops. Furthermore, it does not result in a larger vowel duration contrast for fricatives: before a pause, the duration of the vowel does not cue the [voice] contrast better for fricatives than for stops. This result differs from Laeufer (1992), who found that vowel duration better supported the [voice] contrast before word-final fricatives than before word-final stops (in both utterance-medial and utterance-final contexts). Thus, what makes the V/VC ratio contrast for fricatives stronger compared to stops in our data (in both positions) is not vowel duration, but rather consonant duration.

4.2. The phonetic precursors of Final Devoicing

The literature on Final Devoicing places the main source of the change in the contexts where phonation is most difficult to maintain (see (1)): in utterance-final position, fricatives, and posterior places of articulation. Our first research question addressed whether these devoicing effects are indeed found in our corpora, and whether they correspond to a reduction of the [voice] contrast (RQ1). The results confirm that [+voice] obstruents are devoiced under the effect of the prepausal position, and that this corresponds to a reduction of the [voice] contrast. Examining fricatives alongside stops highlights the importance of assessing contrast rather than focusing on [+voice] tokens alone. In presonorant position, [+voice] fricatives show a smaller proportion of voicing than [+voice] stops, as anticipated due to the conflicting requirements they impose on voicing (Ohala, 1997). Nevertheless, the v-ratio contrast for fricatives is as substantial as that of stops in this position. It is only before a pause that the v-ratio contrast for fricatives diminishes in comparison to stops. In this position, [+voice] fricatives are voiced for only about 60% of their duration–a finding consistent with Jacques (1990)’s measurements for Quebec French. Stops, on the other hand, show a less dramatic effect of position.

The AVC (Ohala, 2011) predicts that posterior consonants should exhibit less voicing than those at other places of articulation, due to their smaller oral cavity. This effect is clearly observed in fricatives, which show a v-ratio hierarchy of /v/ > /z/ > /ʒ/, reflected in the contrast hierarchy /v-f/ = /z-s/ > /ʒ-ʃ/ in both positions. Comparable patterns have also been identified in Quebec French (Jacques, 1990) and English (Haggard, 1978). However, the oral cavity and its potential for soft tissue extension is not much smaller for the post-alveolar /ʒ/ compared to the alveolar /z/. Consequently, the lower v-ratio contrast we observe in word-final /ʒ-ʃ/ may actually exceed the predictions of the AVC. Another factor which could contribute to this result is consonant duration: in our data, /v/ is shorter than /z/ which is in turn shorter than /ʒ/. This longer duration makes it more likely that the intra-oral pressure necessary to maintain vibration will diminish before the end of the constriction in posterior consonants (see Haggard, 1978 for discussion).

Stops, on the other hand, do not exhibit the expected effect of the AVC: the voicing of /ɡ/ is comparable to that of /b/ in both positions, slightly smaller than that of /d/ in presonorant position, and larger than that of /d/ in prepausal position, contrary to the /b/ > /ɡ/ hierarchy found by Laeufer (1996) in word-final prevocalic contexts and to the /d/ > /ɡ/ hierarchy found by Kohler et al. (1981) in utterance-final position. The /ɡ-k/ v-ratio contrast is among the largest in both positions. /d-t/ is the only pair whose v-ratio contrast is reduced before a pause, and the stop pairs with the lowest v-ratio contrast before a pause are /d-t/ and /b-p/. This failure of velar stops to meet the predictions of the AVC aligns with findings in other true voicinglanguages (e.g., Serbian in Sokolović-Perović, 2012, Romanian in Hutin et al., 2020, see also Popescu et al., 2023 comparing five Romance languages). It is conceivable that the quality of the laryngeal contrast plays a role here, as the expected effect of the posterior place of articulation has been observed in aspirating languages such as English (Laeufer, 1996) and German (Jessen, 1998). However, other factors could play a role in this result. Temple (2000) notes that in the oïl region of the Atlas Linguistique de France (1902–10), some devoicing patterns are lexically conditioned: certain lexical items, such as malade [malad], show higher degrees of devoicing. Pooley (1994) also reports that, for Roubaix speakers (North of France), higher rates of devoicing may be linked to lexical frequency. Further investigation is needed to assess the influence of these factors on the results. In any case, the general variation observed across languages suggests that the AVC is not the only factor affecting place-related devoicing rates.

Is the V/VC duration contrast reduced or enhanced in the same contexts (RQ2)? Our results indicate that the V/VC ratio contrast remains stable under the effect of prepausal lengthening. Fricatives exhibit a lower V/VC duration ratio than stops, as the duration of fricative constriction exceeds that of stop closure plus release intervals, and occupies a larger proportion of the VC rhyme (in spite of vowels being longer before fricatives). However, the V/VC duration ratio distinguishes [+voice] from [ – voice] fricatives more effectively than it does for stops in both positions within the utterance. The complex pattern of word-final vowel and consonant lengthening before a pause balances out: for both manners, the V/VC duration ratio contrast remains stable between [+voice] and [ – voice] at the end of the utterance. This shows that the long vowels found before prepausal fricatives do not help to maintain the contrast (see Section 4.1). Finally, regarding the effect of place of articulation, velar stops demonstrate a V/VC ratio contrast comparable to that of the other stop pairs in both utterance positions (only slightly larger than /b-p/before a pause). The /b-p/ pair is the weakest one before a pause in terms of V/VC ratio. Within fricatives, the pair with the weakest V/VC ratio contrast is /z-s/.

Our last research question addressed the overall contrast (RQ3): when combining the v-ratio and the V/VC ratio, is the contrast more fragile in the three contexts outlined in (1)? First, the comparison of the v-ratio and the V/VC ratio suggests a degree of trade-off between the two cues, at the level of the phoneme pairs (see Table 9). For example, the /d-t/ contrast exhibits a v-ratio contrast that is the only one reduced in final position among stops, and a V/VC ratio contrast that is the only one that increases in final position (the other two pairs of stops do not show any effect of position). No consonant pair demonstrates a reduction or enhancement in both the voicing ratio and the V/VC duration contrasts. A comparable trade-off relationship between the proportion of word- or phrase-final obstruent voicing and the duration of the preceding vowel has been found in different varieties of English (e.g., Klatt, 1976; Purnell et al., 2005). The increase of the V/VC duration ratio contrast could be seen as an adaptive response to the weakening of the v-ratio cue (Kirby, 2013).

Another interesting result is that the expected weakness of the fricative [voice] contrast does not manifest in our data. In word-final, presonorant position, fricatives exhibit a stronger overall contrast than stops, and an equally strong one before a pause. If Final Devoicing originates in prepausal position, as our findings suggest, there is no reason to expect it to target fricatives before stops. This observation could help explain the absence of a typological bias towards Final Devoicing of fricatives discussed in 1.1: there would be no more languages neutralising the laryngeal contrast of word-final fricatives than of stops because the fricative overall [voice] contrast is not weaker than that of stops–it is actually larger in word-final, presonorant position.

Finally, the examination of the effect of place of articulation reveals that, among stops, the weakest pair is not the velar one: /d-t/ is the pair most affected by prepausal devoicing, and /b-p/ and /d-t/ are the weakest pairs before a pause. Ultimately, /b-p/ and /d-t/ are the pairs closest to neutralisation before a pause for stops. The expected effect is better reflected in fricatives, as the weakest pairs before a pause are /z-s/ and /ʒ-ʃ/.

To summarise, our study suggests that the precursors of Final Devoicing in French should be situated in a partially different set of contexts than those identified in (1):

    1. (3)
    1. Contexts in which the overall [voice] contrast is reduced
    1.  
    1. a.
    1. In prepausal position
    1.  
    1. b.
    1. No preference for fricatives over stops
    1.  
    1. c.
    1. Alveolar and post-alveolar fricatives; labial and alveolar stops.

In other words, we predict that, if Final Devoicing were to develop in this variety of French, it would start at the end of utterances, target both stops and fricatives at the same time, possibly beginning at different places of articulation in each set of obstruents.

The comparison of our results with the findings in other languages suggests that at least part of the effects we observe are language-specific. This study focused on a true voicing language, with the hypothesis that the differences in laryngeal settings between aspirating and voicing languages might influence the sensitivity to word-final laryngeal neutralisation (see Section 1.2). When compared to the literature, the results gathered here do not support this hypothesis: French does not necessarily display the same effects as other voicing languages, and may be punctually more comparable to aspirating languages. For instance, utterance-final lengthening affects fricative duration more than preceding vowel duration in (voicing) Hebrew VC# rhymes (Berkovits, 1993), while the reverse is true in French (the V/VC ratio of /v/ and /ʒ/ is enhanced before a pause – the other fricatives do not show any effect of position). On the other hand, the French vowel lengthening before prepausal [+voice] fricatives recalls an aspirating language such as English, where vowel duration has been found to be particularly salient before fricatives (Klatt, 1976, most recently Morley and Smith, 2023 – although in French, it does not result in a larger voicing effect for fricatives than stops). Interestingly, Jansen (2004, p. 79) points out that from a typological perspective, voicing and aspirating languages are equally likely to undergo (phonological) Final Devoicing: for example, German (aspirating) and Dutch (voicing) exhibit Final Devoicing, while English (aspirating) and Standard French (voicing) do not. This suggests that the contrast is difficult to maintain word-finally independently of the broader language category in terms of voicing type, consistent with the notion that laryngeal contrasts are cued by a cluster of features which may vary from language to language in ways which are partially independent from the aspirating/voicing categorisation.

4.3. Limitations of the study

This paper examined the role of the voicing ratio and the V/VC duration ratio as cues to the [voice] contrast in French. While these are important cues, a more comprehensive assessment of our research question should encompass a larger set of parameters, such as release noise intensity and duration (Abdelli-Beruh, 2004; Flege & Hillenbrand, 1987; Kohler et al., 1981). A separate alignment of the closure and release phases of the stops would also facilitate comparisons with previous studies, which typically focus on the closure phase only. This alignment would also help disentangle the proportion of voicing during both closure and release, as it is unclear whether release voicing is a cue to [voice] in French.

Moreover, this study focused on the acoustics of the [voice] contrast in French. However, the relationship between the acoustic realisation of a sound and its perception is not straightforward. van Dommelen (1983) demonstrates that while voicing is a major cue in French, it can be overridden by other cues depending on the context. Thus, it is possible that perceptually, the role of the V/VC duration ratio is not as important as it is in the acoustics; misperception might occur even though the V/VC ratio remains robust. Note that in his perception studies on English word-final obstruents, Myers (2012) finds that the listeners tend to misperceive final fricatives as voiceless, while no effect is observed for stops. Although he attributes this specific result to the weaker voicing cues expected in fricatives, it is conceivable that a bias towards voiceless fricatives could emerge in perception.

Finally, many instances of phonologised Final Devoicing have been shown to retain subtle traces of the contrast (Baroni and Vanelli, 2000; Charles-Luce, 1985; Dmitrieva, 2005; Warner et al., 2004, among others). These traces are usually found within the durational parameters of the contrast. This suggests that the change towards neutralisation may be driven by a reduction of the vocal fold vibration cue, while durational cues resist the complete neutralisation. Such findings underscore the need for further research into the perception of the phonetic precursors of Final Devoicing.

5. Conclusion

This paper investigated the phonetic precursors of Final Devoicing, in the hope that it sheds light on the mismatches between the phonetic predictions and the actual patterns of Final Devoicing observed in the typology. We have argued that the source of neutralisation patterns should not be sought solely in [+voice] consonants, but rather in the magnitude of the [voice] contrast: a reduction of an acoustic parameter in [+voice] obstruents does not necessarily imply a corresponding reduction of the contrast itself for this parameter. We have also advocated for the inclusion of different cues to the [voice] contrast when assessing contrast reduction or enhancement. We observe a trend towards a trading relationship between the two cues we selected, the voicing ratio and the V/VC duration ratio: when one is reduced, the other tends to be enhanced. In French, fricatives partially compensate the loss of voicing before a pause by a weak enhancement of the V/VC duration ratio. This suggests a strategy to reinforce the contrast in prepausal position, which is the position where the change towards Final Devoicing might first emerge. This results in a comparable robustness of stop and fricative overall contrasts before a pause, which could help explain the unexpected absence of manner asymmetry in the typology of Final Devoicing patterns.

Additional Files

The R code used for model fitting, extracting details from posterior distributions, and creating the figures presented in the article is available, along with the anonymised dataset used for the analyses, in the OSF repository at https://osf.io/r75ts/.

Notes

  1. Throughout this paper, voicing refers to vocal fold vibration, and [voice] to the phonological feature. Final Devoicing with capital initials refers to the phonological synchronic pattern, and voicing/devoicing refers to the variable phonation in the acoustics. [^]
  2. Presumably in anticipation of the inspiratory gesture (Löfqvist, 1975; Westbury & Keating, 1986). [^]
  3. See Section 2.1 for more details. [^]
  4. We report the magnitude of differences between proportions using Cohen’s h (Cohen, 1988), defined as the difference between the arcsine transformation of each proportion and whose values can be interpreted analogously to those of Cohen’s d. Using this measure of effect size is preferable to using the difference between proportions, which overestimates the importance of differences between proportions close to the extreme values of 0% or 100%. [^]
  5. Dide is a nonce word. [^]
  6. In this method, the ASR system determines, for each obstruent, whether it aligns more closely with the phone model of the [+voice] or [ – voice] member of the pair, independently of its lexical representation (Adda-Decker & Lamel, 2000). [^]
  7. The effect of the corpus is not included in the present paper. A preliminary study found that the corpus did not affect the proportion of voicing of word-final [+voice] obstruents (Jatteau, Vasilescu, Lamel, Adda-Decker, & Audibert, 2019). [^]
  8. We coded velar stops and post-alveolar fricatives with the same label back, although their places of articulation are distinct. This is possible because the results of place of articulation are investigated separately for stops and fricatives (see 3.1.3 and 3.1.4 for the v-ratio, and 3.2.3 and 3.2.4 for the V/VC duration ratio), so that the two places of articulation are not actually conflated in the analyses. [^]
  9. Available at https://osf.io/r75ts/; see the section Additional files at the end of the paper. [^]
  10. The word-final, presonorant position is labelled internal in the tables and figures, standing for utterance-internal position. The word-final, prepausal position is labelled final in the tables and figures, standing for utterance-final position. [^]

Acknowledgements

We thank one anonymous reviewer and Morgan Sonderegger for their thorough and constructive reviews. We are also grateful to Anisia Popescu for useful insights on an earlier version of the paper. All remaining errors are ours. This research was partially supported by the ANR grant DIPVAR (ANR-21-CE38-0019).

Competing Interests

The authors have no competing interests to declare.

Author Contributions

AJ designed the study and wrote the main text of the paper. NA performed the statistical analysis and wrote the parts of the paper related to it. LL segmented the corpora. All authors contributed to manuscript revision, and read and approved the submitted version.

References

Abdelli-Beruh, N. (2004). The stop voicing contrast in French sentences: Contextual sensitivity of vowel duration, closure duration, voice onset time, stop release and closure voicing. Phonetica, 61, 201–219.

Abdelli-Beruh, N. (2009). Influence of place of articulation on some acoustic correlates of the stop voicing contrast in Parisian French. Journal of Phonetics, 37, 66–78.

Adda-Decker, M., & Lamel, L. (1999). Pronunciation variants across system configuration, language and speaking style. Speech Communication, 29(2–4), 83–98.

Adda-Decker, M., & Lamel, L. (2000). The use of lexica in automatic speech recognition. In F. Eynde & D. Gibbon (Eds.), Lexicon development for speech and language processing (pp. 235–266). Springer.

Baroni, M., & Vanelli, L. (2000). The relationship between vowel length and consonantal voicing in Friulian. In L. Repetti (Ed.), Phonological theory and the dialects of Italy (pp. 13–44). John Benjamins.

Barry, W. J. (1979). Complex encoding in word-final voiced and voiceless stops. Phonetica, 36(6), 361–372.

Beckman, J. N., Jessen, M., & Ringen, C. (2013). Empirical evidence for laryngeal features: Aspirating vs. true voice languages. Journal of Linguistics, 49(2), 259–284.

Beguš, G. (2020). Estimating historical probabilities of natural and unnatural processes. Phonology, 37(4), 515–549.

Berkovits, R. (1993). Progressive utterance-final lengthening in syllables with final fricatives. Language and Speech, 36(1), 89–98.

Bermúdez-Otero, R. (2015). Amphichronic explanations and the Life Cycle of phonological processes. In J. C. Salmons & P. Honeybone (Eds.), The Oxford Handbook of Historical Phonology (pp. 374–399). Oxford University Press.

Blevins, J. (2006). Theoretical synopsis of Evolutionary Phonology. Theoretical Linguistics, 32(2), 117–166.

Boersma, P., et al. (1993). Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. Proceedings of the institute of phonetic sciences, 17(1193), 97–110.

Boersma, P., & Weenink, D. (2017). Praat: Doing phonetics by computer. http://www.praat.org.

Broselow, E. (2018). Laryngeal contrasts in second language phonology. In L. M. Hyman & F. Plank (Eds.), Phonological Typology (pp. 312–340). Walter de Gruyter.

Bürkner, P.-C. (2021). Bayesian item response modeling in R with brms and Stan. Journal of Statistical Software, 100(5), 1–54.

Caramazza, A., & Yeni-Komshian, G. H. (1974). Voice Onset Time in two French dialects. Journal of Phonetics, 2(3), 239–245.

Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. https://www.jstatsoft.org/index.php/jss/article/view/v076i01

Charles-Luce, J. (1985). Word-final devoicing in German: Effects of phonetic and sentential contexts. Journal of Phonetics, 13(3), 309–324.

Chauveau, J.-P. (1991). Aspects de la conscience linguistique dans le centre de la Bretagne. In J.-C. Bouvier & C. Martel (Eds.), Les Français et leurs langues (pp. 135–162). Publications de l’Université de Provence.

Chen, M. (1970). Vowel length variation as a function of the voicing of the consonant environment. Phonetica, 22(3), 129–159.

Chodroff, E., & Wilson, C. (2017). Structure in talker-specific phonetic realization: Covariation of stop consonant VOT in American English. Journal of Phonetics, 61, 30–47.

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Routledge.

Coretta, S. (2020). Vowel duration and consonant voicing: A production study [Doctoral dissertation, University of Manchester]. University of Manchester. https://research.manchester.ac.uk/en/studentTheses/vowel-duration-and-consonant-voicing-a-production-study

Crystal, T. H., & House, A. S. (1988). Segmental durations in connected-speech signals: Current results. The Journal of the Acoustical Society of America, 83(4), 1553–1573.

Delattre, P. (1951). Principes de phonétique française à l’usage des étudiants anglo-américains. École française d’été, Middlebury College.

Delattre, P. (1962). Some factors of vowel duration and their cross-linguistic validity. The Journal of the Acoustical Society of America, 34(8), 1141–1143.

Denes, P. (1955). Effect of duration on the perception of voicing. The Journal of the Acoustical Society of America, 27(4), 761–764.

DiCanio, C., Chen, W.-R., Benn, J., Amith, J. D., & García, R. C. (2022). Extreme stop allophony in Mixtec spontaneous speech: Data, word prosody, and modelling. Journal of phonetics, 92, 101147.

Dmitrieva, O. (2005). Final devoicing in Russian: Acoustic evidence of incomplete neutralization. Journal of the Acoustical Society of America.

Flege, J. E., & Hillenbrand, J. (1987). A differential effect of release bursts on the stop voicing judgments of native French and English listeners. Journal of Phonetics, 15(2), 203–208.

Fouché, P. (1956). Traité de prononciation franccedil;aise. Klincksieck.

Galliano, S., Geoffrois, E., Mostefa, D., Choukri, K., Bonastre, J.-F., & Gravier, G. (2005). The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News. Proceedings of ISCA Interspeech, Lisbon.

Gauvain, J.-L., Lamel, L., & Adda, G. (2002). The LIMSI broadcast news transcription system. Speech communication, 37(1–2), 89–108.

Gósy, M., & Ringen, C. (2009). Everything You Always Wanted to Know About VOT in Hungarian [Paper presented at the 9th International Conference on the Structure of Hungarian].

Grégoire, A. (1911). Influences des consonnes occlusives sur la durée des syllabes précédentes. Revue de phonétique, 1, 260–292.

Haggard, M. (1978). The devoicing of voiced fricatives. Journal of Phonetics, 6(2), 95–102.

Hallé, P., & Adda-Decker, M. (2007). Voicing assimilation in journalistic speech. The International Congress of Phonetic Sciences, 493–496.

Hallé, P., & Adda-Decker, M. (2011). Voice assimilation in French obstruents: A gradient or a categorical process? In J. A. Goldsmith, E. V. Hume, & L. Wetzels (Eds.), Tones and features: Phonetic and phonological perspectives (pp. 149–175). De Gruyter.

Hambye, P. (2005). La prononciation du franccedil;ais contemporain en Belgique: Variation, normes et identités [Doctoral dissertation, Université Catholique de Louvain]. DIAL.pr. https://dial.uclouvain.be/pr/boreal/object/boreal:4883

Hoijer, H. (1933). Tonkawa: An Indian language of Texas. In F. Boas (Ed.), Handbook of American Indian Languages (pp. 1–148). J.J. Augustin.

Hutin, M., Jatteau, A., Vasilescu, I., Lamel, L., & Adda-Decker, M. (2021). A corpus-based study of the distribution of word-final schwa in Standard French and what it teaches us about its phonological status. Isogloss. Open Journal of Romance Linguistics, 7, 1–27.

Hutin, M., Niculescu, O., Vasilescu, I., Lamel, L., & Adda-Decker, M. (2020). Lenition and fortition of stop codas in Romanian. Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), 226–234.

Jacques, B. (1990). Étude de trois indices acoustiques du voisement des consonnes fricatives en franccedil;ais de Montréal. Revue québécoise de linguistique, 19(2), 59–71.

Jansen, W. (2004). Laryngeal contrast and phonetic voicing: A laboratory phonology approach to English, Hungarian, and Dutch [Doctoral dissertation, Rijksuniversiteit Groningen]. s.n. https://research.rug.nl/en/publications/laryngeal-contrast-and-phonetic-voicing-a-laboratory-phonology-ap

Jatteau, A., Vasilescu, I., Lamel, L., & Adda-Decker, M. (2019). Final devoicing of fricatives in French: Studying variation in large-scale corpora with automatic alignment. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia (pp. 295–299). Australasian Speech Science; Technology Association Inc.

Jatteau, A., Vasilescu, I., Lamel, L., Adda-Decker, M., & Audibert, N. (2019). “Gra[f]e!” Word-final devoicing of obstruents in Standard French: An acoustic study based on large corpora. Proceedings of Interspeech, 1726–1730.

Jessen, M. (1998). Phonetics and phonology of tense and lax obstruents in German. John Benjamins.

Kay, M. (2023). tidybayes: Tidy data and geoms for Bayesian models [R package version 3.0.6]. http://mjskay.github.io/tidybayes/

Kay, M. (2024). Ggdist: Visualizations of distributions and uncertainty in the grammar of graphics. IEEE Transactions on Visualization and Computer Graphics, 30(1), 414–424. doi:  http://doi.org/10.1109/TVCG.2023.3327195

Keating, P. (1984). Phonetic and phonological representation of stop consonant voicing. Language, 286–319.

Keating, P., Linker, W., & Huffman, M. (1983). Patterns in allophone distribution for voiced and voiceless stops. Journal of Phonetics, 11(3), 277–290.

Kingston, J., & Diehl, R. L. (1994). Phonetic knowledge. Language, 70(3), 419–454.

Kirby, J. (2013). The role of probabilistic enhancement in phonologization. In A. Yu (Ed.), Origins of sound change: Approaches to phonologization (pp. 228–246). Oxford University Press.

Kirby, J., & Ladd, D. R. (2016). Effects of obstruent voicing on vowel F0: Evidence from “true voicing” languages. Journal of the Acoustical Society of America, 140(4), 2400–2411.

Klatt, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence [Aspects of linguistic awareness in central Brittany]. The Journal of the Acoustical Society of America, 59(5), 1208–1221.

Kohler, K. J. (1979). Dimensions in the perception of fortis and lenis plosives. Phonetica, 36(4–5), 332–343.

Kohler, K. J., van Dommelen, W., & Timmermann, G. (1981). Die Merkmalpaare stimmhaft/stimmlos und fortis/lenis in der Konsonantenproduktion und -perzeption des heutigen Standard-Französisch. Arbeitsberichte des Instituts für Phonetik des Universität Kiehl (AIPUK), 14.

Kubinec, R. (2023). Ordered beta regression: A parsimonious, well-fitting model for continuous data with lower and upper bounds. Political analysis, 31(4), 519–536.

Laeufer, C. (1992). Patterns of voicing-conditioned vowel duration in French and English. Journal of Phonetics, 20(4), 411–440.

Laeufer, C. (1996). The acquisition of a complex phonological contrast: Voice timing patterns of English final stops by native French speakers. Phonetica, 53, 117–142.

Lenth, R. V. (2024). emmeans: Estimated marginal means, aka least-squares means [R package version 1.10.1]. https://CRAN.R-project.org/package=emmeans

Lisker, L. (1986). “Voicing” in English: A catalogue of acoustic features signaling /b/ versus /p/ in trochees. Language and Speech, 29(1), 3–11.

Löfqvist, A. (1975). A study of subglottal pressure during the production of Swedish stops. Journal of Phonetics, 3, 175–189.

Luce, P. A., & Charles-Luce, J. (1985). Contextual effects on vowel duration, closure duration, and the consonant/vowel ratio in speech production. The Journal of the Acoustical Society of America, 78(6), 1949–1957.

Mack, M. (1982). Voicing-dependent vowel duration in English and French: Monolingual and bilingual production. Journal of the Acoustical Society of America, 71, 171–178.

Maddieson, I. (2013). Voicing and gaps in plosive systems. In M. S. Dryer & M. Haspelmath (Eds.), The world atlas of language structures online. Max Planck Institute for Evolutionary Anthropology.

Montreuil, J.-P. (2010). Multiple opacity in Eastern Regional French. In S. Colina, A. Olarrea, & A. M. Carvalho (Eds.), Romance Linguistics 2009: Selected Papers fron the 39th Linguistic Symposium on Romance Languages (LSRL), Tucson, Arizona (pp. 153–166). John Benjamins.

Morley, R. L., & Smith, B. J. (2023). A Reanalysis of the Voicing Effect in English: With Implications for Featural Specification. Language and Speech, 66(4), 935–973.

Myers, S. (2012). Final devoicing: Production and perception studies. In T. Borowsky, S. Kawahara, T. Shinya, & M. Sugahara (Eds.), Prosody matters: Essays in honor of Elisabeth Selkirk (pp. 148–180). Equinox.

Nicenboim, B., & Vasishth, S. (2016). Statistical methods for linguistic research: Foundational ideas—part ii. Language and Linguistics Compass, 10(11), 591–613.

Ohala, J. J. (1983). The origin of sound patterns in vocal tract constraints. In P. MacNeilage (Ed.), The production of speech (pp. 189–216). Springer-Verlag.

Ohala, J. J. (1997). Aerodynamics of phonology. Proceedings of the Seoul International Conference on Linguistics, 92, 97.

Ohala, J. J. (2011). Accommodation to the aerodynamic voicing constraint and its phonological relevance. Proceedings of the 17th International Congress of Phonetic Sciences, 64–67.

Pape, D., & Jesus, L. M. (2015). Stop and fricative devoicing in European Portuguese, Italian and German. Language and speech, 58(2), 224–246.

Patience, M., & Steele, J. (2022). Relative difficulty in the acquisition of the phonetic parameters of obstruent coda voicing: Evidence from Mandarin-speaking learners of French. Language and Speech, 1–27.

Pinget, A.-F., Kager, R., & Van de Velde, H. (2019). Linking variation in perception and production in sound change: Evidence from Dutch obstruent devoicing. Language and Speech.

Pitt, M. A., Johnson, K., Hume, E., Kiesling, S., & Raymond, W. (2005). The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability. Speech Communication, 45(1), 89–95.

Pooley, T. (1994). Word-final consonant devoicing in a variety of working-class French – a case of language contact? Journal of French Language Studies, 4(2), 215–233.

Popescu, A., Hutin, M., Vasilescu, I., Lamel, L., & Adda-Decker, M. (2023). Stop devoicing and place of articulation: A cross-linguistic study using large-scale corpora. Proceedings of the 20th International Congress of Phonetic Sciences.

Puggaard-Rode, R., Søballe Horslund, C., & Jørgensen, H. (2022). The rarity of intervocalic voicing of stops in Danish spontaneous speech. Laboratory Phonology, 13(1), 1–47.

Purnell, T., Salmons, J., Tepeli, D., & Mercer, J. (2005). Structured heterogeneity and change in laryngeal phonetics: Upper Midwestern final obstruents. Journal of English Linguistics, 33, 307–338.

Ringen, C., & Kulikov, V. (2012). Voicing in Russian Stops: Cross-Linguistic Implications. Journal of Slavic Linguistics, 20(2), 269–286.

Riverin-Coutlée, J. (2020). Sur le voisement des consonnes fricatives finales en franccedil;ais du québec. Actes de la 6e conférence conjointe Journées d’Études sur la Parole (JEP, 31e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1: Journées d’Études sur la Parole, 552–560.

Snoeren, N., Hallé, P., & Segui, J. (2006). A voice for the voiceless: Production and perception of assimilated stops in French. Journal of Phonetics, 34, 241–268.

Sokolović-Perović, M. (2012). The voicing contrast in Serbian stops [Doctoral dissertation, University of Newcastle]. Newcastle University. http://theses.ncl.ac.uk/jspui/handle/10443/1701

Sonderegger, M. (2023). Regression modeling for linguistic data. MIT Press.

Sonderegger, M., Stuart-Smith, J., Knowles, T., MacDonald, R., & Rathke, T. (2020). Structured heterogeneity in Scottish stops over the twentieth century. Language, 96(1), 94–125.

Steriade, D. (1997). Phonetics in phonology: The case of laryngeal neutralization [Unpublished manuscript].

Steriade, D. (1999). Alternatives to syllable-based accounts of consonantal phonotactics [Unpublished manuscript].

Tanner, J., Sonderegger, M., & Stuart-Smith, J. (2020). Structured speaker variability in Japanese stops: Relationships within versus across cues to stop voicing. Journal of the Acoustical Society of America, 148(2), 793–804.

Tanner, J., Sonderegger, M., Stuart-Smith, J., & Fruehwald, J. (2020). Towards “English” phonetics: Variability in the pre-consonantal voicing effect across English dialects and speakers. Frontiers in Artifical Intelligence, 3, 1–15.

Temple, R. (1999). Sociophonetic conditioning of voicing patterns in the stop consonants of French. Proceedings of the 14th International Congress of Phonetic Sciences, 1409–1412.

Temple, R. (2000). Old wine into new wineskins. A variationist investigation into patterns of devoicing in plosives in the Atlas linguistique de la France. Transactions of the Philological Society, 98, 353–294.

Torreira, F., Adda-Decker, M., & Ernestus, M. (2010). The Nijmegen corpus of casual French. Speech Communication, 10(3), 201–212.

Van de Velde, H., & van Hout, R. (1996). The devoicing of fricatives in Standard Dutch: A real-time study based on radio recordings. Language Variation and Change, 8, 149–175.

van Dommelen, W. (1981). Kontextbedingte Vokeldehnung im Französischen. Arbeitsberichte des Instituts für Phonetik des Universität Kiehl (AIPUK), 16, 95–107.

van Dommelen, W. (1983). Parameter Interaction in the Perception of French Plosives. Phonetica, 40(1), 32–62.

Vasilescu, I., Wu, Y., Jatteau, A., Adda-Decker, M., & Lamel, L. (2020). Alternances de voisement et processus de lénition et de fortition: Une étude automatisée de grands corpus en cinq langues romanes. Revue TAL: Traitement automatique des langues.

Vasishth, S., Nicenboim, B., Beckman, M. E., Li, F., & Kong, E. J. (2018). Bayesian data analysis in the phonetic sciences: A tutorial introduction. Journal of phonetics, 71, 147–161.

Warner, N., Jongman, A., Sereno, J., & Kemps, R. (2004). Incomplete neutralization and other sub-phonemic durational differences in production and perception: Evidence from Dutch. Journal of Phonetics, 32(2), 251–276.

Westbury, J. R. (1983). Enlargement of the supraglottal cavity and its relation to stop consonant voicing. The Journal of the Acoustical Society of America, 73(4), 1322–1336.

Westbury, J. R., & Keating, P. (1986). On the naturalness of stop consonant voicing. Journal of Linguistics, 22, 145–166.