1. Introduction

1.1. Background

It has been claimed that phonetic reduction affects a segment to different degrees depending on its morphological status. For instance, English -/s/ tends to be longer at the end of monomorphemic words than when it forms a suffix by itself (Plag et al., 2017). Such morpho-phonetic effects have long been a subject of interest, not least because phonetic effects that can be traced to morphological structure run counter to traditional conceptualizations of the production of complex words. In such models (e.g., Levelt et al., 1999; Roelofs, 2000), effects such as segmental shortening or optional deletion are implemented at the level of phonetic encoding, which takes into account things like phonological context but which does not have access to morphological information. As such, morpho-phonetic effects have been used to argue in favour of models in which word representations contain a certain degree of phonetic detail (e.g., Plag et al., 2017; Seyfarth et al., 2018) or models in which phonetic effects arise from relations between representations of morphologically related words (e.g., Tomaschek et al., 2021; Schmitz et al., 2021b).

Although morpho-phonetic effects have been studied for a while, the evidence for these effects remains limited. Many of the older studies suffer from methodological issues (see Hanique & Ernestus, 2012; Plag et al., 2017 for an overview), and more recent papers have mostly focussed on a single morphological exponent, English word final /s/ (e.g., Plag et al., 2017; Seyfarth et al., 2018; Tomaschek et al., 2021; Schmitz et al., 2021a). As such, the present research attempts to find additional evidence for morpho-phonetic effects by investigating Dutch final -/ən/.

Furthermore, recent investigations of morpho-phonetic effects have generally not considered whether these effects may originate from the phonological contexts in which the words occur most often. Words differing in their morphological structure may differ in their typical phonological contexts. In turn, these contexts may differ in the extent to which they trigger reduction. If we assume that speakers somehow associate the pronunciation of words or morphemes with the phonological contexts they typically occur in, then words that typically occur in reducing contexts may also be reduced when they occur in a nonreducing context. The present research investigates whether the differences in phonetic reduction that are observed between words with different morphological structures may originate from the effect of the phonological contexts these morphological structures typically occur in. Testing this hypothesis requires a phenomenon with clear differences in phonetic realization across phonological contexts. As explained in the following sections, Dutch final -/ən/ fits this description.

1.2. Indications for morpho-phonetic effects in the realization of Dutch -en

This section introduces Dutch final -/ən/ and discusses why a new investigation of this phenomenon could expand the evidence for morpho-phonetic effects by reviewing the previous literature on this phenomenon. Previous summaries of morpho-phonetic effects (e.g., Hanique and Ernestus, 2012 and by Plag and colleagues, 2017) have found that many older studies of morphological effects on phonetic realisation suffer from methodological issues. Surprisingly, these summaries do not mention older work on Dutch final -/ən/ that contains indications for morpho-phonetic effects. As such, the current section looks at the strengths and shortcomings of these previous studies on Dutch -/ən/.

Canonically, the Dutch word ending -en, as in lopen ‘to walk’, is pronounced as /ən/. However, depending on a host of variables such as speech register, accent, speech rate, etc., it is also completely acceptable or even commonplace to pronounce -en as /ə/, without the final /n/ (Van de Velde, 1996). Some of the most relevant findings on Dutch final -/ən/ from a morpho-phonetic perspective are described by Van de Velde and van Hout. Using both radio broadcasts (Van de Velde, 1996; Van de Velde & van Hout, 1998) and data collected in production experiments (Van de Velde & van Hout, 2000, 2001, 2003), they demonstrate that deletion rate of the final /n/ is affected by linguistic factors such as the sound following the -en word as well as the morphological category of the -en word. Specifically, they find that /n/ is less likely to be absent when it is part of the stem (e.g., as in de molen ‘the mill’ or ik teken ‘I draw’) than when it is part of an -en suffix (e.g., as in we lopen ‘we walk’). However, Van de Velde and van Hout (1998) warn against taking their results at face value due to the large differences between speakers. Although they establish a speaker typology (Van de Velde and van Hout, 2001) to help make sense of these interspeaker differences with respect to the morphological effect, their dataset was not large enough and the statistical methods available to them were not advanced enough to make any conclusive statements about the interaction between speaker type and morphological status.

Another interesting contribution to the morpho-phonetic investigation of Dutch final -en is made by Goeman (2001). He used a dataset of elicited speech in different Dutch dialects to map the regional variation in the pronunciation of Dutch final -en. Although he generally finds the same pattern as Van de Velde and van Hout of final /n/ being more likely to be absent if it is part of a suffix, Goeman found the opposite for the final /n/ in the -en suffix indicating the plural for nouns. Goeman also claims that the effect of morphological status on /n/ deletion rate depends on regional dialect. However, these claims are based on statistical analyses that did not adequately model such interactions.

A limitation that is shared by previous investigations of morpho-phonetic effects on Dutch final -/ən/ is that they only consider morphological effects on a categorical form of phonetic reduction, i.e., /n/ deletion. This is especially salient in light of recent evidence for morpho-phonetic effects on gradient phonetic properties such as segmental duration (e.g., Plag et al., 2017). Moreover, Goeman (2001) reports realisations of -/ən/ with reduced /n/ and nasalized /ə/, suggesting that reduction of Dutch -/ən/ is not limited to simple /n/ deletion.

In sum, there are several studies which indicate that morphological status affects the presence versus absence of final /n/ in Dutch -en words. In addition, there are indications that these effects are mediated by interspeaker differences. However, the exact nature of the morpho-phonetic effect, including the differences between different speaker groups, has not yet been reliably established. Furthermore, these studies limited their investigations to /n/ deletion, ignoring potential morpho-phonetic effects of a gradient nature. A more comprehensive characterisation of the morpho-phonetic effects on Dutch final -en requires a new investigation using a sufficiently large dataset and statistical methods that allow for appropriate modelling of speaker-specific morphological effects.

1.3. The role of words’ typical context

In light of our interest in typical phonological context as a potential mechanism behind morpho-phonetic effects, the current section reviews the literature on the role of the typical context in phonetic effects. One of the earliest claims of phonetic effects being due to the typical context of words is made in Bybee (2002). Among other phenomena, Bybee compares the /t/-deletion rate at the end of contracted negations such as don’t and other words ending in /nt/, such as different. She finds that the final /t/ in contracted negations is deleted much more often than final /t/ in other -/nt/ words. Bybee attributes this to the fact that contracted negations are more likely to be followed by a consonant than other -/nt/ words. She argues that /t/ is more likely to be deleted when the following word starts with a consonant, that words that tend to be followed by consonants are therefore reduced more often, and that this reduction is stored in the mental lexicon. As a result, she claims, these words are also more likely to be reduced when produced in nonreducing context, e.g., before a vowel. Bybee’s hypotheses about /t/-deletion specifically and the effect of words’ typical phonological contexts more generally were vindicated in more recent studies that used appropriate statistical analyses (e.g., Raymond et al., 2016; Sóskuthy & Hay, 2017; E. Brown, 2018; E. K. Brown, 2018).

More recent studies have focused on phonetic reduction as a result of words’ typical contextual predictability (e.g., Seyfarth, 2014; Sóskuthy & Hay, 2017; Tang & Shaw, 2021). It is well established that words which are predictable given their context tend to be more phonetically reduced than words that are unpredictable based on their context (see the overview in Bell et al., 2009). Seyfarth (2014) argues that reduction due to predictability is stored in the mental lexicon, such that words which often occur in predictable contexts have more phonetically reduced representations than words that often occur in a wide variety of contexts. As a result, Seyfarth claims, tokens of words that are generally predictable are more reduced regardless of the predictability of these specific tokens in their own contexts. To substantiate these claims, Seyfarth (2014) investigated whether durations of English words correlate with a measure, informativity, that captures the average probability of a word given the following word, while also controlling for the probability of the word tokens in their own contexts. The results showed that words with low informativity, i.e., words that are usually predictable, are shorter than words with high informativity. As such, these results provide evidence for the hypothesis that words’ typical contextual predictability can affect their reduction.

To the best of our knowledge, previous studies on words’ typical contexts have not explored whether these factors might explain morpho-phonetic effects. E. Brown (2013) has explored the possibility that words’ typical phonological contexts may explain phonetic differences between word classes, which, like morphological functions, are abstract linguistic categories. This research focused on the reduction of word-initial /s/ in Traditional New Mexican and Chihuahua, Mexico Spanish. In these varieties of Spanish, word-initial /s/ may be deleted or reduced to [h]. The reduction rate of word-initial /s/ varies between different word classes, with verbs having much higher reduction rates than nouns, for instance. E. Brown proposes that these differences are related to the frequency with which words in the different classes occur in reducing contexts. She investigated whether words that occur relatively frequently after a nonhigh vowel show higher reduction rates, by calculating the FRC measure (Frequency in a Reducing Context; see Raymond et al., 2016), which divides the number of tokens of a word following /a, e, o/ by the total number of tokens of that word. Using a multiple regression model, she showed that the initial /s/ in words with a high FRC are indeed reduced more often, even when a given token’s directly preceding context is taken into account. Moreover, she claims that word class does not significantly predict /s/ reduction in a model which also includes FRC. However, the analysis presented in E. Brown (2013) does not rule out such an effect1. Further, Van de Velde & van Hout (2003) speculate that the morphological effect on Dutch word-final -/ən/ reduction originally stems from a phonological effect, but did not investigate their claim. As a result, it remains unclear whether words’ typical contexts can fully explain previously reported associations between phonetic reduction and abstract linguistic categories, such as word class or morphological status.

1.4. The present research

The first objective of the present research is to determine whether a detailed and statistically robust description of the phonetic realisation of Dutch final -/ən/ provides additional evidence for morpho-phonetic effects. Two studies are conducted to achieve this goal. The first study investigates whether the morphological status of the -en ending (i.e., -en as part of the stem versus -en as a suffix and among the different functions of the suffix) influences the probability that the final /n/ is realised or deleted. The second study investigates whether the morphological status of -en influences gradient phonetic properties of the -en suffix, specifically the respective segmental durations of /n/ and /ə/, and the proportion of /ə/ that is nasalized.

The second objective of this research is to investigate to what extent words’ typical contexts can explain differences in /ən/-realisation between morphological categories. To do so, we first need to identify the contextual phonological factors that are likely to affect /ən/ realisation in Dutch -en words. One of the primary candidates is the following phonological context. Previous studies have reported large differences in /n/ deletion between -en words preceding Pauses, Vowels, and Consonants. Interestingly, the direction of these differences seems to be speaker- and region-dependent (Van de Velde & van Hout, 1998; Goeman, 2001). Goeman (2001) shows that regional differences in /n/ deletion rate are most pronounced before a pause. As noted by Van de Velde & van Hout (2003), in the pre-pausal context the -en word is most likely to be in (prosodic) focus, and as such its pronunciation is most susceptible to speaker control. As a result, before a pause, the pronunciations most resemble speakers’ canonical pronunciations, which would lead to the larger differences Goeman (2001) reported between /n/-realising Northern speakers versus /n/-deleting Southern speakers. A speaker’s pronunciation before pauses may be stored in the lexical representations of words that frequently occur before a pause. To quantify the effect of the prepausal context on /ən/ realisation in Dutch -en words, also outside of this context, we could calculate how often each -en word occurs before a pause compared to other contexts, resulting in a word-specific Before Pause Proportion (henceforth BPP) measure—analogous to E. Brown’s (2018) FRC measure.

If we assume that words’ typical phonological contexts affect their /ən/ realisation, we can make a number of predictions for the deletion of /n/. First, we should expect that the effect of BPP on /n/ deletion rate depends on the type of speaker, see (1).

    1. (1)
    1. a.
    1. For speakers that tend to produce the /n/ before a pause, a higher BPP should be associated with lower /n/ deletion rates.
    1.  
    1. b.
    1. For speakers that tend to delete the /n/ before a pause, a higher BPP should be associated with higher /n/ deletion rates.

Second, if typical phonological context drives the morphological effects on /n/ deletion, the deletion rates of morphological categories that frequently occur before a pause should depend on the type of speaker. Third, after BPP is taken into account, any remaining differences in /n/ deletion rate between morphological categories would indicate that not all morpho-phonetic effects can be explained by words’ typical phonological context.

To generate predictions regarding the effects of words’ typical phonological context on the duration of /n/, we assume that durational reduction is an incomplete form of deletion. Given this assumption, we can make the predictions in (2):

    1. (2)
    1. a.
    1. For speakers that tend to produce the /n/ before a pause, a higher BPP should be associated with longer /n/ durations.
    1.  
    1. b.
    1. For speakers that tend to delete the /n/ before a pause, a higher BPP should be associated with shorter /n/ durations.

It is difficult to make strong predictions about the effects of words’ typical phonological context on the duration of /ə/, as two reasonable, but opposing assumptions, can be made. Firstly, it could be that the mechanism behind reduction of /n/ applies to the -/ən/ suffix as a whole. In this case, one would expect shortened /n/ to be accompanied by a likewise shortened /ə/. Secondly, there is some evidence that deletion of coda /n/ in certain varieties of Dutch results in compensatory lengthening of the preceding vowel (van Oostendorp, 2001). It is a small leap, then, to suggest that a shortened /n/ should be accompanied by a lengthened /ə/.

Two opposing assumptions may also inform any predictions regarding the nasalization of /ə/ as a result of words’ typical phonological context. If it is assumed that reduction applies to the -/ən/ suffix as a whole, we might expect that the pressure to reduce the duration of the segments that make up the suffix results in greater coarticulation between them, such that a larger portion of the /ə/ is nasalized. Alternatively, there is evidence from English that vowel nasalization may decrease when the following coda /n/ is shortened (Cho et al., 2017).

In sum, predictions regarding /ə/ and nasalization duration can be grouped into those following a suffix reduction hypothesis and those following a /n/-reduction hypothesis, see (3).

    1. (3)
    1. a.
    1. Suffix reduction hypothesis: Shorter /n/ realisations are accompanied by shorter and more nasalized /ə/ realisations.
    1.  
    1. b.
    1. /n/-reduction hypothesis: Shorter /n/ realisations are accompanied by longer and less nasalized /ə/ realisations.

As we do not have a clear preference for one of these hypotheses, our investigation of /ə/ and nasalization duration will be exploratory.

2. Study I: Morphological effects on the deletion of /n/ in Dutch final -en

2.1. Materials and Methods

2.1.1. Speech Data

The speech data investigated in our study were extracted from the Dutch part of the read-aloud stories component of the Spoken Dutch Corpus (SDC; Oostdijk, 2002). This subcorpus was selected based on several interdependent criteria. To enable reliable estimates of morpho-phonetic effects, which we expected to vary between different speaker groups, we needed a large amount of data from a relatively diverse set of speakers. The resulting size of the dataset, in turn, required the use of automatic phonetic annotation. Automatic annotation of a subtle phonetic variable such as a word-final nasal stop works best with high quality speech recordings, which ruled out the more conversational components of the SDC.

Initially, we annotated the presence versus absence of final /n/ with a Kaldi-based (Povey et al., 2011) forced alignment tool (CLST ASR FA tool; Ganzeboom, 2018). However, manual inspection of the resulting annotations revealed that the presence of final /n/ was not reliably determined. To improve the reliability of our /n/ deletion data and in order to also obtain reliable estimations of durations of /ə/, /n/ and nasality, we built our own classification system based on the automatically generated transcription. In short, we automatically selected reliable parts of the generated transcription to train three BiLSTM (bidirectional long short-term memory) models to detect the presence of /ə/, /n/ and nasality, respectively. We evaluated the results on a small but representative subset of the data (71 validation items and 71 test items), which was annotated by two different phonetically trained annotators. With respect to the presence/absence of final /n/, the system approached human level performance on the test items, with 94.3% agreement between annotator TZ and the system, compared to 95.8% agreement between annotators TZ and MW.

The dataset was constructed by extracting all words ending in -en, excluding several types of words. That is, we excluded archaic inflections (e.g., te mijnen huize ‘at my house’), contracted forms (e.g., z’n for zijn ‘his’), the indefinite article een, due to its extremely high frequency, and past participle forms, like gegeven ‘given’, because, in those words, -en is part of a circumfix, which means the -en is only one part of the morphological exponent. After also excluding tokens for which our classification system could not reliably determine the segmental boundaries, we ended up with 43,258 word tokens across all morphological categories, as shown in Table 1.

Table 1

Number of types and tokens in the deletion dataset by morphological category.

Morphological category Example Label Number of Types Number of Tokens
Nonsuffix reken ‘calculate’, tegen ‘against’, toren ‘tower’ NS 539 6142
Plural noun armen ‘arms’ NOUN 3270 14255
Plural verb lopen ‘walk’ PL 1381 8866
Infinitive verb lopen ‘to walk’ INF 1953 13995

Because we wanted to investigate interspeaker differences and previous research has linked interspeaker variation to regional differences, it is relevant that our dataset represented speakers with varied regional backgrounds. Table 2 shows the distribution of speakers across different regions of birth. We used the four main regions defined in the SDC: the core area containing the major cities of Amsterdam, Rotterdam, The Hague, and Utrecht, the transitional area which includes the centre of the country and the province Zeeland, the Northeastern peripheral area, and the Southern peripheral area. This regional classification roughly lines up with earlier studies of interspeaker differences in -en pronunciation (e.g., Van de Velde & van Hout, 2001). Speakers that were born abroad and speakers whose birth region was unknown were assigned their own category.

Table 2

Number of speakers and tokens in the deletion dataset by birth region.

Region of birth Number of Speakers Number of Tokens
Core area 171 22422
Transitional area 52 7075
Northeastern area 44 5932
Southern area 42 5747
Abroad 14 1960
Unknown 1 122

2.1.2. Predictors

We modelled the presence versus absence of /n/ with the following predictors of interest: Morphological Category, Phonological Context, BPP, and Speaker Group. The morphological categories of the words were determined based on the Part-Of-Speech tags generated by the Frog software (Hendrickx et al., 2016). We distinguished four categories: Plural verbs, Infinitive verbs, Plural nouns, and Nonsuffix, which included words for which the final -en is part of the stem.

We determined the Phonological Context based on the forced aligner’s phonetic transcription of the following sound (including pauses). We followed previous research (e.g., Van de Velde & van Hout, 1998) by distinguishing three categories: Consonants, Pauses, and Vowels. Table 3 illustrates how often these phonological contexts occur in each morphological category and shows that -en words from different morphological categories typically occur in different phonological contexts.

Table 3

Percentages of the phonological contexts following words in the respective morphological categories.

Consonants Pauses Vowels
Nonsuffix 52.33% 30.04% 18.63%
Plural nouns 41.76% 42.73% 15.51%
Plural verbs 57.94% 23.73% 18.33%
Infinitive verbs 18.44% 73.86% 7.70%

The Speaker Group variable was included to enable statistical analysis of how reduction behaviour generalizes across speakers. Previous research (Goeman, 2001) has found substantial interspeaker variation in /n/ deletion rate, especially before pauses. As such, the following groups were distinguished: inserters, who tend to produce /n/ before a pause, deleters, who tend to not produce /n/ before a pause, and neutrals, whose pre-pausal /n/ deletion rate lies between that of the inserters and deleters. While Goeman (2001) reported clear regional differences in /n/ deletion rates, Van de Velde and van Hout (2003) noted that the /n/ deletion patterns of individual speakers cannot fully be explained by regional patterns. Furthermore, given the hypothesis that BPP is behind morphological effects on /n/ deletion, it makes more sense to define our speaker groups on the basis of speakers’ behaviour in the pre-pausal context rather than on the basis of their birth region. The reasoning behind having three speaker groups is as follows. The data in Goeman (2001) suggest a dichotomy between those who insert and those who delete before a pause. We recognize that a dichotomy is probably an oversimplification, given a previous speaker typology (Van de Velde and van Hout, 2001) suggested up to six speaker groups. Not wanting to overcomplicate the analysis, we went for a compromise of three groups, where the neutrals group would hold those speakers that could not easily be categorized as inserters or deleters. The methodology by which the speakers were assigned to their speaker groups is detailed in Section 2.1.3.

Additionally, several potentially confounding variables were included in our statistical analysis. More predictable words have been found to be on average more phonetically reduced (e.g., Bell et al., 2009). We included Word Frequency as a measure of a word’s baseline predictability. Moreover, the context also influences the predictability of a word. As such, we included Next Probability and Previous Probability, which represented the probability of the -en word given the next word and the previous word, respectively. Additionally, we included the Next Informativity and Previous Informativity variables, which represent the average contextual probabilities of the -en word (see Seyfarth, 2014 for the mathematical formula). All of these measures were calculated based on the Corpus of Contemporary Dutch (Dutch Language Institute, 2021).

We also took into account the relative speed with which the speech containing the -en words was produced as this may influence the degree of segmental reduction that is encountered. Speech Rate was calculated by dividing the number of syllables in the uninterrupted chunk of speech that contained the -en word by the duration of that chunk. In this computation, the number of vowels in the automatically obtained phonetic transcription was used as a proxy for the number of actually produced syllables.

Lastly, we included two random variables: Word and Speaker. Word was included to account for any word-specific characteristics which would influence /n/ deletion rate, but which would not be related to morphological category or any of the other word-specific predictors. Speaker was included to determine the speaker groups (see Section 2.1.3) and to account for any speaker-specific behaviour that is not captured by the groups.

2.1.3. Modelling

Our modelling procedure consisted of two stages. The first stage was aimed at establishing the speaker groups, whereas the second stage was meant to investigate the respective effects of morphological category and BPP on the degree of /n/ deletion in the different speaker groups.

To ensure that our classification of speakers was based on /n/ deletion rates as a function of phonological context, rather than as a function of variables that are to some extent confounded with phonological context, we did not want to use speakers’ raw /n/ deletion rates to assign them to a speaker group. Instead, we used Bayesian logistic regression (as implemented in McElreath, 2020) to model the presence of /n/ based on all of the predictors described in Section 2.1.2, in addition to by-Speaker slopes for Phonological Context, Morphological Category, and BPP. Subsequently, we used the posterior distribution of this Speaker model to compute each speaker’s predicted /n/ realization rates before consonants, pauses, and vowels. We then computed the difference in /n/ realization rate before pauses and before the other contexts for each speaker: δ = (rconsonant + rvowel)/2 – rpause. Finally, the K-means clustering algorithm (Hartigan & Wong, 1979) was used to divide the speakers’ δ values into three groups. Members of the group with the highest average δ value were labelled deleters, members of the group with the lowest average δ value were labelled inserters, and the remaining speakers were labelled neutrals.

In the second stage of the modelling procedure, four more Bayesian logistic regression models were fitted: termed the Baseline model, the Morphology model, the BPP model and the Full model. The Baseline model was fitted purely as a baseline to compare against the three other, more complicated models. It contained by-Speaker Group intercepts for Phonological Context, Word Frequency, Next Probability, Previous Probability, Next Informativity, Previous Informativity, and Speech Rate as fixed predictors. Furthermore, it contained Random intercepts for Word and by-Speaker intercepts for Phonological Context. The Morphology model expanded the Baseline model with by-Speaker Group fixed intercepts for Morphological Category and by-Speaker random intercepts for Morphological Category. Likewise, the BPP model expanded the Baseline model with by-Speaker Group coefficients for BPP as a fixed predictor and by-Speaker random slopes for BPP. The final Full model combined the two previous models by including both by-Speaker Group coefficients for BPP and for Morphological Category, as well as by-Speaker random slopes for BPP and for Morphological category. The BPP model and the Morphology model were used to examine how BPP and Morphological Category affected /n/ deletion rates in the different speaker groups: By comparing these models to the Baseline model using WAIC (Vehtari et al., 2017), we could establish the degree to which Morphological category and BPP explained /n/ deletion rates. In doing so, we worked towards our first research goal of extending the evidence for morpho-phonetic effects, and we investigated whether the BPP variable behaved as predicted in (2). Finally, by comparing the WAIC scores of the BPP model and the Full model, we could assess whether adding Morphological Category as a predictor improves model predictions when BPP is already accounted for. This part of the analysis addresses our second objective of investigating to what extent the typical phonological context of morphological categories could explain morpho-phonetic effects.

2.2. Results

2.2.1. Speaker Groups

Using the posterior distribution of the Speaker model, we estimated the predicted by-speaker average /n/ realisation rates in the different phonological contexts and categorized the speakers as inserters, neutrals or deleters (see Table 4 for the speaker and token distribution over the three groups).

Table 4

Number of speakers and tokens by speaker group in the deletion dataset.

Speaker Group Number of Speakers Number of Tokens
inserters 45 5613
neutrals 86 11829
deleters 193 25816

Figure 1 illustrates the predictions for each speaker in the different speaker groups and the distribution of δ: speakers’ deletion rates before pauses relative to their deletion rates before vowels and consonants. Note that while the speakers were categorized based on their δ value, the categorization reflects well how often they delete /n/ in general.

Figure 1
Figure 1

Speaker-specific predicted average /n/ realization rates preceding consonants, pauses and vowels, by speaker group (top), and histogram of the difference in /n/ realization rate before pauses and before the other contexts for each speaker (bottom).

To evaluate the extent to which the speaker groups correspond to regional groups, we calculated the proportion of inserters and deleters for each of the main regions of birth defined in the SDC. These proportions are visualized in Figure 2. We see that all regions are represented by both inserters and deleters, although some regions are mostly represented by a single group.

Figure 2
Figure 2

Proportions of inserters (Left) and deleters (Right) in the main regions of birth tracked by the Spoken Dutch Corpus: The Central (C), Transitional (T), Northeastern (N), and Southern (S) Regions.

2.2.2. Does morphological category affect /n/ deletion?

To find out whether the pronunciation of Dutch -en words shows a morpho-phonetic effect, we compared the Morphology model to the Baseline model using the WAIC measure. This revealed that the model including the Morphological Category predictor had a much lower WAIC of 28114.46 compared to the WAIC of 28408.38 of the Baseline model. In other words, the Morphology model was estimated to have a lower prediction error, indicating the presence of a morphological effect on /n/ deletion rates.

To investigate the nature of the morphological effect, we used the posterior distributions to estimate the average intercepts of the morphological categories for each of the speaker groups. To show that Phonological Context remains predictive of /n/ deletion when the Morphological Category predictor is included, we also estimated the average intercepts of the three phonological contexts for each of the speaker groups. The effects of both variables are visualized in Figure 3 and summarized in Table 5.

Figure 3
Figure 3

Estimated average effects of Phonological Context (Left) and Morphological Category (Right) on the realization of final /n/ in the Morphology model. We distinguish between consonants, pauses, and vowels as phonological contexts, and between nonsuffix, plural nouns, plural verbs, and infinitive verbs as morphological categories.

Table 5

Posterior distributions of the parameters in the Morphology model of /n/ deletion. All estimates are in log-odds of realized /n/. Estimates for individual levels of the Word and Speaker variables are omitted for the sake of brevity. Shaded cells indicate contrasts with a 95% credible interval that does not include 0.

Fixed Parameters Mean 95% lower boundary 95% upper boundary
Speech Rate –0.41 –0.45 –0.37
Word Frequency –0.25 –0.37 –0.14
Next Probability –0.13 –0.18 –0.08
Previous Probability –0.06 –0.10 –0.02
Next Informativity –0.18 –0.30 –0.07
Previous Informativity –0.08 –0.19 0.02
Fixed Parameters by Speaker Group Inserters Neutrals Deleters
Mean 95% lower 95% upper Mean 95% lower 95% upper Mean 95% lower 95% upper
Phon. Context: C –0.84 –2.31 0.65 –0.96 –2.41 0.49 –1.97 –3.46 –0.50
Phon. Context: P 1.57 0.08 3.06 –0.01 –1.46 1.45 –2.08 –3.58 –0.60
Phon. Context: V –0.21 –1.70 1.30 –0.43 –1.89 1.06 –0.44 –1.93 1.04
Morph. Cat.: NS 0.08 –1.43 1.58 –0.12 –1.60 1.34 –0.59 –2.07 0.90
Morph. Cat.: NOUN 0.46 –1.04 1.96 –0.11 –1.57 1.35 –1.34 –2.83 0.15
Morph. Cat.: PL –0.08 –1.59 1.41 –0.25 –1.73 1.22 –0.96 –2.43 0.54
Morph. Cat.: INF –0.01 –1.51 1.50 –0.88 –2.34 0.59 –1.52 –3.00 –0.01
Phonological Contrasts
C – P –2.41 –2.67 –2.16 –0.95 –1.13 –0.78 0.11 –0.06 0.29
V – P –1.78 –2.06 –1.50 –0.42 –0.62 –0.22 1.65 1.48 1.82
C – V –0.63 –0.95 –0.31 –0.54 –0.78 –0.30 –1.54 –1.73 –1.35
Morphological Contrasts
NOUN – NS 0.39 0.09 0.68 0.01 –0.22 0.24 –0.75 –0.96 –0.53
PL – NS –0.16 –0.48 0.15 –0.13 –0.38 0.11 –0.37 –0.60 –0.14
INF – NS –0.09 –0.41 0.24 –0.76 –1.01 –0.50 –0.93 –1.16 –0.68
PL – NOUN –0.55 –0.81 –0.30 –0.15 –0.34 0.05 0.38 0.20 0.57
INF – NOUN –0.48 –0.73 –0.21 –0.77 –0.97 –0.57 –0.18 –0.37 0.01
INF – PL 0.07 –0.21 0.37 –0.62 –0.84 –0.40 –0.56 –0.78 –0.34
Random Parameters Mean 95% lower boundary 95% upper boundary
SD Word 0.55 0.48 0.62
SD Speaker: Phon.: C 1.15 0.96 1.33
SD Speaker: Phon.: P 1.11 0.90 1.28
SD Speaker: Phon.: V 1.01 0.80 1.20
SD Speaker: Morph.: NS 0.36 0.07 0.60
SD Speaker: Morph.: NOUN 0.38 0.18 0.59
SD Speaker: Morph.: PL 0.30 0.02 0.54
SD Speaker: Morph.: INF 0.46 0.23 0.66

To illustrate that the group differences in phonological and morphological effects are strong enough to be noticeable in the raw data, we can look at word forms with multiple morphological functions. The most common doubling of morphological function occurs with plural and infinitive verbs. Figure 4 plots the raw proportions of /n/ realization for words that occur at least 150 times in the combined data of inserters and deleters, and both as plural and infinitive verbs, split by speaker group.

Figure 4
Figure 4

Proportion of /n/ realization for plural and infinitive verbs that occurred at least 150 times in the combined data of inserters and deleters. The proportions of komen ‘(to) come’, blijven ‘(to) stay’ and laten ‘(to) let’ have been highlighted to enable direct comparisons between inserter and deleter patterns.

2.2.3. Does typical phonological context explain morphological effects on /n/ deletion?

To assess whether the effect of morphology on /n/ deletion reported in Section 2.2.2 can be explained as an effect of typical phonological context, we compared the WAIC scores of the Morphology model, the BPP model and the Full model, as shown in Figure 5.

Figure 5
Figure 5

WAIC scores for the Baseline, BPP, Morphology, and Full model. Lower scores indicate better performance.

The comparison in Figure 5 reveals that adding BPP as a predictor to the baseline model improves the WAIC score (the BPP model has a WAIC that is 48.88 points lower than the Baseline model’s WAIC), but the improvement obtained by adding BPP to the Morphology model is limited (the Full model has a WAIC that is 3.60 points lower than the Morphology model). This shows that the BPP explains some variation in the data, and that part of this variance is also explained by the morphological function of -en. However, it is clear from the WAIC comparison of the BPP (28359.50) and the Full model (28110.86) that Morphological Category also explains a considerable amount of variance that is not accounted for by BPP.

To investigate the direction of the BPP effect, we used the posterior distribution of the BPP model to estimate the average predicted effect on the realisation of /n/ across the range of BPP values for the inserters and the deleters, respectively, as shown in Figure 6.

Figure 6
Figure 6

Average predicted effect of BPP on the realisation of /n/ for the inserters and the deleters speaker groups in the BPP model. The shaded areas indicate 95% credible intervals.

As Figure 6 shows, deleters are more likely to delete final /n/ in words that frequently occur before a pause, whereas inserters are more likely to produce the /n/ in those words. This is as predicted (see Section 1.4).

In order to show to what extent BPP and Morphological Category explain the same variation, we also estimated posterior predictions for both variables in the Full model, as shown in Figure 7.

Figure 7
Figure 7

Average predicted effect of BPP and Morphological Category on the realisation of /n/ in the Full model. The shaded areas indicate 95% credible intervals.

Compared to Figures 3 and 6, differences between inserters and deleters are smaller in Figure 7, presumably because BPP and Morphological effect partly explain the same variance in Figure 7.

3. Study II: Morphological effects on the duration of Dutch final -en

3.1. Materials and Methods

3.1.1. Speech Data

In this study, we took the dataset described in Section 2.1.1 and narrowed it down to -en tokens in which both the /ə/ and the /n/ were realized. This was done to allow for the simultaneous modelling of the /ə/, /n/ and nasalization durations (see Section 3.1.3). This selection consisted of 8770 tokens, as shown in Table 6.

Table 6

Number of types and tokens by morphological category in the duration dataset.

Morphological category Label Number of Types Number of Tokens
Nonsuffix NS 256 1327
Plural noun NOUN 1340 3406
Plural verb PL 600 1633
Infinitive verb INF 806 2404

We wanted to see whether the speaker groups that were assigned based on /n/ deletion behaviour also predict speakers’ durational reduction behaviour (see Section 3.1.3). A prerequisite for this comparison is that the dataset contains enough tokens for each of the different speaker groups. This is the case, as shown in Table 7. The original dataset was dominated by deleters (see Table 4), which is precisely the group that loses most of the data as a result of our selection criteria for the durational data. They are still well represented but no longer dominate the data.

Table 7

Number of speakers and tokens by speaker group in the duration dataset.

Speaker Group Number of Speakers Number of Tokens
inserters 45 3036
neutrals 86 3836
deleters 190 1898

3.1.2. Predictors

We modelled the durations of /ə/, /n/ and nasalization with the same predictors of interest that were used in the /n/-deletion study: Morphological Category, Phonological Context, BPP, and Speaker Group. Likewise, this study included the Word and Speaker random variables and all of the control variables from the /n/-deletion study. Please see Section 2.1.2 for details.

One additional control variable, Previous Phonological Context, was included in the present study, because the duration of Dutch schwa is affected by the type of preceding segment (e.g., van Bergem, 1994). This variable distinguished whether the segment preceding the -en suffix was an Approximant, Fricative, Liquid, Nasal, or a Plosive.

3.1.3. Modelling

All of the durational models were multivariate Bayesian regression models. That is, each model had three outcome variables (Schwa Duration, Stop Duration and Nasalization Duration), each of which had its own linear model, but the three shared their error structure. As a consequence, any correlations between the residuals of the outcome variables were accounted for in the model.

The first aim of the analysis was to establish whether the speaker groups that were assigned in the /n/-deletion study are predictive of durational data as well. To that end, we compared two models. The first one is the Full Model, which contained all of our predictors, including by-Speaker random slopes for BPP, Morphological Category and Phonological Context, and fixed effect interactions between Speaker Group and BPP, Speaker Group and Morphological Category, and Speaker Group and Phonological Context. The second model is the No Groups Model, which contained all predictors in the Full Model except for Speaker Group and its interactions. If the Speaker Groups from the /n/-deletion analysis reflect speaker differences that are also relevant for the durational data, we would expect the Full Model to outperform the No Groups model.

The second aim of the analysis was to investigate whether Morphological Category affected /ə/ duration, /n/ duration and the nasalization of /ə/. We first fitted a Baseline Model, which included all predictors except for those involving BPP and Morphological Category. We compared this model with the Morphology model, which added by-Speaker random intercepts for Morphological Category and an interaction between Speaker Group and Morphological Category. In other words, this part of the analysis was aimed at expanding the evidence for morpho-phonetic effects by testing whether morphological status had gradient effects on the phonetic realization of Dutch final /ən/.

The third and final aim of the durational analysis was to investigate to what extent any morphological effects on the durational aspects of /ən/ could be explained by BPP. To measure whether BPP affected the durations at all, we compared the Baseline Model to the BPP Model, which added by-Speaker random slopes for BPP and an interaction between Speaker Group and BPP. We also compared the BPP Model to the Full Model, which contained both the Morphological Category and BPP predictors, to see whether the added morphological predictors of the Full Model explained any variance that was not already explained by the BPP predictors. In sum, this part of the analysis tested the extent to which the typical phonological context of words could be behind any gradient morpho-phonetic effects on Dutch final /ən/.

3.2. Results

3.2.1. Are the speaker groups applicable to the durational data?

As Figure 8 illustrates, the Full model, with Speaker Group as predictor, had a lower WAIC (63532.15) than the No Groups model (63599.63). This indicates that the deletion-derived speaker groups also explained a considerable amount of variance in the duration analysis.

Figure 8
Figure 8

WAIC scores of all the models that were fitted to the durational data.

3.2.2. Does morphological category affect durational aspects of final -en?

Figure 8 illustrates that the Morphology model has a lower WAIC (63529.40) than the Baseline model (63643.01), indicating a better fit of the model including Morphological Category. This shows that Morphological Category explains variance in the /n/, /ə/ and/or nasalization duration distributions.

The nature of the Morphological Category effect is visualized in Figure 9, which also shows the effect of the following Phonological Context, and it is summarized in Tables 8, 9 and 10. For /ə/ duration, the inserters and neutrals showed almost no differences between morphological categories, whereas the deleters produced longer /ə/ for plural nouns and infinitive verbs compared to plural verbs and words with nonsuffix -en.

Figure 9
Figure 9

Estimated average effects of Phonological Context (left) and Morphological Category (right) on the durations of /ə/ (top) and /n/ (middle), and the proportion of /ə/ that is nasalized (bottom) in the Morphology model.

Table 8

Posterior distribution of the /ə/ parameters in the Morphology model of -/ən/ duration. All estimates are in scaled log seconds. Estimates for individual levels of the Word and Speaker variables are omitted for the sake of brevity. Shaded cells indicate contrasts with a 95% credible interval that does not include 0.

Fixed Parameters Mean 95% lower boundary 95% upper boundary
Speech Rate –0.07 –0.09 –0.05
Word Frequency –0.05 –0.11 0.01
Next Probability –0.02 –0.05 0.00
Previous Probability 0.01 –0.02 0.03
Next Informativity –0.05 –0.11 0.01
Previous Informativity 0.00 –0.06 0.06
Prev. Phon. Context: Liq. 0.41 –0.23 1.02
Prev. Phon. Context: App. 0.08 –0.57 0.70
Prev. Phon. Context: Nas. –0.18 –0.83 0.43
Prev. Phon. Context: Fri. –0.06 –0.70 0.54
Prev. Phon. Context: Plo. 0.02 –0.61 0.63
Fixed Parameters by Speaker Group Inserters Neutrals Deleters
Mean 95% lower 95% upper Mean 95% lower 95% upper Mean 95% lower 95% upper
Phon. Context: C –0.24 –1.04 0.60 –0.21 –1.05 0.62 –0.22 –1.07 0.57
Phon. Context: P 0.15 –0.65 0.98 0.12 –0.73 0.94 0.16 –0.68 0.96
Phon. Context: V –0.08 –0.90 0.75 0.04 –0.81 0.89 0.69 –0.16 1.50
Morph. Cat.: NS –0.03 –0.82 0.77 –0.06 –0.84 0.74 –0.00 –0.81 0.81
Morph. Cat.: NOUN –0.01 –0.80 0.80 0.04 –0.74 0.84 0.30 –0.50 1.11
Morph. Cat.: PL –0.06 –0.85 0.74 0.03 –0.75 0.83 –0.02 –0.81 0.80
Morph. Cat.: INF –0.13 –0.92 0.68 –0.05 –0.84 0.76 0.31 –0.49 1.13
Phonological Contrasts
C – P –0.39 –0.51 –0.28 –0.33 –0.42 –0.23 –0.38 –0.51 –0.26
V – P –0.23 –0.47 0.01 –0.08 –0.27 0.12 0.54 0.37 0.70
C – V –0.16 –0.41 0.08 –0.25 –0.44 –0.06 –0.92 –1.08 –0.76
Morphological Contrasts
NOUN – NS 0.02 –0.11 0.15 0.10 –0.01 0.21 0.31 0.17 0.44
PL – NS –0.04 –0.18 0.11 0.09 –0.04 0.22 –0.01 –0.16 0.13
INF – NS –0.10 –0.24 0.04 0.01 –0.12 0.14 0.32 0.16 0.48
PL – NOUN –0.06 –0.17 0.05 –0.01 –0.11 0.08 –0.32 –0.44 –0.20
INF – NOUN –0.13 –0.22 –0.02 –0.09 –0.19 0.01 0.01 –0.12 0.15
INF – PL –0.07 –0.19 0.06 –0.08 –0.19 0.03 0.33 0.18 0.48
Random Parameters Mean 95% lower boundary 95% upper boundary
SD Word 0.17 0.15 0.18
SD Speaker: Phon.: C 0.25 0.18 0.32
SD Speaker: Phon.: P 0.42 0.35 0.50
SD Speaker: Phon.: V 0.78 0.69 0.88
SD Speaker: Morph.: NS 0.13 0.01 0.26
SD Speaker: Morph.: NOUN 0.07 0.00 0.17
SD Speaker: Morph.: PL 0.10 0.00 0.24
SD Speaker: Morph.: INF 0.18 0.07 0.31
Table 9

Posterior distribution of the /n/ parameters in the Morphology model of -/ən/ duration. All estimates are in scaled log seconds. Estimates for individual levels of the Word and Speaker variables are omitted for the sake of brevity. Shaded cells indicate contrasts with a 95% credible interval that does not include 0.

Fixed Parameters Mean 95% lower boundary 95% upper boundary
Speech Rate –0.17 –0.18 –0.15
Word Frequency –0.08 –0.12 –0.03
Next Probability 0.01 –0.01 0.03
Previous Probability –0.01 –0.03 0.01
Next Informativity 0.01 –0.03 0.06
Previous Informativity –0.05 –0.09 –0.01
Prev. Phon. Context: Liq. –0.09 –0.71 0.53
Prev. Phon. Context: App. –0.16 –0.79 0.47
Prev. Phon. Context: Nas. –0.11 –0.73 0.52
Prev. Phon. Context: Fri. –0.22 –0.85 0.41
Prev. Phon. Context: Plo. –0.13 –0.76 0.50
Fixed Parameters by Speaker Group Inserters Neutrals Deleters
Mean 95% lower 95% upper Mean 95% lower 95% upper Mean 95% lower 95% upper
Phon. Context: C –0.49 –1.34 0.32 –0.45 –1.26 0.37 –0.49 –1.31 0.32
Phon. Context: P 0.78 –0.07 1.60 0.65 –0.16 1.49 0.46 –0.36 1.29
Phon. Context: V –0.39 –1.24 0.42 –0.33 –1.16 0.50 –0.57 –1.39 0.25
Morph. Cat.: NS 0.04 –0.75 0.82 –0.01 –0.79 0.79 –0.01 –0.80 0.75
Morph. Cat.: NOUN 0.02 –0.77 0.79 –0.02 –0.81 0.78 –0.17 –0.97 0.59
Morph. Cat.: PL –0.11 –0.89 0.66 –0.08 –0.87 0.72 –0.22 –1.01 0.54
Morph. Cat.: INF –0.02 –0.80 0.76 –0.09 –0.88 0.70 –0.32 –1.12 0.44
Phonological Contrasts
C – P –1.27 –1.40 –1.14 –1.11 –1.21 –1.00 –0.96 –1.07 –0.84
V – P –1.16 –1.33 –0.99 –0.99 –1.11 –0.86 –1.03 –1.16 –0.91
C – V –0.10 –0.23 0.02 –0.12 –0.22 –0.02 0.08 –0.02 0.17
Morphological Contrasts
NOUN – NS –0.02 –0.12 0.09 –0.01 –0.10 0.08 –0.16 –0.26 –0.06
PL – NS –0.14 –0.26 –0.03 –0.06 –0.16 0.03 –0.21 –0.31 –0.10
INF – NS –0.05 –0.17 0.05 –0.08 –0.18 0.02 –0.31 –0.43 –0.20
PL – NOUN –0.13 –0.21 –0.04 –0.06 –0.13 0.02 –0.05 –0.14 0.05
INF – NOUN –0.04 –0.11 0.04 –0.07 –0.15 0.00 –0.15 –0.25 –0.05
INF – PL 0.09 –0.00 0.18 –0.02 –0.11 0.07 –0.10 –0.21 0.01
Random Parameters Mean 95% lower boundary 95% upper boundary
SD Word 0.07 0.06 0.09
SD Speaker: Phon.: C 0.23 0.17 0.30
SD Speaker: Phon.: P 0.42 0.35 0.49
SD Speaker: Phon.: V 0.19 0.12 0.25
SD Speaker: Morph.: NS 0.15 0.05 0.23
SD Speaker: Morph.: NOUN 0.10 0.02 0.17
SD Speaker: Morph.: PL 0.07 0.00 0.15
SD Speaker: Morph.: INF 0.14 0.05 0.23
Table 10

Posterior distribution of the nasalization parameters in the Morphology model of -/ən/ duration. All estimates are in scaled log seconds, except for the contrasts, which are in proportions. Estimates for individual levels of the Word and Speaker variables are omitted for the sake of brevity. Shaded cells indicate contrasts with a 95% credible interval that does not include 0.

Fixed Parameters Mean 95% lower boundary 95% upper boundary
Speech Rate –0.03 –0.05 –0.00
Word Frequency –0.06 –0.12 0.01
Next Probability –0.00 –0.03 0.02
Previous Probability 0.00 –0.03 0.03
Next Informativity –0.01 –0.07 0.05
Previous Informativity –0.02 –0.08 0.04
Prev. Phon. Context: Liq. 0.01 –0.60 0.61
Prev. Phon. Context: App. –0.05 –0.67 0.57
Prev. Phon. Context: Nas. 0.36 –0.25 0.96
Prev. Phon. Context: Fri. –0.10 –0.70 0.51
Prev. Phon. Context: Plo. –0.12 –0.72 0.49
Fixed Parameters by Speaker Group Inserters Neutrals Deleters
Mean 95% lower 95% upper Mean 95% lower 95% upper Mean 95% lower 95% upper
Phon. Context: C 0.09 –0.72 0.92 0.05 –0.80 0.88 –0.04 –0.87 0.80
Phon. Context: P 0.11 –0.70 0.94 0.09 –0.76 0.92 0.09 –0.74 0.93
Phon. Context: V –0.21 –1.03 0.62 –0.15 –1.00 0.68 –0.01 –0.84 0.83
Morph. Cat.: NS –0.01 –0.85 0.80 0.02 –0.78 0.82 0.07 –0.71 0.87
Morph. Cat.: NOUN 0.08 –0.75 0.89 –0.05 –0.85 0.74 –0.02 –0.79 0.78
Morph. Cat.: PL –0.03 –0.86 0.78 0.08 –0.73 0.88 –0.11 –0.89 0.69
Morph. Cat.: INF 0.00 –0.83 0.80 –0.04 –0.85 0.76 0.02 –0.76 0.81
Phonological Contrasts
C – P 0.06 –0.00 0.16 0.04 –0.01 0.11 –0.00 –0.08 0.07
V – P –0.12 –0.27 –0.03 –0.11 –0.24 –0.03 –0.13 –0.27 –0.05
C – V 0.17 0.07 0.37 0.14 0.05 0.30 0.12 0.04 0.26
Morphological Contrasts
NOUN – NS 0.04 –0.02 0.14 –0.05 –0.14 0.00 –0.09 –0.21 –0.02
PL – NS –0.00 –0.09 0.08 0.01 –0.06 0.09 –0.08 –0.20 –0.01
INF – NS 0.03 –0.05 0.11 –0.03 –0.12 0.03 –0.08 –0.20 –0.00
PL – NOUN –0.04 –0.13 0.02 0.07 0.01 0.15 0.01 –0.05 0.07
INF – NOUN –0.02 –0.08 0.03 0.02 –0.02 0.08 0.02 –0.05 0.09
INF – PL 0.03 –0.04 0.11 –0.04 –0.13 0.01 0.01 –0.07 0.09
Random Parameters Mean 95% lower boundary 95% upper boundary
SD Word 0.09 0.07 0.12
SD Speaker: Phon.: C 0.15 0.08 0.22
SD Speaker: Phon.: P 0.31 0.24 0.37
SD Speaker: Phon.: V 0.30 0.21 0.38
SD Speaker: Morph.: NS 0.07 0.00 0.18
SD Speaker: Morph.: NOUN 0.07 0.00 0.15
SD Speaker: Morph.: PL 0.10 0.01 0.22
SD Speaker: Morph.: INF 0.06 0.00 0.15

For /n/ duration, the neutrals once again showed no morphological differences, while the inserters and deleters showed different morphological effects. The inserters showed reduced /n/ duration for plural verbs compared to the nonsuffix category. The deleters showed this reduction in /n/ duration for all suffix categories, with the shortest durations for infinitive verbs.

For the proportion of schwa nasalization, the inserters showed no clear differences between morphological categories. The neutrals showed somewhat less nasalization of schwa in plural nouns than in plural verbs. The deleters showed the clearest effect of Morphological Category, with a reduction in schwa nasalization for all suffix categories.

3.2.3. Does typical phonological context explain morphological effects on /ən/ duration?

A comparison between the Baseline model and the BPP model revealed that including the BPP predictor by itself results in a slightly lower WAIC score (63630.59 versus 63643.01). However, as Figure 10 illustrates, the effect is very small: the lowest and highest values of the BPP predictor resulted in estimated duration differences of approximately 1 ms and nasalization proportion differences of 0.01.

Figure 10
Figure 10

Average predicted effects of BPP on the duration of /ə/ and /n/ and proportion of nasalization for the inserters and the deleters speaker groups in the BPP model. The shaded areas indicate 95% credible intervals.

The WAIC comparison between the Full model and the BPP model reveals that adding the Morphological Category predictor to a model that already includes the BPP variable results in a considerably better fitting model (63532.15 vs. 63630.59). This suggests that the BPP measure does not fully explain any morphological effects on /ən/ duration.

4. Discussion

By investigating the degree of reduction in Dutch word-final /ən/, the current research aimed to achieve two main objectives: 1) provide more evidence for morphological effects on phonetic reduction, and 2) explore whether a potential morpho-phonetic effect could be explained by the phonological contexts the different morphological categories typically occur in. To establish these goals, we also needed to take differences between speaker groups into account, as the phenomenon under investigation (word-final /ən/ reduction in Dutch) reportedly is subject to large interspeaker variation.

The speakers in our dataset were categorized into three speaker groups. The results in Figure 1 show that the inserters and deleters speaker groups are rather homogenous in terms of the /n/ deletion patterns they show before consonants, pauses, and vowels. If these patterns had not been homogenous within speaker groups, we would not have been able to generate predictions about how the typical phonological context of a word should affect /n/ deletion rates for the different speaker groups. After all, the mechanism behind the effect of the typical phonological context relies on a clear contrast in /n/ deletion rates before a pause compared to other contexts. Interestingly, deleters and inserters are not only differentiated in terms of /n/ deletion rate differences between phonological contexts but also in terms of overall /n/ deletion rates. Specifically, deleters are more likely to delete than inserters regardless of phonological context, and before a pause this difference is amplified. In contrast, the neutrals are not as clearly defined in terms of their behaviour in the different phonological contexts. This suggests that our method for categorizing speakers into speaker groups could be improved, either in terms of the algorithm used or the assumptions that were made, for example, the assumption of only three identifiable speaker groups. Regardless of the problems with the neutrals, the clear distinction between the inserters and the deleters formed a solid basis to pursue the main goals of the investigation.

Although this was not one of the goals of our study, we produced Figure 2 to see to what extent our speaker groups corresponded to regional groups, as previous work has used region to characterize interspeaker differences in /n/ deletion (e.g., Goeman, 2001). This figure suggests that the Central and Southern areas have a higher proportion of deleters and a lower proportion of inserters than the Transitional and Northeastern areas. This is in line with previous observations that /n/ is realised more frequently in the Northeastern parts of the Netherlands (e.g., Goeman, 2001). However, the figures also show that our data contained many more deleters than inserters across all regions. Hence, although the speaker groups line up with expected regional differences to some extent, the region-based and behaviour-based methods of grouping speakers are definitely not interchangeable.

Regarding the first main objective of this study, the deletion study established that the morphological status of Dutch final -en affects the probability that the /n/ is realized (see the WAIC comparison in Figure 5). Crucially, the morphological predictor even explains variance in a statistical model that also contains the direct phonological context as a predictor. These results confirm previous indications of morpho-phonetic effects on Dutch final /ən/. They show that not only the realization of word-final /s/ in English (e.g., Plag et al., 2017; Seyfarth et al., 2018; Schmitz et al., 2021a) but also word-final segments in other languages may show effects of these segments’ morphological functions. Models of speech production therefore need to account for how articulation is informed by a word’s morphological structure.

Our findings also provide the clearest picture yet regarding interspeaker differences in the morphological effect on the production of final /ən/ through a sufficiently large dataset and modern statistical methods (see Figure 3). Both deleters and inserters realize the /n/ more frequently in nonsuffix words than in plural verbs. However, the deleters have much lower /n/ realisation rates for both plural nouns and infinitive verbs than for plural verbs, whereas the inserters realise the /n/ most frequently in plural nouns and have a realisation rate for infinitive verbs that sits in-between the rates for nonsuffix words and plural verbs.

The duration study also found evidence for morpho-phonetic effects, as the statistical model including morphological predictors outperformed the baseline model. Importantly, the morphological effects on /n/ duration are similar to those on /n/ deletion: Both deleters and inserters realize longer nasal stops in nonsuffix words than in plural verbs, and while deleters have relatively short /n/ durations for both plural nouns and infinitive verbs, inserters have /n/ durations for these categories that are relatively similar to those in nonsuffix words.

We also found morphological effects on /ə/ and nasalization duration. Deleters’ shorter /n/ productions in plural nouns and infinitives (compared to nonsuffix words) were accompanied by longer /ə/ productions and less nasalization. As such, we can conclude that it is not the /ən/ suffix as a whole that is reduced (the suffix reduction hypothesis of 3a) but rather the /n/ (the /n/-reduction hypothesis of 3b). Hence, our study is not only the first to find that the gradient reduction of Dutch final /ən/ is affected by morphology, but also to show that these gradient effects may be explained as an incomplete form of the categorical deletion effects.

We should also briefly discuss the results of the speakers belonging to the neutrals group. This group showed fewer morpho-phonetic effects across the different analyses. One reason for this result may be that this group is more heterogenous in terms of their /n/ realization rate in the different phonological contexts (see top row of Figure 1). Moreover, the individual speakers within the neutrals group showed smaller differences in /n/ realization rate across the different phonological contexts (see top row of Figure 1). As a result, it is to be expected that any morpho-phonetic effects driven by typical phonological context were less likely to surface. This does not mean that these speakers are not sensitive to other (context-driven) mechanisms that may result in morpho-phonetic effects. In fact, this group did show a very clear contrast in /n/ realization rate between infinitive verbs and all other categories (see Figure 3). In the current research we did not investigate the potential mechanism behind this effect.

Regarding the second main objective of this study, we investigated whether pronunciations associated with words’ typical phonological context may be the driving factor behind morpho-phonetic effects. To some extent, the results from the Morphological models of /n/ deletion and /n/ duration are in line with such an explanation. As deleters tend to delete /n/ before a pause and inserters tend to realise /n/ before a pause, we would expect the largest deleter-inserter differences in /n/ realisation rates and /n/ duration for plural nouns and infinitive verbs, which occur most frequently before pauses. This is exactly what is visualized in Figure 3 and Figure 9, respectively.

To further investigate the effect of typical phonological context, we investigated the effect of a given word’s frequency before a pause relative to its frequency before consonants or vowels (Before Pause Proportion, BPP). If typical phonological context plays a role, a higher BPP should be associated with higher /n/ realisation rates and longer /n/ for inserters, but it should be associated with lower /n/ realisation rates and shorter /n/ for deleters. While we found no clear BPP effect on /n/ duration, we did for /n/ deletion, as shown in Figure 5. Moreover, the interaction plot between BPP and Speaker Group in Figure 6 reveals that the effect on /n/ realisation is in the expected direction for both inserters and deleters. To find out whether the morpho-phonetic effects on /n/ realisation can be completely explained as an effect of typical phonological context, we also compared the BPP model to the Full model, which contained both BPP and Morphological Category as predictors. Figure 5 shows that adding Morphological context improves the model even if BPP is already included. It follows that typical phonological context, as implemented in the current study, does not fully explain the morpho-phonetic effects on /n/ realisation.

The results definitely also show, however, that BPP and Morphological Context explain some of the same variation. Figure 5 shows that, while the addition of the BPP predictor to the Baseline model greatly improves the fit to the data, addition of this predictor to the Morphology model (resulting in the Full model) only results in a small improvement. This suggests that BPP does not have much explanatory power that is not already contained in Morphological Category. Moreover, inspection of Figure 7 reveals smaller differences between inserters and deleters in the respective BPP and Morphology effects of the Full model than of the BPP and Morphology models. Presumably, the Full model shows these smaller differences because a similar amount of variation is modelled by two predictors instead of one in the BPP and Morphology models.

The finding that words’ typical phonological context can explain a small part of the morpho-phonetic effect on Dutch final /ən/ illustrates that the mechanisms behind morpho-phonetic effects do not necessarily involve abstract morphological categories. It is possible that differences in phonetic realization between morphological categories arise over time simply because words with similar functions occur in similar contexts (be it phonological or otherwise) which are associated with phonetic reduction or enhancement. If we adopt the assumptions made in the literature on frequency (Bybee, 2002) and predictability (Seyfarth, 2014) effects on reduction, previous phonetic realizations of a word influence future realizations of that word. So, if words tend to occur in a reducing context, they are likely also more reduced outside of that context. Taken to its extreme, this theory of phonetic reduction would claim that morpho-phonetic effects are epiphenomena (Brown, 2013), and that if we account for how often words are exposed to all contextual factors that influence phonetic reduction, we would not need to refer to morphological categories to explain differences in phonetic reduction across words.

More research is needed to determine whether other effects exist that, together with the typical phonological context effect, completely explain away morpho-phonetic effects. In the absence of such other effects, the results of both deletion and duration studies remain relevant for models of word production. The deletion results suggest that such models should contain mechanisms by which a word’s morphological status can affect whether a segment is pronounced. Additionally, the duration results suggest that these models should also allow for more subtle durational differences in pronunciation to be influenced by morphological status. As such, the present results contribute to the evidence in favour of models that allow for phonetically detailed representations of words or sub-word structures which can be influenced by the frequency with which different pronunciations are encountered and produced, such as exemplar models (e.g., Bybee, 2001) and discriminative learning models (e.g., Baayen et al., 2019).

In sum, the current research achieved its objectives by providing more evidence for morpho-phonetic effects and by establishing that, at least for Dutch word-final -en, words’ typical phonological contexts can only partially explain morpho-phonetic effects. Our results suggest that morphological status may influence the part of the production process that determines the exact acoustic details with which a word is produced. Moreover, the present results reaffirm the notion that models of word production should allow for phonetic detail in representations that are associated with lexical or morphological meaning.

Additional files

Data, materials and code generated in this research can be accessed here: https://doi.org/10.34973/zgqk-0294.

Acknowledgements

The authors acknowledge the editors and reviewers for their helpful feedback as well as Ingo Plag and Vsevolod Kapatsinski for their very useful comments on an earlier version of this research.

Funding Information

This research was funded by the Deutsche Forschungsgemeinschaft (Research Unit FOR2373 ‘Spoken Morphology’, Project ER 547/1-1 ‘Dutch morphologically complex words: The role of morphology in speech production and comprehension’), which we gratefully acknowledge.

Competing interests

The authors have no competing interests to declare.

Authors’ contributions

TZ, LtB and ME contributed to the conception and design of the research. TZ was responsible for data processing, analysis and the first draft of the manuscript. LtB and ME revised and provided feedback on the initial draft.

Notes

  1. To make this claim, one would have to compare the model described in E. Brown (2013) to a model without the Word Class predictor using a measure like the F-statistic or AIC. [^]

References

Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., & Blevins, J. P. (2019). The discriminative lexicon: a unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (De)composition but in linear discriminative learning. Complexity 2019, 1–39.  http://doi.org/10.1155/2019/4895891

Bell, A., Brenier, J. M., Gregory, M., Girand, C., & Jurafsky, D. (2009). Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language, 60(1), 92–111.  http://doi.org/10.1016/j.jml.2008.06.003

Brown, E. (2013). Word classes in studies of phonological variation: Conditioning factors or epiphenomena. In Selected proceedings of the 15th Hispanic linguistics symposium (pp. 179–186). Cascadilla Press. https://lccn.loc.gov/2015413085

Brown, E. (2018). Cumulative exposure to phonetic reducing environments marks the lexicon. In K. A. Smith & D. Nordquist (Eds.), Functionalist and Usage-Based approaches to the study of language (pp. 127–153). John Benjamins Publishing Company.  http://doi.org/10.1075/slcs.192

Brown, E. K. (2018). The company that word-boundary sounds keep. In K. A. Smith & D. Nordquist (Eds.), Functionalist and Usage-Based approaches to the study of language (pp. 108–125). John Benjamins Publishing Company.  http://doi.org/10.1075/slcs.192

Bybee, J. (2001). Phonology and Language Use. Cambridge University.  http://doi.org/10.1017/CBO9780511612886

Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language variation and change, 14(3), 261–290.  http://doi.org/10.1017/S0954394502143018

Cho, T., Kim, D., & Kim, S. (2017). Prosodically-conditioned fine-tuning of coarticulatory vowel nasalization in English. Journal of Phonetics, 64, 71–89.  http://doi.org/10.1016/j.wocn.2016.12.003

Dutch Language Institute. (2021). Corpus Hedendaags Nederlands (Version 3.0) [Online Service]. http://hdl.handle.net/10032/tm-a2-s8

Ganzeboom, M. (2018). CLST ASR Forced Aligner. https://lst.cls.ru.nl/clst-asr/doku.php?id=forced-aligner

Goeman, T. (2001). Morfologische Condities op n-behoud en n-deletie in dialecten van Nederland. Taal & Tongval, 14, 52–88.

Hanique, I., & Ernestus, M. (2012). The role of morphology in acoustic reduction is important. Lingue e linguaggio, 11(2), 147–164.  http://doi.org/10.1418/38783

Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100–108.  http://doi.org/10.2307/2346830

Hendrickx, I., van den Bosch, A., van Gompel, M., & van der Sloot, K. (2016, June). Frog, a natural language processing suite for Dutch (Reference Guide). Radboud University Nijmegen. https://github.com/LanguageMachines/frog/blob/master/docs/frogmanual.pdf

Levelt, W. J., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and brain sciences, 22(1), 1–38.  http://doi.org/10.1017/S0140525X99001776

McElreath, R. (2020). Statistical Rethinking: A Bayesian course with examples in R and Stan (2nd ed.). Boca Raton: CRC Press.  http://doi.org/10.1201/9780429029608

Oostdijk, N. (2002). The design of the Spoken Dutch Corpus. In P. Peters, P. Collins, & A. Smith (Eds.), New Frontiers of Corpus Research (pp. 105–112). Rodopi.  http://doi.org/10.1163/9789004334113_008

Plag, I., Homann, J., & Kunter, G. (2017). Homophony and morphology: The acoustics of word-final S in English. Journal of Linguistics, 53(1), 181–216.  http://doi.org/10.1017/S0022226715000183

Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., Silovský, J., Stemmer, G., & Veselý, K. (2011). The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society.

Raymond, W. D., Brown, E., & Healy, A. F. (2016). Cumulative context effects and variant lexical representations: Word use and English final t/d deletion. Language Variation and Change, 28(2), 175–202.  http://doi.org/10.1017/S0954394516000041

Roelofs, A. (2000). WEAVER++ and other computational models of lemma retrieval and word-form encoding. In Aspects of language production (pp. 71–114). Psychology Press.  http://doi.org/10.4324/9781315804453

Schmitz, D., Baer-Henney, D., & Plag, I. (2021a). The duration of word-final /s/ differs across morphological categories in English: Evidence from pseudowords. Phonetica, 78(5–6), 571–616.  http://doi.org/10.1515/phon-2021-2013

Schmitz, D., Plag, I., Baer-Henney, D., & Stein, S. D. (2021b). Durational differences of word-final /s/ emerge from the lexicon: Modelling morpho-phonetic effects in pseudowords with linear discriminative learning. Frontiers in Psychology, 12, 680889.  http://doi.org/10.3389/fpsyg.2021.680889

Seyfarth, S. (2014). Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition, 133(1), 140–155.  http://doi.org/10.1016/j.cognition.2014.06.013

Seyfarth, S., Garellek, M., Gillingham, G., Ackerman, F., & Malouf, R. (2018). Acoustic differences in morphologically-distinct homophones. Language, Cognition and Neuroscience, 33(1), 32–49.  http://doi.org/10.1080/23273798.2017.1359634

Sóskuthy, M., & Hay, J. (2017). Changing word usage predicts changing word durations in New Zealand English. Cognition, 166, 298–313.  http://doi.org/10.1016/j.cognition.2017.05.032

Tang, K., & Shaw, J. A. (2021). Prosody leaks into the memories of words. Cognition, 210, 104601.  http://doi.org/10.1016/j.cognition.2021.104601

Tomaschek, F., Plag, I., Ernestus, M., & Baayen, R. H. (2021). Phonetic effects of morphology and context: Modeling the duration of word-final S in English with naïve discriminative learning. Journal of Linguistics, 57(1), 123–161.  http://doi.org/10.1017/S0022226719000203

van Bergem, D. R. (1994). A model of coarticulatory effects on the schwa. Speech Communication, 14, 143–162.  http://doi.org/10.1016/0167-6393(94)90005-1

van Oostendorp, M. (2001). Nasal consonants in variants of Dutch and some related systems. Neerlandistiek, 2001. https://dspace.library.uu.nl/bitstream/handle/1874/28504/article.pdf

Van de Velde, H. (1996). Variatie en verandering in het gesproken Standaard-Nederlands (1935–1993) [Doctoral dissertation, Radboud University]. https://hdl.handle.net/2066/146159

Van de Velde, H., & van Hout, R. (1998). Dangerous aggregations. a case study of Dutch (n) deletion. Papers in Sociolinguistics, 137–147.

Van de Velde, H., & Van Hout, R. (2000). N-deletion in reading style. Linguistics in the Netherlands, 17(1), 209–219.  http://doi.org/10.1075/avt.17.20van

Van de Velde, H., & van Hout, R. (2001). Sprekertypologie met betrekking tot de realisering van de slot-n in het Standaard-Nederlands. Taal & Tongval, 14, 89–112.

Van de Velde, H., & van Hout, R. (2003). De deletie van de slot-n. Nederlandse taalkunde, 8(2), 93–114.

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432.  http://doi.org/10.1007/s11222-016-9696-4