1. Introduction

External sandhi processes, where a phonological alternation is triggered across word boundaries, have been subject to extensive study, particularly with respect to locality restrictions on their application and the implications this has for theories of speech planning (see Wagner, 2012 on the Production Planning Hypothesis, and more recently Kilbourn-Ceron, 2017; Tamminga, 2018). However, formal accounts of how these processes exhibit sensitivity to phrasal boundaries often fail to capture the various ways in which such an effect may be conditioned. The collinearity between boundary phenomena such as pause and phrase-final segmental lengthening poses a serious problem for research into the mechanisms conditioning such effects: Are they conditioned directly by adjacency to prosodic boundaries of particular strengths, or do they reflect a more general sensitivity to segmental duration or pause? This study seeks to disentangle the close relationship between these factors, and does so by investigating one particular case of external sandhi that has been often overlooked in variationist linguistics.

The variable presence of post-nasal [ɡ] in words such as sing [sɪŋɡ]∼[sɪŋ] and wrong [ɹɒŋɡ]∼[ɹɒŋ] is a characteristic feature of the varieties of English spoken in the North West and West Midlands of England.1 This phenomenon, which Wells (1982, p. 365) refers to as ‘velar nasal plus,’ has been documented in a number of dialectological handbooks (e.g., Hughes, Trudgill, & Watt, 2012; Trudgill, 1999; Wakelin, 1984) but has scarcely been investigated under the variationist paradigm. As a result of this, while its diachronic pathway of change has been explored in detail (see Bermúdez-Otero, 2011; Bermúdez-Otero & Trousdale, 2012), synchronic patterns of variation in velar nasal plus remain comparatively understudied.

This paper provides evidence that variation in [ŋɡ] clusters, hereafter denoted by (ng) using standard sociolinguistic convention, is less stable than previously thought; specifically, the behaviour of (ng) in pre-pausal position appears to be undergoing change in apparent time, whereby younger speakers are reanalyzing this environment as one that favours [ɡ]-presence. The primary goal of this paper is to investigate the mechanisms underlying this innovation, specifically to disentangle the collinearity between three factors that on the surface appear to condition this effect: segmental duration, prosodic boundary strength, and the presence/duration of a following pause. In doing so, this study adds to a growing body of evidence outlining how probabilistic lenition processes behave before phrasal boundaries, and its results have implications for ongoing research into the conditioning factors of external sandhi.

Drawing upon production data from an elicitation task, it is shown that the probability of surface [ɡ]-presence is most strongly correlated with the duration of pause that follows it, independent of the word’s position in the utterance or intonational phrase. The presence of a following pause is also highly collinear with the duration of the preceding nasal due to the effects of pre-boundary segmental lengthening, but the former is a much stronger predictor of the variation in [ɡ]-presence. Thus, velar nasal plus in northern English dialects shows no evidence of direct reference to segmental duration (cf. Lavoie, 2001), and there is only weak evidence of sensitivity to phrasal prosodic categories (cf. Nespor & Vogel, 1986); rather, the results of this study emphasize the importance of the temporal relationship between the target and trigger in external sandhi processes.

The structure of this paper is as follows: Section 2.1 introduces velar nasal plus and outlines the current body of knowledge regarding how its patterns of variation are structured along social and language-internal dimensions; Section 2.2 provides a summary of the literature on how pausal boundaries affect other probabilistic external sandhi processes, and highlights a number of ways in which the conditioning of external sandhi has been accounted for in phonological theory; the discussion of pre-boundary lengthening in Section 2.3 foregrounds the collinearity issues explored in this paper, whose research goals are then re-stated in Section 2.4.

The methodology undertaken for this study is outlined in Section 3, detailing the methods of data collection and in particular how the elicitations were carefully designed in order to invoke different magnitudes of pre-boundary lengthening. The results of this study are split into two subsections: Section 4.1 presents evidence from sociolinguistic interviews of a change in apparent time with respect to the rates of [ɡ]-presence pre-pausally, and Section 4.2 addresses the primary goal of this paper by exploring how this innovation is represented in speakers’ grammars through analysis of an elicited reading task. Although the focus of this paper is to uncover the precise mechanisms that condition this innovation, discussed in Section 5.1, part of the discussion is also dedicated to addressing the social and/or internal factors that actually motivate this change; in Section 5.2, a number of possibilities are proposed, specifically whether this diachronic change reflects a shift in the social meaning and evaluation of the local form, or stems from the inherent variability of external sandhi processes compared with word-internal phenomena.

2. Background

2.1. Velar nasal plus

It should be pointed out that post-nasal [ɡ] was once present across all varieties of English, before it began to undergo deletion in the Late Modern English period. Bermúdez-Otero and Trousdale (2012), drawing upon reports by eighteenth-century orthoepist James Elphinston as discussed by Garrett and Blevins (2009), provide a particularly enlightening account of this change. They show how the phonological /ɡ/-deletion rule progressed through the grammar such that in varieties of Present Day English, [ŋɡ] clusters are only ever present pre-vocalically in monomorphemic or root-based items such as finger or elongate, in addition to a small set of lexically-listed exceptions (the comparative and superlative forms of strong, long, and young).

Although this coda-targeting deletion rule ran to completion in most varieties of English, the non-coalesced [ŋɡ] form was not lost everywhere; variation in (ng) still exists today in these varieties spoken in the North West and West Midlands of England. Although we know very little about the synchronic variation of (ng) in these communities, the presence of post-nasal [ɡ] is well-documented in the dialectological literature (e.g., Hughes et al., 2012; Trudgill, 1999; Wakelin, 1984). It has been documented in Birmingham (Thorne, 2003), Cannock (Heath, 1980), Liverpool (Knowles, 1973), West Wirral (Newbrook, 1999), Manchester (Bailey, 2015; Schleef, Flynn, & Ramsammy, 2015), and in Sandwell and the surrounding Black Country (Asprey, 2015; Mathisen, 1999). These areas all fall within the North West or West Midlands of England, corresponding with the Survey of English Dialects isogloss (Orton, Sanderson, & Widdowson, 1978) as well as more recent dialectological surveys (MacKenzie, Bailey, & Turton, 2017). However, these studies do not go beyond pointing out the presence of this form, and many in fact do not acknowledge that its presence is variable in those communities in which it is attested, let alone explore the factors that condition such variation. With many of them also relying on impressionistic and auditory analysis, variation in (ng) has simply not been subject to the same sociophonetic scrutiny as other variables.

While variation in (ng) does historically stem from a deletion rule, it is possible that at this point in time the synchronic system does not work that way, and that some tokens of post-nasal [ɡ] surface instead from an insertion process. Determining whether or not this is the case is beyond the scope of this paper, and as such the subsequent discussion of (ng) variation will remain theory neutral, referring only to presence or absence of [ɡ] and not to the process assumed to underpin this variation.

The observation that [ɡ]-presence is favoured before pause, with which this paper is primarily concerned, has not been discussed explicitly in other studies. However, the observation that (ng) shows strong stylistic stratification could provide supportive evidence for this effect; both Mathisen (1999) and Bailey (2015) report high rates of [ɡ]-presence in word-list elicitations. The conventional and most immediate interpretation of this is of course that [ɡ]-presence is considered the ‘prestige’ form and that this style-shifting simply reflects adherence to this norm in more conscious speech styles. However, it should be noted that these word-list elicitation tasks conflate two things: formality, and phonological environment. In other words, do we find more [ɡ]-presence in word-list elicitations because this form is considered the standard and is therefore more frequent in formal discourse styles, or is it actually because in this style the tokens of (ng) are elicited with clear pauses and prosodic breaks between each item? It is of course possible that the high rate of word-list [ɡ]-presence is in fact attributable to both. The former explanation presupposes that forms with the post-nasal [ɡ] are indeed considered prestigious, but the only study to investigate the evaluation of [ŋɡ] shows no evidence that this is the case (Newbrook, 1999).

A number of studies seem to suggest that the local non-coalesced form, in which post-nasal [ɡ] is present, is increasing in popularity with younger speakers, though few actually provide quantitative evidence in support of such claims. Asprey (2007, p. 90) reports that the presence of [ɡ] is “linked to the younger generations” in the Black Country, and this association between [ŋɡ] and youth speech is echoed by others (see Chinn & Thorne, 2001; Wakelin, 1984). Mathisen’s (1999) work in Sandwell in the West Midlands does, however, provide an empirical grounding to such claims; this increase, described as a ‘revitalization’ of this local form, is being led by young women and the working classes in particular. A preference for velar nasal plus among the working classes is corroborated by Thorne (2003, p. 121), and an increase in its use in apparent time is also found in the speech community of Wilmslow, Cheshire (Watts, 2005, p. 173).

2.2. Boundary effects on other external sandhi processes

Since very little work has been carried out on the language-internal factors influencing (ng), specifically its sensitivity to phonological and prosodic environment and its behaviour pre-pausally, we can instead turn to comparable external sandhi processes that have been subject to more extensive variationist study. One such example is /t,d/-deletion in varieties of English: the reduction of word-final consonant clusters ending with a coronal stop e.g., just [dʒʌst]∼[dʒʌs], proved [pɹuːvd]∼[pɹuːv]. This is remarkably well-studied, having been attested across the world’s varieties of English, and its variation shows similar patterning to (ng). Both involve segmental presence/absence in word-final consonant clusters, and both are sensitive to morphological and syntactic structure in ways consistent with a cyclic analysis. Guy (1991) adopts a Lexical Phonology framework in accounting for the morphological effect on /t,d/-deletion, whereby deletion is less likely for past tense items due to the targeted /t,d/ segment appearing later in the derivation, while diachronic and synchronic accounts of (ng) have been combined under the life cycle of phonological processes (see Bermúdez-Otero & Trousdale, 2012 on the diachronic process of /ɡ/-deletion, and Bailey, 2016b on the synchronic implications that follow from this analysis).

Most importantly for the present study, both processes show sensitivity to the immediate phonological context and do so in an identical manner: [ɡ]-presence is more likely pre-vocalically than pre-consonantally (Knowles, 1973; Upton, Sanderson, & Widdowson, 1987; Watts, 2005), and the same pattern of variation has been shown for /t,d/-deletion in countless studies (e.g., Tamminga, 2016 on Philadelphia English, Baranowski & Turton, 2015 on Manchester English). In fact, Tagliamonte and Temple (2005) claim that this is consistently the strongest predictor of /t,d/-deletion in all varieties of English in which it has been studied. What is not so consistent, however, is how coronal stop deletion behaves pre-pausally. In some varieties, following pauses are said to inhibit deletion even more so than following vowels, e.g., York (Tagliamonte & Temple, 2005), Philadelphia (Guy, 1980), and Chicano English (Santa Ana, 1996), while in others the deletion rate pre-pausally is higher (see Bayley, 1994 on Tejano English and Hazen, 2011 on Appalachian English). For some speakers, particularly those of African American Vernacular English, deletion in pre-pausal environments can be even as high, and sometimes higher, than in pre-consonantal position (e.g., Fasold, 1972 in Washington D.C. and Guy, 1980 in New York).

More recent studies have done away with a categorical coding of pause presence/absence altogether, and instead incorporated pause duration as a gradient factor; Tanner, Sonderegger, and Wagner (2017) show how pause duration, used as a proxy of boundary strength, modulates the effect of following segments such that the influence of a following vowel or consonant on the application of /t,d/-deletion is neutralized when a long pause (100 ms or greater) separates them from the preceding /t,d/ cluster. They argue that this behaviour lends empirical support to the production planning hypothesis (Wagner, 2012): The stronger the prosodic or syntactic boundary between constituents, the less likely it is that the following segmental material has been planned, and as such it can have no effect on the realization of the preceding coronal stop. This has also been recently explored by Tamminga (2018), who finds a similar interaction between the magnitude of the following segment effect and the strength of the syntactic juncture between the target and trigger.

Formal accounts of external sandhi conditioning, specifically the mechanisms that trigger this sensitivity to phonological environment, have also often focused on /t,d/-deletion. A number of explanations of the ‘following segment effect’ have been proposed, with the goal of capturing not just the consistent observation that deletion is more likely pre-consonantally than pre-vocalically, but also the variability of deletion pre-pausally. It has been argued (e.g., Guy, 1980) that the effect stems from the possibility of phrase-level resyllabification, where word-final pre-vocalic /t,d/ variably attaches as an onset to the following word and thus avoids deletion; however, this has been disputed on the grounds that the phonetic realization of word-final pre-vocalic /t,d/ when present on the surface is not comparable to that of a canonical word-initial /t,d/ even though the former is argued to be in onset position (Labov, 1997).

Alternative explanations make reference to the Obligatory Contour Principle (J. J. McCarthy, 1986; Yip, 1988) by highlighting differences in feature similarity between the /t,d/ and a following consonant, liquid, glide, or vowel. Crucially, the inter-dialectal variation in how /t,d/-deletion behaves pre-pausally stems from the fact that the pre-pausal environment by its very nature does not fit into the above hiearchy and is therefore “susceptible to differing analyses by different speakers or dialects” (Guy, 1980, p. 27).

Coetzee (2004) offers yet another proposal, instead relying on licensing by cue (Steriade, 1997) and how these perceptual cues for identifying the place and manner of articulation of stops (namely, their release and also the formant transitions into a following vowel) are realized before consonants, vowels, and pauses. Such an account simply has to stipulate inter-dialectal differences in the ranking of constraints, or alternatively in the phonetic realization of pre-pausal consonants, to capture the difference between varieties in how pre-pausal /t,d/ behaves. Whatever the nature of this sensitivity to the phonological environment, there is ample evidence to suggest that the effect of a following pause on /t,d/-deletion is open to differing analyses between speech communities. We find similar inter-dialectal variation in /s/-weakening across varieties of Spanish; this rule is yet another example of a coda-targeting lenition process, in this case debuccalization from /s/ to [h], where the effect of a following pause is not universal. In standard varieties of Argentinian Spanish, /s/-weakening is blocked pre-pausally (see Kaisse, 1996), contrasting with Caribbean varieties where weakening shows no such sensitivity to pause (see Harris, 1983).

These processes are uncontroversially leniting and therefore any comparison with (ng), which as discussed earlier could conceivably be a case of synchronic [ɡ]-insertion, should be taken with some degree of caution. However, it remains the case that there are clear parallels between these three processes with respect to their phrase-level conditioning: The varying segments ([ɡ], [t]/[d], and [s]) are present before vowel-initial words, absent before consonant-initial words, and show unusual and inconsistent behaviour pre-pausally. In the case of (td)-deletion and (s)-weakening, this registers itself as differences in pausal effects between different varieties, and in the case of (ng) as diachronic instability, as will be illustrated in Section 4.1.

2.3. Pre-boundary lengthening

One often over-looked aspect of how pauses influence variable linguistic phenomena is the way in which they affect suprasegmental features, specifically phonetic duration. It has been observed cross-linguistically that segments in pre-boundary position are longer in duration than those not adjacent to a prosodic boundary. Many reports focus on Indo-European languages (see Lehiste, Olive, & Streeter, 1976 on English; Lindblom, 1968 on Swedish; Delattre, 1966 on French, Spanish, and German), but Hockey and Fagyal (1998) also report it for Hungarian of the Finno-Ugric family, despite this language having phonemic length distinctions.

It is generally considered that pre-boundary lengthening is triggered directly by finality in a prosodic constituent, with the magnitude of lengthening correlated with the size of the constituent in the phonological hierarchy (Gussenhoven & Rietveld, 1992; Wightman, Shattuck-Hufnagel, Ostendorf, & Price, 1992). However, this has recently been disputed by Feldscher and Durvasula (2017), who instead propose that lengthening is triggered directly by pause. There is also evidence indicating language-specific implementations of lengthening, possibly influenced by the role of duration in other areas of the grammar, which suggests that the magnitude of pre-boundary lengthening is sensitive to factors other than just the prosodic hierarchy (Cho, 2016; Turk, 2012). The exact typology of constituents within this hierarchy is also subject to debate, but there is widespread agreement on the ‘major’ categories above the word-level, as well as their relative ordering: The Utterance (U) is higher than the Intonational Phrase (IP), which in turn is higher than the Phonological Phrase (PPH) (Gussenhoven, 2002; Selkirk, 1978).

Given that stronger boundaries elicit longer pauses and greater segmental lengthening, the collinearity between these three factors raises questions regarding the nature of these reported ‘pre-pausal’ effects. What if the effects of a following pause sometimes reflect something more granular, i.e., sensitivity to duration? This was explored by Sproat and Fujimura (1993) in their study of /l/-darkening; they argue that contrary to earlier claims, /l/-darkening is gradient in nature and triggered by a purely durational mechanism in which the darkness of the /l/ is positively correlated with the duration of the rime. Although more recently it has been shown that this is an oversimplification, with ultrasound tongue imaging revealing both a categorical and gradient process of darkening (see Turton, 2014, 2017), it is nevertheless the case that the gradient process of darkening is correlated with duration.

Much like with the parallels drawn between (ng) and (td) in the preceding section, I do not mean to suggest that (ng) and /l/-darkening are comparable variables; this is particularly important in the case of /l/ given that it is not only uncontroversially leniting but also of a non-deleting type. However, regardless of the similarities and differences between these processes, it is possible that the same durational mechanism applies in the case of (ng), even if it is interpreted as insertion. Given that such an insertion process only requires a slight change in gestural timing, where the velum is raised before rather than after cessation of the oral gesture, it would hardly be surprising to find it showing sensitivity to the preceding nasal duration.

2.4. Research questions

In light of the current knowledge summarized in the previous section, this study aims to accomplish two things: to provide evidence of a hitherto-unreported change in progress towards increasing [ɡ]-presence among young speakers, restricted to pre-pausal contexts, and then to solve the collinearity between various boundary phenomena by investigating three related prosodic factors that potentially condition this change.

Solving this collinearity issue has wide implications for a number of theoretical approaches that make predictions with respect to what should condition such an effect. A classical Prosodic Phonology approach to external sandhi processes (Nespor & Vogel, 1986) predicts that the conditioning environment is defined by the categories of the prosodic hierarchy itself, e.g., when final in the intonational phrase (IP) or utterance (U). If this were the case, we should find a stark contrast in [ɡ]-presence between domain-medial and domain-final positions, and a result in which this dichotomy provides the best fit to the observed variation would lend support to this theory.

On the other hand, there are theories of lenition (e.g., Lavoie, 2001) that highlight the importance of durational factors over the abstract categories proposed by Prosodic Phonology. These accounts claim that the primary phonetic manifestation of weakening is shorter segmental duration; as such, they would predict that it is phonetic duration that directly influences [ŋɡ] variation, such that the probability of [ɡ]-presence is correlated with the duration of the syllable coda or rime.

Finally, recent years have seen a rise in theories that emphasize the psycholinguistic processing of language, such as the Production Planning Hypothesis (Wagner, 2012); under these accounts, the most important factor in conditioning external sandhi processes is the temporal relationship between target (in this case, word-final [ŋɡ]), and trigger (the following consonant-initial word), thus motivating the inclusion of intervening pause duration in this analysis. It may be the case that this plays the biggest role in conditioning this case of external sandhi, whereby the presence (and possibly duration) of a pause following the (ng) token has a direct impact upon the probability that the [ɡ] is present on the surface, independent of prosodic position or phonetic duration.

It is important to consider these three factors separately at both the conceptual and empirical level, despite the strong collinear relationship between them. Although it has been shown that pause is the most important acoustic cue to the perception of intonational phrasing (see Mo, 2010; Swerts, 1997; Yang, Shen, Li, & Yang, 2014; Zhang, 2012), it is not mandatory to mark a boundary with pause (Krivokapić & Byrd, 2012), which results in cases where this collinearity breaks down. In showing how the effects of pause interact with prosodic position in Japanese high-vowel devoicing, Kilbourn-Ceron (2017) highlights the importance of teasing apart such factors for variables that show apparent ‘pre-pausal’ behaviour, and there is evidence from Argentinian Spanish that (a) IP boundaries can be produced without pause and (b) pauses can occur within IPs (Kaisse, 1996).

This paper seeks to uncover the relative contributions made by these three boundary phenomena (prosodic boundary strength, segmental duration, and pause presence/duration) in boosting [ɡ]-presence in northern British English, and by doing so will contribute to our knowledge of how external sandhi processes operate in pre-boundary position.

3. Methodology

This study takes a two-pronged approach in answering the research questions posed in the preceding section, and it does so by drawing upon two complementary sources of data: a collection of sociolinguistic interviews, and a follow-up elicitation task with a similar population sample. These two methods of data collection have complementary strengths and weaknesses, and an analysis that makes use of both is therefore better-equipped to provide an accurate description of the variation. The interviews contain naturalistic language that is often the subject of variationist analysis; these will be used to provide empirical evidence of a change in progress among North Western subjects. The elicitations, on the other hand, allow for careful control over the environments in which the dependent variable appears, which is a crucial component of this investigation into the behaviour of (ng) at various linguistic boundaries; these will be used to provide insight into factors that condition the afore-mentioned change.

All recordings were made using a Sony PCM-M10 recorder and a lavalier microphone attached to the participant, saved at a 44.1 KHz sampling rate in uncompressed WAV format.

3.1. Sociolinguistic interviews

The naturalistic component of the data consists of 32 sociolinguistic interviews, most of which took place during a two-year period from 2015 to 2017. In the following sub-sections I provide detail on the demographics of the participants and the interview process itself.

3.1.1. Participants

The population sample consists of 32 speakers (17 female; 15 male), all of whom were born in the North West of England with a large majority born and raised in Greater Manchester. Their date of births range from 1907 to 1998, with two interviews conducted in 1971 included to provide extra time depth to the apparent time analysis of (ng). Socioeconomic status was controlled for by only interviewing upper working class speakers, with this classification based broadly on occupation following the methods of Baranowski (2017, p. 303) in Manchester.

3.1.2. Task

The sociolinguistic interviews were conducted one-on-one with the participants. Following guidance from Tagliamonte (2006, pp. 37–49), the interviews typically lasted for approximately one hour and took the recommended form of ‘hierarchically-structured’ topical modules such as childhood, work and family life, the local community, travel etc. (Labov, 1984, p. 33); these topics consisted of open-ended questions, many of which were designed to elicit narratives of personal experience, said to provide the clearest access to a speaker’s vernacular and minimize the effects of the observer’s paradox (Labov, 2010). In total, these interviews yielded 1526 tokens of (ng).

These interviews were conducted as part of a wider variationist investigation of (ng), and as such they were also coded for a number of linguistic factors such as the immediate preceding/following phonological context, word frequency, speech rate, and part of speech, among others. Tokens were coded as being ‘pre-pausal’ when final in an ELAN breath group; broadly speaking these tokens are followed by a period of silence lasting approximately 100 ms or longer.

3.2. Elicitations

While the conversational data provides a reliable insight into how (ng) varies in naturalistic speech, an elicitation task is required to tease apart the various collinear boundary phenomena that potentially condition this variation. For this elicitation task, subjects were asked to read out a list of sentences as naturally as possible and at their own pace. Each sentence contained exactly one token of (ng) before a particular linguistic boundary, detailed in Section 3.2.2, and they were presented one at a time on a laptop screen. In the following sub-sections I provide further information about the sample population and the design of elicitation stimuli.

3.2.1. Participants

The elicitation task was conducted with 30 speakers from the North West of England, many of whom were also subjects of the sociolinguistic interviews detailed in Section 3.1. These 30 speakers form a balanced population sample with respect to age and sex (see Table 1), and although many of the informants were born and raised in Manchester, the sample contains a number of speakers from other regions in the North West such as Blackburn, Widnes, Wigan, and Bolton. Efforts were made to ensure that all subjects who took part in this elicitation task were not only born and raised in the North West of England but also that they had at least one parent who was also a native British English speaker from the same region.

Table 1

The age and sex distribution of informants for the elicitation task. Cells include the average age of each group, with N denoting number of subjects. ‘Young’ speakers are aged between 18 and 30; ‘old’ speakers are aged between 52 and 85.

Male Female
Young 24 yrs
N = 8
24 yrs
N = 8
Old 60 yrs
N = 6
60 yrs
N = 8

3.2.2. Stimuli

Five linguistic boundaries of different perceived ‘strengths’ are under consideration here, based primarily on the stimuli chosen by Sproat and Fujimura (1993) in their comparable study of coda-targeting /l/-darkening in English.2 The aim of these carefully-controlled environments, which vary in their inherent ‘strengths,’ is to elicit different magnitudes of pre-boundary segmental lengthening. Although later work suggests that the magnitude and implementation of pre-boundary lengthening is not universal and not solely a function of prosodic or syntactic boundary strength (Cho, 2016; Feldscher & Durvasula, 2017; Turk, 2012), the results from Sproat and Fujimura (1993) nevertheless justify the adoption of similar stimuli here. It has already been shown that these elicitations result in a range of segmental durations, which will allow for an investigation into how well this phonetic property correlates with [ɡ]-presence. These boundaries, which are either syntactic or prosodic in nature, are detailed and exemplified below in increasing order of the strength of the juncture:

  1. NP-internal boundary: immediately followed by the head of an NP

    e.g., She was given [the wrong amount]NP

  2. VP-internal boundary: immediately followed by the direct object in a double object construction

    e.g., She gave [the ring]IO [a quick polish]DO

  3. VP boundary: immediately followed by an NP/VP juncture

    e.g., [The sting]NP became painful

  4. Intonational phrase boundary: final in the intonational phrase

    e.g., [“It’s a traditional thing,”]IP Patricia said

  5. Utterance boundary: final in the utterance

    e.g., [The drink was surprisingly strong.]U

Because the words under consideration contain two sonorous segments upon which the effects of lengthening can be registered (see Turk & Sawusch, 1997; Turk & Shattuck-Hufnagel, 2007; Turk & White, 1999), pre-boundary lengthening was operationalized as ‘sonorant duration,’ encompassing both the vowel and nasal portion of the (ng) word. The phonetically-gradient relationship between this durational measure and the five boundary contexts included in this study is illustrated in Figure 1, highlighting the success of the chosen stimuli in eliciting various magnitudes of lengthening.

Figure 1
Figure 1

The impact of boundary strength on the sonorant duration of (ng) tokens.

Each boundary context was represented eight times in the sentence list, equally distributed by phonological context such that each boundary × preceding segment × following segment combination was represented by two example sentences. The segment immediately following the (ng) cluster was either a consonant or a vowel,3 with the following word also having non-initial stress, and the segment immediately preceding the (ng) cluster was either a low vowel or a high vowel. Controlling for vowel height is necessary because it presents a confound for our quantification of pre-boundary lengthening effects. Given that pre-boundary lengthening applies not just to the nasal segment of the (ng) word but also to the preceding stressed vowel, we want to minimize the possibility of any other factors influencing the durational properties of these segments. The well-established correlation between vowel height and duration (see Lehiste, 1970; Solé & Ohala, 2010; Tauberer & Evanini, 2009) is one such confound. Word token frequency is another potential confounding factor, and one that is less easily overcome given the small set of lemmas that actually contain a variable (ng) cluster. The impact of token frequency on phonetic implementation has been subject to extensive study, and one such surface manifestation of token frequency is registered in segmental duration, whereby less frequent words are often longer than words that are frequent and more predictable in discourse (see Aylett & Turk, 2004; Jurafsky, Bell, Gregory, & Raymond, 2001). To minimize the impact of this confounding factor, efforts were made to avoid highly infrequent (ng) lemmas; with just one exception, all lemmas used in the stimuli range between 4.25–5.94 on the logarithmic 1–7 Zipf-scale (van Heuven, Mandera, Keuleers, & Brysbaert, 2014).

In total, 40 sentences were elicited per participant (5 boundaries × 2 preceding segments × 2 following segments × 2 repetitions), yielding a total of 1,200 tokens; these sentences are given in full in the Appendix.

3.3. Data annotation

The recordings were all transcribed orthographically using ELAN and force-aligned with the FAVE suite (Rosenfelder, Fruehwald, Evanini, & Yuan, 2011) to facilitate a more efficient analysis. Forced alignment is a major methodological innovation in contemporary variationist linguistics in which an audio file is time-aligned at the word- and phoneme-level with a corresponding orthographic transcription.

Although recent work has probed the ability of forced alignment to also automatically code for linguistic variation (e.g., Bailey, 2016a; Yuan & Liberman, 2011), a manual method of coding was employed here. Coding of the dependent variable was carried out using a combination of auditory analysis and visual inspection of the spectrogram in Praat (Boersma & Weenink, 2017). For ambiguous tokens where the presence/absence of [ɡ] was not clear (approximately 3% of the entire sample), a second round of coding was carried out independently by another phonetically-trained researcher, and any tokens for which there was disagreement were subject to further inspection. These cases were extremely rare, and there is in fact relatively little variation with respect to the phonetic realization of post-nasal [ɡ], which is almost always released and very often devoiced in phrase-final position. In light of this, a binary coding scheme was used based on the categorical presence/absence of a post-nasal stop. Prototypical examples of a [ɡ]-ful token and a [ɡ]-less token are given in Figure 2.

Figure 2
Figure 2

Example spectrograms and waveforms of song with [ɡ]-absence (left) and thing with [ɡ]-presence (right).

Although the stimuli were designed to control intonational phrasing, with boundaries 1–3 intended to elicit IP-medial tokens and boundaries 4–5 IP-final tokens, this was independently clarified through intonational analysis. Pitch contours consisting of 64,644 dynamic pitch measurements were extracted, manually corrected, and smoothed using the mausmooth Praat script (Cangemi, 2015). The elicited sentences were then annotated by the author in the ToBI framework (Beckman, Hirschberg, & Shattuck-Hufnagel, 2005) for nuclear accent placement and presence of phrase accent and boundary tones, the latter providing a more reliable annotation of intonational phrasing. This manual annotation is of paramount importance for tokens where the phonetic cues to intonational phrasing are only partially present: Tokens of (ng) produced in the IP-medial context that are followed by a pause—and conversely tokens in the IP-final context that are not—are crucial to the analysis and in these cases the manual annotations were compared with those of another researcher trained in phonetics to ensure the reliabilityof the coding.

4. Results

The results of this study will be presented in two complementary sub-sections: The first part of the analysis draws upon sociolinguistic interview data to provide evidence of change in apparent time (Section 4.1); in the second part of this analysis (Section 4.2), attention turns to the follow-up elicitation task with the goal of probing the precise factors that condition this innovation.

All logistic regression models reported in this section were fit using the glmer function in the lme4 R package (Bates, Maechler, Bolker, & Walker, 2015). All models include random intercepts of speaker and word.

4.1. Change in apparent time

Although there have already been reports that rates of [ɡ]-presence are increasing in a number of communities, summarized earlier in Section 2.1, the interaction between age (or date of birth) and phonological environment with respect to [ɡ]-presence has yet to be investigated. Figure 3 plots the pre-consonantal and pre-pausal rates of velar nasal plus by date of birth for this sample of speakers from the North West of England, where ‘pre-pausal’ refers to tokens followed by a period of silence lasting around 100 ms or longer. The results indicate that the increase in [ɡ]-presence over the 91-year time span covered by this sample of speakers is largely confined to the pre-pausal environment.

Figure 3
Figure 3

Change in apparent time of pre-pausal [ɡ]-presence, based on a sample of 32 speakers. Pre-consonantal rates given for comparison. Points reflect individual speaker means; lines reflect linear models fit to the two environments with shaded areas representing 95% confidence intervals.

It should be noted that there also seems to be a slight increase in [ɡ]-presence pre-consonantally, but this trend is much less dramatic and the correlation is not statistically significant (Spearman’s rs = 0.23, p = 0.21); the trend in pre-pausal environments, however, is strong and highly significant (rs = 0.70, p < 0.001).4 The favourable effect of a following pause on the probability of [ɡ]-presence is particularly evident for speakers born after 1975, many of whom show categorical use of the local form in this particular environment.

This apparent diachronic change affecting pre-pausal velar nasal plus finds statistical support from the results of mixed-effects logistic regression, where the interaction between phonological environment and date of birth is significant for the pre-pausal tokens but does not pass the threshold for significance pre-consonantally (see Table 2). Furthermore, conducting an ANOVA comparison between nested models confirms that adding an interaction with date of birth leads to a statistically-significant decrease in AIC (935.31, cf. 958.29; p < 0.001) and, therefore, a better fitting model.

Table 2

Mixed-effects logistic regression model for the interaction between following segment and date of birth; includes random intercepts of speaker and word. [ɡ]-presence as application value. Vowel as reference group.

Fixed effects Estimate SE z-value Pr (>|z|)
(Intercept) 1.2142 0.2551 4.760 <0.001 ***
Following segment
   pause 1.1458 0.2800 4.092 <0.001 ***
   consonant –2.8466 0.2382 –11.951 <0.001 ***
Following segment × Date of birth
   pause:dob 1.0632 0.2464 4.315 <0.001 ***
   consonant:dob 0.1431 0.1716 0.834 0.404
Date of birth
   date of birth (scaled) 0.0738 0.1962 0.376 0.707

4.2. Elicitation task

Although there are a number of benefits to analyzing the conversational data discussed in the previous section, most notably the fact that this is a naturalistic speech style and therefore more representative of the speakers’ vernaculars, it is not without fault. One particular limitation is that a dichotomy between whether or not a token of (ng) is followed by a pause actually conflates a number of prosodic environments and interactional situations; in reality, these pre-pausal tokens may encompass a wide range of contexts, e.g., turn-final, utterance-final, IP-final etc. Is there an absence of segmental material following the (ng) token because the speaker was interrupted, or was the pause just temporary, with the speaker then resuming their turn? Pauses may arise for a number of different reasons, whether they be cognitively or interactionally motivated (see Kendall, 2013 for an exploration of the factors that condition pause production).

To combat this shortcoming, the second part of this study’s analysis focuses on the follow-up elicitation task where the exact environments in which (ng) clusters appear can be carefully controlled using reading passage stimuli. In this way, efforts can be made to disentangle the collinearity between the three factors that on the surface appear to be boosting [ɡ]-presence: phonetic duration, prosodic boundary strength, and pause presence/duration.

Figure 4 illustrates an interaction between boundary strength and following segment with respect to rates of [ɡ]-presence. When the (ng) cluster is pre-vocalic, rates of [ɡ]-presence remain high irrespective of the type of boundary; however, in pre-consonantal position the variation clearly shows sensitivity to boundary strength in a much more striking manner, such that the rate of [ɡ]-presence in the weakest boundary context is as low as 4%.

Figure 4
Figure 4

Rate of [ɡ]-presence by boundary strength and following segment.

The relative lack of variation pre-vocalically is no great surprise, and is likely due to the fact that there are two competing forces promoting the presence of [ɡ] in this environment: one that favours [ɡ]-presence before stronger boundaries, as we can see with the pre-consonantal tokens, but also one that favours [ɡ]-presence in weaker boundary contexts; crucially, this latter effect is confined to the pre-vocalic environment. If we assume that variation in (ng) is derived from a coda-targeting deletion rule, and that the promoting effect of following vowels on the probability of [ɡ]-presence stems from phrase-level resyllabification of the [ɡ] into onset position, it logically follows that this effect is more likely when the juncture between the words is weaker.5 Because weaker boundaries favour resyllabification, they consequently also favour [ɡ]-presence, but crucially this applies only in pre-vocalic environments. These two antagonistic effects cancel each other out pre-vocalically, where rates of [ɡ]-presence are high across the board, whereas in pre-consonantal environments only the former, more general effect is present. Given then that (ng) only shows sensitivity to boundary strength in pre-consonantal environments, the subsequent analysis will focus on this subset of the data, discarding the pre-vocalic tokens that are largely invariable.

In the pre-consonantal environment, we do clearly see a monotonic increase in [ɡ]-presence correlated with the strength of the following juncture. However, the rates of [ɡ] do not increase in a gradual manner parallel to the phonetically-gradient relationship between boundary strength and segmental duration; instead, we see a stark contrast between boundaries 1–3 and 4–5, suggesting that the meaningful contrast is between IP-medial and IP-final position. However, sensitivity to IP boundaries alone would not account for the contrasting behaviour of (ng) clusters between boundaries 4 and 5; in the former (utterance-medial IP boundary) we see 53% [ɡ]-presence, whereas in the latter (utterance-final IP boundary) we find rates almost at ceiling level (96%).

Pause presence/duration provides a possible explanation for this contrast, in addition to showing the strongest correlation with the presence of post-nasal [ɡ]. The use of [ŋɡ] is more variable at the utterance-medial IP boundary (i.e., boundary 4) because here the prosodic phrasing and, in particular, presence of pause is also variable (cf. the utterance-final IP boundary which is always pre-pausal). This is shown in Figure 5, which illustrates the relationship between pause duration, pre-boundary lengthening, IP position, and the realization of (ng).

Figure 5
Figure 5

The relationship between sonorant duration, following pause duration, and [ɡ]-presence for pre-consonantal (ng) tokens. Tokens with no period of silence before the following word are excluded. Ellipses represent 95% confidence intervals for tokens with and without surface [ɡ]-presence.

What is perhaps most interesting to note from Figure 5 is that there is much clearer separation along the x-axis than along the y-axis with respect to [ɡ]-presence. In other words, following pause duration is a much stronger predictor than sonorant duration, with a cut-off point around 4.6 (∼100 ms) on the x-axis, where any pause longer than this is enough to result in [ɡ]-presence. Reassuringly, this is the same value as the cut-off point used to identify pre-pausal tokens in the conversational data, when establishing the change in progress as described in Section 4.1.

To investigate the effects of pause and IP position independently, there need to exist tokens of (ng) where the pausal cue to major prosodic boundaries is absent on the surface. Figure 5 highlights an overlap between the IP-medial and IP-final tokens with respect to the following pause duration, suggesting that this is the case. In total, 65 of 120 tokens in the IP-final context surface without a pause. However, it is entirely possible that the intonational phrasing this stimuli intended to elicit was not actually produced, and these cases of IP-final tokens without pause are in fact medial in the IP. To combat this, we need independent evidence, specifically based on the pitch contour, of the presence of IP boundaries. Intonational analysis, combining visual inspection of the pitch contours with manual ToBI annotation, reveals that none of these 65 tokens show convincing evidence of a prosodic boundary after the (ng) word.

However, the ability to tease apart these two collinear effects does not rest solely on the presence of such tokens, where IP boundaries are not marked by pause. If there are cases of IP-medial tokens that are followed by pause, and crucially exhibit [ɡ]-presence, this would provide strong evidence that the variation is conditioned most strongly by pause as apposed to the presence of a prosodic boundary. Twenty-five of the 360 tokens (∼7%) elicited in the IP-medial context are produced in such a way. Based on the intonational analysis, 14 of these 25 (56%) are genuinely in IP-medial position with no evidence of pitch reset or boundary tone, i.e., there is a brief juncture before resumption of the same pitch movement. An example of this is given in Figure 6a. It is also important to note that in this example, the hiatus in the pitch contour does not reflect devoicing of the /ɡ, b/ sequence in Spring began but rather a genuine period of silence, as shown in the waveform and spectrogram. A counter-example is presented alongside this in Figure 6b, where the pause is clearly a phonetic cue to an IP boundary tonally marked with a fall-rise phrase accent and boundary tone combination. Crucially, post-nasal [ɡ] is present in all of these 14 genuine cases where (ng) occurs before an IP-medial pause.

Figure 6
Figure 6

Pitch contours and ToBI annotation for two pre-pausal tokens in the IP-medial context: one genuinely IP-medial in (a) and one produced with phrase-final intonation in (b).

Mixed-effects logistic regression lends further support to the idea that the presence and duration of pause is the primary conditioning factor of (ng). Three individual models were initially fit, including a main predictor of either: (1) sonorant duration, (2) IP position (based on what was actually produced, rather than what was intended by the stimulus), or (3) following pause duration, in addition to random intercepts of speaker and word. These models were compared for ‘goodness of fit’ based on their AIC values to determine which predictor explains most of the variation in (ng), where lower values correspond to a better model. These comparisons, summarized in Table 3, indicate that sonorant duration explains the least amount of variation (AIC: 562), and that IP position fares a little better (359). Pause duration (273) is by far the strongest predictor.

Table 3

AIC comparison between models, all of which include random intercepts of speaker and word. All additions to the base models lead to significant increases in model fit by ANOVA comparison (p < 0.001), with the exception of the model containing pause and sonorant duration (p = 0.07).

Sole predictor With sonorant duration With pause duration
Sonorant duration
(continuous; scaled)
562.30 NA 271.72
Position in IP
(medial vs. final)
358.80 348.50 267.44
Pause duration
(continuous; log-transformed)
272.70 271.72 NA

Models were then fit with a combination of these predictors to investigate the possibility of additive effects, which could be the case if (ng) is conditioned by multiple phonetic cues to boundary strength. ANOVA comparisons between nested models were conducted to quantify whether or not the increase in the amount of variation explained by these additional predictors offsets the cost of a more complex model. In doing this, it became apparent that the strong predictive power of pause duration does not mean that the other collinear variables play no role; adding IP position to a model with pause duration leads to a decrease in AIC that, although small, is statistically significant (267.44, cf. 272.69; p = 0.007). The fact that IP position explains a significant portion of the remaining variation suggests that (ng) is not only sensitive to pause but also to the prosodic phrasing. That is, although the probability of [ɡ]-presence is most strongly influenced by pause, it is also boosted when final in an intonational phrase. This best-fitting model is given in full in Table 4.

Table 4

Best-fitting logistic regression model; [ɡ]-presence as application value, with random intercepts of speaker and word.

Fixed effects Estimate SE z-value Pr (>|z|)
(Intercept) –10.5752 1.7062 –6.198 <0.001 ***
IP position
1.8094 0.6845 2.644 0.008 **
Pause duration
1.9990 0.3574 5.593 <0.001 ***

5. Discussion

I would now like to address two separate aspects of the variation discussed here: the mechanisms of this innovation in velar nasal plus, and the implications this has for our understanding of how external sandhi processes are conditioned in pre-boundary environments, and also the possible motivations driving this change.

5.1. Mechanisms of innovation

Having successfully ‘disentangled’ the collinearity between what on the surface appeared to be effects of nasal duration (with increasing [ɡ]-presence after longer nasals), prosodic position (with more [ɡ]-presence IP-finally), and following pause (with higher rates of [ɡ]-presence pre-pausally), the results indicate that this innovation in (ng) is conditioned most strongly by the presence/duration of a following pause. This would of course suggest that the apparent relationship between post-nasal [ɡ]-presence and sonorant duration is indirect and stems only from the fact that segmental duration is increased pre-pausally. There is limited evidence to suggest that (ng) is also directly sensitive to prosodic boundary strength; IP position explains a small amount of variation independently of pause, but comparisons between these two predictors suggests that this effect is much smaller in magnitude.

The important role of pause is perhaps best visualized abstractly, as in Figure 7. There is naturally a great deal of overlap between pre-pausal tokens and IP-final tokens, given that the presence of a following pause is one of the major phonetic cues to prosodic boundaries; these tokens that are both IP-final and pre-pausal exhibit post-nasal [ɡ] almost without fail. The non-overlapping portion of Figure 7 reflects the existence of tokens that are pre-pausal but actually medial in the IP; the fact that in these cases [ɡ] is still ever-present provides strong evidence to suggest that only pause is necessary for [ɡ] to surface in these environments, regardless of the prosodic structure.

Figure 7
Figure 7

Visualization of the distribution of pauses and IP boundaries in this study’s dataset, and the effect they have upon [ɡ]-presence.

That this innovation seems to be conditioned most strongly by the presence/absence of pause, rather than by segmental duration or prosodic boundary strength, is rather interesting in light of previous studies that have also attempted to tease apart these factors for other external sandhi processes, e.g., /s/-debuccalization in Spanish. Kaisse (1996) shows how in the Buenos Aires variety of Argentinian Spanish, word-final coda /s/ does not weaken to [h] when the following segment is ‘temporally distant,’ i.e., /s/ is saved from weakening when in pre-pausal position. Much like the argument presented here for (ng) variation, Kaisse claims that this blocking of debuccalization is triggered on the temporal domain, and is independent of prosodic position; this claim is based on the fact that IP-final tokens in fast speech, where the speaker does not pause, still undergo weakening, and that IP-medial tokens where the speaker pauses before resuming with the same intonational contour do not undergo weakening. Comparisons are also drawn with final /r/-devoicing in Turkish, which shows similar behaviour in that devoicing only occurs IP-finally if the IP boundary is marked by a pause (Kaisse, 1990).

More recently, it has been shown that other processes exhibit rather more complex behaviour in pre-boundary environments. Kilbourn-Ceron (2017) investigates the conditioning of high vowel devoicing (HVD) in Japanese and addresses the same collinearity issue highlighted in this paper; the results indicate that all three boundary phenomena play a joint role in conditioning HVD, with an interaction between prosodic position and pause presence such that pauses inhibit HVD phrase-medially but promote it phrase-finally. These results paint a much more complex picture relative to earlier claims that suggest the pre-boundary effect is triggered in utterance-final position (Kondo, 1997).

No other putative pre-pausal effects have, to this author’s knowledge, been investigated from this perspective; however, this interplay between prosody, pause, and segmental duration does raise questions about the nature of similar effects that have been reported for other external sandhi processes (most notably /t,d/-deletion, as discussed in Section 2.2), which could form a fruitful avenue of further research.

5.2. Motivations of innovation

While the quantitative analysis discussed in this paper has made it possible to determine the mechanisms of this innovation, the motivations behind pre-pausal [ɡ]-presence have thus far been neglected.

This relatively recent pre-pausal innovation could have been triggered by language-internal factors, specifically by the very nature of how synchronic ‘following segment effects’ are stored and processed in speakers’ grammars. As discussed in Section 2.2, the effect of following consonants in promoting /t,d/-deletion, and of following vowels in inhibiting it, has been analyzed under the Obligatory Contour Principle (Guy, 1980); the same framework can apply here with (ng) variation. Under this analysis, the effect stems from the avoidance of similar adjacent segments, with following consonants sharing more features with the post-nasal [ɡ] than following vowels. This also accounts for the intermediate effect of following liquids on /t,d/-deletion, which share more features with the preceding /t,d/ than a following vowel but fewer than a following consonant. Crucially, pauses by their very nature do not fit into this typology and could therefore be left open to interpretation with respect to their effect on probabilistic external sandhi processes such as these. The possible consequences of this ‘instability’ could be registered synchronically through inter-dialectal differences (e.g., how the behaviour of pre-pausal /t,d/-deletion differs between speech communities), but also diachronically, as reported here for (ng) in Section 4.1, with changes in pre-pausal behaviour over successive generations of speakers.

We can also turn to an entirely different process, of voiceless stop ejectivization, in search for possible explanations. Ejectivization of the English voiceless plosive set /p, t, k/ has been attested by many scholars (Fabricius, 2000; Gordeeva & Scobbie, 2013; Ogden, 2009) and has been said to be increasing over time for /k/, for which it is most frequent (O. McCarthy & Stuart-Smith, 2013). In the same paper, O. McCarthy and Stuart-Smith explore the factors conditioning /k/-ejectivization in speakers of Glasgow English, and find that it is favoured not only phrase-finally but also when preceded by a nasal consonant. That is, the words most susceptible to ejectivization are sink, rank, hunk etc.; these phrase-final /ŋk/ clusters that are frequently ejectivized are essentially the voiceless counterparts to the (ng) clusters that so frequently exhibit post-nasal stop presence in phrase-final position, e.g., sing, rang, hung. These conditioning factors would need to be independently attested in the North West of England, but assuming ejectivization shows comparable behaviour for the speakers recorded here, these two processes could conceivably be seen as part of the same wider phenomenon: a boundary-marking ‘velar fortition rule,’ with parallel changes involved in strengthening voiceless velar nasal + stop clusters through ejectivization and voiced velar nasal + stop clusters through presence of [ɡ].

This innovation could also be driven by external factors; that is, it could be socially-motivated. Utterance/phrase-final position is highly salient,6 and as such we may expect the influence of social evaluation to be registered most strongly in this environment. This explanation presupposes two things, however: that (ng) variation is sufficiently above the level of awareness such that it is subject to social evaluation, and if so, that the presence of [ɡ] carries local prestige in these north western communities. If this is indeed the case, perhaps this diachronic change in production, with increasing rates of [ɡ]-presence in highly salient pre-pausal and phrase-final environments, actually reflects a perceptual shift within these communities. Younger speakers may well be attaching local prestige to this post-nasal [ɡ], and actively using this vernacular feature to project a northern identity and align themselves with this dialect region (similar to the use of centralized diphthongs by Martha’s Vineyard residents in Labov, 1963, or the realization of GOAT vowels as [ɵː] by speakers of Tyneside English, in Watt, 2002). The results from Newbrook’s 1999 perception task, discussed briefly in Section 2.1, potentially reflect such a change in evaluation. However, a more recent matched-guise experiment reveals the absence of a community-wide norm with respect to the evaluation of [ɡ]-presence, suggesting that this is not a case of evaluation-driven change (Bailey, to appear).

6. Conclusion

The goal of this study was to investigate the mechanisms of a recent innovation in post-nasal [ɡ]-presence, and in doing so explore the collinear relationship between prosodic boundary strength, pause, and segmental duration in conditioning external sandhi processes. In teasing apart these three factors, all of which on the surface appear to affect the probability of [ɡ]-presence, this study has shown that their relative contributions in conditioning (ng) variation are far from equal. The presence and duration of a following pause provides the strongest explanation of probabilistic [ɡ]-presence; the additive effect of IP position overlaid on this is much weaker, despite theoretical frameworks that foreground the importance of prosodic categories in conditioning external sandhi (e.g., Nespor & Vogel, 1986). The apparent relationship between segmental duration and (ng) variation, such that [ɡ]-presence is more likely after longer nasals, arises only because duration is itself correlated with prosodic boundary strength and pause through the process of pre-boundary lengthening. That is, these results suggest that (ng) shows no direct sensitivity to segmental duration.

The results of this study add to a growing body of knowledge about how probabilistic lenition processes behave in pre-boundary position, and in doing so raise questions regarding the nature of similar effects that have previously been attributed to one of these collinear factors without due consideration of the others.

The process that gives rise to (ng) variation, whether that be a synchronic deletion or insertion rule, bears a striking resemblance to /t,d/-deletion in a number of ways, most notably the strong effect of following segment which sees the word-final consonant cluster licensed pre-vocalically but not pre-consonantally; however, where /t,d/-deletion shows inter-dialectal variation with respect to its behaviour pre-pausally, [ŋɡ] clusters instead show inter-generational variation, with younger speakers reanalyzing this pre-pausal environment as one that favours use of the local form with [ɡ]-presence.

This shibboleth of north western dialects, the variable presence of a feature that has been lost in almost all other varieties of English spoken throughout the British Isles, is yet another example of the oft-discussed linguistic conservatism of the north of England. However, in light of the results reported here, this feature is clearly less stable than previously thought; although this variation in (ng) began some time in the Early Modern English period, it still exhibits interesting behaviour today, and even appears to be undergoing a revitalization in this community. Far more than a mere relic of traditional northern dialects, this variation in (ng) clusters offers valuable insight into the diachronic trajectory of phonological processes (Bermúdez-Otero & Trousdale, 2012) and, as revealed in this study, how external sandhi processes are conditioned in pre-boundary environments.

Additional File

The additional file for this article can be found as follows:


Sentence list stimulus for experiment. DOI: https://doi.org/10.5334/labphon.115.s1


  1. Whilst post-nasal [ɡ] also occurs in unstressed -ing clusters, leading to surface three-way variation between [ɪn]∼[ɪŋ]∼[ɪŋɡ], in this paper the focus is solely on the stressed (ng) clusters that are invariable in non-northern varieties. [^]
  2. Word-medial tokens were also collected, but had to be discarded from the analysis due to a confound of epenthesis; the (ng) tokens before consonant-initial suffixes (e.g., gangster, youngster) resulted in nasal + sibilant clusters that are known to trigger excrescent stop insertion (Fourakis & Port, 1986; Warner, 2002). This is of course a productive and widely-attested phenomenon in English, sometimes referred to as the prints-prince merger, where gestural timing during the nasal + sibilant transition can lead to insertion of a stop that is homorganic with the preceding nasal, resulting in surface forms such as [ˈpɹɪnts] ‘prince,’ [ˈhæmp.stə] ‘hamster,’ and [ˈjʊŋk.stə] ‘youngster.’ Owing to the difficulty in determining whether a post-nasal stop in this environment arises through this process rather than being a genuine case of velar nasal plus, these examples had to be excluded. [^]
  3. For the word-final (ng) tokens, the following consonant was almost always a non-lingual obstruent to prevent possible confounds of assimilation or phrase-level resyllabification of the word-final [ɡ]. The only exception to this is in the case of one IP-boundary elicitation, where the (ng) token is followed by a non-lingual sonorant /m/. [^]
  4. Although not plotted in Figure 3, there is also no evidence for a change in progress involving the pre-vocalic environment (rs = 0.04, p = 0.73). [^]
  5. The likelihood of phrase-level resyllabification being the source of this phonological environment effect is further increased by the independent observation that resyllabification across word boundaries is more likely when the following word, to which the [ɡ] is attaching, has non-initial stress (Bermúdez-Otero & Trousdale, 2012, p. 697); recall that in this study, all word-final tokens of (ng) are elicited before words with non-initial stress. [^]
  6. For psycholinguistic evidence of the salience of utterance-final position, see Sundara, Demuth, and Kuhl (2011) (visual-fixation task) and Dube, Kung, Peter, Brock, and Demuth (2016) (EEG experiment). [^]


This work would not have been possible without the generous funding of the Economic and Social Research Council (NWDTC studentship, grant number ES/J500094/1), or the speakers who were equally generous in giving up their time for this research. I am also indebted to Ricardo Bermúdez-Otero, Maciej Baranowski, and Laurel MacKenzie for their guidance and insightful feedback, to the audiences at FWAV4, NWAV45, and the 25th Manchester Phonology Meeting for their questions, and finally to the LabPhon editors and reviewers, whose comments were instrumental in shaping the final version of this paper. All remaining errors are entirely my own.

Competing Interests

The author has no competing interests to declare.


Asprey, E. 2007. Black Country English and the Black Country identity (Doctoral dissertation, The University of Leeds).

Asprey, E. 2015. The West Midlands. In: Hickey, R. (ed.), Researching Northern English, 393–416. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/veaw.g55.17asp

Aylett, M., & Turk, A. E. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech, 47(1), 31–56. DOI:  http://doi.org/10.1177/00238309040470010201

Bailey, G. 2015. Social and internal constraints on (ing) in northern Englishes (Master’s dissertation, The University of Manchester).

Bailey, G. 2016a. Automatic detection of sociolinguistic variation using forced alignment. University of Pennsylvania Working Papers in Linguistics: Selected Papers from NWAV 44, 22(2), 10–20.

Bailey, G. 2016b. Velar nasal plus in the north of (ing)land. Paper presented at New Ways of Analyzing Variation 45, Simon Fraser University. 4th November 2016.

Bailey, G. To appear. Emerging from below the social radar: Incipient evaluation in the North West of England. Journal of Sociolinguistics. DOI:  http://doi.org/10.1111/josl.12307

Baranowski, M. 2017. Class matters: The sociolinguistics of GOOSE and GOAT in Manchester English. Language Variation and Change, 29(3), 301–339. DOI:  http://doi.org/10.1017/S0954394517000217

Baranowski, M., & Turton, D. 2015. T/D-deletion in British English revisited: Evidence for the long-lost morphological effect. Paper presented at New Ways of Analyzing Variation 44, University of Toronto. 23rd October 2015.

Bates, D., Maechler, M., Bolker, B., & Walker, S. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. DOI:  http://doi.org/10.18637/jss.v067.i01

Bayley, R. 1994. Consonant cluster reduction in Tejano English. Language Variation and Change, 6(3), 303–326. DOI:  http://doi.org/10.1017/S0954394500001708

Beckman, M. E., Hirschberg, J., & Shattuck-Hufnagel, S. 2005. The original ToBI system and the evolution of the ToBI framework. In: Jun, S.-A. (ed.), Prosodic typology: The phonology of intonation and phrasing, 9–54. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199249633.003.0002

Bermúdez-Otero, R. 2011. Cyclicity. In: van Oostendorp, M., Ewen, C., Hume, E., & Rice, K. (eds.), The Blackwell Companion to Phonology, 2019–2048. Malden, MA: Wiley-Blackwell. DOI:  http://doi.org/10.1002/9781444335262.wbctp0085

Bermúdez-Otero, R., & Trousdale, G. 2012. Cycles and continua: On unidirectionality and gradualness in language change. In: Nevalainen, T., & Traugott, E. C. (eds.), The Oxford Handbook of the History of English, 691–720. New York: Oxford University Press.

Boersma, P., & Weenink, D. 2017. Praat: Doing phonetics by computer. Retrieved from: http://www.praat.org/.

Cangemi, F. 2015. mausmooth. Retrieved from: http://phonetik.phil-fak.uni-koeln.de/fcangemi.html.

Chinn, C., & Thorne, S. 2001. Proper Brummie: A dictionary of Birmingham Words and Phrases. Studley, Warwickshire: Brewin Books.

Cho, T. 2016. Prosodic boundary strengthening in the phonetics-prosody interface. Language and Linguistics Compass, 10(3), 120–141. DOI:  http://doi.org/10.1111/lnc3.12178

Coetzee, A. W. 2004. What it means to be a loser: Non-optimal candidates in Optimality Theory (Doctoral dissertation, University of Massachusetts).

Delattre, P. 1966. A comparison of syllable length conditioning among languages. International Review of Applied Linguistics in Language Teaching, 4(3), 183–198. DOI:  http://doi.org/10.1515/iral.1966.4.1-4.183

Dube, S., Kung, C., Peter, V., Brock, J., & Demuth, K. 2016. Effects of type of agreement violation and utterance position on the auditory processing of subject-verb agreement: An ERP study. Frontiers in Psychology, 7(1276), 1–18. DOI:  http://doi.org/10.3389/fpsyg.2016.01276

Fabricius, A. 2000. T-glottalling: Between stigma and prestige. A sociolinguistic study of modern RP (Doctoral dissertation, Copenhagen Business School).

Fasold, R. W. 1972. Tense marking in Black English: A linguistic and social analysis. Arlington: Center for Applied Linguistics.

Feldscher, C., & Durvasula, K. 2017. Prosodic domain boundaries do not trigger final lengthening. Poster presented at the 25th Manchester Phonology Meeting, University of Manchester. 27th May 2017.

Fourakis, M., & Port, R. 1986. Stop epenthesis in English. Journal of Phonetics, 14(2), 197–221.

Garrett, A., & Blevins, J. 2009. Analogical morphophonology. In: Hanson, K., & Inkelas, S. (eds.), The nature of the word: Essays in honor of Paul Kiparsky, 527–545. Cambridge, MA: MIT Press.

Gordeeva, O. B., & Scobbie, J. M. 2013. A phonetically versatile contrast: Pulmonic and glottalic voicelessness in Scottish English obstruents and voice quality. Journal of the International Phonetic Association, 43(3), 249–271. DOI:  http://doi.org/10.1017/S0025100313000200

Gussenhoven, C. 2002. Phonology of intonation. Glot International, 6(9/10), 271–284.

Gussenhoven, C., & Rietveld, A. C. M. 1992. Intonation contours, prosodic structure, and preboundary lengthening. Journal of Phonetics, 20(3), 283–303.

Guy, G. R. 1980. Variation in the group and the individual: The case of final stop deletion. In: Labov, W. (ed.), Locating language in time and space, 1–36. New York: Academic Press.

Guy, G. R. 1991. Explanation in variable phonology: An exponential model of morphological constraints. Language Variation and Change, 3(1), 1–22. DOI:  http://doi.org/10.1017/S0954394500000429

Harris, J. W. 1983. Syllable structure and stress in Spanish: A nonlinear analysis. Cambridge: MIT Press.

Hazen, K. 2011. Flying high above the social radar: Coronal stop deletion in modern Appalachia. Language Variation and Change, 23(1), 105–137. DOI:  http://doi.org/10.1017/S0954394510000220

Heath, C. 1980. The pronunciation of English in Cannock, Staffordshire. Oxford: Blackwell.

Hockey, B. A., & Fagyal, Z. 1998. Pre-boundary lengthening: Universal or language-specific? The case of Hungarian. University of Pennsylvania Working Papers in Linguistics: Proceedings of the 22nd Annual Penn Linguistics Colloquium, 5(1), 71–82.

Hughes, A., Trudgill, P., & Watt, D. 2012. English Accents and Dialects. London: Routledge.

Jurafsky, D., Bell, A., Gregory, M., & Raymond, W. D. 2001. Probabilistic relations between words: Evidence from reduction in lexical production. In: Bybee, J., & Hopper, P. (eds.), Frequency and the emergence of linguistic structure, 229–254. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/tsl.45.13jur

Kaisse, E. M. 1990. Toward a typology of postlexical rules. In: Inkelas, S., & Zec, D. (eds.), The phonology-syntax connection, 127–143. Chicago: University of Chicago Press.

Kaisse, E. M. 1996. The prosodic environment of s-weakening in Argentinian Spanish. In: Zagona, K. (ed.), Selected papers from the 25th Linguistic Symposium on Romance Languages, 123–134. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/cilt.133.10kai

Kendall, T. 2013. Speech Rate, Pause, and Sociolinguistic Variation. New York: Palgrave Macmillan. DOI:  http://doi.org/10.1057/9781137291448

Kilbourn-Ceron, O. 2017. Speech production planning affects variation in external sandhi (Doctoral dissertation, McGill University).

Knowles, G. 1973. Scouse: The urban dialect of Liverpool (Doctoral dissertation, University of Leeds).

Kondo, M. 1997. Mechanisms of vowel devoicing in Japanese (Doctoral dissertation, University of Edinburgh).

Krivokapić, J., & Byrd, D. 2012. Prosodic boundary strength: An articulatory and perceptual study. Journal of Phonetics, 40(3), 430–442. DOI:  http://doi.org/10.1016/j.wocn.2012.02.011

Labov, W. 1963. The social motivation of a sound change. Word, 19(3), 273–309. DOI:  http://doi.org/10.1080/00437956.1963.11659799

Labov, W. 1984. Field methods of the project on linguistic change and variation. In: Baugh, J., & Scherzer, J. (eds.), Language in use: Readings in sociolinguistics, 28–66. Englewood Cliffs: Prentice Hall.

Labov, W. 1997. Resyllabification. In: Hinskens, F. L., van Hout, R., & Wetzels, L. (eds.), Variation, change, and phonological theory, 145–179. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/cilt.146.08lab

Labov, W. 2010. Oral narratives of personal experience. In: Hogan, P. C. (ed.), Cambridge encyclopedia of the language sciences, 546–548. Cambridge: Cambridge University Press.

Lavoie, L. 2001. Consonant strength: Phonological patterns and phonetic manifestations. New York: Garland. DOI:  http://doi.org/10.4324/9780203826423

Lehiste, I. 1970. Suprasegmentals. Cambridge, MA: MIT Press.

Lehiste, I., Olive, J. P., & Streeter, L. A. 1976. Role of duration in disambiguating syntactically ambiguous sentences. The Journal of the Acoustical Society of America, 60(5), 1199–1202. DOI:  http://doi.org/10.1121/1.381180

Lindblom, B. 1968. Temporal organization of syllable production. Speech Transmission Laboratory: Quarterly Progress and Status Report, 9(2–3), 1–5.

MacKenzie, L., Bailey, G., & Turton, D. 2017. Our dialects: Mapping variation in English in the UK. Retrieved from: http://projects.alc.manchester.ac.uk/ukdialectmaps/.

Mathisen, A. G. 1999. Sandwell, West Midlands: Ambiguous perspectives on gender patterns and models of change. In: Foulkes, P., & Docherty, G. (eds.), Urban voices: Accent studies in the British Isles, 107–123. London: Arnold.

McCarthy, J. J. 1986. OCP effects: Gemination and antigemination. Linguistic Inquiry, 17(2), 207–263.

McCarthy, O., & Stuart-Smith, J. 2013. Ejectives in Scottish English: A social perspective. Journal of the International Phonetic Association, 43(3), 273–298. DOI:  http://doi.org/10.1017/S0025100313000212

Mo, Y. 2010. Prosody production and perception with conversational speech (Doctoral dissertation, University of Illinois at Urbana-Champaign).

Nespor, M., & Vogel, I. 1986. Prosodic Phonology. Dordrecht: Foris.

Newbrook, M. 1999. West Wirral: Norms, self reports and usage. In: Foulkes, P., & Docherty, G. (eds.), Urban voices: Accent studies in the British Isles, 90–106. London: Arnold.

Ogden, R. 2009. An Introduction to English Phonetics. Edinburgh: Edinburgh University Press.

Orton, H., Sanderson, S., & Widdowson, J. 1978. The Linguistic Atlas of England. London: Croom Helm.

Rosenfelder, I., Fruehwald, J., Evanini, K., & Yuan, J. 2011. FAVE (Forced Alignment and Vowel Extraction) program suite. Retrieved from: http://fave.ling.upenn.edu.

Santa Ana, O. 1996. Sonority and syllable structure in Chicano English. Language Variation and Change, 8(1), 63–89. DOI:  http://doi.org/10.1017/S0954394500001071

Schleef, E., Flynn, N., & Ramsammy, M. 2015. Production and perception of (ing) in Manchester English. In: Torgersen, E., Hårstad, S., Mæhlum, B., & Røyneland, U. (eds.), Selected papers from the Seventh International Conference on Language Variation in Europe (ICLaVE 7), 197–210. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/silv.17.15sch

Selkirk, E. O. 1978. On prosodic structure and its relation to syntactic structure. In: Fretheim, T. (ed.), Nordic prosody II, 111–140. Trondheim: TAPIR.

Solé, M.-J., & Ohala, J. J. 2010. What is and what is not under the control of the speaker: Intrinsic vowel duration. In: Fougeron, C., Kühnert, B., D’Imperio, M., & Vallée, N. (eds.), Papers in Laboratory Phonology 10, 607–655. Berlin: Mouton de Gruyter.

Sproat, R., & Fujimura, O. 1993. Allophonic variation in American English /l/ and its implications for phonetic implementation. Journal of Phonetics, 21(3), 291–311.

Steriade, D. 1997. Phonetics in phonology: The case of laryngeal neutralization. Unpublished manuscript, University of California.

Sundara, M., Demuth, K., & Kuhl, P. K. 2011. Sentence-position effects on children’s perception and production of English third person singular –s. Journal of Speech Language and Hearing Research, 54(1), 55–71. DOI:  http://doi.org/10.1044/1092-4388(2010/10-0056)

Swerts, M. 1997. Prosodic features at discourse boundaries of different strength. The Journal of the Acoustical Society of America, 101(1), 514–521. DOI:  http://doi.org/10.1121/1.418114

Tagliamonte, S. 2006. Analysing Sociolinguistic Variation. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511801624

Tagliamonte, S., & Temple, R. 2005. New perspectives on an ol’ variable: (t,d) in British English. Language Variation and Change, 17(3), 281–302. DOI:  http://doi.org/10.1017/S0954394505050118

Tamminga, M. 2016. Persistence in phonological and morphological variation. Language Variation and Change, 28(3), 335–356. DOI:  http://doi.org/10.1017/S0954394516000119

Tamminga, M. 2018. Modulation of the following segment effect on English coronal stop deletion by syntactic boundaries. Glossa: A Journal of General Linguistics, 3(1), 1–27. DOI:  http://doi.org/10.5334/gjgl.489

Tanner, J., Sonderegger, M., & Wagner, M. 2017. Production planning and coronal stop deletion in spontaneous speech. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1), 1–39. DOI:  http://doi.org/10.5334/labphon.96

Tauberer, J., & Evanini, K. 2009. Intrinsic vowel duration and the post-vocalic voicing effect: Some evidence from dialects of North American English. In: Proceedings of Interspeech 2009, 2211–2214.

Thorne, S. 2003. Birmingham English: A sociolinguistic study (Doctoral dissertation, The University of Birmingham).

Trudgill, P. 1999. The Dialects of England. Oxford: Blackwell.

Turk, A. E. 2012. The temporal implementation of prosodic structure. In: Cohn, A., Fougeron, C., Huffman, M., & Renwick, M. (eds.), The Oxford handbook of Laboratory Phonology, 242–253. Oxford: Oxford University Press.

Turk, A. E., & Sawusch, J. R. 1997. The domain of accentual lengthening in American English. Journal of Phonetics, 25(1), 25–41. DOI:  http://doi.org/10.1006/jpho.1996.0032

Turk, A. E., & Shattuck-Hufnagel, S. 2007. Multiple targets of phrase-final lengthening in American English words. Journal of Phonetics, 35(4), 445–472. DOI:  http://doi.org/10.1016/j.wocn.2006.12.001

Turk, A. E., & White, L. 1999. Structural influences on accentual lengthening in English. Journal of Phonetics, 27(2), 171–206. DOI:  http://doi.org/10.1006/jpho.1999.0093

Turton, D. 2014. Variation in English /l/: Synchronic reflections of the life cycle of phonological processes (Doctoral dissertation, The University of Manchester).

Turton, D. 2017. Categorical or gradient? An ultrasound investigation of /l/-darkening and vocalization in varieties of English. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1), 1–31. DOI:  http://doi.org/10.5334/labphon.35

Upton, C., Sanderson, S., & Widdowson, J. 1987. Word Maps: A Dialect Atlas of England. London: Croom Helm.

van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M. 2014. SUBTLEX-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67(6), 1176–90. DOI:  http://doi.org/10.1080/17470218.2013.850521

Wagner, M. 2012. Locality in phonology and production planning. McGill Working Papers in Linguistics, 22(1), 1–18.

Wakelin, M. 1984. Rural dialects in England. In: Trudgill, P. (ed.), Language in the British Isles, 70–93. Cambridge: Cambridge University Press.

Warner, N. 2002. The phonology of epenthetic stops: Implications for the phonetics-phonology interface in optimality theory. Linguistics, 40(1), 1–27. DOI:  http://doi.org/10.1515/ling.2002.004

Watt, D. 2002. ‘I don’t speak with a Geordie accent, I speak, like, the Northern accent’: Contact-induced levelling in the Tyneside vowel system. Journal of Sociolinguistics, 6(1), 44–63. DOI:  http://doi.org/10.1111/1467-9481.00176

Watts, E. 2005. Mobility-induced dialect contact: A sociolinguistic investigation of speech variation in Wilmslow, Cheshire (Doctoral dissertation, University of Essex).

Wells, J. C. 1982. Accents of English vol. 2: The British Isles. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511611759

Wightman, C. W., Shattuck-Hufnagel, S., Ostendorf, M., & Price, P. J. 1992. Segmental durations in the vicinity of prosodic phrase boundaries. The Journal of the Acoustical Society of America, 91(3), 1707–1717. DOI:  http://doi.org/10.1121/1.402450

Yang, X., Shen, X., Li, W., & Yang, Y. 2014. How listeners weight acoustic cues to intonational phrase boundaries. PLoS ONE, 9(7), 1–9. DOI:  http://doi.org/10.1371/journal.pone.0102166

Yip, M. 1988. The obligatory contour principle and phonological rules: A loss of identity. Linguistic Inquiry, 19(1), 65–100.

Yuan, J., & Liberman, M. 2011. Automatic detection of ‘g-dropping’ in American English using forced alignment. In: Proceedings of 2011 IEEE International Conference of Acoustics, Speech and Signal Processing, 490–493. DOI:  http://doi.org/10.1109/ASRU.2011.6163980

Zhang, X. 2012. A comparison of cue-weighting in the perception of prosodic phrase boundaries in English and Chinese (Doctoral dissertation, University of Michigan).