1 Introduction

Numerous sources report that word-final consonants in Spanish undergo resyllabification when followed by a vowel in the next word (e.g., Bermúdez-Otero, 2011; Colina, 1997, 2006, 2009a, 2009b; Face, 2002; Harris, 1969, 1983; Hualde, 1991; Kaisse, 1999; Macpherson, 1975; Solé, 2010, inter alia). As a result of resyllabification, word-final pre-vocalic consonants are said to surface as onsets, as illustrated in (1).

    1. (1)
    1. Resyllabification of word-final consonants in Spanish. Example from Harris (1983).
    1. los otros
    1. [lo.so.tros]
    1. ‘the others’

Similar observations concerning resyllabification across word boundaries have been made for other Romance languages, such as Italian (Krämer, 2009) and French (Delattre, 1965; although see Section 4 for more discussion on French), while evidence from metrics suggests that the resyllabification rule may have already been operative in Latin (Ryan, 2013). Out of these cases, Spanish has featured particularly strongly in theoretical literature, because Spanish has a number of segmental processes which target word-final consonants and which interact with resyllabification in complex ways. Derived onsets (we shall use this term as a shorthand for word-final pre-vocalic consonants) sometimes pattern phonologically with canonical onsets, undergoing the same segmental processes, but in other cases they behave phonologically like canonical codas.

An example of a process where derived onsets pattern with canonical onsets is /s/-aspiration, as applying in selected Spanish dialects. For instance, in Chinato Spanish, word-final /s/ undergoes weakening before a phrase boundary, or before a following consonant, but not before a following vowel (Hualde, 1991a; Kaisse, 1999).1 On the other hand, in Caribbean Spanish and Rio Negro Argentinian Spanish, /s/-aspiration does target derived onset /s/, which means that derived onsets pattern with canonical codas rather than canonical onsets. The difference in distribution of /s/-aspiration is exemplified in (2).

    1. (2)
    1. /s/-aspiration (Hualde, 1991a; Kaisse, 1999)
    1. dieses
    2. dos palas
    3. dos alas
    4. dieses
    5. dos palas
    6. dos alas
    1. [die.seh]
    2. [doh.pa.lah]
    3. [do.sa.lah]
    4. [die.seh]
    5. [doh.pa.lah]
    6. [do.ha.lah]
    1. ‘tens’
    2. ‘two shovels’
    3. ‘two wings’
    4. ‘tens’
    5. ‘two shovels’
    6. ‘two wings’

In addition to /s/-aspiration, there are also other segmental processes which target word-final codas and derived onsets in various dialects of Spanish. A summary of the relevant processes with examples of dialects where they occur is in Table 1.

Phonological onsets pattern with canonical onsets

/s/-aspiration /s/→[h]/[Ø] Chinato Spanish Hualde (1991b)
Buenos Aires Argentinian Spanish Kaisse (1996)
Phonological onsets pattern with canonical codas

/s/-aspiration /s/→[h]/[Ø] Caribbean Spanish Kaisse (1999)
Rio Negro Argentinian Spanish Colina (1997); Colina (2002)
Chilean Spanish Broś (2012)
/s/-voicing /s/→[z] Quito Spanish Robinson (1979)
Lipski (1989)
Strycharczuk et al. (2013)
/n/-velarization /n/→[ŋ] Galician Spanish Harris (1983)
Ecuadorian Spanish Ramsammy (2013)
liquid gliding /l/→[j] (depending on stress) Cibaeño Spanish Harris (1983)

Table 1

Phonological behaviour of derived onsets with respect to segmental processes.

Since the domain of application for segmental processes in Spanish dialects shows variation with respect to derived onsets, it appears motivated to distinguish derived onsets at the level of description, as we have done in the present discussion. However, it is not obvious how to accommodate this distinction in phonology. A body of theoretical work on Spanish involves the assumption that derived onsets and canonical onsets share a common phonological representation (e.g., Bermúdez-Otero, 2011; Colina, 1997, 2009a; Face, 2002; Kaisse, 1999; Ramsammy, 2013; Strycharczuk et al., 2013). Once that assumption is made, the burden of distinguishing between the behaviour of canonical and derived onsets falls onto computation. Thus, the behaviour of Spanish derived onsets has been at the heart of theoretical debate concerning how resyllabification and segmental processes may interact to produce distinct outputs for canonical onsets and derived onsets. Colina (2009a) analyzes Quito Spanish /s/-voicing as involving an identity relationship between different members of the same paradigm (Output-output correspondence; Benua, 1997). Bermúdez-Otero (2011) argues that the phenomenon requires an analysis where different phonological operations may apply within different domains determined by morphology. An earlier account proposing a solution involving multiple levels is by Kaisse (1999), who analyzes Spanish /s/-aspiration as being lexical or post-lexical in different dialects of Spanish. The representation of derived onsets and its relevance for output-oriented theories of phonology, such as Optimality Theory (Prince & Smolensky 2004 [1993]) is also considered in works by Colina (2002), Broś (2012), Ramsammy (2013), and Strycharczuk et al. (2013).

In contrast to the works cited above, some linguists have questioned whether word-final consonants undergo full resyllabification in all Spanish dialects. Lipski (1999) proposes that, in some dialects, resylabification may create ambisyllabic consonants, which are simultaneously onsets and codas. Robinson (2012) suggests that resyllabification may be suspended for specific dialects or even individual segments. Based on reports from Quito Spanish informants, Robinson argues that word-final /n/ and /s/ do not undergo resyllabification across word boundaries in this dialect, but other consonants, such as /l/, do. The debate concerning syllabification of word-final pre-vocalic /s/ in Ecuadorian Spanish is also considered in Bradley and Delforge (2006), while Bradley (2005) proposes a formal account which prevents resyllabification of pre-vocalic /s/ in Ecuadorian Spanish.

In summary, word-final pre-vocalic consonants in Spanish have received three different types of analysis in the theoretical literature: (1) as always undergoing complete resyllabification, (2) as alternating between complete and partial resyllabification, or (3) as alternating between complete resyllabification and none at all, depending on segment and dialect. It is, however, worth stressing that these different interpretations only pertain to cases where derived onsets pattern with canonical codas (see Table 1). Where there is no evidence of distinct phonological behaviour on the part of derived onsets, most accounts espouse the view that there is resyllabification. This approach, however, mostly builds on speaker intuitions and theoretical considerations, without much independent empirical support. In contrast, existing phonetic findings problematize assumptions concerning complete resyllabification even in apparently transparent cases.

In its strong formulation, a complete resyllabification analysis would predict derived onsets to be phonetically indistinguishable from word-initial onsets. Contrary to this prediction, however, existing data on the phonetic realization of derived onsets in Spanish show that derived onsets differ phonetically from canonical ones. Hualde and Prieto (2014) analyze duration of pre-vocalic /s/ in different prosodic positions (VsV, V#sV, Vs#V) in Madrid Spanish, and find that word-final pre-vocalic /s/ is shorter compared to /s/ in word-initial or word-medial onsets. According to Hualde and Prieto, these subphonemic differences point towards the direction of sound change observed in many dialects, where derived onsets (Vs#V) pattern distinctly from canonical onsets (VsV or V#sV). The results also problematize the complete resyllabification hypothesis, and they raise empirical questions concerning the realization of /s/ in derived onsets (Vs#V) compared to canonical codas (Vs#C). If word-final /s/ showed similar duration regardless of whether the following segment was a consonant or a vowel, this would call into question whether resyllabification applies at all. If, however, there is still a difference in the word-final pre-vocalic and pre-consonantal /s/, this could suggest that a form of resyllabification applies.

Given the central role of resyllabification in the phonological literature on Spanish, and considering the findings by Hualde and Prieto (2014), the current study sets out to provide further empirical insights on Spanish resyllabification. We examine the acoustic properties of derived onsets in Spanish, in order to address the question concerning their syllabic affiliation. In our investigation, we focus on word-final /s/ in central and northern varieties of Peninsular Spanish. We chose /s/, as acoustic segmentation of /s/ is more reliable than segmentation of other consonants which commonly appear in word-final coda positions in Spanish (/l/, /r/, or /n/). In addition, focusing on /s/ allows us to compare our results to those by Hualde and Prieto (2014), using a different type of speech materials. Hualde and Prieto’s data come from a spontaneous speech corpus (based on a map task), and therefore the findings concerning /s/ duration are subject to potential confounds related to prosody and speech rate. Consequently, the argument concerning resyllabification would be stronger if the findings were replicated in a more controlled context. In our study, we compare the acoustic realisation of derived onset /s/ (Vs#V) with canonical coda environments (VsC, Vs#C) and canonical onset environments (VsV, V#sV). As a point of departure, we take the strong resyllabification hypothesis, where word-final pre-vocalic /s/ becomes structurally identical to word-initial onset /s/. Assuming that only phonological (but not morphological or lexical) information is available to phonetics, the strong resyllabification hypothesis would predict that derived onsets pattern phonetically with canonical onsets. On the other hand, if there is no resyllabification, we would expect derived onsets to pattern with canonical codas. In our analysis, we also consider the role of word prosody on duration, analyzing /s/ in word-medial /s/ onsets and codas, compared to /s/ at word boundaries. Furthermore, we compare the duration of singleton /s/ in different prosodic positions to the duration of fake geminate /s/, where a word-final /s/ is followed by a word-initial /s/, with a view to exploring a broad spectrum of possible /s/ duration.

2 Materials and method

2.1 Stimuli

The test items used in our experiment contained /s/ in word-final and word-medial codas, /s/ in word-initial and word-medial onsets, as well as derived onset /s/ and fake geminate /s/, in a controlled segmental and prosodic environment. The full test items for each condition are listed in Table 2. For the coda context, we only included /s/ followed by a voiceless obstruent. This is because preceding a voiced consonant, /s/ may undergo voicing (Romero, 1999), which in turn may shorten the fricative. Effects of voice assimilation on the duration of the undergoer have been noted for a number of languages (Jansen, 2004), including the Quito dialect of Spanish (Strycharczuk et al., 2013). For the intervocalic environments, we kept the vowels surrounding /s/ constant. We also avoided stress on the vowel immediately preceding /s/, or in the syllable into which /s/ could be potentially resyllabified. This was done to avoid potential confounds on duration induced by stress assignment. We kept the same stress template for all the stimuli. The test items were embedded in a fixed carrier phrase: Ahora digo … ‘I now say …’.


Word-initial onset V1#sV2 cruce sagrado ‘sacred crossing’
fraude sabido ‘known fraud’
base salada ‘salty base’

Word-final coda V1s#T viajes pagados ‘paid trips’
jefes casados ‘married bosses’
meses pasados ‘past months’

Derived onset V1s#V2 redes atadas ‘tied nets’
peces asados ‘fried fish’
reses adultas ‘adult cattle’

Word-medial onset V1sV2 gran pesadilla ‘big nightmare’
no desalientes ‘don’t get discouraged’
haz quesadillas ‘make quesadillas’

Word-medial coda V1sT diez estatutos ‘ten statues’
tres espaguetis ‘three spaghetti’
seis españoles ‘six Spaniards’

Fake geminate V1s#sV2 cruces sagrados ‘sacred crossings’
fraudes sabidos ‘known frauds’
bases saladas ‘salty bases’

Table 2

Test items.

2.2 Participants

The participants were 11 native speakers of Peninsular Spanish (10 females). We recruited speakers who had been born and grew up in the northern or central part of Spain. Some of the speakers came from bilingual parts of Spain (Asturias, Galicia, Catalonia, Valencia, and Balearic Islands), but we only included speakers who had grown up in monolingual Spanish households (see Section 2.5 and the Appendix for further discussion on this point). All of the speakers were living abroad at the time of the recording. A summary of the speaker data is in Table 3. Participation was voluntary, and the speakers did not receive any remuneration. They were all naïve about the purpose of the recording.

Participant code Sex Age Province of origin

2A F 36 Madrid
2B F 32 Lugo
2C F 37 Madrid
2D F 26 Asturias
2E F 35 Barcelona
2F M 29 Albacete
2G F 26 Castellón
2H F 47 Madrid
2I F 33 Balearic Islands
2J F 27 Madrid
2K F 36 Madrid

Table 3

Participant details.

2.3 Procedure

The recordings were made in a laboratory setting, in a sound-attenuated room at two different sites. Speakers 2A, 2B, 2C, 2D, 2E, and 2F were recorded using Adobe Audition CS6 software, version 5.0.2. and Roland Quad Capture UA-55 Audio Interface. The microphone was Sennheiser MKH416T. Speakers 2G, 2H, 2I, 2J and 2K were recorded on an Apple iMac, using Digidesign Pro Tools LE8 software and a Digidesign DIGI003 recording interface. The microphone was a Neumann U89i. For all the recordings, the speakers were positioned ca. 30 cm away from the microphone. The participants read four repetitions of the experimental material out loud. The test items were semi-randomized in blocks for each speaker (excluding immediate repetitions in neighbouring blocks) and presented on a computer screen, one at a time. The experiment was self-timed. The speakers were instructed to speak as naturally as possible. They were also encouraged to correct themselves if they made a mistake, by repeating the entire sentence. Altogether, 792 tokens were recorded, excluding instances when participants corrected themselves (18 items × 4 repetitions × 11 participants). The audio data were sampled at 44,100 Hz with a 16-bit depth.

2.4 Segmentation and measurements

The data were analyzed using Praat version 5.3.59 (Boersma & Weenink, 2009) on a 5 ms Gaussian window. The acoustic signal was segmented using EasyAlign for Spanish (Goldman, 2011). The boundaries for /s/ and its surrounding segments (preceding vowel, following vowel/consonant) were further inspected and adjusted manually by the first author to comply with the following segmentation criteria. We defined the onset of /s/ as the onset of frication visible in the region of 3–5 kHz (and higher). This typically coincided with the offset of the formant structure for the preceding vowel, and proved to be a consistent segmentation criterion. Defining the offset of /s/ was more problematic, since we often found that high-intensity frication was followed by a short period of transition characterized by very low-intensity frication before the acoustic landmarks of the following segment (formants for vowels, antiformants in case of nasals, voicing) became visible. We could not identify a consistent criterion which would allow us to determine precisely the offset of such weak frication before consonants, especially preceding stop closure. This was additionally complicated by the fact that, in many cases, complete closure was not made for the following stop, and so weak acoustic energy could be seen at the offset of the fricative and in the following stop with no clear transition. The observations of incomplete stop closure in our data are consistent with previous reports concerning lenition in Spanish (Hualde et al., 2011; Torreira & Ernestus, 2011). Due to this issue, we chose to place the offset of the fricative at the offset of high-intensity frication. Example segmentations are illustrated in Figure 1. The onset of the vowel preceding /s/ (V1) was placed at the onset of visible formant structure. In a number of instances, the obstruent preceding the vowel was lenited, and formant structure was visible during the obstruent. In such cases, we relied on intensity transitions to identify the vowel onset. We used the same criteria to identify the offset of the vowel following /s/.

Figure 1 

Example segmentations.

There were 16 occurrences in the data when a speaker made a pause following word-final /s/, which could be accompanied by glottalization when the following word began in a vowel. We discounted all such cases, which left us with 776 tokens for further analysis.

Based on the segmentation described above, the following measurements were made.

  1. Duration measurements: duration of /s/, V1, and V2 (for intervocalic fricatives).
  2. Spectral characteristics of /s/: First four spectral moments (centre of gravity, standard deviation, skewness, and kurtosis) were measured for /s/ based on time-averaged Discrete Fourier Transforms (DiCanio, 2013).
  3. Spectral characteristics of V1 and V2: F1 and F2 of the surrounding vowels were measured, using the Burg algorithm in Praat, at 5 equidistant intervals throughout the 50-ms window preceding and following the fricative.

2.5 Statistical analysis

Statistical analysis was performed using R (R Development Core Team, 2005) version 3.0.1. We analyzed the individual measurements using linear mixed-effects regression models in the nlme package (Pinheiro et al., 2014). For each variable, we fitted a linear mixed-effects regression model with Condition (Vs#V, V#sV, Vs#sV, Vs#T, VsT, VsV) as a fixed effect. The intercept was set to the Vs#V condition in all cases, in order to make comparisons between the derived onset environment and other conditions. The random effects structure was kept maximal, following Barr et al. (2013). This structure included random intercepts for Speaker and Item as well as a random slope for Condition within Speaker. In our modelling procedure, we further considered whether the results were different for speakers who came from monolingual parts of Spain, compared to speakers who came from bilingual regions. For every phonetic parameter, we added Region as an additional fixed predictor, as well as an interaction between Region and Condition. The effect of Region and its interaction with Condition was not significant in any case, and therefore we do not include them in the models reported in Section 3 below. In the Appendix, we report the results of a model of /s/-duration (the key phonetic parameter in our study) which include the (non-significant) interaction.

3 Results

3.1 /s/ duration

The duration of /s/ was systematically affected by condition (F=45.10, df=182, p<0.001). The differences between derived onsets and other environments are summarized in Table 4, and the overall results are plotted in Figure 2. The average duration of derived onset /s/ was 11.84 ms greater than that of /s/ in a word-final coda, but 11.68 ms less than its duration in a word-initial onset. Word-medial codas had the relatively shortest duration, 17.24 ms shorter than derived onsets. Fake geminates had the relatively largest /s/ duration, 36.29 ms greater than derived onsets. The model results also show a small (5.67 ms) but nevertheless significant duration difference between word-medial onsets and derived onsets, where word-medial onsets are relatively longer. We further performed pairwise comparisons between all the individual conditions, by changing the intercept for Condition and re-running the model. As summarized in Table 5, most of the pairwise comparisons were significant, except the difference in /s/ duration between word-medial codas and word-final codas.

Predictor Level β SE t p

(Intercept) 76.23 3.31 23.04 <0.001
Condition Vs#T –11.84 3.43 –3.45 <0.001
Condition VsT –17.24 3.18 –5.42 <0.001
Condition VsV 5.67 2.02 2.80 <0.01
Condition V#sV 11.68 2.20 5.31 <.001
Condition Vs#sV 36.29 3.78 9.59 <.001

Table 4

Summary of fixed effects in a linear mixed-effects regression model predicting /s/ duration (in ms) with Condition as a fixed predictor, random intercepts for Speaker and Item, and a random slope for Condition within Speaker.

Figure 2 

Fitted values of /s/ duration depending on the context.

VsT Vs#T Vs#sV V#sV VsV

Vs#V *** *** *** *** **
VsV *** *** *** **
V#sV *** *** ***
Vs#sV *** ***
Vs#T

Table 5

Summary of significance thresholds for pairwise comparisons of /s/ duration in different prosodic conditions.

*** <0.001, ** <0.01,* <0.05.

The inclusion of a random slope for Condition within Speaker significantly improved the model fit, compared to a model with random intercepts only (p<0.001). This result indicates individual variation with respect to how Condition affected /s/ duration. The variation is illustrated in Figure 3, which shows fitted values for /s/ duration, depending on Condition and Speaker. In order to improve plot legibility, we removed the fake geminate condition, which is not crucial to distinguishing between the experimental hypotheses. For most speakers, we find the general duration trend: word-medial codas < word-final codas < derived onsets < word-medial onsets < word-initial onsets. Speaker 2J was a clear exception, showing very little variation in /s/ duration, depending on condition. For speakers 2A and 2H there was very little difference in duration between word-final and word-medial codas, whereas for speaker 2I, word-medial codas were longer than word-final codas. Whilst we can generalize that derived onsets were longer than codas, there were some exceptions. For speakers 2D and 2E, derived onsets had greater duration compared to word-medial codas, but not word-final codas. When it comes to the duration of derived onsets vs. canonical onset environments, some speakers (e.g., 2C, 2D, 2E, 2G) showed a relatively clear trend consistent with the generalization made for the population, i.e., increase in duration in the direction derived onset < word-medial onsets < word-initial onsets. Other speakers (e.g., 2A, 2K) showed relatively little difference between derived onsets and word-medial onsets. Finally, there were also speakers (e.g., 2B, 2I, 2J) for whom word-medial onsets patterned with word-initial onsets. A difference between derived onsets and word-initial onsets was present for all speakers.

Figure 3 

Plot of fitted values for /s/ duration (in ms) depending on condition and speaker.

Finally, we also considered the distribution of /s/ duration in derived onsets. This is important in the context of assessing the amount of prosodic variation at word junctures in our data. Even though we excluded the clear cases of disfluencies and hesitations, as manifested by pauses or initial vowel glottalization (see Section 2.4), we might still find categorical variation in the data, where /s/ sometimes patterns with codas and sometimes with onsets. However, in such a case, we would expect a bimodal distribution of /s/ in the derived onset condition, as well as increased standard error, compared to word-medial conditions. Yet we find no evidence of either. As shown in Figure 4, derived onset /s/ followed a unimodal distribution. Table 4 shows that all experimental conditions involved comparable standard error levels. We also inspected the normality of distribution for our /s/-duration data pooled together, and found that the distribution was right-skewed. This kind of deviation from normality may produce a Type I error in a model. In order to assess this for our data, we normalized the dependent variable, by extracting its square root, and re-fitted the model. No major differences were found between the models with a non-normalized and normalized dependent variable (none affecting the significance thresholds). Considering that the effect sizes are more directly interpretable when the dependent variable is not normalized, the non-normalized version is the model we report.

Figure 4 

Distribution of /s/ duration (in ms) in the derived onset environment.

3.2 V1 duration

Condition was found to have a significant main effect on V1 duration (F=14.83, df=182, p<0.001). Vowels preceding derived onsets were on average 7.25 ms longer than vowels preceding word-final codas. V1 duration was also greater in the context of derived onsets compared to vowels preceding word-medial codas (β=–17.28) and word-medial onsets (β=11.43). In contrast, vowels preceding fake geminate /s/ were 5.64 ms longer compared to derived onsets. No significant difference was found between the durations of vowels preceding derived onsets and word-initial onsets. The summary of the fixed effects for the V1 duration model is in Table 6. No significant difference in model fit was found between the model with a maximal random effect structure and a model with random intercepts only, indicating no significant inter-speaker differences.

Predictor Level β SE t p

(Intercept) 68.26 4.18 16.34 <0.001
Condition Vs#T –7.25 2.06 –3.51 <0.001
Condition VsT –17.28 2.92 –5.92 <0.001
Condition VsV –11.43 2.98 –3.83 <0.001
Condition V#sV 0.09 2.06 –0.04 <0.97
Condition Vs#sV 5.64 2.13 2.65 <0.01

Table 6

Summary of fixed effects in a linear mixed-effects regression model predicting V1-duration (in ms) with Condition as a fixed predictor, random intercepts for Speaker and Item, and a random slope for Condition within Speaker.

We further tested whether the remaining prosodic positions differed significantly from each other with respect to V1 duration. The comparisons are summarized in Table 7.

VsT Vs#T Vs#sV V#sV VsV

Vs#V *** *** *** **
VsV ** *** ***
V#sV *** *** *
Vs#sV *** ***
Vs#T ***

Table 7

Summary of significance thresholds for pairwise comparisons of V1 duration in different prosodic conditions.

3.3 Other variables

Using the procedure described in Section 2.5, we did not find a significant main effect of Condition on V2 duration (measured for conditions involving intervocalic /s/ only). Neither was there a significant effect of Condition on formant frequencies (F1 or F2) of V1 or V2. Similarly, the first four spectral moments (centre of gravity, standard deviation, skewness, kurtosis) were not systematically affected by Condition.

4 Discussion

4.1 Main findings

The prosodic characteristics (syllabic affiliation and position within words) systematically affected two acoustic cues: /s/ duration and V1 duration. For /s/ duration in word-initial onsets, we observe lengthening compared to word-medial onsets. This observation is in line with earlier findings for other languages, which also show segmental lengthening adjacent to relatively stronger prosodic boundaries (Byrd, 2000; Byrd & Saltzman, 2003; Fougeron & Keating, 1997; Turk & Shattuck-Hufnagel, 2000), although note that Hualde and Prieto (2014) found no significant lengthening in word-initial /s/ compared to word-medial /s/ in Spanish. Independently of the boundary lengthening effects, we also observe a difference based on syllabic affiliation: onset /s/ had generally greater duration than coda /s/, as previously observed, e.g., in French (Fougeron, 2007; Fougeron et al., 2003). Vowel duration also shows some boundary lengthening effects, as it is greater preceding word-initial onsets or word-final codas than in word-medial positions. At the same time, vowel duration is smaller preceding coda /s/ than preceding onset /s/ word-medially, and we see a similar difference in V1 duration between onset and coda /s/ at word juncture. Thus, we can generalize that vowel duration is greater in open syllables compared to closed syllables, which, once again, is a well-established cross-linguistic generalisation (Maddieson, 1997).

Having considered how segmental environment and boundary conditions affect duration, let us now turn to the question of how derived onsets compare to the baseline contexts. For /s/ duration, we find a significant difference between derived onsets and all other environments, consistent with the findings on intervocalic /s/ by Hualde and Prieto (2014). Contrary to the predictions of a strong resyllabification hypothesis, we find that derived onset /s/ is not as long as word-initial onset /s/. It is also important to note that while derived onsets are systematically shorter than canonical onsets, they are longer compared to canonical codas. This suggests that timing of word-final /s/ is indeed affected by the presence of a vowel following the word-boundary.

In contrast to /s/ duration, results from V1 duration are consistent with the predictions of the resyllabification hypothesis. V1 duration was the same preceding word-initial onsets and derived onsets, but greater compared to word-final coda context. This suggests that derived onset /s/, unlike word-final coda /s/ does not condition the shortening of a preceding vowel.

We also note a degree of inter-speaker variation in our data with respect to /s/ duration, as individual speakers did not always show the same trends as we find for the population. Speaker 2J showed very little systematic variation depending on condition. For speakers 2D and 2E, derived onsets patterned with word-final codas with respect to /s/ duration. The strong resyllabification hypothesis is not confirmed for any of the speakers, as we always find increased /s/ duration in word-initial onsets compared to derived onsets. Word-medial onsets, on the other hand, had similar /s/ duration compared to derived onsets for some speakers: 2A and 2K. Importantly, there was individual variation with respect to word-final and word-initial lengthening, which suggests that both of these phenomena may be considered strong trends, but not absolute generalizations for our speaker population.

4.2 Phonological representation and the phonetic duration effects

As a point of departure in our study, we took the hypothesis that resyllabification would render derived onsets and word-initial onsets phonetically indistinguishable. That prediction is clearly not borne out by our data, as the two groups show consistent differences in duration. Let us now consider weaker versions of the resyllabification hypothesis, where derived onsets are still analyzed as phonological onsets, but their unique properties follow from other aspects of the representation.

One possible approach is to admit partial resyllabification which gives rise to intermediate phonetic realizations. This can be expressed through an ambisyllabic representation, schematized in (3). Representations like this have featured in numerous cases where segmental processes seem to be conditioned by syllable position, yet they do not unambiguously target onsets or codas (Gussenhoven, 1986; Hammond, 1999; Kahn, 1976, inter alia).

    1. (3)
    1. An ambisyllabic representation of word-final pre-vocalic /s/

In the ambisyllabic approach, word-final pre-vocalic /s/ is affiliated not with one, but with two neighbouring syllables. By virtue of this representation, derived onsets are distinguished from canonical onsets and from canonical codas, and thus they may be phonetically different from either category. At the same time, derived onsets share parts of representation with codas and onsets, and so they may undergo the same phonological processes that target either of those categories.

However, while ambisyllabicity allows us to capture the fact that there is a three-way distinction between derived onsets, canonical onsets, and canonical codas, it is less clear whether it also makes any specific predictions about the phonetic realization of ambisyllabic segments. In the phonological literature, ambisyllabic representations have been discussed in the context of intermediate phonetic realizations (Gick, 2003), but they have also been considered as a way of representing geminates. The link between geminates and ambisyllabicity is supported by some psycholinguistic evidence. For instance, in syllable boundary elicitation tasks, consonants spelled with a double letter are more likely to be classified as ambisyllabic (Derwing, 1992; Treiman & Danis, 1988; Treiman & Zukowski, 1990). Ambisyllabicity has also been argued for geminates based on their phonetic and phonological properties. Ridouane (2010) shows that geminates in Tashlhiyt Berber have considerably increased duration compared to singleton consonants, and in addition, lexical geminates (as opposed to derived geminates) are enhanced by other acoustic parameters, such as vowel duration, RMS amplitude, and systematic presence of stop release. Furthermore, Ridouane argues that geminates in Tashlhiyt Berber pattern with singleton consonants with respect to some phonological processes (e.g., palatalization), but they show cluster-like properties in poetic versification. Ridouane (2010) resolves this ‘geminate ambiguity’ by proposing an ambisyllabic representation, where a geminate consonant is a single segment, which is simultaneously linked to a coda and to an onset position. Arvaniti (2001) argues for a similar ambisyllabic representation of geminates in Cypriot Greek, also showing that geminates combine the properties of singleton segments and clusters. Specifically, geminates have increased phonetic duration compared to singletons. At the same time, however, geminates do not pattern with clusters as palatalization targets. For Galician, Colina and Díaz-Campos (2006) use ambisyllabicity to represent derived geminates, which show intermediate duration between singleton onsets and lexical geminates.

Based on the current data, fake geminates in Spanish behave phonetically much like geminates in Tashlhiyt Berber and Cypriot Greek, showing increased segmental duration compared to singleton word-initial onsets. However, they also form a distinct category from derived onsets (which are shorter compared to fake geminates). Therefore, if the ambisyllabic representation is used to capture the durational properties of fake geminates, derived onsets must be represented differently. The alternative would be to analyze derived onsets as ambisyllabic, whereas fake geminates are represented as clusters. This solution, however, brings forth the question concerning the more universal phonetic predictions which follow from ambisyllabic representations. In the case of Tashlhiyt Berber and Cypriot Greek, we see arguments for analyzing geminates as single segments, as well as evidence of increased phonetic duration. An ambisyllabic representation has the scope to account for both of these observations, as geminates may share the properties of singleton onsets and codas on the phonological level, while phonetically, the two prosodic positions (onset and coda) may both contribute to the duration, making geminates relatively longer. If, on the other hand, an ambisyllabic structure is used to represent derived onsets, the phonetic prediction seems to be that ambisyllabicity yields intermediate segmental duration (between the duration of canonical onset and canonical coda). In a cross-linguistic perspective, these two diverging predictions cannot be upheld simultaneously. It is, of course, possible that ambisyllabicity may have different durational correlates, depending on the language. However, if ambisyllabicity and segmental duration are not inherently linked, positing an ambisyllabic representation for derived onset /s/ in Spanish cannot explain the observed duration facts. Ambisyllabicity may capture the fact that derived onset /s/ is different from word-initial /s/, but not the fact that derived onset /s/ is comparatively shorter.

Another option for capturing the unique duration properties of derived onsets becomes possible if we assume that syllable boundaries may be misaligned with higher-level boundaries. A similar kind of representation is proposed by Colina (1997, 2006) to account for cases of /s/-aspiration in derived onsets (see Section 1 for example data on /s/-aspiration). Colina models resyllabification not as a process of secondary syllable structure assignment, but as a case of conflict between syllable structure (driven by constraints against onsetless syllables) and morphological alignment (requiring morpheme contiguity). Under this approach, the representation is enriched to include not just prosodic, but also morphological boundaries ([lo.s|o.tros]). Extending this proposal to our case directly would require that morphological information (morpheme boundary) be accessible to the phonetic component. This is because in the case of Peninsular Spanish /s/, syllable structure does not condition categorical allomorphy, but rather it has an effect on phonetic /s/-duration. Whether or not such interactions occur is a contentious question (see Section 4.3 below). However, let us consider a similar case of misalignment, where syllable boundaries are misaligned with prosodic word boundaries. Unlike morphological boundaries, prosodic boundaries are unambiguously visible to phonetic processes, as shown, for instance, by word-initial prosodic strengthening. A representation of prosodic misalignment in schematized in (4). Under this representation, derived onsets are distinguished from word-initial or word-medial onsets by their position within word, not within syllable, and the distinctions in phonetic duration follow directly from that.

    1. (4)
    1. Resyllabification with prosodic misalignment

The representation in (4) makes a connection with the lexical identity of the resyllabified consonant without stipulating that phonetics can interact with morphology or the lexicon directly. It can also account for the durational difference between word-initial onsets and derived onsets: word-initial onsets are subject to prosodic strengthening, whose phonetic correlate is increased duration (Fougeron & Keating, 1997; Keating et al., 2003), whereas derived onsets are not prosodic word-initial, and therefore do not undergo similar strengthening. However, the proposal runs into problems when we consider the distinction between derived onsets and word-medial onsets. If the distinction between derived onsets and word-initial onsets is only a matter of prosodic strengthening, we would expect derived onsets and word-medial onsets to pattern together. This, however, is not the case, as we find a small (5.67 ms), but nonetheless systematic difference between derived onsets and word-medial onsets. Similarly, Hualde and Prieto (2014) report that word-medial /s/ is longer compared to derived onset /s/ in their data. Note that word-finality also does not contribute to accounting for this difference, because word-final is a position of lengthening, compared to word-medial, as shown for instance by the lengthening in word-final codas compared to word-medial codas in our data.

4.3 The relevance of syllable structure

As different versions of the resyllabification hypothesis predict category effects which are not closely matched by our data, we must revisit the question of whether apparent resyllabification effects really involve changes in syllable structure. Similar concerns are raised by Scobbie and Pouplier (2010), who argue against an ambisyllabic analysis of word-final pre-vocalic /l/ in English. Scobbie and Pouplier (2010) note that the gestural realisation of word-final /l/ is influenced much more strongly by the segmental context than by the ostensible syllabic position of the /l/ itself (onset/coda). For instance, /l/ followed by /h/ may be more onset-like than /l/ followed by /b/. This, and other aspects of gradient variation in the realisation of word-final /l/, can be captured in a gestural model, where sandhi effects emerge from different levels of overlap between individual gestures. However, once the gestural properties of neighbouring sounds are taken into account, the role of the syllable becomes redundant. The presence of a tongue tip gesture in pre-vocalic /l/ follows from the gestural organisation of /l/+V sequences. Therefore, if a gestural account is needed independently (to accommodate the difference between /l#h/ and /l#b/), a syllable-level analysis is superfluous.

Translating this discussion back to our Spanish data, the ‘resyllabification effect’ (lengthening of /s/ in Vs#V compared to Vs#C) may not be a matter of language users parsing word-final pre-vocalic consonants as onsets, but may follow from differences in timing between consonant-vowel and consonant-consonant sequences. This kind of timing difference at word junctures is similar to what we see word-internally in Spanish, where /s/ is longer in VsV sequences compared to VsT. What coarticulation alone does not explain is the durational difference between word-initial, word-medial, and word-final pre-vocalic /s/. However, neither is this difference explained by a resyllabification account, where a pre-vocalic /s/ is always an onset. Instead, in order to explain the durational differences in pre-vocalic /s/, we must consider higher prosodic levels than the syllable, specifically prosodic effects at word junctions.

Segmental duration is commonly affected by word-level prosodic effects, such as initial strengthening (Fougeron & Keating, 1997), or final lengthening. These durational effects provide additional boundary cues, facilitating lexical access. In contrast, resyllabification potentially impedes lexical access, as predicted for instance by the Possible Word Constraint (Norris et al., 1997), since it creates a misalignment between a hypothesized syllable boundary and a word boundary. However, these potential issues with lexical access are not reflected in experimental findings. In a series of priming experiments, Gaskell, Spinelli, and Meunier (2002) find that resyllabification does not impede word recognition in French. On the contrary, word recognition seems to be slightly facilitated where resyllabification occurs. To account for this finding, Gaskell et al. (2002) hypothesize that listeners recognize resyllabified consonants as such, by paying attention to sub-phonemic cues which distinguish resyllabified consonants from word-initial ones. At the same time, the results from word recognition experiments question the central role of a syllable as a processing unit. While there is evidence that syllable structure plays a role in speech processing (e.g., McQueen, 1998), its role may be limited in some languages, such as French, where evidence from enchaînement (see Section 4.4) points instead towards specific word-initial representations and associated processing mechanisms.

Taken together, word boundary effects and effects of coarticulation yield a continuum of consonantal duration, which we also observe in our data (see Figure 2). The place of Vs#V sequences is intermediate in this continuum, reflecting a certain tension between gestural and lexical organisation. While the affiliation of /s/ with the following vowel at the gestural level increases segmental duration (in Vs#V compared to Vs#C), the increase in duration is limited by the affiliation of /s/ and the preceding word at the lexical/prosodic level. The tension between gestural and lexical organisation also provides a perspective on the behaviour of word-final pre-vocalic consonants in sound change. In Section 1, we drew a distinction between phonologically transparent diachronic developments in Spanish dialects, where derived onsets pattern with canonical onsets, and apparently opaque changes, where derived onsets pattern with canonical codas. However, we can also see these two trends in sound change as consistent with partitioning the durational continuum according to the requirements of either gestural cohesion or lexical membership of word-final consonants. In the former case, word-final pre-vocalic consonants pattern with other pre-vocalic consonants. In the latter case, word-final consonants behave uniformly, regardless of the following context. This interpretation involves no loss of generalisation compared to a syllable-based analysis, as either analysis needs to also include word-level effects in order to handle the unique durational properties of Vs#V sequences in Peninsular Spanish. A gestural re-analysis of the Spanish facts represents a more linear approach as opposed to a hierarchical syllabic one. The possibility of such re-analysis has often been pointed out in discussions concerning empirical evidence for the syllable – virtually any syllable-based generalisation can also be expressed in non-hierarchical linear terms (see Côté, 2012, for a recent review). Of course, just because the syllable is ostensibly redundant does not necessarily entail that it does not play a role in language processing. Indeed, Côté (2012) cites a number of studies which make a compelling empirical case for the syllable. For instance, some articulatory models, e.g., Gafos and Goldstein (2012), include the syllable as an intermediate level between gestural representation and higher-level prosody to account for the asymmetries in the timing properties of onset and coda gestures. We must, however, note that while our current findings on Spanish do not constitute evidence against a syllable-based analysis, neither are they evidence in favour of such an account. This is important, because the role of the syllable has long been taken for granted in accounting for the realization of word-final pre-vocalic consonants in Spanish. A resyllabification analysis would be supported if there was a categorical onset-coda split in the duration data, or if the strong resyllabification hypothesis could be confirmed. The gradient complexity we find, on the other hand, suggests that the syllable may only have a peripheral role in explaining the durational properties of Spanish consonants, if it plays any role at all.

A question raised by a reviewer concerns the gestural account, and specifically how gestural coordination can be modelled from the formal point of view. We see two possible solutions. If we assume that gestural constraints interact with morphological or lexical boundaries directly, as suggested for instance by Hualde and Prieto (2014), we need a model that assumes no modular distinction between phonology and phonetics. An analysis which forfeits modularity in modelling Spanish syllable structure is proposed by Bradley (2006), who argues for an interaction of phonetic constraints on gestural phasing and higher-level constraints on alignment between syllable edges and morphological edges (note, however, that resyllabification still features in Bradley’s model). Another possibility is one where lexical or morphological word edges translate into prosodic boundaries, and these prosodic boundaries further interact with phonetic constraints on gestural phasing. Such an analysis involves a fairly powerful view of the language-specific phonetic component, while reducing the role of phonology, but it does not include direct interactions between phonetics and morphology or the lexicon. Whether or not such interactions occur is an empirical question, but not one that has been settled (Cho, 2001; Hanique & Ernestus, 2012; Hay, 2004; Lammert et al., 2014; Mousikou et al., 2015; Schuppler et al., 2012; Song et al., 2013; Sugahara and Turk, 2009). Thus, decisions on how to model the effects of word boundaries on gestural coordination must await further developments in this area.

4.4 Resyllabification in a cross-linguistic perspective and the Spanish opacity debate

The durational difference between derived onsets and word-initial onsets in Peninsular Spanish, however we choose to represent it in phonology, is of crucial relevance to the debate on phonological opacity in Spanish dialects. As previously discussed in Section 1, segmental processes affecting word-final /s/ in Spanish may only affect /s/ in canonical coda position, as is the case with Chinato Spanish /s/-aspiration, or they may also affect derived onsets, as we find, e.g., in /s/-aspiration in Rio Negro Argentinian Spanish. This variation in phonological behaviour of derived onsets has largely been interpreted in the phonological literature as follows. Cases where derived onsets pattern differently from codas have been taken to support the generalization that there is phonological resyllabification in Spanish: word-final pre-vocalic consonants are assumed to surface as onsets in the output of phonology, and thus they are not affected by segmental processes which target codas (e.g., Harris, 1983, on emphatic trilling). Once the resyllabification hypothesis is assumed, however, an independent explanation is required to account for cases where derived onsets pattern segmentally with canonical codas. Hence, much of the theoretical debate on the segmental phenomena in Spanish has focused on how phonological computation may derive opaque surface generalizations. The results from our study, however, cast serious doubt on one of the key assumptions on which the aforementioned theoretical debate rests, namely that derived onsets form a single phonological category with canonical onsets. As we have focused on Peninsular Spanish in our investigation, we cannot draw any conclusions about the status of resyllabification in other dialects, such as Rio Negro Argentinian Spanish or Quito Spanish, both of which have featured in the resyllabification and opacity discussion. Nevertheless, our findings warrant caution before complete resyllabification is assumed. Standard Peninsular Spanish appears to be a fairly straightforward case, where there are salient phonetic cues to resyllabification, such that linguists and native speakers have intuitions that resyllabification applies, and there are no segmental processes which target derived onsets and canonical codas to the exclusion of canonical onsets. Yet, speakers of this variety produce subtle phonetic distinctions which single out derived onsets as a unique prosodic category. The patterning of /s/ duration suggests that structural differences between canonical onsets and derived onsets are accessible to learners at a level of linguistic representation that connects directly to phonetic realization.

The intermediate patterning of derived onsets we find in Spanish bears many similarities to what has been reported for derived onsets in French and English. In French, word-final consonants have long been said to undergo resyllabification before a following vowel, in a phenomenon known as enchaînement. However, a number of experimental studies have shown that word-final pre-vocalic consonants are systematically shorter than word-initial consonants in a similar segmental context (Fougeron et al., 2003; Fougeron, 2007), similarly to what we find for Spanish. In addition to consonant duration effects, Fougeron et al. (2003) report increased vowel duration preceding canonical onsets, compared to derived onsets (enchaînement consonants), and Fougeron (2007) notes spectral differences in the surrounding vowels depending on the prosodic condition. Furthermore, psycholinguistic research points to specific processing properties of enchaînement consonants (see Section 4.3).

A case for the special status of derived onsets has also been made for English, based on their gestural properties. Gick (2003) presents electromagnetic data which compare the articulatory realizations of word-final pre-consonantal and pre-vocalic liquids and glides in English (e.g., how hotter and how otter) with canonical onsets (e.g., ha wadder). The results show that the degree of consonantal constriction in derived onsets is intermediate between canonical codas and canonical onsets. In addition, Gick finds some tendency for different timing patterns in derived onsets compared to word-initial onsets. Similarly, unique behaviour of derived onsets emerges from an electropalatographic study of English liquids by Scobbie and Pouplier (2010). According to Scobbie and Pouplier (2010), derived onset /l/ patterns with onset /l/ rather than with coda /l/ as far as alveolar contact is concerned. At the same time, however, derived onset /l/ shows dorsal retraction, which is not typical of canonical onsets. In addition, the gestural magnitude and timing in derived onset /l/ is not affected by the occurrence (or not) of initial vowel glottalisation, contrary to predictions of the resyllabification hypothesis.

Accumulating evidence from different languages suggests that complete resyllabification may be a lot less common than previously assumed. The only empirically well-supported case of complete resyllabification that we have been able to trace is Korean, where derived onsets (C#V) are indistinguishable from canonical onsets (#CV) based on a number of articulatory measures examined by Cho et al. (2014). In other cases, intermediate effects in the realization of derived onsets proliferate. Thus, in a cross-linguistic perspective, a parametric view of postlexical resyllabification becomes increasingly untenable.

5 Conclusion

In this study, we have shown that word-final pre-vocalic /s/ in Peninsular Spanish differs systematically in duration from /s/ in canonical onset or canonical coda positions. It remains to be seen whether our findings can be replicated for other speakers, and whether they also extend to other consonants, or to other dialects. Furthermore, the perceptibility of these small durational distinctions also invites further investigation. Nevertheless, already in their current form, our findings problematize a number of earlier theoretical treatments of phonological phenomena in Spanish. In particular, our results suggest that differences between canonical and derived onsets require a detailed representational account, rather than a computational solution geared towards deriving opaque generalisations. However, the appropriate representation may not necessarily be ambisyllabic, as proposed hitherto for cases of apparent partial resyllabification. We have argued that ambisyllabicity fails to make clear predictions concerning the phonetic realisation of ambisyllabic consonants. Instead, a full account of apparent resyllabification phenomena, in Spanish and in other languages, may require combined insights from gestural organization of speech and from lexical constraints on prosodic strengthening effects.

Competing Interests

The authors declare that they have no competing interests.