Stiffness and articulatory overlap in Moroccan Arabic consonant clusters

Kevin D. Roon1,2, Philip Hoole3, Chakir Zeroual4, Shihao Du5 and Adamantios I. Gafos2,5 1 Program in Speech-Language-Hearing Sciences, CUNY Graduate Center, New York, US 2 Haskins Laboratories, New Haven, Connecticut, US 3 Institut für Phonetik und Sprachverarbeitung, Ludwig-Maximilians-Universität München, München, DE 4 Université Mohammed Premier, Oujda, MA 5 Department Linguistik und Kognitionswissenschaften, Universität Potsdam, Potsdam, DE Corresponding author: Kevin D. Roon (kroon@gc.cuny.edu)


Introduction
In many languages, two consonants in a cluster will agree in place, with the first consonant changing to match the place of the second (see, e.g., Jun, 2004, and references therein). It has been claimed (Jun, 1995(Jun, , 2004Monahan, 1993) that certain places are more likely to be triggers of place assimilation than others. Jun (2004) discusses the fact that coronals are the least likely to be triggers of place assimilation and velars are the most likely, providing an example from Korean: (1) Korean place assimilation (data originally from Jun, 1996) a. /ip+ko/ → /[ikko] 'wear and' b. /ip+tolok/ → [iptolok], *[ittolok] 'wear + causative marker' Jun (2004) proposes that, given a cluster containing two consonants VC1C2, C1 is more likely to assimilate to the place of C2 when there is more overlap between the articulations of the consonants than when there is less. The proposal by Jun (2004) is a phoneticallybased approach to phonological assimilation, motivated by the effects of overlap on These two components of the proposal made by Jun (2004) have not been tested experimentally. The first component presents a clear testable hypothesis: H-J1: Articulations of tongue tip consonants should be faster than articulations of lower lip and tongue body consonants, and articulations of lower lip consonants should be faster than those of tongue body consonants, all things being equal.
Assuming H-J1 is supported, then the amount of overlap in a heterorganic cluster should be determinable by the combination of articulators in a cluster (cluster type). The expected overlap by cluster type is illustrated in Figure 2.
The middle column of Figure 2 shows the expected overlap when C2 is always a lower lip consonant (corresponding to Figure 1A-B). More overlap is expected when C1 is a tongue tip consonant than when C1 is also a lower lip consonant, since the tongue tip is putatively more rapid than the lower lip. More overlap is expected when C1 is a lower lip consonant than when C1 is a tongue body consonant, since the lower lip is putatively more rapid than the tongue body. The middle row of Figure 2 shows the expected overlap when C1 is always a lower lip consonant (corresponding to Figure 1C-D). More overlap is expected when C2 is a tongue body consonant than when C2 is also a lower lip consonant, since the lower lip is putatively more rapid than the tongue tip. More overlap is expected when C2 is a lower lip consonant than when C1 is a tongue tip consonant, since the lower lip is putatively more rapid than the tongue tip. By the same reasoning, tongue tiptongue body clusters should have more overlap than lower lip-tongue body clusters and tongue tip-lower lip clusters. Tongue body-tongue tip clusters should have less overlap than tongue body-lower lip clusters and lower lip-tongue tip clusters. No prediction is made about the overlap between lower lip-tongue body clusters and tongue tip-lower lip clusters, or between tongue body-lower lip clusters and lower lip-tongue tip clusters. Jun's proposal therefore is tantamount to the following hypothesis.
H-J2: The amount of overlap in a heterorganic cluster is predicted to decrease by cluster type in the following way: TT-TB > {LL-TB, TT-LL} > {TB-LL, LL-TT} > TB-TT, where no prediction is made between pairs within curly brackets. It is possible that H-J1 is supported empirically while H-J2 is not. This would undermine Jun's proposal that the typological patterns of place assimilation are attributable to inherent properties of the articulators involved in producing a cluster. If empirical evidence does not support H-J1, then H-J2 is likely moot, since H-J2 is derived based on the assumption that H-J1 is true. However, in this case, that would mean rejecting the insight from Jun that the relative rapidity of the articulations involved in producing a cluster has a significant influence on overlap because of the assumption that the rapidity of the articulations is solely a function of the articulators involved. It is possible to propose an alternative to H-J2 that maintains that the amount of overlap in a cluster will be a function of the relative rapidity of the articulations involved, without any reference to articulator.
H-3: The amount of overlap in a cluster increases as the rapidity of C1 increases compared to C2.
The goal of the present study was to test these hypotheses, since no work has been done to test experimentally the proposals by Jun (2004). In order to test these hypotheses, however, other known influences on cluster overlap needed to be taken into consideration, several issues of quantification needed to be addressed, and a suitable language for testing these predictions had to be identified. We address each of these topics below.

Test language: Moroccan Arabic
The present study used articulatory data from speakers of Moroccan Arabic (MA) gathered using 3D electromagnetic articulography ('EMA'; Hoole, Zierdt, & Geng, 2003). MA is a highly appropriate language for studying timing in consonant clusters since it permits a complex set of consonant combinations. Looking only at two-consonant stop-stop sequences; labial, coronal, and dorsal places of articulation are attested in all six combinations word-initially, word-medially, and word-finally except for wordfinal coronal-labial combinations, as shown in Table 1. Several manners of coronal consonants-liquids, a rhotic tap, stops, and fricatives-are also widely found as members of two-consonant clusters. The freedom of clustering permitted in MA therefore offers a particularly apt case study for testing hypotheses regarding overlap, since it is possible within-speaker to independently control for and test the effects of several different factors affecting overlap.

Quantification
There are two aspects of articulation that are crucial to the proposal made by Jun (2004): the 'rapidity' of an articulation (which is a function of the 'inherent velocity' of the primary articulator), and overlap. However, no quantification of either of these terms is made explicit by Jun (2004), and there are multiple ways to quantify both terms. We motivate two different quantifications for each term that we used to refine and test the present hypotheses.
In order to discuss these quantifications, we first discuss how articulator movements themselves were identified and quantified. The articulator movement measured for each consonant was indexed by the EMA receiver corresponding to the consonant's primary oral articulator-lower lip: [b]; tongue tip: [d, t]; tongue body: [g, k]. Following Gafos (2002, p. 296), we assumed that the oral gesture of a consonant drives the timing relations among consonants.
Each gesture can be characterized by certain temporal landmarks, shown in Figure 3. The Onset of movement is the point in time where the articulator starts moving toward its required constriction, Target is the point in time when that constriction is achieved, and Release is the point in time when the articulator moves away from the constriction. The Plateau of the articulation is the interval between Target and Release when the articulator is maintaining the required constriction. These landmarks were calculated using MATLABbased code for analyzing EMA data ('MVIEW'), developed at Haskins Laboratories by Mark Tiede. Onsets were identified algorithmically as the point at which the EMA sensor exceeded 20% of the maximum tangential velocity of the articulator during the closing phase, Targets were identified as the point at which the EMA sensor subsequently went below 20% of the same maximum tangential velocity, and Releases were identified as the point at which the EMA sensor exceeded 20% of the following maximum tangential velocity (comparable to the methods used by, e.g., Chitoran, Goldstein, & Byrd, 2002;Gafos, Hoole, Roon, & Zeroual, 2010;Oliveira & Teixeira, 2007, and many others). Further technical details about the acquisition and processing of the EMA data are included in the Methods section below.

Rapidity
Any targeted movement can be characterized by three basic kinematic properties: movement duration T, movement amplitude A, and peak velocity vˆ (Nelson, 1983). One way to calculate the rapidity of a given articulatory movement is to use the peak velocity vˆ of a fleshpoint on the relevant articulator as it moves to effect the constriction associated with the consonant. This measure, when applied to appropriate datasets that bring out the comparisons shown in Figure 1, would seem to offer a most direct test of the claims on overlap of Jun (2004). We therefore evaluated H-J1 using the peak velocity of the closing movement associated with each consonant as the index of the articulation's rapidity (H-J1 PV ).
However, the use of peak velocity vˆ as an index of articulator rapidity is potentially problematic given two empirically well-documented relations between vˆ and the other two kinematic properties of movement duration T and movement amplitude A. The first relation is that a movement's peak velocity vˆ covaries with its amplitude A (Kent & Moll, 1976;Kuehn & Moll, 1976). This relation has been reported for a large variety of consonant-vowel and vowel-consonant sequences involving movements of tongue body, tongue tip, lips, and jaw (Ostry, Keller, & Parush, 1983) and has been described as an overall linear correlation (Kuberski & Gafos, 2019;, with A-vˆ slopes steeper for faster than for slower speech rates and with decreasing strength of covariation as A increases (Vatikiotis-Bateson & Kelso, 1990Kelso, , 1993. This relationship between A and vˆ is also predicted by the DIVA model of Guenther (1995). The other relation ties together all three kinematic properties: The ratio of peak velocity to amplitude, vˆ/A, varies inversely with movement duration T (Kuberski & Gafos, 2019;Munhall, Ostry, & Parush, 1985; see also Sorensen & Gafos, 2016, for modeling) across manipulations of stress, vowel, and consonant identity (Fuchs, Perrier, & Hartinger, 2011;. This ratio, which we denote as k′ (see below for further discussion), has been used as an index of articulatory stiffness (Kelso, Vatikiotis-Bateson, Saltzman, & Kay, 1985;Munhall et al., 1985; and others), a control parameter proposed in the motor-control literature that modulates the time-space behavior of an articulator in various ways (Cooke, 1980).
Stiffness is incorporated formally in both Articulatory Phonology (AP, Browman & Goldstein, 1986, et seq.) and the Task Dynamic Model of speech production (TDM, Browman & Goldstein, 1990;Munhall et al., 1985;Saltzman & Munhall, 1989), and is therefore relevant to theories of phonological representation and its phonetic implementation. Various researchers have investigated how stiffness may express different prosodic phenomena, including final lengthening (Edwards, Beckman, & Fletcher, 1991), timing across prosodic boundaries (Byrd & Saltzman, 1998), and intonation (Beckman & Edwards, 1992). An explanation of the task-dynamical system used in AP and the specification of gestures within the system is provided in Saltzman and Munhall (1989) and Browman and Goldstein (1990). Perhaps most relevant to the issues addressed in the present study is that Browman and Goldstein (1989, p. 229) raised the possibility that different articulators may have different inherent stiffness.
The TDM provides an invariant means of describing a dynamical system, parameterized by the variables of rest position (x 0 ), stiffness (k), and damping (b). In this model, velocity is instantaneous velocity (ẋ, which, along with acceleration and position, are obtained by the TDM from sensory-motor feedback in order to compute what articulator movements are necessary to achieve the task) that the TDM factors into calculating what articulations are required, but it is not manipulated directly in the system. Damping for non-laryngeal gestures in the TDM is set to a constant critical value. The variable for stiffness k, on the other hand, can be manipulated in the model. Manipulating stiffness with everything else held constant has the effect that gestures with higher stiffness values result in "shorter duration movements and greater peak velocity/movement amplitude ratios" . Measured stiffness was therefore calculated for the closing movement of each of the articulations in each consonant as in (2): (2) Measured stiffness: k′ = vˆ/A We also evaluated H-J1 using the measured stiffness of the closing movement associated with each consonant as another index of the articulation's rapidity. Given these two quantifications of rapidity, H-3 can be refined in two different formulations: H-3 PV : The amount of overlap in a cluster increases as the peak velocity of C1 increases compared to the peak velocity of C2. and H-3 stiffness : The amount of overlap in a cluster increases as the measured stiffness of C1 increases compared to the measured stiffness of C2.

Overlap
Two indices of overlap were calculated, based on the landmarks in articulator trajectories shown in Figure 3. One measure of overlap used in the present study ('Relative Overlap') was calculated as the difference between the time points of C2 Onset and C1 Target, normalized by the duration of the C1 Plateau, as shown in Figure  3 and defined in (3). (3) This index of overlap was chosen for the same reasons outlined in Jun (2004) and Chitoran et al. (2002): It provides a reasonable means of measuring the potential effect of C2 movement on the acoustic outcomes of C1. This index is the same as the relative measure used by Gafos et al. (2010) and comparable to the one used by Chitoran et al. (2002) in that it quantifies overlap as the proportion of the C1 Plateau that is coextensive with the C2 gesture, as a proportion of the C1 Plateau. It is possible that the velocity and/or stiffness of an articulator movement may affect the length of its Plateau. Since the Relative Overlap measure described here has Plateau of C1 as the denominator, any effects of velocity and/or stiffness on overlap using this measure may reflect differences in C1 Plateau duration rather than differences in overlap (all other things being equal, the faster C1 gestures will yield higher overlap measures). To address this, the simple lag time between the onsets of both gestures in milliseconds ('Onset Lag') was also used as a second measure of overlap, with the onset time of C2 subtracted from the onset time of C1 so that greater lag values indicate more overlap (or, putting this another way, given that values are typically negative, the more negative the values, the less overlap), calculated as in (4).
We evaluated hypotheses H-J2, H-3 PV , and H-3 stiffness with overlap first indexed as Relative Overlap (3) and then indexed as Onset Lag (4), since there is no agreed-upon norm for indexing overlap, and Jun (2004) does not make this explicit.

Influences on overlap
Since the primary goal of the present study was to investigate the potential influence of relative rapidity of articulation on overlap, other known influences on overlap in clusters had to be taken into account. We summarize below some known influences on overlap in general, as well as on some specific findings from Gafos et al. (2010), who reported on cluster overlap from a different subset of the data from two of the speakers analyzed in the present study.
Word-initial clusters are typically less overlapped than word-medial, intervocalic clusters. This effect has been observed for English (Byrd, 1996;Hardcastle, 1985), Tsou (Wright, 1996), and Georgian (Chitoran et al., 2002). There are two perception-based motivations for this effect given by Chitoran et al. (2002). First, word-initial clusters are less overlapped because they may also be utterance-initial and therefore lack a preceding vowel to provide cues to help in the identification of the first consonant in the cluster (see Redford & Diehl, 1999, for more discussion). Second, lexical access relies heavily on word-initial phonetic detail (Marslen-Wilson, 1987). Gafos et al. (2010) compared overlap in stop-stop clusters for two MA speakers. Word position effects were found for Speaker 1 in that word-initial clusters were significantly less overlapped than word-medial and word-final clusters. The overlap in word-medial clusters was not significantly different from word-final clusters. Speaker 2 showed no word position effects.
Previous studies (Byrd, 1992(Byrd, , 1996Chitoran et al., 2002;Hardcastle & Roach, 1979;Son, 2008;Surprenant & Goldstein, 1998;Zsiga, 1994) have shown that front-to-back clusters (that is, clusters where the place of articulation of C1 is anterior to that of C2, e.g., [tk], [pt]) tend to show more overlap on average than back-to-front clusters (e.g., [kt], [tp]) than in front-to-back clusters. The proposed explanation for this effect is perceptually based: Starting the C2 closure of a back-to-front cluster too soon will hide the acoustic information needed for the listener to identify C1, whereas this is not true for front-toback clusters. Results from other studies (Chitoran & Goldstein, 2006;Kühnert, Hoole, & Mooshammer, 2006;Zeroual, Hoole, Gafos, & Esling, 2014) have called into question the generality and/or the perceptual motivation of the place order effect. To illustrate, Gafos et al. (2010) analyzed only stop-stop clusters within each word position and speaker, and found that Speaker 2 showed significant place order effects consistent with findings in other studies: Front-to-back stop-stop clusters were significantly more overlapped than back-to-front clusters word-initially and word-medially. There was no significant difference word-finally. The overlap patterns for Speaker 1 were significantly different both from what had been reported in other studies and from Speaker 2. In particular, contrary to the expected place order effect, Speaker 1 had significantly more overlap for back-to-front clusters, both as a main effect (across all word positions) and word-initially. Gafos et al. (2010) concluded that this was related to the fact that Speaker 1 had less cluster overlap than Speaker 2. Such results along with questions about the validity of an unqualified or universal place order effect in previous studies across languages and speakers led them to propose that place order effects are in fact relativized: If there are two environments (say, two different speakers or two different phonetic contexts) in which place order effects may arise and they arise in only one, the place order effect will be found in the environment where there is more overlap. The reasoning behind the hypothesis is that it is only in the more overlapped environment where recoverability is at stake and thus the effect should emerge; in a less overlapped environment, recoverability of acoustic information is less threatened and therefore the place order effects on overlap need not come into play.
Lastly, the amount of overlap in clusters may also be to some degree idiosyncratic to individual speakers. Gafos et al. (2010) showed that there was a significant difference between speakers in the amount of overlap in stop-stop clusters, with Speaker 1 having less overlapped stop-stop clusters than Speaker 2. This held overall and within each word position.
The above influences on overlap were therefore taken into consideration in evaluating the hypotheses tested in the present study, both in the selection of stimuli and in the design of the statistical models (see the Methods section for further details).
Verifying the validity of H-J2, H-3 PV , or H-3 stiffness would be useful in at least two ways. First, these hypotheses make predictions about overlap in cases where other hypotheses in the literature about overlap make no predictions. Comparing, for example, word-medial [ɡd] and [ɡb] clusters: Neither the word position nor the place order hypotheses make any predictions about the expected amount of articulatory overlap since the clusters are in the same word position and are both back-to-front. However, H-J2 predicts more overlap for [ɡb] than for [ɡd]. The two formulations of H-3 would also make predictions about these overlap in these clusters, not based on the articulator, but rather on the relative peak velocity or relative stiffness of the closing movements of C1 and C2.
Second, overlap is a continuum regardless of whether measure (3) or (4) is used, and there are multiple influences on the amount of overlap that were observed in any given cluster. The hypotheses tested here may possibly account for cases where observed articulatory overlap runs contra these other (word position and place order) hypotheses. For example, if [bd] clusters are observed to have less overlap than [db] clusters, this is not expected based on the place order hypothesis, but it could be accounted for by one of the present hypotheses. This could potentially explain the nature of the distributions of overlap more adequately, rather than unpredicted cases being viewed as exceptions or arising from noise.

Speakers
Six native speakers of the Oujda dialect of Moroccan Arabic (spoken in Northeast Morocco, near the Algerian-Moroccan border) were recorded. The speakers (one female, five male) ranged in age from 25 to 38. All speakers provided written informed consent to take part in the experiment, and were paid for their participation in the experiment. The experimental procedures were approved by the Ethics Committee of Ludwig-Maximilians-Universität.

3D Electromagnetic Articulography
The movement of speech articulators was tracked with 3D EMA at 200 Hz sampling rate. Concurrent audio recordings sampled at 24 kHz were made. Recordings were made at the EMA lab in the Institut für Phonetik und Sprachliche Kommunikation (now the Institut für Phonetik und Sprachverarbeitung), Ludwig-Maximilians-Universität München, Germany. EMA receivers included for analysis in the current study were attached to the speaker's lower lip, tongue tip, and tongue body. EMA receivers were also affixed at the nasion, right and left mastoid, and upper incisor to allow for head correction of the movement of the sensors used in analyses. The data were processed following Hoole and Zierdt (2010).

Stimuli
There are two types of consonant sequences types in MA: clusters with no inter-consonantal vowel (CC, e.g., [kbaʃ]) and sequences of two consonants where an optional schwa-like vocoid can appear between the two consonants (CˆC, e.g., [kˆbda]). The precise nature of this vocoid and the phonology that accounts for it is a matter of some debate (cf. Dell & Elmedlaoui, 2002;Gafos et al., 2010;Heath, 1987). CˆC sequence types were therefore not included in the present study: Only unambiguous CC clusters were analyzed. Another consideration in selecting stimuli for the present experiment is that Gafos et al. (2010) found that word-medial clusters were significantly more overlapped than word-initial and word-final clusters. The same study also found that the place-order effect on overlap was more likely to be found in contexts where clusters are otherwise more overlapped than in those where they are less so. If the same is true for the effects of differences in peak velocity and/or measured stiffness of the consonants in a cluster, these effects should most likely be observed in word-medial clusters. Therefore, all stimuli were real words containing two-consonant, word-medial clusters, where both the preceding and following vowel were [a]. Restricting the stimuli to word-medial clusters allowed us to control for word position rather than increase the complexity of our analyses by including another predictor.
The data analyzed in the present study are a subset of articulatory recordings from three corpora that were collected for other experiments. The stimuli are shown in Table 2, broken down by corpus. The stimuli in corpus 1 were produced by Speaker 1. Those in corpus 2 were produced by Speakers 2 and 3. Those in corpus 3 were produced by Speakers 4, 5, and 6. Stimuli were presented on a computer screen in Arabic script, including diacritics indicating full vowels, within a carrier phrase. Articulatory and acoustic recordings were made as the speakers read these stimuli from the computer screen. The presentation order of all stimuli was pseudo-randomized for each speaker. All speakers produced each of their stimuli five times, such that there were always other stimuli intervening between any two productions of a given stimulus.

Results
A total of 566 productions were recorded. vˆ and k′ were calculated for both consonants in each token. Within each token, the difference of the vˆ and k′ values of C1 and C2 were calculated to determine the PV Difference and Stiffness Difference in each token, as in (5) and (6), meaning that positive values indicate that C1 had a greater vˆ or k′ value than that of C2, and negative values indicate that C1 had a smaller vˆ or k′ value than that of C2. The values for each token as output from MVIEW and calculated therefrom are available in the file indicated in Supplementary Material 1. Twenty productions (3.4% of the data) were removed due to having an outlier value (≥3 standard deviations away from the mean) of any of the following values: vˆ of C1, vˆ of C2, PV Difference, k′ of C1, k′ of C2, Stiffness Difference, Relative Overlap, or Onset Lag. These tokens were discarded on the assumption that these differences were due to measurement error, leaving 546 tokens for analyses. Each token was also classified by place order. Table 3 shows the number of tokens included for each speaker and place order.

Articulator-specific rapidity
First we tested H-J1, which posits that there should be inherent differences in the rapidity of articulations based on the primary oral articulator, with the tongue tip being fastest, lower lip being intermediary, and tongue body being slowest. Figure 4A shows the data relevant for H-J1 using peak velocity, and Figure 4B shows the data relevant for H-J1 using measured stiffness. The distributions are broken out by cluster position (C1 or C2) to take into consideration the potential effect of position on rapidity, and because the patterns of assimilation discussed by Jun (2004) pertain to cluster positions separately, as shown in Figure 1 (A-B versus C-D).  The peak velocity of the closing movements showed very little difference by articulator when in C1 position, though numerically they did pattern as predicted by H-J1. Peak velocities were slower in general for C2 consonants, especially for lower lip movements, which were slower not just compared to lower lip peak velocities in C1, but compared to the other two articulators in C2. This pattern is inconsistent with the prediction of H-J1. The numerical pattern for measured stiffness was consistent with the prediction of H-J1, and as with the peak velocities, stiffness values were lower in C2 position than in C1.
Two linear mixed-effects models (Baayen, Davidson, & Bates, 2008;Gelman & Hill, 2007) were fit to test whether the differences shown in Figure 4 were significant, using the lme4 package (Bates, 2005;Bates et al., 2020) for R (R Development Core Team, 2018). Speaker and stimulus were modeled as random effects, with one model having peak velocity as its predicted value and the second for measured stiffness. The fixed effects in the models were articulator and cluster position (C1 or C2), as well as the interaction between the two. Each model included a random slope for articulator by speaker. Omnibus test statistics for the fixed effects (articulator, cluster position, and their interaction) were determined using a type III analysis of variance with Satterthwaite's method from the anova function of the stats package (R Development Core Team, 2018) in R. Post-hoc differences in articulator-cluster position combinations were assessed using estimated marginal means (Searle, Speed, & Milliken, 1980) using the emmeans package (Lenth, Singmann, Love, Buerkner, & Herve, 2019) for R. Degrees of freedom were calculated using the Kenward-Roger method. The R script used to fit these and all of the mixed-effects models, as well as to generate all of the figures, in the present study are available in the file indicated in Supplementary Material 2.
Results of the ANOVA are shown in Table 4. There were no significant differences in peak velocity based on articulator, though cluster position and the interaction were significant. For measured stiffness, all three fixed effects were significant. Table 5 shows the results of the comparisons of all articulator-cluster position combinations, providing more insight into the results of the ANOVA shown in Table 4. Pairs with a Tukey-adjusted p value < 0.05 were deemed reliable and are shown in bold. For peak velocity, there were no reliable differences based on articulator in C1 position. In C2 position, the only reliable difference between articulators was that the lower lip had lower peak velocity than the tongue body. The peak velocity of the lower lip was significantly higher in C1 position than in C2, but there was no difference based on cluster position for the other two articulators. The only other significant difference was that the lower lip in C2 position had lower peak velocity than the tongue body in C1. There was no support for H-J1 based on peak velocity.   In C1 position, the tongue body had significantly lower measured stiffness values than either the tongue tip or lower lip. There were no significant differences between any articulators in C2 position. Similar to peak velocity, the lower lip had significantly lower stiffness values in C2 position than in C1. The tongue body also had lower measured stiffness than the other two articulators when it was in C2 position and the other articulators were in C1. The tongue body in C1 position also had significantly lower stiffness than the tongue tip in C2 position. There was therefore limited support for H-J1 based on the measured stiffness of the articulators: The tongue body had lower stiffness than the other two articulators, but only within C1 position or sometimes across cluster positions. Interpreting the cross-position differences is complicated by the main effect of Cluster Position (with C2 having lower stiffness values). The significant effect of articulator in the ANOVA was therefore due to these differences between the tongue body and the other articulators, given that the predicted difference in measured stiffness between the tongue tip and lower lip was found neither within nor across cluster positions.

Overlap based on cluster type
In this section we test hypothesis H-J2. We have noted already that H-J2 is predicated on H-J1, and we have just shown that there is limited support for H-J1 in these data. Nevertheless, there is still a possibility that articulator-specific differences may be evident only when consonants are analyzed using measures that are within-token. Looking at within-token overlap within cluster types is the most direct test of hypothesis H-J2. Figure 5A and B show the Relative Overlap and Onset Lag, respectively, with the cluster types organized such that the amount of overlap should decrease from left to right, according to hypothesis H-J2. The patterns of overlap do not correspond to those expected based on H-2J, regardless of the index of overlap that was used. Most notably, the two tongue tip-initial cluster types have much less overlap than the two lower lip-initial cluster types, comparable to the amount of overlap with the two tongue body-initial cluster types, which were expected to have the least overlap. Apart from the tongue tip-initial clusters, overlap by cluster type patterns more or less as expected.
The reliability of the differences shown in Figure 5 was evaluated with two linear mixed effects models in which cluster type was the fixed predictor (with six levels), one of Relative Overlap and another in which cluster type was the fixed predictor of Onset Lag. Random intercepts were included for speaker and stimulus. Full pairwise comparisons of all cluster types would require statistical power beyond what is available in our data. Therefore, we set the cluster type tongue tip-tongue body as the reference level, since this cluster type was predicted to have the highest overlap according to H-J2 yet had numerically lower overlap than lower lip-initial clusters. Structuring the model this way allowed us to determine whether tongue tip-tongue body clusters had reliably more overlap than tongue tip-lower lip clusters (per H-J2), whether the greater overlap for the two lower lip-initial clusters was reliable (contra H-J2), and whether the tongue body-initial clusters had less overlap than the tongue tip-tongue body clusters (per H-J2). Place Order could not be included as a predictor because it is too correlated with cluster type. (see Figure 2) should be greatest for the leftmost type (tongue tip-tongue body) and decrease by type going rightwards, with tongue body-tongue tip clusters having the least overlap. Colors correspond to the primary oral articulator, with the background color indicating the C1 articulator and the color of the violin plot indicating the C2 articulator. Table 6 shows that there was no reliable difference in overlap between tongue tip-tongue body and tongue tip-lower lip clusters (not supporting H-J2), and that the overlap for the two lower lip-initial clusters was reliably greater than for the tongue tip-tongue body clusters (contra H-J2), regardless of whether overlap was indexed by Relative Overlap or Onset Lag. Both tongue body-initial clusters had less overlap than the tongue tip-tongue body clusters (per H-J2), but only when overlap was indexed by Onset Lag. There was no reliable difference when indexed by Relative Overlap.

Effects of Peak Velocity Difference and Stiffness Difference
Lastly, we tested H-3 PV and H-3 stiffness , using both Relative Overlap and Onset Lag as indexes of overlap. The relationship between PV Difference and Stiffness Difference, and Relative Overlap and Onset Lag is shown in Figure 6. The black lines are linear regression lines across all cluster types. The colored lines are linear regression lines within cluster type.
Four linear effects models were fit to the data, one for each relationship shown in Figure 6. Speaker and stimulus were modeled as random effects. The fixed effects were Peak Velocity Difference or Stiffness Difference (both continuous), and Place Order (categorical). Cluster type was not included as a predictor, as the models would not converge if it was added. Since Gafos et al. (2010) found Place Order effects were speakerdependent for a subset of the speakers included in the present study, speaker-specific random slopes for Place Order were also included in the model. The results of all four models are shown in Table 7.
Results of the statistical models confirm that Stiffness Difference was a significant predictor of overlap, regardless of whether Relative Overlap or Onset Lag was used to index overlap, with overlap increasing as Stiffness Difference increased. These results provide strong support for hypothesis H-3 stiffness . On the other hand, no support was found for hypothesis H-3 PV : Peak Velocity Difference was not a significant predictor of overlap, regardless of whether Relative Overlap or Onset Lag is used to index overlap. Place Order was not a significant predictor of overlap other than in the model that included Peak Velocity Difference as a predictor of Onset Lag ( Figure 6B/Table 5-B), where it predicted greater overlap in front-to-back clusters, as expected. Jun (2004) proposed that typological patterns of regressive place assimilation can be explained by differences in overlap in different types of clusters. According to this proposal, when clusters are more overlapped, acoustic information as to the place of the first consonant is diminished and place assimilation is more likely. Differences in overlap arise, according to this proposal, based on the 'inherent velocities' of the primary oral    Figure 6). Significant effects are bolded. articulators involved in making the constrictions needed for the two consonants in a C1C2 cluster: In short, when the articulator making the C1 constriction is more rapid than the articulator making the C2 constriction, there is more overlap than when the order of the rapidity of the articulators is the reverse. Jun (2004) assumes that the tongue tip has the highest inherent velocity, the tongue body has the lowest, and the lower lip is intermediate between the two lingual articulators. In the present study, we tested two hypotheses that are assumed in the proposal made by Jun (2004). The first was that articulators have 'inherent velocities.' The second, which is predicated on the first being true, was that overlap in a cluster will be modulated systematically by the specific combinations of articulators involved in making the consonantal constrictions in the cluster. We also tested a third hypothesis, as an alternative to the second, that overlap is modulated by the relative rapidity of the articulations involved, but that this relative rapidity is not determined by the specific articulators. Support for the first hypothesis was limited, and discernible only when the rapidity of the articulators was quantified as measured stiffness. The only consistent, significant, articulator-related difference was that tongue body articulations had significantly lower measured stiffness (but not peak velocity) than lower lip and tongue tip articulations in C1 position and across cluster positions. This finding is consistent with the proposal by Jun (2004). However, there were no significant differences between the lower lip and tongue tip within or across cluster positions. This lack of difference is problematic for the proposal by Jun (2004), in which it is crucially assumed that tongue tip articulations are more rapid than those of both of the other articulators.

Relative Overlap
Given that there was only limited support for the first hypothesis, it was not surprising that overlap was not predicted correctly by the second hypothesis. The most relevant prediction of this hypothesis for the account of Jun (2004)-that two tongue tip-initial cluster types should show the most overlap-was undermined by significant results in the opposite direction. While the rest of the overlap patterns (i.e., those within and between lower lip-initial and tongue body-initial clusters) were more or less consistent with the second hypothesis, the exception of the tongue tip-initial clusters is deeply problematic for the account proposed by Jun (2004).
We found significant support for the third hypothesis we tested, i.e., the amount of overlap in a cluster was predicted significantly by the difference in rapidity of the specific articulations in a given token. This was true regardless of whether overlap was indexed by Relative Overlap or Onset Lag. However, this was only true when the rapidity of an articulation was quantified by stiffness. When the rapidity of an articulation was quantified by peak velocity, the difference in peak velocity did not predict either measure of overlap.
Accounting for the variation in overlap with Stiffness Difference (see Figure 6C-D)which has continuous values-can explain much more of the data than accounts that appeal to categorical predictors like cluster type (i.e., the specific combination of articulators involved). Firstly, although there were some significant differences in overlap based on cluster type (see Figure 5 and Table 5, in addition to the different intercepts by cluster type visible in Figure 6), the relationship between Stiffness Difference and both indexes of overlap held across cluster types (although the significance of this observation could not be tested statistically, as noted in Section 3.3). While an account based on cluster type could possibly be constructed to explain the significant differences found by cluster type, such an account would be unable to explain the fact that the relationship between Stiffness Difference and overlap holds for tokens within the same cluster type. Another aspect of the relationship between Stiffness Difference and overlap is that it captures the fact that it is the stiffness of C1 relative to that of C2 that is important, not the categorical distinction of whether C1 has higher measured stiffness than C2 or vice versa. This can be illustrated by looking at the lower lip-tongue body data in Figure 6C-D (represented by the highest regression lines in both sub-figures). The Stiffness Differences of all tokens for this cluster type were positive, meaning that C1 always had higher stiffness than C2, and the relationship between Stiffness Difference and overlap was the same as for cluster types whose tokens encompassed both positive and negative values. An analysis that relied on categorizing clusters based on stiffness 'order' (i.e., indicating whether C1 or C2 has the higher value of measured stiffness) would not be able to capture this fact. Lastly, the fact that Peak Velocity Differences and Stiffness Differences within each cluster type spanned a wide range of negative and positive values further undermines the first hypothesis tested here. If the rapidity of a gesture were determined predominantly by its main oral articulator, then it would be expected that the difference in values within a given cluster type would be predominantly on one side or the other of the zero point for peak velocity and/or stiffness difference. However, only lower lip-tongue body clusters patterned this way, and only for Stiffness Difference. The other five cluster types did not pattern this way for either difference measure.
The lack of a result for the Peak Velocity Difference may at first seem at odds with the significant result for Stiffness Difference, but these results are compatible with each other. Looking at Figure 1C-D, Jun's (2004) proposal predicts less overlap in C compared to D due to a 'more rapid' C2 articulator movement in C. However, these schematic drawings do not take articulator displacement into account. As discussed above, the peak velocity of an articulator is known to covary with the articulator's displacement (e.g., Guenther, 1995;Kent & Moll, 1976;Munhall et al., 1985). In other words, the peak velocity of the articulator increases as the distance that the articulator has to travel increases. If we think of the schemas in Figure 1C-D as representing articulator movements through space as well as time with the C2 movement in Figure 1C traveling less far than the one in Figure 1D, we expect that the articulator movement in Figure 1C should have a lower peak velocity than the one in Figure 1D. Viewed this way, the schemas in Figure 1C and D are then consistent with the results of this study since the clusters where C2 had higher peak velocity than C1 ( Figure 1D) had more overlap than when C2 had lower velocity than C1 (Figure 1C). At the same time, a C2 gesture like the one shown in Figure 1C could have a higher measured stiffness value than the C2 gesture in Figure 1D and yet still have a lower peak velocity than the C2 gesture in Figure 1D. If the C2 gesture in Figure 1C does not have to go as far as the C2 gesture in Figure 1D to achieve its target, the measured stiffness values factor out the confounding influence of displacement.

Conclusion
The results from the present study have several implications for the proposal of Jun (2004).
One key insight of that proposal is that the amount of overlap in a consonant cluster is a function of the relative rapidity of the articulations of those two consonants. Our results provide strong support for this aspect of Jun's proposal, but only when rapidity was indexed by the more abstract dynamical parameter of stiffness of the closing gestures of the two articulations in the cluster. Overlap was not predicted in the same clusters when the difference in the peak velocities themselves was used. The peak velocities of the articulations seemed to be a function of both the stiffness settings of the gestures and the distance that the articulators needed to travel to achieve their targets. We conclude that the settings of the stiffness control parameter associated with each articulatory gesture in a cluster, not the peak velocity of each articulator movement, contribute to differences in overlap.
The other aspect of the proposal by Jun (2004) is that typological patterns of the triggers and targets of regressive place assimilation can be explained by inherent differences in rapidity across articulators. However, attributing differences in stiffness (or peak velocity) in a cluster based on the articulators involved seems implausible given the present results. We found limited support for systematic differences in stiffness values or peak velocities of articulations based on primary oral articulator. The amount of overlap in a cluster was also not predicted well by the specific combination of articulators in the cluster, and in fact patterned opposite to expectation in some crucial cases.
The present results do not change the typological generalizations noted by Jun (2004), but they do undermine the phonetic explanation that Jun (2004) proposes for them. The present results should be interpreted with some caution, since they come from one language. Nevertheless, the assumptions of Jun (2004) concerning both the 'inherent velocities' of articulators and the proposed effect that those inherent velocities have on overlap are cast by Jun (2004) as language-independent, which they would have to be in order to provide an adequate account of typological patterns. It is therefore reasonable to test those assumptions in any language. As we discussed in the Introduction, Moroccan Arabic is very useful in this regard given the rich combinations of consonants in this language. Differences in overlap may ultimately be part of an explanation for observed typological patterns of assimilation. Our results strongly suggest that further research with the goal of providing such an explanation should include the role of the dynamical control parameter of stiffness on overlap, since stiffness seems to stand at the right level of abstraction from the surface kinematics, which are highly context-dependent.

Additional Files
The additional files for this article can be found as follows: Chakir Zeroual, as a native-speaker of Moroccan Arabic, was the primary consultant for all questions concerning the language, especially in designing the stimuli for the experiment.
He was Speaker 1. Shihao Du labeled the data that identified the consonantal gestures from the three speakers in corpus 3. Adamantios Gafos designed the stimuli for the experiments, assisted in the collection of the data, devised the logic underlying the third hypothesis tested here, and was extensively involved in the writing of the manuscript. This study evolved from extensive discussions between KDR and AIG.