Speakers do not just produce canonical forms. Moreover, they have the choice to manipulate phonetic cues in a given utterance to increase or decrease perceptual distances between competing words or syllables (H&H model by Lindblom, 1990). Thus, speakers systematically vary between more and less distinct articulation within each utterance, phrase, or even within a word. If perceptual distances are increased, the associated articulatory resource costs also increase, leading to a more distinct production of segments, syllables, or words (Liberman & Mattingly, 1985; De Jong, 1995; Harrington, Fletcher, & Beckman, 2000; Cho, 2005; Baese-Berk & Goldrick, 2009; Farnetari & Recasens, 2010; Scarborough, 2013; Mücke & Grice, 2014; Nelson & Wedel, 2017). The articulatory low-cost behavior of the speech system leads to an increase in overlap between articulatory gestures and therefore to a higher degree of coarticulation, which is related to hypoarticulated speech. In contrast, the high-cost behaviour of the articulatory speech system, i.e., hyperarticulated speech, leads to a decrease in coarticulatory overlap and therefore to a more distinct articulation which enhances distances in the perceptual space. Both strategies affect temporal and spatial properties of articulatory speech movements and the related acoustic output. Thus, we always deal with surface patterns which are affected by prosodic variation, segmental effects, or speaker-specific behaviour (inter alia Mücke, Hermes & Cho, 2017; Hermes, Mücke, & Auris, 2017; Gafos, Charlow, Shaw, & Hoole, 2014). However, the mediation between the linguistic and physical control systems should imply efficient timing patterns which increase either prosodic/paradigmatic contrast or decrease the costs of the physical control system. Patterns that increase the costs of the physiological system but at the same time do not contribute to the functions of the linguistic system cannot be directly framed within the H&H model. From this perspective, these deviant patterns are inefficient and expected to occur in pathological speech, such as dysarthria (Duffy, 2013; Ziegler & Vogel, 2010).
This study applied a dynamical approach to investigate syllable coordination patterns in the speech of Essential Tremor (ET) patients treated with Deep Brain Stimulation (DBS). This dynamic approach (within the framework of Articulatory Phonology) allows us to account for possible gradient changes in phonetic surface structures. We analyzed syllable coordination patterns for nine ET patients with inactivated and activated stimulation (DBS-OFF and DBS-ON) of the ventral intermediate nucleus (VIM) and compared them to an age-matched, healthy control group. The recordings were carried out with an electromagnetic articulograph. In this study, we focused on the timing patterns among gestures in syllables with low and high complexity, CV and CCV.
In Articulatory Phonology (Saltzman, 1986; Saltzman & Kelso, 1987; Saltzman & Munhall, 1989; Browman & Goldstein, 1989, 1992), the basic units of speech production are dynamically defined articulatory gestures, which can be modelled as a constellation of invariant functional units of vocal tract constricting actions, such as the full closure of the tongue tip at the alveolar ridge to produce the speech sound /t/ (Saltzman & Munhall, 1989; Browman & Goldstein, 1988, 1992). The model integrates phonetics and phonology, i.e., low-dimensional descriptions and high-dimensional descriptions in a unified system (Gafos & Benus, 2006). Within this model, variation in speech production can be modelled in terms of hyper- and hypoarticulated speech, constantly mediating between the demands of the physical control system and linguistic structure, e.g., prosodic head marking. Changing the values of a gesture’s parameter set changes the temporal and/or spatial properties of the physical, articulatory action and therefore the acoustic outcome (Saltzman, 1986; Browman & Goldstein, 1989, 1992; Saltzman & Kelso, 1987; Edwards, Beckman, & Fletcher, 1991). The model of task dynamics (Fowler, Rubin, Remez, & Turvey, 1980; Saltzman & Munhall, 1989) implies several parameter modifications such as (i) target, (ii) stiffness, and (iii) phasing that systematically induce variation (Hawkins, 1992; Mücke, 2018). Since gestures are goal-directed movements, each gesture has a target. Note that there are multiple ways for achieving a desired motor goal, also in terms of multiple parameter modifications (Patri, Diard, & Perrier, 2015; Cho, 2006). This is an important aspect, especially when studying compensatory articulation in pathological speech. The following parameter changes shall be mentioned here briefly: (i) A change in the underlying target involves changes in the peak velocity in proportion to the target value; an undershoot leads to smaller movements with lower peak velocities; (ii) Stiffness is an abstract control parameter related to the relative speed of the movement. Decreasing a gesture’s underlying stiffness leads to slower and longer movements; the target is achieved in a shorter time; (iii) Phasing affects the overlap between two gestures and can lead to variation. When a consonantal gesture is timed earlier with respect to another gesture, the overlap between the gestures increases and the preceding consonantal gesture will be truncated (Saltzman & Kelso, 1987; Harrington, Fletcher & Roberts, 1995; Cho, 2006; Iskarous & Kavitskaya, 2010). Thus, the truncated gesture becomes shorter (especially the deceleration phase), leading to a target undershoot, while the peak velocity remains the same.
The complex interplay between prosodic structure, segmental context, and phonological syllable parse affects the timing patterns of consonant and vowel articulation (Mücke et al., 2017; Hermes et al., 2017). Prosodic structure involves inter alia prosodic head marking and therefore often leads to hyperarticulation of the accented syllable to enhance prominence (especially in contrastive focus condition), while other syllables might be reduced. Furthermore, the segmental context influences the overlap between neighbouring segments depending on the degree of coarticulatory resistance (Recasens, Pallarès, & Fontdevila, 1997; Farnetani & Recasens, 2010). The degree of variability of consonant and vowel timing is constrained by intrinsic syllable timing patterns of the phonological system. There is a difference between languages allowing for complex onset coordination patterns (e.g., branching onsets for CCV in English and Italian; Marin & Pouplier, 2010; Hermes, Mücke, & Grice, 2013) and those that restrict syllable coordination to simple onset coordination (e.g., C.CV for Tashlhiyt Berber, Goldstein, Chitoran, & Selkirk, 2007; Hermes, Ridouane, Mücke, & Grice, 2011). These differences are part of the speakers’ linguistic knowledge and can be modelled within the framework of nonlinear planning oscillators, described in the following section.
Previous research has shown that it is possible to diagnose distinct phonological syllable parses on the basis of timing patterns between consonants and vowels at the syllable level (Browman & Goldstein, 2000). While the cluster /fl/ in a word like <flat>, for example, is parsed as monosyllabic in American English, the same cluster /fl/ in a word like <flan> (‘someone’) in Moroccan Arabic is parsed as heterosyllabic (Shaw & Gafos, 2015). Those syllable timing relations are captured by the degree of overlap between consonants and vowels in CV versus CCV sequences. Languages which are assumed to have complex syllable onset parses such as American English (Browman & Goldstein, 1988; Marin & Pouplier, 2010) and Italian (Hermes et al., 2013) show that the prevocalic consonant is shifted towards the following vowel to make room for the added consonant, leading to an increase of CV overlap in complex syllables. Such an increase in CV overlap is not observed in languages with simple onset coordination such as Moroccan Arabic (Shaw, Gafos, Hoole, & Zeroual, 2011) or Tashlhiyt Berber (Goldstein et al., 2007; Hermes et al., 2011). As pointed out above, these distinct syllable timing patterns are also affected by variation induced by factors such as language system, prosodic head marking, or segmental-make up (Shaw et al., 2011; Brunner, Geng, Sotiropoulou, & Gafos, 2014; Hermes et al., 2017).
Syllable structure can be modelled in terms of a self-organizing system, a model of nonlinear planning oscillators, capturing regularity and variability on different levels of linguistic description. In such a model, each gesture is associated with an oscillator (or clock or temporal trigger) and these oscillators are coupled to one another in a pairwise, potentially competing fashion (described in coupling graphs). Two intrinsic coupling modes are assumed: in-phase (0° phase transition; the associated movements start at the same time) and anti-phase (180° phase transition; the associated movements start sequentially). Figure 1 schematizes the coordination of the syllables /pi/, /li/, and /pli/ on three different levels of abstraction: (i) the articulatory trajectories of lip and tongue movements, (ii) the gestural score involving activation intervals for each movement (each box schematizes the interval from start to target of an activated gesture), and (iii) the coupling graphs encoding the coupling modes between oscillators associated with the gesture.
In a simple CV syllable, such as /pi/ in <Pina> (girl name) or /li/ in <Lima> (capital of Peru), the C and V movements start at the same time (simple onset coordination; Figure 1, first two rows). The vocalic gesture is less stiff (relatively slower) and therefore longer in duration. This is also schematized in the composition of the gestural activation intervals (Figure 1, gestural scores). Even though both movements are activated at the same time, the vocalic gesture takes longer to reach its target. On the acoustic surface, we get the impression of a CV sequence. In a coupled oscillator model, the simultaneous initiations of the C and the V gesture are displayed with an in-phase coupling mode of C and V (Figure 1, coupling graph on the right), a simple coupling structure.
In a complex CCV syllable, such as /pli/ in <Plina>, it is assumed that both consonants are adjusted in relation to the vowel. When adding a C to a CV syllable resulting in CCV, the prevocalic C shifts towards the following V to make room for the added C. In the example in Figure 1 (bottom row; trajectories and gestural scores) there is an increase of overlap between /l/ and /i/ in /plina/ compared to /lima/. In contrast, the initial C shifts away from the following vowel, leading to a decrease in overlap between /p/ and /i/. In terms of coupled oscillators (Figure 1, coupling graphs), we deal with a complex coupling structure: Both consonants are coupled in-phase with the V gesture, since both consonants are part of the syllable onset. At the same time, the two consonants are also coupled in anti-phase with each other for perceptual recoverability.
Even though two different phonological syllable parses (i.e., simple and complex onset parse) are assumed to be triggered by the speakers’ grammatical knowledge, we expect to find naturally-induced variation on the phonetic surface patters. As mentioned before, this type of variation can be systematically triggered by factors such as prosodic structure, segmental context, and speaker-specificity (Mücke et al., 2017; Hermes et al., 2017; Gafos et al., 2014). The challenge of flexibility and stability is important for language production and it can be found within and across languages. For example, in German, it has been shown that there is complex syllable organization. However, the literature has reported timing differences in these syllable coordination patterns (cf. Pouplier, 2012; Brunner et al., 2014). When applying the established articulatory measures (C-center measure) to detect syllable structure, Brunner et al. (2014) found that a pattern such as /bl/ shows the expected rightward shift of the prevocalic C towards the V (increase of overlap between C and V), while this shift was not present for /pl/. However, this does not mean that the clusters /pl/ and /bl/ do not share the same phonological syllable parse. Moreover, the rightward shift of the prevocalic C was somehow blocked in /pl/ on the phonetic surface. In a recent study, Gafos, Roeser, Sotiropoulou, Hoole, and Zeroual (2019) attribute differences in segmental make up for the same underlying phonological organization to pleiotropic organization of prosody. This means that a certain prosodic structure can be expressed by more than one phonetic exponent allowing for lawful flexibility in a multi-dimensional way.
Essential Tremor (ET) is the most common movement disorder (Haubenberger & Hallet, 2018). Clinically, it surfaces with a bilateral upper limb action tremor. In some cases, tremor may occur in other locations of the body (such as head, voice, or lower limbs), but also additional symptoms may occur such as impaired tandem gait (ataxia), dystonic posturing, memory impairment, or rest tremor. Recently, due to the heterogeneity of ET symptoms, a new classification of ET was introduced. Here, a differentiation between ET and ET plus was proposed (MDS consensus criteria, Bhatia et al., 2018). ET and ET plus are usually treated with betablockers such as propranolol or with antiepileptic drugs (e.g., primidone and topimarate). When medication fails or is not tolerated by the patient, chronic deep brain stimulation (DBS) of the nucleus ventralis intermedius (VIM) of the thalamus (Flora, Perera, Cameron, & Maddern, 2010) or the posterior subthalamic area (PSA) (Barbe et al., 2018) has been shown to be an effective treatment option (see Figure 2). While tremor suppression may reach values of over 90%, some patients report that DBS has a deleterious effect on their speech production, thus impacting their quality of life and social functioning (Flora et al., 2010; Barbe et al., 2014; Mücke et al., 2014; Mücke et al., 2018).
So far, only a few studies have investigated the stimulation-induced deterioration in speech. All of them focused on fast syllable repetition tasks (oral diadochokinesis, DDK) to detect signs of dysarthric speech, but not on natural sentence production. In an acoustic analysis of DDK tasks, Mücke et al. (2014) reported on coordination problems of glottal and oral control in German ET patients treated with DBS. Under stimulation, patients produced fewer voiceless intervals, indicating a weakening of the glottal abduction gesture during the entire syllable cycle. Furthermore, they found imprecise oral articulation under stimulation. Patients produced incomplete oral closures leading to spirantization of the stop consonants on the acoustic surface (see also Pützer, Barry, & Moringlane, 2007 for German Multiple Sclerosis patients treated with DBS).
In a follow-up study, Mücke et al. (2018) investigated the effects of ET patients treated with DBS in the articulatory dimension in fast syllable repetition tasks by using electromagnetic articulography, tracking the articulatory movements of the tongue and lips directly. They found that critical changes in speech dynamics occur on two levels: (i) With inactivated stimulation (DBS-OFF), the patients showed coordination problems in terms of imprecision and slowness. Compared to healthy controls, ET patients produced longer, faster, and more displaced consonantal movements. However, the consonantal movements in the production of the ET patients were imprecise showing e.g., spirantization on the acoustic surface due to incomplete closures during the intended stop consonants revealing problems in coordination. (ii) Under activated stimulation (DBS-ON), these problems were getting stronger accompanied by an additional overall slowing-down of the oral speech motor system. It was not clear from the neuroanatomical data, whether this overall slowing-down is due to affection of the upper motor fibers of the internal capsule or whether it is compensation strategy due to an aggravation of pre-existing cerebellar deficits.
The present study sheds light on syllable coordination patterns in ET patients with activated and inactivated stimulation compared to age-matched healthy control speakers. These patterns are interpreted within a dynamic approach in the framework of Articulatory Phonology, allowing us to capture variance in syllable production patterns in neurotypical and pathological speech.
On the basis of previous research (Mücke et al., 2014, 2018) which was done on fast syllable repetition tasks in ET patients treated with DBS, the present study investigated the effects of DBS on the speech system of ET patients in natural sentence production. In doing so, we focused on syllable coordination patterns in a natural prosodic context (Staiger, Schölderle, Brendel, Bötzel, & Ziegler, 2016). We hypothesize the following syllable coordination patterns comparing healthy controls speakers to ET patients without stimulation and with stimulation:
For this study we recorded natural sentence production of nine ET patients who underwent a DBS surgery. A subset of the same recording session for fast syllable repetition tasks is published in Mücke et al. (2018). For our cohort, a tremor reduction of 64.13% was observed after the surgery (detailed information provided in Table 1). This is in line with other reports of DBS in ET patients (Flora et al., 2010; Benabid et al., 1996). In general, patients report being satisfied with the success of the DBS surgery. The criteria for inclusion of the ET patients for this study was the subjective report of dysarthria. Tremor severity was assessed by the Fahn-Tolosa-Marin tremor rating scale (TRS; Fahn, Tolosa, & Marín, 1993). The differentiation between ET and ET plus (Bhatia et al., 2018) was not used as a criterion for inclusion in the present study.
|Sex||Age||Disease duration (Years)||Months of DBS||Alcohol response||Family hist.||Cerebellar symptoms1||Head tremor2|
All of the nine patients did show that DBS reduces the tremor (see Figure 3a). Furthermore, we assessed a VAS score (Visual Analogue Scale) for the subjects’ ‘ability to speak.’ The patients rated their speech as being affected when the stimulation is turned on (see Figure 3b).
The patients were recorded with stimulation ON and OFF under the articulograph. Further, we recorded nine age- and gender-matched healthy control speakers. The recordings were part of a bigger recording session (results on fast syllable repetition tasks have been published in Mücke et al., 2018).
The articulatory data were recorded with a 3-dimensional Electromagnetic Articulograph (Carstens Medizinelektronik; AG501) at the IfL – Phonetics laboratory at the University of Cologne. We placed sensors on the upper and lower lip, tongue tip, tongue blade, and tongue dorsum. For labial consonantal movements, we analyzed the lower lip sensor, for alveolar consonantal movements, the tongue tip sensor, and for dorsal consonantal and vocalic movements, the tongue dorsum sensor. For head corrections we used three additional sensors on the nose ridge and behind the left and right ear. Further, a bite plate measure was applied for rotation in the occlusal plane. The sensors remained on the articulators for both measurements (DBS-ON and DBS-OFF) to warrant comparability of the data. After stimulation changed (randomized order: either from OFF to ON or from ON to OFF) the waiting time between the recording sessions was kept constant for a minimum of 20 minutes. The articulatory data were recorded at 1250Hz, downsampled to 250Hz and smoothed with a 40Hz low-pass filter and a 3-step floating mean. The acoustic data (time-synchronized) were recorded using a condenser microphone (AKG C420 headset) sampled at 48kHz, 16bit. All data were converted to SSFF format using custom software (EMA2SSFF) for annotation within the EMU Speech Database System (Cassidy & Harrington, 2001).
The speech material for the analysis of syllable coordination patterns consisted of target words with word initial CV and CCV that were embedded in in the carrier sentence “Er hat wieder _____ gesagt” (‘He said _____ again’). The target words for CV structure were /lima, pina, kina/ and for CCV structure /plina, klima/ (patients: 3 CV * 2 CCV * 5 repetitions * 2 stimulation conditions (OFF, ON) * 9 ET patients = 450 tokens; controls: 3 CV * 2 CCV * 5 repetitions * 9 controls = 225 tokens). We used both words and non-words as target items. All target items have the word accent on the first syllable.
The articulatory annotation of the data was done with the EMU Speech Database System (Cassidy & Harrington, 2001). We annotated the following landmarks for the consonantal movements and the vocalic movements in the vertical plane (movement on the y-axis): onset, peak velocity, maximum target (see Figure 4), using zero-crossings in the respective velocity and acceleration traces.
This labelling procedure allowed us to compute the following articulatory variables. We first analyzed the CV coordination in simple onsets. Thus, we aimed to shed light on the assumed in-phase coordination in CV syllables. We computed the (1) CV lag, relating the start of the consonantal movement to the start of the vocalic movement (see Figure 5 top row). The in-phase coupling of the C and the V gesture is expected to show a simultaneous activation of both movements. Further, we computed the shifts of the initial consonants relative to the following vocalic anchor for /i/, separated for the labial /plina/ and lingual /klima/: (2) leftward shift (comparing /p/ in /pina/ with /p/ in /plina/; /k/ in /kina/ with /k/ in /klima/), and (3) rightward shift (comparing /l/ in /lima/ with /l/ in /klima/ and in /plina/).
These shifts are calculated to analyze coordination patterns that are assumed for complex onsets. Figure 5 displays how the shifts of the consonants are calculated. For the leftward shift, we compare the latencies of the consonantal target, e.g., /p/ in /pina/, to the anchor, (i.e., the target of the vocalic gesture) with the latencies for the gestural target of /p/ in /plina/. By comparing these latencies in CV and CCV, we computed the variable of the leftward shift (see Figure 5, left, grey box). For the rightward shift, we compare the latencies of the consonantal target, e.g., /l/ in /lima/ to the following vocalic anchor, with the latencies of /l/ in /plina/. Comparing these latencies of the rightmost consonant to the vocalic anchor in CV and CCV allows us to compute the variable of the rightward shift (see Figure 5, right, grey box).
Using R (R Team, 2015) and the lme4 package (Bates, Maechler, Bolker, & Walker, 2014), we computed mixed linear regression models fitted to scaled log-transformed dependent outcome variables (1) CV lag, (2) leftward shift, and (3) rightward shift and compared the following DBS conditions: Control versus DBS-OFF and DBS-OFF versus DBS-ON. For (1), we included random intercepts and slopes for speakers by place of articulation (POA) as well as fixed factors for DBS condition and POA. We validated the models by comparing (i) the test model (with interaction DBS*POA) to a reduced model (without interaction) and (ii) the test model (with the critical predictor DBS) to a reduced model (without the critical predictor DBS) via likelihood-ratio tests (p-values are based on these comparisons). For (2) and (3), we included random intercepts and slopes for speakers by syllable complexity (CV versus CCV) with the critical predictors DBS and syllable complexity. Here, we did a separate model for the labial and the lingual data set. We validated the models by comparing the test model (with critical predictor) to a reduced model (without critical predictor) via likelihood-ratio tests. For the above-mentioned parameters, we tested different measurements against the null hypothesis, thus, we corrected for multiple testing using the Dunn–Šidák correction, lowering the analysis wide alpha level for these measurements to 0.0167.
The following section reports the articulatory results for the coordination patterns in CV compared to CCV.
We were interested in the coordination of the C and the V gestures. First, we report on the coordination in simple CV syllables. Within the coupling hypothesis of syllable structure, it is assumed that in a CV syllable the C and the V gesture are coupled in-phase with each other and thus are initiated at the same time (cf. Figure 1). The analysis of the initiation of the consonantal and the vocalic movement (CV lag) in labial and alveolar condition (/pi/ and /li/) showed that control speakers as well as patients in DBS-OFF and also in DBS-ON condition did show the expected coordination pattern, reflecting an in-phase coupling (labial: Control: μ = 56 ms, σ = 60; DBS-OFF: μ = 32 ms, σ = 31; DBS-ON: μ = 43 ms, σ = 42; alveolar: Control: μ = 32 ms, σ = 66; DBS-OFF: μ = 25 ms, σ = 92; DBS-ON: μ = 11 ms, σ = 17). Applying a mixed model comparing controls to DBS-OFF revealed neither an interaction of POA and DBS (X2(1) = 0.3918; p = 0.5314), nor an effect of DBS (X2(1) = 2.7887; p = 0.09493). Comparing patients in DBS-OFF condition with DBS-ON, there was an interaction of POA and DBS (X2(1) = 6.6649; p = 0.009833), but no effect of DBS (X2(1) = 0.1692; p = 0.69808). This interaction, showing that in labial context the consonantal movements are activated earlier than the vocalic gesture, is in line with what has been found by inter alia Löfqvist and Gracco (1999), who reported larger CV lags when labial consonants were involved.
The analysis of simple CV syllables (measured in terms of CV lags) revealed that for all groups the consonantal and the vocalic gesture are simultaneously activated. This was the case for the control speakers as well as for the patients in DBS-OFF and DBS-ON. In the framework of the coupled oscillator model this is the expected in-phase coupling between the syllable onset and the nucleus in a CV syllable.
In order to shed light on the coordination in complex onsets, we calculated the latencies of the C movements to the vocalic anchor in CV and CCV. These latencies are calculated for the labial (i.e., /l/-/p/-/plina/) and lingual (i.e., /l/-/k/-/klima/) systems separately, comparing control speakers with patients in DBS-OFF and DBS-ON (see Table 2).
|Leftmost C to V||Syllable||Control||DBS-OFF||DBS-ON|
|/kina/||CV||121 (36)||161 (74)||145 (60)|
|/klima/||CCV||171 (49)||212 (68)||220 (58)|
|/pina/||CV||138 (34)||174 (43)||172 (42)|
|/plina/||CCV||203 (42)||258 (65)||272 (79)|
|Rightmost C to V||Syllable||Control||DBS-OFF||DBS-ON|
|/lima/||CV||105 (34)||98 (48)||112 (52)|
|/klima/||CCV||104 (41)||107 (56)||135 (60)|
|/plina/||CCV||111 (39)||125 (62)||161 (73)|
Figure 6 displays the corresponding shifts of the consonants from CV to CCV: the shift of the leftmost C (Figure 6, dark grey bars), e.g., /p/ in /pina/ vs. /p/ in /plina/ and the shift of the rightmost C (Figure 6, light grey bars), e.g., /l/ in /lima/ vs. /l/ in /plina/. Thus, values below zero (bars to the left) reflect that the consonants are shifted away from the vowel, whereas bars to the right reflect a shift towards the vowel. It is assumed that a complex onset coordination involves a global coordination pattern (competitive coupling structure), entailing a leftward and a rightward shift.
For the control speakers, there was the expected shift of the leftmost consonant to the left, i.e., /p/ in /plina/ was shifted further away from the vowel (compared to /p/ in /pina/). For patients in DBS-OFF condition this shift was even bigger. The mixed model revealed an effect of syllable structure, i.e., from CV to CCV (X2(1) = 33.304; p = 7.883e–09) and an effect of DBS (X2(1) = 7.8086; p = 0.0052). Comparing patients in DBS-OFF with DBS-ON, the shift of the leftmost consonant numerically increased. The model revealed an effect of syllable structure (X2(1) = 17.555; p = 2.791e–05), but no effect of DBS (X2(1) = 0.185; p = 0.6665). Analyzing the shift of the rightmost consonant, i.e., /l/ in /lima/ versus /l/ in /plina/ showed that there is no shift towards the vowel, neither in the control speakers nor in the patients. The canonical pattern assumed for complex onset would have expected a shift of the rightmost consonant to the right. Although for the control speakers, there was no shift visible for the rightmost consonant at all (Control: CV μ = 105ms, σ = 34; CCV μ = 111ms, σ = 39), we see a tendency towards an even bigger shift to the left for the patients in DBS-OFF, i.e., further away from the vowel in the wrong direction (DBS-OFF: CV μ = 98ms, σ = 48; CCV μ = 125ms, σ = 62). The mixed model revealed no effect of syllable strurcture, comparing controls to patients in DBS-OFF (X2(1) = 2.7278; p = 0.09861), and no effect of DBS (X2(1) = 0.408; p = 0.523). Having a look at the patients in DBS-OFF and DBS-ON condition, an additional increase in this shift (in the wrong direction) can be observed. The rightmost consonant was shifted even further to the left. The model revealed no effect of syllable structure (X2(1) = 5.6918; p = 0.01704), but an effect of DBS (X2(1) = 24.377; p = 7.922e–07).
The analysis of the shift of the leftmost C, comparing /k/ in /kina/ versus /k/ in /klima/ indicates a similar pattern to the one shown for /plina/. All groups showed the expected leftward shift which is assumed for complex onset coordination. Thus, comparing the control speakers with the patients in DBS-OFF conveyed an effect of syllable structure (X2(1) = 14.115; p = 0.0001719) and an effect of DBS (X2(1) = 7.8764; p = 0.005008). Comparing the patients in DBS-OFF with DBS-ON condition, there was also an effect of syllable structure, i.e., a significant longer shift to the left (X2(1) = 11.792; p = 0.0005948). However, the model did not reveal an effect of DBS (X2(1) = 0.921; p = 0.3372).
Similar to the results for /plina/, the expected shift of the rightmost consonant in /klima/ towards the vowel was not present in all conditions (control, DBS-OFF, DBS-ON). When comparing the control speakers with the patients in DBS-OFF condition, the model did neither reveal an effect of syllable structure (X2(1) = 0.2276; p = 0.6333) nor DBS (X2(1) = 0.4744; p = 0.4909). When comparing the patients in DBS-OFF to DBS-ON condition, there was no effect of syllable structure (X2(1) = 0.0626; p = 0.8024), but an effect of DBS (X2(1) = 8.8369; p = 0.002952). Patients with DBS-ON did show a larger shift. However, it is important to mention that this assumed ‘rightward’ shift was in the wrong direction, i.e., to the left.
Opposed to the analysis of simple CV syllables, the analysis of complex CCV syllables (measured by the shift of initial consonantal movements relative to the vowel) revealed differences between control speakers and patients. For the shift of the leftmost consonant in a cluster (e.g., /p/ in /pina/ compared to /p/ in /plina/), all cohorts did reveal the leftward shift assumed for complex onsets. However, patients in DBS-OFF and DBS-ON differed from control speakers by producing an even larger leftward shift. Interestingly, the results for the shift of the rightmost consonant uncovered differences only within the patients (DBS-OFF versus DBS-ON). When turning the stimulation on, the rightmost consonant is shifted in the wrong direction (away from the vowel).
This study on the effect of DBS on syllable coordination patterns in ET patients revealed pattern similarities in syllables with low complexity (CV) and pattern differences in syllables with high complexity (CCV) when comparing control speakers to patients with inactivated stimulation (DBS-OFF) to patients with activated stimulation (DBS-ON). The analysis in simple CV syllables revealed for all groups the expected synchronous activation of the C and the V gesture (in-phase coordination; Löfqvist & Gracco, 1999, p. 1871).
Figure 7 displays the averaged trajectories for the tongue tip and the tongue dorsum in the target word /lima/, separately for one ET patient with DBS-OFF condition (on the right) and one age-matched control speaker (on the left). The trajectories are temporally aligned with the acoustic onset of the syllable /li/ (Figure 7, vertical line). The figure shows that both movements start at the same time, even though movements are longer and variability is higher in the patient’s production. This in-phase coupling is assumed to be the most stable mode in movement coordination (innate), and the respective coordination patterns were not affected when comparing controls with ET patients in DBS-OFF or DBS-ON.
The picture is different when looking at the production of syllables with higher complexity. In complex syllable onsets, a competitive coupling structure is required. Both C gestures are expected to be coupled in-phase with the V gesture and at the same time anti-phase with each other. This pattern is non-innate and has to be learnt. This competitive coupling mode is supposed to lead to a rightward shift of the prevocalic C towards the following V to make room for the added C. This means that the prototypical competitive coupling structure entails that the overlap between the prevocalic C and V should increase. However, this was not the case in our age-matched healthy control speakers. To some extent, it appears that the rightward shift was blocked, meaning that there is no change in overlap between C and V. This type of variation can be attributed to prosodic and segmental factors as being reported in Shaw et al. (2011), Pastätter and Pouplier (2015), Hermes et al. (2017), and Gafos et al. (2019) for various languages such as German, Tashlhiyt Berber, Polish, and Moroccan Arabic. In our case, the missing rightward shift on the phonetic surface representation is likely due to effects of segmental context. This is in line with Pouplier (2012) and Brunner et al. (2014), who also found no rightward shift for /pl/ clusters in German.
However, when comparing patients in DBS-OFF to DBS-ON this expected rightward shift was not simply blocked as it was the case for the control speakers. Indeed, there was an unexpected shift of the rightmost C to the left (in the opposite direction), shifting away from the vowel, when the stimulation was turned on (DBS-OFF = 18 ms versus DBS-ON = 51 ms). This means that the overlap between the prevocalic C and the following V decreases even though a C is added to the syllable. Figure 8 exemplifies the respective syllable coordination patterns in complex syllable for the target word /plina/ for the same patient and the aged-matched control speaker as presented in Figure 7. The figure displays the averaged trajectories for the lower lip, tongue tip, and the tongue dorsum movements. We hypothesize that both initial consonantal gestures are initiated sequentially in the production of the control speaker (example displayed in Figure 8; left). This sequential activation is important for perceptual recoverability of both consonants. In contrast, we assume that in the patient’s production the consonantal and the vocalic movements are initiated at the same time (example displayed in Figure 8; right). To compensate for this simultaneous activation, here /l/ is lengthened. This leads to a higher amount of variation during the production of /l/, which was also reflected in the velocity profiles of the patient’s tongue tip movement including changes in velocity over the acceleration and deceleration phase during targeting /l/.
Figure 9 provides a scheme of the coordination patterns in CV and CCV syllables for control speakers compared to ET patients. In simple syllable coordination patterns control speakers and ET patients did not differ and both groups reflected the prototypical coordination pattern in CV syllables, where C and V gesture are simultaneously activated with V being longer and less stiff than C (see Figure 9, top left and top right). However, differences arose in complex syllables. For the leftward shift, the pattern for control speakers differed from patients in DBS-OFF and DBS-ON, which showed a larger shift to the left. For the rightward shift, patients with activated stimulation (DBS-ON) did show a different pattern from patients with inactivated stimulation (DBS-OFF). While a prototypical competitive timing pattern should result in a sequential activation of the CCV sequence on the surface representation (schematized in Figure 9, bottom left), in the patients’ production, both consonants are activated at the same time and thus, for perceptual recoverability, the prevocalic consonant /l/ was considerably lengthened, as a compensatory strategy (schematized in Figure 9, bottom right). This pattern could reflect the patients’ difficulties to adapt to the conflicting demands of the underlying CCV coupling structure. The patients—already with inactivated stimulation—initiated the C gestures and the V gesture all at the same time. This coordination pattern even got worse under activated stimulation (shift of the prevocalic C in the wrong direction).
These deviant timing patterns for ET patients can be interpreted as inefficient coordination patterns in the phonetic realization of the competing coupling relations for complex onsets. The phonetic outcome of the phonological syllable parse deviates from what the phonology is supposed to trigger. More specifically, the data revealed a shift of the prevocalic C in the wrong direction (away from the vowel), combined with its compensatory lengthening to ensure perceptual recoverability of both consonants and to keep the syllable as a unit. This inefficiency could also lead to a higher amount of variation. Within a dynamical system, these critical changes of speech motor functions could also be interpreted as what Mackey and Milton (1987) as well as Glass (2015) referred to as dynamical disease, implying a “change in the qualitative dynamics of some observable nature as one or more parameters are changed” (Mackey & Milton, 1987, p. 16).
A possible explanation of these deviant patterns in the patients could be that the competitive coupling structure, a pattern that has to be learnt, breaks down on the phonetic surface. However, this does not imply that this structure is lost as a phonological pattern in terms of a categorical change. It is more that we are dealing with a gradient change, which could also be accounted for in a dynamical system with differences in coupling strengths of in-phase and anti-phase coupling, as presented in Tilsen (2016). We would assume that the coupling strength for anti-phase is weaker than for in-phase, which could lead to an imbalanced, asymmetric pattern.
This is—to our knowledge—the first study measuring articulatory movements in natural sentence production with electromagnetic articulography in ET patients with activated and inactivated stimulation, allowing us to compare the production of a patient cohort in two different conditions (DBS-OFF and DBS-ON), showing that there is a serious change in the behaviour of the speech motor system (with and without voltage in the target area). However, from a neurological perspective, we cannot clarify whether the timing problems in complex syllable structures are due to disease-related cerebellar dysfunction (atactic dysarthria) or an affection of the motor neuron (spastic dysarthria), or both. Therefore, we would need additional data of the brain quantifying the current spread of the activated electrodes in the respective target area. This is an important issue for future studies. Another problem is the heterogeneity of our patient group, since the very recent differentiation of ET and ET plus were not used as an inclusion criterion for the study (Bhatia et al., 2018).
The analysis of syllable coordination patterns in ET patients with DBS revealed coordination problems (compared to healthy control speakers). The analysis of these coordination patterns uncovered inefficient timing patterns for ET patients in the realization of complex syllables. We assume that patients have difficulties with competitive coupling structures of complex phonological syllable parses, leading to deviant, inefficient patterns on the phonetic exponents. However, these inefficient timing differences of the patients’ articulation are not categorical but gradient in nature, pointing to the fact that there are dynamic mechanisms of regulation behind quantitative consequences of qualitative syllable parses (Gafos et al., 2014, 2019; Hermes et al., 2017, Mücke et al., 2017). When changing the state of the system, e.g., when activating the stimulation, patients have to re-target the speech motor system to a certain extent to adapt to the side-effects of DBS on speech motor control. This behavior increases the costs of the physiological control system in the patients’ speech.
From a clinical point of view, we conclude that ET patients indeed have problems to adapt to the conflicting demands of complex coordination patterns by producing inefficient timing patterns. The activation of the stimulation affects the dynamics of the speech motor system. We assume that these timing problems in prosodic constituents with high complexity are likely due to cerebellar deficits, revealing therefore atactic dysarthria. These coordination problems are getting worse under stimulation and thus, we suppose that this could be an aggravation of the pre-existing cerebellar deficits (cf. Mücke et al., 2018 for fast syllable repetition tasks). However, it could also be the case that the stimulation induces—due to an overall slowing down of the system—additional spastic signs of dysarthria that deteriorate the temporal properties of syllables with high complexity, which would mean that there is an additional affection of upper motor fibers of the internal capsule.
From a linguistic perspective, we can conclude that there is a certain amount of variability allowed on the phonetic exponents of phonological syllable parses, but beyond the execution of the syllable as a prosodic constituent is affected.
This work was supported by the German Research Foundation (DFG) as part of the SFB 1252 “Prominence in Language” in the project A04 “Dynamic Modelling of Prosodic Prominence” at the University of Cologne. We want thank Till A. Dembek (Neurology Department, University Hospital Cologne) for his graphical support.
The authors have no competing interests to declare.
Baese-Berk, M., & Goldrick, M. 2009. Mechanisms of interaction in speech production. Language and cognitive processes, 24(4), 527–554. DOI: https://doi.org/10.1080/01690960802299378
Barbe, M. T., Dembek, T. A., Becker, J., Raethjen, J., Hartinger, M., Meister, I. G., Runge, M., Maarouf, M., Fink, G. R., & Timmermann, L. 2014. Individualized current-shaping reduces DBS-induced dysarthria in patients with essential tremor. Neurology, 82(7), 614–619. DOI: https://doi.org/10.1212/WNL.0000000000000127
Barbe, M. T., Reker, P., Hamacher, S., Franklin, J., Kraus, D., Dembek, T. A., Becker, J., Steffen, J. K., Allert, N., Wirths, J., & Dafsari, H. S. 2018. DBS of the PSA and the VIM in essential tremor: A randomized, double-blind, crossover trial. Neurology, 10–1212. DOI: https://doi.org/10.1212/WNL.0000000000005956
Benabid, A. L., Pollak, P., Gao, D., Hoffmann, D., Limousin, P., Gay, E., Payen, I., & Benazzouz, A. 1996. Chronic electrical stimulation of the ventralis intermedius nucleus of the thalamus as a treatment of movement disorders. Journal of Neurosurgery, 84(2), 203–214. DOI: https://doi.org/10.3171/jns.1996.84.2.0203
Bhatia, K. P., Bain, P., Bajaj, N., Elble, R. J., Hallett, M., Louis, E. D., Raethjen, J., Stamelou, Testa, C. M., & Deuschl, G. 2018. Consensus Statement on the classification of tremors, from the task force on tremor of the International Parkinson and Movement Disorder Society. Movement Disorders, 33(1), 75–87. DOI: https://doi.org/10.1002/mds.27121
Browman, C. P., & Goldstein, L. 1988. Some notes on syllable structure in articulatory phonology. Phonetica, 45(2–4), 140–155. DOI: https://doi.org/10.1159/000261823
Browman, C. P., & Goldstein, L. 1989. Articulatory gestures as phonological units. Phonology, 6(2), 201–251. DOI: https://doi.org/10.1017/S0952675700001019
Browman, C. P., & Goldstein, L. 1992. Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180. DOI: https://doi.org/10.1159/000261913
Brunner, J., Geng, C., Sotiropoulou, S., & Gafos, A. 2014. Timing of German onset and word boundary clusters. Journal of Laboratory Phonology, 5(4). DOI: https://doi.org/10.1515/lp-2014-0014
Cassidy, S., & Harrington, J. 2001. Multi-level annotation in the Emu speech database management system. Speech Communication, 33(1–2), 61–77. DOI: https://doi.org/10.1016/S0167-6393(00)00069-8
Cho, T. 2005. Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of/ɑ, i/in English. The Journal of the Acoustical Society of America, 117(6), 3867–3878. DOI: https://doi.org/10.1121/1.1861893
De Jong, K. J. 1995. The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. The Journal of the Acoustical Society of America, 97(1), 491–504. DOI: https://doi.org/10.1121/1.412275
Deuschl, G., Wenzelburger, R., Löffler, K., Raethjen, J., & Stolze, H. 2000. Essential tremor and cerebellar dysfunction clinical and kinematic analysis of intention tremor. Brain, 123(8), 1568–1580. DOI: https://doi.org/10.1093/brain/123.8.1568
Edwards, J., Beckman, M. E., & Fletcher, J. 1991. The articulatory kinematics of final lengthening. The Journal of the Acoustical Society of America, 89(1), 369–382. DOI: https://doi.org/10.1121/1.400674
Farnetani, E., & Recasens, D. 2010. Coarticulation and connected speech processes. The Handbook of Phonetic Sciences, Second Edition, 316–352. DOI: https://doi.org/10.1002/9781444317251.ch9
Flora, E. D., Perera, C. L., Cameron, A. L., & Maddern, G. J. 2010. Deep brain stimulation for essential tremor: A systematic review. Movement Disorders, 25(11), 1550–1559. DOI: https://doi.org/10.1002/mds.23195
Gafos, A. I., & Benus, S. 2006. Dynamics of phonological cognition. Cognitive Science, 30(5), 905–943. DOI: https://doi.org/10.1207/s15516709cog0000_80
Gafos, A. I., Charlow, S., Shaw, J. A., & Hoole, P. 2014. Stochastic time analysis of syllable-referential intervals and simplex onsets. Journal of Phonetics, 44, 152–166. DOI: https://doi.org/10.1016/j.wocn.2013.11.007
Gafos, A. I., Roeser, J., Sotiropoulou, S., Hoole, P., & Zeroual, C. 2019. Structure in mind, structure in vocal tract. Natural Language and Linguistic Theory. DOI: https://doi.org/10.1007/s11049-019-09445-y
Glass, L. 2015. Dynamical disease: Challenges for nonlinear dynamics and medicine. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25(9), 097603. DOI: https://doi.org/10.1063/1.4915529
Goldstein, L., Chitoran, I., & Selkirk, E. 2007. Syllable structure as coupled oscillator modes: Evidence from Georgian vs. Tashlhiyt Berber. Proceedings of the 16th International Congress of Phonetic Sciences, 241–244.
Harrington, J., Fletcher, J., & Beckman, M. 2000. Manner and place conflicts in the articulation of accent in Australian English. In: Papers in Laboratory Phonology V Acquisition and the Lexicon, 40–51. London: Cambridge University Press.
Harrington, J., Fletcher, J., & Roberts, C. 1995. Coarticulation and the accented/unaccented distinction: Evidence from jaw movement data. Journal of Phonetics, 23(3), 305–322. DOI: https://doi.org/10.1016/S0095-4470(95)80163-4
Haubenberger, D., & Hallett, M. 2018. Essential Tremor. New England Journal of Medicine, 378(19), 1802–1810. DOI: https://doi.org/10.1056/NEJMcp1707928
Hawkins, S. 1992. An introduction to task dynamics. Papers in laboratory phonology II: Gesture, segment, prosody, 9–25. DOI: https://doi.org/10.1017/CBO9780511519918.002
Hermes, A., Mücke, D., & Auris, B. 2017. The variability of syllable patterns in Tashlhiyt Berber and Polish. Journal of Phonetics, 64, 127–144. DOI: https://doi.org/10.1016/j.wocn.2017.05.004
Hermes, A., Mücke, D., & Grice, M. 2013. Gestural coordination of Italian word-initial clusters: The case of “impure s.” Phonology, 30(1), 1–25. DOI: https://doi.org/10.1017/S095267571300002X
Iskarous, K., & Kavitskaya, D. 2010. The interaction between contrast, prosody, and coarticulation in structuring phonetic variability. Journal of Phonetics, 38(4), 625–639. DOI: https://doi.org/10.1016/j.wocn.2010.09.004
Liberman, A. M., & Mattingly, I. G. 1985. The motor theory of speech perception revised. Cognition, 21(1), 1–36. DOI: https://doi.org/10.1016/0010-0277(85)90021-6
Lindblom, B. 1990. Explaining Phonetic Variation: A Sketch of the H&H Theory. In: Speech Production and Speech Modelling, 403–439. Netherlands: Springer. DOI: https://doi.org/10.1007/978-94-009-2037-8_16
Löfqvist, A., & Gracco, V. L. 1999. Interarticulator programming in VCV sequences: Lip and tongue movements. The Journal of the Acoustical Society of America, 105(3), 1864–1876. DOI: https://doi.org/10.1121/1.426723
Mackey, M. C., & Milton, J. G. 1987. Dynamical diseases. Annals of the New York Academy of Sciences, 504(1), 16–32. DOI: https://doi.org/10.1111/j.1749-6632.1987.tb48723.x
Marin, S., & Pouplier, M. 2010. Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control, 14(3), 380–407. DOI: https://doi.org/10.1123/mcj.14.3.380
Mücke, D., Becker, J., Barbe, M. T., Meister, I., Liebhart, L., Roettger, T. B., Dembek, T., Timmermann, L., & Grice, M. 2014. The Effect of Deep Brain Stimulation on the Speech Motor System. Journal of Speech, Language, and Hearing Research, 57(4), 1206–1218. DOI: https://doi.org/10.1044/2014_JSLHR-S-13-0155
Mücke, D., Hermes, A., & Cho, T. 2017. Mechanisms of regulation in speech: Linguistic structure and physical control system. Journal of Phonetics, 64, 1–7. DOI: https://doi.org/10.1016/j.wocn.2017.05.005
Mücke, D., Hermes, A., Roettger, T. B., Becker, J., Niemann, H., Dembek, T. A., Timmermann, L., Visser-Vandewalle, V., Fink, G. R., Grice, M., & Barbe, M. T. 2018. The effects of Thalamic Deep Brain Stimulation on speech dynamics in patients with Essential Tremor: An articulographic study. PLoS ONE, 13(1), e0191359. DOI: https://doi.org/10.1371/journal.pone.0191359
Nelson, N. R., & Wedel, A. 2017. The phonetic specificity of competition: Contrastive hyperarticulation of voice onset time in conversational English. Journal of Phonetics, 64, 51–70. DOI: https://doi.org/10.1016/j.wocn.2017.01.008
Pastätter, M., & Pouplier, M. 2015. Onset-vowel timing as a function of coarticulation resistance: Evidence from articulatory data. Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK.
Patri, J. F., Diard, J., & Perrier, P. 2015. Optimal speech motor control and token-to-token variability: A Bayesian modeling approach. Biological Cybernetics, 109(6), 611–626. DOI: https://doi.org/10.1007/s00422-015-0664-4
Pützer, M., Barry, W. J., & Moringlane, J. R. 2007. Effect of Deep Brain Stimulation on Different Speech Subsystems in Patients with Multiple Sclerosis. Journal of Voice, 21(6), 741–753. DOI: https://doi.org/10.1016/j.jvoice.2006.05.007
Recasens, D., Pallarès, M. D., & Fontdevila, J. 1997. A model of lingual coarticulation based on articulatory constraints. The Journal of the Acoustical Society of America, 102(1), 544–561. DOI: https://doi.org/10.1121/1.419727
Saltzman, E. 1986. Task dynamic coordination of the speech articulators: A preliminary model. US Department of Commerce, National Technical Information Service. DOI: https://doi.org/10.1007/978-3-642-71476-4_10
Saltzman, E., & Kelso, J. A. 1987. Skilled actions: A task-dynamic approach. Psychological Review, 94(1), 84. DOI: https://doi.org/10.1037/0033-295X.94.1.84
Saltzman, E. L., & Munhall, K. G. 1989. A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1(4), 333–382. DOI: https://doi.org/10.1207/s15326969eco0104_2
Scarborough, R. 2013. Neighborhood-conditioned patterns in phonetic detail: Relating coarticulation and hyperarticulation. Journal of Phonetics, 41(6), 491–508. DOI: https://doi.org/10.1016/j.wocn.2013.09.004
Shaw, J. A., & Gafos, A. I. 2015. Stochastic time models of syllable structure. PloS one, 10(5), e0124714. DOI: https://doi.org/10.1371/journal.pone.0124714
Shaw, J. A., Gafos, A. I., Hoole, P., & Zeroual, C. 2011. Dynamic invariance in the phonetic expression of syllable structure: A case study of Moroccan Arabic consonant clusters. Phonology, 28(3), 455–490. DOI: https://doi.org/10.1017/S0952675711000224
Staiger, A., Schölderle, T., Brendel, B., Bötzel, K., & Ziegler, W. 2016. Oral Motor Abilities Are Task Dependent: A Factor Analytic Approach to Performance Rate. Journal of Motor Behavior, 49(5), 482–493. DOI: https://doi.org/10.1080/00222895.2016.1241747
Tilsen, S. 2016. Selection and coordination: The articulatory basis for the emergence of phonological structure. Journal of Phonetics, 55, 53–77. DOI: https://doi.org/10.1016/j.wocn.2015.11.005
Ziegler, W., & Vogel, M. 2010. Dysarthrie: Verstehen, untersuchen, behandeln. Georg Thieme Verlag. DOI: https://doi.org/10.1055/b-002-25584