<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1868-6354</journal-id>
<journal-title-group>
<journal-title>Laboratory Phonology: Journal of the Association for Laboratory Phonology</journal-title>
</journal-title-group>
<issn pub-type="epub">1868-6354</issn>
<publisher>
<publisher-name>Open Library of Humanities</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.16995/labphon.24395</article-id>
<article-categories>
<subj-group>
<subject>Journal article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Timing lag matters in native speakers&#8217; perception of Georgian stop sequences</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-0773-2680</contrib-id>
<name>
<surname>Chitoran</surname>
<given-names>Ioana</given-names>
</name>
<email>ioana.chitoran@u-paris.fr</email>
<xref ref-type="aff" rid="aff-1">1</xref>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kwon</surname>
<given-names>Harim</given-names>
</name>
<xref ref-type="aff" rid="aff-3">3</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Universit&#233; Paris Cit&#233;, CNRS, Laboratoire de linguistique formelle, France</aff>
<aff id="aff-2"><label>2</label>Institut Universitaire de France, France</aff>
<aff id="aff-3"><label>3</label>Seoul National University, South Korea</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2026-03-27">
<day>27</day>
<month>03</month>
<year>2026</year>
</pub-date>
<pub-date pub-type="collection">
<year>2026</year>
</pub-date>
<volume>17</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>36</lpage>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2026 The Author(s)</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.journal-labphon.org/articles/10.16995/labphon.24395/"/>
<abstract>
<p>This study tests the hypothesis that timing lag is part of phonological knowledge, investigating Georgian speakers&#8217; perceptual recoverability of C1 consonantal gestures in stop-stop sequences. We hypothesize that the recovery of C1 is facilitated by (i) longer timing lag (i.e., reduced gestural overlap) between C1 and C2, and (ii) the presence of a C1 vocalic release. Two perception experiments (a forced choice identification and a follow-up transcription task) were conducted using stimuli that have been acoustically and articulatorily analyzed. We consider three acoustic parameters (acoustic lag, presence of C1 vocalic releases, vocalic release duration) and three articulatory parameters derived from EMA data (release lag, onset lag, relative overlap). The results show that longer timing lag does facilitate the accurate recovery of C1, but the presence or longer duration of a C1 vocalic release does not. The results support the inclusion of timing lag in the phonological grammar. We discuss why vocalic releases provide less reliable perceptual information that is potentially ambiguous between segmental and prosodic (syllabic) levels of structure.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>1. Introduction</title>
<p>The accurate perception of phonetic information that encodes phonological contrasts can be inhibited or enhanced in certain contexts. Misparsings of the speech signal can result from such misperceptions and may ultimately lead to sound change (<xref ref-type="bibr" rid="B48">Ohala, 1981</xref>; <xref ref-type="bibr" rid="B29">Harrington et al., 2008</xref>; <xref ref-type="bibr" rid="B30">Harrington et al., 2019</xref>; <xref ref-type="bibr" rid="B2">Beddor, 2009</xref>; <xref ref-type="bibr" rid="B35">Iskarous &amp; Kavitskaya, 2018</xref>, among others). It has also been proposed that knowledge of such perceptual patterns is directly encoded in phonological representations (see <xref ref-type="bibr" rid="B32">Hayes et al., 2004</xref>; <xref ref-type="bibr" rid="B66">Steriade, 2008</xref>). Testing the status of perceptual knowledge in synchronic phonology requires evidence that particular structures are either preferred or avoided, depending on the recoverability of essential parts of the signal. While several previous studies have examined such preferences in production, comparatively fewer have addressed this question directly from perception: Do the preferred structures truly present an advantage for the recoverability of phonological information? The current study contributes to further understanding of this issue empirically and considers its implications for a model of phonological representations.</p>
<p>One path toward testing the role of perceptual information in phonology is shaped by the concept of perceptual recoverability, as developed in articulatory phonology (<xref ref-type="bibr" rid="B6">Browman &amp; Goldstein, 1992</xref>; <xref ref-type="bibr" rid="B25">Goldstein &amp; Fowler, 2003</xref>), and interpreted with respect to the relative timing of gestures. Perceptual recoverability is understood to be one of the factors that characterizes intergestural coordination (cf. <xref ref-type="bibr" rid="B47">Mattingly, 1981</xref>), in the sense that speakers acquire knowledge of the temporal patterns and their acoustic consequences needed to correctly recover a gesture. Mattingly (<xref ref-type="bibr" rid="B47">1981</xref>) has proposed, for example, that syllabic organization may reflect such knowledge, because it allows considerable overlap of adjacent gestures, thus maximizing parallel transmission of information, while still allowing the individual gestures to be correctly recovered. When the degree of overlap exceeds the recoverability limits, a speech gesture may be hidden, as shown in the classic example of the phrase <italic>perfect memory</italic> (<xref ref-type="bibr" rid="B5">Browman &amp; Goldstein, 1990</xref>). In a fluent production of this phrase, the final /t/ in <italic>perfect</italic> is articulatorily present, but can be acoustically masked by the lip closure of /m/ in <italic>memory</italic> due to the maximum overlap of the consonantal gestures. This shows that even fully articulated gestures may become perceptually inaccessible, highlighting the importance of timing for perceptual recoverability. In articulatory phonology (henceforth AP), variability in inter-gestural timing follows from the dynamic nature of a gesture. A gesture is a dynamical system, and its dynamic regime is affected by a number of factors (see, for example, <xref ref-type="bibr" rid="B64">Sorensen &amp; Gafos, 2016</xref>; <xref ref-type="bibr" rid="B34">Iskarous, 2017</xref>; <xref ref-type="bibr" rid="B18">Du &amp; Gafos, 2022</xref>). In the specific case of the perceptibility of stop-stop sequences, relevant factors that have been examined include position in the word or utterance, and the order of the gestures&#8217; respective constriction locations (see references reviewed in Section 2). Both of these factors can be responsible for cases of maximum overlap where one of the gestures can be obscured, and both have been tested in experimental studies that adopt an AP position.</p>
<p>In our study, we further probe to what extent, and under which conditions, timing affects perceptual recoverability of gestures in consonantal sequences. Specifically, we test a hypothesis on the recoverability of stop-stop sequences: Gestures in a C1C2 stop sequence will be harder to recover if the sequence is produced with increased temporal overlap. C1, the first stop in the sequence, is expected to be the most vulnerable perceptually, if its release is overlapped with a complete oral closure. Thus, when a stop-stop sequence is produced with short lag (i.e., increased temporal overlap), the C2 closure can mask the acoustic release of C1. Cross-linguistic evidence in production (reviewed in Section 1) has shown that speakers reduce temporal overlap in precisely those contexts <italic>where C1 is most at risk of being masked</italic>. These findings have led, logically, to the hypothesis that speakers may control the degree of overlap between consonantal gestures on the basis of knowledge about the perceptibility of the individual gestures. We also consider the acoustic consequences of overlap patterns, in light of the discussions on cue robustness and cue precision in Henke et al. (<xref ref-type="bibr" rid="B33">2012</xref>), Wright (<xref ref-type="bibr" rid="B72">1996</xref>, <xref ref-type="bibr" rid="B73">2001</xref>), and Benki (<xref ref-type="bibr" rid="B3">2003</xref>). Temporal overlap can affect cue precision in particular, defined by Henke et al. as the degree to which information in the acoustic signal narrows down, perceptually, the number of segmental choices available to the listener.</p>
<p>In light of the evidence from production that speakers adjust timing in ways that may enhance recoverability, we aim to verify the basic premise of the C1C2 recoverability hypothesis: Does reduced overlap (longer lag) between consonant gestures help listeners to accurately recover C1 in a C1C2 stop sequence? To investigate this, we focus on the stop-stop sequences in Georgian, a language which has received a great deal of attention for its rich phonotactic combinations that present almost no gaps. The production patterns of Georgian stop-stop sequences have been well studied (<xref ref-type="bibr" rid="B77">Zhgenti, 1956</xref>; <xref ref-type="bibr" rid="B12">Chitoran et al., 2002</xref>; <xref ref-type="bibr" rid="B14">Crouch, 2022</xref>; <xref ref-type="bibr" rid="B16">Crouch et al., 2023a</xref>; <xref ref-type="bibr" rid="B52">Pouplier et al., 2022</xref>). In Section 2, we review production patterns established for Georgian and for other languages that have served as arguments for or against a perceptual motivation. In Section 3, we present an overview of the relevant perception studies that can inform our question. Section 4 outlines our perception experiments, followed by details of the materials and methods in Section 5 and the presentation of the results in Section 6. We end with a discussion of the results and conclusions in Section 7.</p>
</sec>
<sec>
<title>2. Perceptual motivations for observed production patterns</title>
<p>In many languages, stop-stop clusters have been found to exhibit variable overlap depending on two factors: (a) position in the word or utterance: initial C1C2 is generally less overlapped than medial; (b) order of place of articulation in the stop sequence: C1C2 with a back-to-front place order is generally less overlapped than front-to-back.</p>
<p>Many studies, reviewed below, have interpreted these overlap patterns based on perceptual recoverability, without testing perception but following a logical reasoning. In utterance-initial position, for example, the only acoustic information available for C1 is its release burst. In the case of increased temporal overlap, the absence of both the C1 acoustic release and of V-C1 or C1-V transitions makes it difficult to recover C1. Moreover, the recovery of C1 may be more critical word-initially, because word-initial onsets are more important in word recognition than word-medial ones (<xref ref-type="bibr" rid="B46">Marslen-Wilson &amp; Zwitserlood, 1989</xref>). Word-medially, by contrast, C1 may still be recoverable from transitions out of a preceding vowel, even at a greater degree of overlap.</p>
<p>A similar argument holds for a back-to-front order of C1C2 stop sequences (e.g., <italic>gd, tp</italic>), where an anterior C2 closure will mask a posterior C1 release unless there is an ample temporal distance between the two. If the C2 constriction is anterior in the vocal tract relative to C1, the C1 release may be completely hidden acoustically, unless the C2 constriction occurs after C1 has already been released. But if the C2 constriction is posterior to C1 (e.g., front-to-back sequences such as <italic>dg, pt</italic>), some acoustic information will still be present at C1 release, even with substantial overlap.</p>
<p>Evidence for longer lag (reduced overlap) produced word-initially and in back-to-front stop sequences is quite consistent across languages. In Georgian stop-stop sequences, Chitoran et al. (<xref ref-type="bibr" rid="B12">2002</xref>) and Crouch (<xref ref-type="bibr" rid="B14">2022</xref>) found that consonant timing varies systematically with position in the word and with the place order of the stops in native speakers&#8217; productions. Word-initial stop-stop sequences have significantly longer lag than word-internal ones, and sequences with a back-to-front (B-F) order of constriction location (e.g., <italic>gd, tp</italic>) have longer lag than sequences with a front-to-back order (F-B) (e.g., <italic>dg, pt</italic>). Chitoran et al. (<xref ref-type="bibr" rid="B12">2002</xref>) attributed these patterns to considerations of perceptual recoverability, that is, when a stop-stop sequence (C1C2) is produced with short lag, the acoustic release of C1 can be masked by C2 closure.</p>
<p>Beyond Georgian, a longer lag pattern word-initially has been reported for several other languages: English (<xref ref-type="bibr" rid="B27">Hardcastle, 1985</xref>, for stop-liquid sequences; <xref ref-type="bibr" rid="B8">Byrd, 1996</xref>, for /s/-stop), Tsou (<xref ref-type="bibr" rid="B72">Wright, 1996</xref>), Russian (<xref ref-type="bibr" rid="B39">Kochetov &amp; Goldstein, 2005</xref>), Moroccan Arabic (<xref ref-type="bibr" rid="B22">Gafos et al., 2010</xref>), Hebrew (<xref ref-type="bibr" rid="B74">Yanagawa, 2006</xref>). Evidence for a place order effect has been found in English (<xref ref-type="bibr" rid="B28">Hardcastle &amp; Roach, 1979</xref>; <xref ref-type="bibr" rid="B7">Byrd, 1992</xref>, <xref ref-type="bibr" rid="B8">1996</xref>; <xref ref-type="bibr" rid="B78">Zsiga, 1994</xref>; <xref ref-type="bibr" rid="B67">Surprenant &amp; Goldstein, 1998</xref>), Tsou (<xref ref-type="bibr" rid="B72">Wright, 1996</xref>), Taiwanese (<xref ref-type="bibr" rid="B49">Peng, 1996</xref>), Russian (<xref ref-type="bibr" rid="B79">Zsiga, 2000</xref>), Korean (<xref ref-type="bibr" rid="B40">Kochetov et al., 2007</xref>), French (<xref ref-type="bibr" rid="B38">K&#252;hnert et al., 2006</xref>).</p>
<p>As promising as these cross-linguistic production patterns may be for a recoverability hypothesis, it is important to consider several caveats. First, across many of these studies, lag or overlap is quantified by a variety of measures, acoustic and/or articulatory. Also, it is not entirely clear how well the acoustic measures employed (e.g., the acoustic duration of the inter-burst interval) correspond to articulatory measures. Pouplier et al. (<xref ref-type="bibr" rid="B51">2017</xref>) further warn that hypotheses about the perceptibility of consonant sequences cannot be tested by measuring articulatory lag alone, since it overlooks important differences between acoustic and articulatory coarticulation effects. For stops in particular, the relation between their articulatory and acoustic release is dependent on their constriction location in the vocal tract. Articulatory releases, as measured based on kinematic data, may precede acoustic releases to varying degrees. Further, differences in intraoral pressure dynamics affect the resulting acoustics depending on the constriction locations and voicing of the stops involved, and thus can affect perception independently of lag. Pouplier et al. test the cue robustness hypothesis of Henke et al. (<xref ref-type="bibr" rid="B33">2012</xref>) and Wright (<xref ref-type="bibr" rid="B72">1996</xref>), proposing that typologically rare clusters are perceptually suboptimal. These rare clusters are predicted to have a limited amount of overlap, and this limited overlap is, moreover, predicted to be resistant to coarticulatory changes normally triggered under rate pressure. It is understood that this coarticulatory stability is under speaker control, reflecting speakers&#8217; knowledge of perceptual cues to the identity of the segments involved. Nevertheless, Pouplier and colleagues do not find support for the covariation between perceptual recoverability and degree of overlap. Contrary to the predictions, speech rate manipulation did affect degree of overlap regardless of their optimal or sub-optimal phonotactics. At fast speech rate, overlap increased regardless of phonotactic status. They found, instead, a lexical frequency effect, whereby degree of overlap increased significantly more at fast speech for clusters with higher frequency, but they found no support for the auditory cue robustness hypothesis.</p>
<p>Differences between initial and medial lags, for example, have also been explained by prosodic factors, as temporal patterns are known to be influenced by word or prosodic boundaries (<xref ref-type="bibr" rid="B20">Edwards et al., 1991</xref>; <xref ref-type="bibr" rid="B65">Sotiropoulou &amp; Gafos, 2022</xref>). The boundary as a prosodic event affects the constriction formation and release of consonantal gestures, as well as their acoustic duration, and distance from the boundary is known to modulate the strength of such effects. In AP, these durational patterns are conceptualized as effects of a prosodic time variable, which slows down articulatory events close to prosodic boundaries (<xref ref-type="bibr" rid="B9">Byrd &amp; Saltzman, 2003</xref>; <xref ref-type="bibr" rid="B42">Krivokapi&#263;, 2007</xref>; <xref ref-type="bibr" rid="B13">Cho et al., 2014</xref>; <xref ref-type="bibr" rid="B37">Katsika, 2016</xref>).</p>
<p>Evidence against the recoverability hypothesis comes from Korean and Japanese, where stop-stop sequences show reversed place order effects. In Korean (<xref ref-type="bibr" rid="B63">Son, 2008</xref>) and Japanese (Yanagawa, 2003), front-to-back /pt/ and /pk/ are found to be less overlapped than /tp/ and /kp/, contrary to expectations based on recoverability. Other studies revealed place order asymmetries in contexts where they are not predicted by recoverability, thus weakening the perceptual recoverability hypothesis, as well. For example, Chitoran and Goldstein (<xref ref-type="bibr" rid="B11">2006</xref>) found a place order effect in Georgian not only in stop-stop but also in stop-liquid sequences, where it would not necessarily be expected, since in this context C1 is not perceptually vulnerable. Longer lag was found in the front-to-back /kl/ and /kr/ sequences than in /pl/ and /pr/, even though, in both cases, the identity of C1 is not as easily masked at shorter lags as it is in stop-stop sequences. When followed by a liquid, a C1 stop preserves some acoustic information in the transition into C2. A similar, perceptually unmotivated place order effect was found in French stop-/l/ sequences by K&#252;hnert et al., (<xref ref-type="bibr" rid="B38">2006</xref>). This longer lag in stop-liquid sequences has instead been attributed to articulatory characteristics of the liquid, such as stiffness (Du &amp; Gafos, 2023). Assuming that speakers control the stiffness parameter, as proposed in AP, the findings on stop-liquid sequences point to articulatory constraints, independently of perceptual considerations, that can shape timing patterns of adjacent gestures.</p>
<p>Similar place order effects unmotivated by perceptual recoverability have also been observed by Yip (<xref ref-type="bibr" rid="B75">2013</xref>) in Greek, a language whose CC inventory approaches that of Georgian. Greek C1C2 sequences show a place order effect at least for some speakers, but in addition to stop-stop sequences, the effect is observed even when one of the consonants is a fricative or when C2 is a liquid. According to recoverability predictions, this result is unexpected, since a fricative in C1 position is not as vulnerable at high overlap as a stop. The frication noise of the fricative carries sufficient acoustic information that is not easily masked by increased overlap with C2. Thus, the Greek and the Georgian production patterns fail to provide specific, conclusive evidence supporting perceptual recoverability, which could be due to an alternative motivation, making it essential to verify the hypothesis directly from the perception side.</p>
</sec>
<sec>
<title>3. Evidence from perception</title>
<p>Relatively few studies so far have tested the perceptual effect of lag duration and/or of its direct acoustic consequences (i.e., the presence vs. absence of an acoustic release burst or vocalic transition) on the perceptual recovery of C1. We review these perceptual studies in this section.</p>
<p>Surprenant and Goldstein (<xref ref-type="bibr" rid="B67">1998</xref>) studied the effect of gestural overlap on the perception of the American-English stop sequences /t#p/ and /p#t/ across a word boundary, using stimuli extracted from x-ray microbeam data. The study addresses the place order effect. In one experiment, listeners were asked to perform a consonant monitoring task for stimuli containing stop-stop sequences (<italic>tot#puddles, top#tuddles</italic>) and controls containing C1 single stops (<italic>tot#huddles, top#huddles</italic>). Results showed that: (i) C1 in the stop-stop sequences was detected significantly less often than the C1 single stops, (ii) a C2 bilabial gesture more often obscured a preceding C1 alveolar gesture than the reverse order, and (iii) the detection rate correlated with the degree of overlap between lip and tongue tip gestures. In a second experiment, the acoustic burst&#8212;acoustic information crucial to retrieve lip and tongue tip gestures at the articulatory release&#8212;was removed from the stimuli. The absence of the acoustic burst decreased the detectability of the stops generally, but the asymmetry between the detectability of C1 in /t#p/ vs. /p#t/ did not entirely disappear. This specific result is significant, because it suggests that the timing of the two gestures alone may have perceptual consequences independently of the acoustic consequences of that timing.</p>
<p>Their study, using natural speech data, confirmed the results of an earlier perception study by Byrd (<xref ref-type="bibr" rid="B7">1992</xref>), which used synthetic stimuli obtained with the Haskins articulatory synthesizer (<xref ref-type="bibr" rid="B60">Rubin et al., 1981</xref>). Byrd varied the amount of overlap in <italic>bab#dan</italic> and <italic>bad#ban</italic>. Results of a forced choice identification test showed that as overlap increased, C1 identification was significantly reduced, and C1 was perceived as assimilated to C2. Also, consistent with the place order effect, the alveolar C1 in back-to-front /d#b/ was affected at a much smaller degree of overlap than the labial C1 front-to-back in /b#d/.</p>
<p>Coronal-labial and labial-coronal sequences were also studied by Chen (<xref ref-type="bibr" rid="B10">2003</xref>) using computational modelling to test the effects of gestural overlap on the recoverability of C1. The stimuli were generated with GEST (<xref ref-type="bibr" rid="B5">Browman &amp; Goldstein, 1990</xref>; <xref ref-type="bibr" rid="B21">Gafos, 2002</xref>), with acoustics derived with the Haskins articulatory synthesizer. Results based on the listener model found that increasing overlap in labial-coronal sequences had little effect on C1 recovery, but in the opposite coronal-labial order, recoverability rates for C1 decreased.</p>
<p>These studies are limited to combinations of labial and coronal stops across a word boundary, a context where intergestural timing is known to be highly variable. Directly relevant to the current study are the studies on segmental recoverability as a function of lag duration within a syllable, and of the presence vs. absence of a stop release burst. Both were investigated by Wright (<xref ref-type="bibr" rid="B72">1996</xref>) in Tsou word-initial clusters. One experiment manipulated the acoustic release burst of C1: The burst was removed from a word-initial cluster (/tmihi/ &#8216;to hang&#8217;) and inserted before a singleton C (/mihi/ &#8216;to desire a denied thing&#8217;). When the release burst was added to a singleton, Tsou native listeners reported hearing clusters. When the release burst was removed from a cluster, they reported hearing the singleton C. A second experiment varied the acoustic inter-burst interval (IBI), representing the acoustic lag between C1 and C2 in the initial stop-stop cluster /pt/. Listeners&#8217; perception varied with the acoustic lag duration. When the C1 and C2 bursts were separated by 0 &#8211; 25 ms silence, listeners reported hearing a single onset /t/, while between 50 and 150 ms, they reported hearing the full cluster. The cluster response went down again beyond 150 ms, and the single /t/ response went up instead.</p>
<p>The perceptual role of acoustic details in stop-stop sequences was later reliably established in Wilson et al., (<xref ref-type="bibr" rid="B71">2014</xref>), in a cross-language perception study. While the study did not consider lag/overlap directly, it showed that masking acoustic details of C1 in a stop-stop sequence can affect its recoverability, at least for non-native listeners. The authors manipulated the amplitude and duration of C1 release bursts in initial stop-stop sequences. The stimuli were non-words containing stop-stop sequences licit in Russian and illicit in English, produced by a Russian-English bilingual. Native English speakers were then asked to hear and produce the sequences in a shadowing task. The results showed that greater burst amplitude made C1 more likely to be correctly produced, protecting it from deletion or other modifications. When C1 burst amplitude decreased, C1 underwent significantly more deletion and change.</p>
<p>The presence vs. absence of stop releases was also tested in a cross-linguistic perceptual study by Kochetov and So (<xref ref-type="bibr" rid="B41">2007</xref>), who conducted two experiments using Russian voiceless clusters with released and unreleased stops. Results from native listeners of Russian, Canadian English, Korean, and Taiwanese Mandarin showed that the presence or absence of stop releases strongly affected the perceptual accuracy of place of articulation.</p>
<p>All these previous findings collectively suggest that a stop release is crucial in perceiving the stop. When the release is masked or removed, the consonant becomes less likely to be accurately perceived. Therefore, the temporal pattern that is likely to mask the release of C1 is predicted to be avoided in production (as supported by many production studies), and listeners are less likely to perceive the intended consonant when the lag is short (or overlap is high), thus the release is less (or not) audible.</p>
<p>In addition to differences in lag and release bursts, previous studies have also reported the variable production of vocalic releases of C1 in a stop-stop sequence in several languages (for a recent review, see <xref ref-type="bibr" rid="B26">Hall, 2024</xref>). These vocalic releases, or vocoids, are typically seen as the acoustic consequence of C-C coordination with longer lag (reduced overlap), and have been reported for Moroccan Arabic (<xref ref-type="bibr" rid="B21">Gafos, 2002</xref>), Tashlhiyt (<xref ref-type="bibr" rid="B57">Ridouane &amp; Fougeron, 2011</xref>; <xref ref-type="bibr" rid="B56">Ridouane &amp; Cooper-Leavitt, 2019</xref>), and Georgian (<xref ref-type="bibr" rid="B12">Chitoran et al., 2002</xref>; Goldstein et al., 2007; <xref ref-type="bibr" rid="B16">Crouch et al., 2023a</xref>). By providing formant transitions, these transitional vocoids may help convey information about the identity of C1.</p>
<p>Evidence from a perception study on Tashlhyit word-initial consonant sequences confirms this prediction. Zellou et al. (<xref ref-type="bibr" rid="B76">2024</xref>) tested the perception of vowelless Tashlhiyt words across clear and casual speaking styles by native and non-native na&#239;ve listeners. A part of this study examined the role of the transitional vocoids, establishing that the longer duration of these vocoids in consonant sequences improves native speakers&#8217; discrimination of CCC words. When discrimination was more challenging, such as between pairs with sequences of matching sonority (falling vs. falling, or plateau vs. plateau), the presence of a transitional vocoid improved discrimination for both native and non-native listeners. A longer duration of the vocoid was also beneficial. CCC words were easier to discriminate when they contained longer transitional vocoids, and this effect was particularly strong for word pairs with sonority plateaus.</p>
<p>The Tashlhiyt results may inform predictions about the perception of Georgian clusters, although the two languages are not directly comparable. While in Georgian all complex word onsets constitute one syllable onset, in Tashlhiyt they can be heterosyllabic, since any consonant can be syllabic (<xref ref-type="bibr" rid="B17">Dell &amp; Elmedlaoui, 2002</xref>; <xref ref-type="bibr" rid="B55">Ridouane, 2016</xref>; <xref ref-type="bibr" rid="B58">Ridouane et al., 2014</xref>). While a comparison of Georgian and Tashlhiyt lies beyond the scope of this study, it is important to note that the two languages differ in the distribution of the vocoids. In both languages, the distribution of vocoids is sensitive to the sonority sequencing, but they show different distributional patterns. The Georgian speakers&#8217; production data (<xref ref-type="bibr" rid="B16">Crouch et al., 2023a</xref>; <xref ref-type="bibr" rid="B15">Crouch et al., 2023b</xref>) revealed that vocoids appear predominantly in sonority rises (56% of the data), less often in sonority plateaus (25% of the data), and only rarely in sonority reversals (9% of the data). The situation seems to be the opposite in Tashlhiyt (<xref ref-type="bibr" rid="B76">Zellou et al., 2024</xref>). These differences suggest that the transitional vocoids may not necessarily result from similar gestural coordination patterns in the two languages. In Georgian, vocoids occur between the two consonants, and they can be attributed to an overall reduced C-C overlap pattern within a complex onset (supported by results of a seven-language comparison in <xref ref-type="bibr" rid="B52">Pouplier et al., 2022</xref>). In Tashlhiyt, however, the vocoids may appear in various positions, either before the consonant sequence, or breaking it. Since Tashlhiyt consonant sequences, unlike those of Georgian, are heterosyllabic, they may involve variable modes of coordination. This difference in coordination may in turn influence how consonant sequences are perceived by listeners.</p>
<p>A previous perception study conducted with Georgian listeners suggests that the listeners&#8217; perception is indeed influenced by temporal organization. Kwon and Chitoran (<xref ref-type="bibr" rid="B44">2024</xref>) tested Georgian listeners on CCa-CVCa discrimination in French stimuli. The results revealed perceptual confusion in particular for the CCa-C&#248;C&#225; contrast, where the French vowel /&#248;/ is phonetically similar to the C-C transition in Georgian, both in terms of temporal organization and tongue shape (formant structure). This suggests that Georgian listeners use their native knowledge of the temporal implementation of word-onset CC sequences in responding to the task. The present study will provide new information that can be connected to these earlier results, aiming to understand the relationship between the vocalic transitions and timing lag. Ultimately, it will help verify what listeners use&#8212;vocalic transitions, or lag alone&#8212;and will further help us understand what speakers plan.</p>
<p>With these considerations in mind, we conducted a perception study using Georgian as a test language. The relatively unconstrained phonotactics of the language allow us to test perception of naturally produced C1C2 stop combinations of varying degrees of overlap, utilizing the same sequences produced in word onset and word medial positions. We first establish the perceptibility of C1 in such sequences. Then, following Pouplier et al.&#8217;s (<xref ref-type="bibr" rid="B51">2017</xref>) cautionary remark against relying exclusively on articulatory lag for perceptibility hypotheses, we test whether, and to what extent, perception accuracy can be related to measures of temporal overlap (lag) between the two consonantal gestures, as measured from an articulatory signal, and to its acoustic consequences. To the best of our knowledge, our study is the first to test the relevance of both articulatory and acoustic patterns for recovering phonological information about C1 in stop-stop sequences.</p>
</sec>
<sec>
<title>4. The current perception experiment</title>
<p>Chitoran et al. (<xref ref-type="bibr" rid="B12">2002</xref>) proposed that the variation present in Georgian production patterns can be interpreted as speaker-controlled strategies for increasing C1 perceptibility in a stop-stop context, where C1 gestures are harder to recover. The reasoning behind the interpretation is based on the following key observations:</p>
<list list-type="bullet">
<list-item><p>The two Georgian speakers whose kinematic data were analyzed in their study consistently produced a longer lag in contexts where C1 could be easily masked. It was proposed that the longer lag would prevent C1 release from being masked by C2 closure and would allow for a clearer, audible C1 release in a stop sequence;</p></list-item>
<list-item><p>The speakers occasionally produced a C1 vocalic release, resulting in a transitional vocoid, as illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>. Vocoids were observed more often in voiced stop-stop sequences and in sequences with a back-to-front place order, where C1 release is less likely to be audible because of the following anterior constriction. It was thus proposed that the vocoid would provide clearer C1 formant transitions, which could contribute to the accurate recovery of C1.</p></list-item>
</list>
<fig id="F1">
<caption>
<p><bold>Figure 1:</bold> Example of a vocalic release (transitional vocoid) occurring between C1 and C2, in the word /dgeba/ &#8216;to be thrown&#8217;.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g1.png"/>
</fig>
<sec>
<title>4.1 Hypotheses and predictions</title>
<p>Based on the interpretation of the observed Georgian production patterns, we test the following hypotheses:</p>
<disp-quote>
<p><bold>H1:</bold> Longer timing lag (or reduced overlap) between C1 and C2 facilitates recovery of C1 gestures</p>
<p><bold>H2:</bold> A C1 vocalic release facilitates recovery of C1 gestures</p>
</disp-quote>
<p>In the present study, we test the hypotheses in two perception experiments with native listeners of Georgian. The perceptual responses are then evaluated against detailed acoustic and articulatory (EMA) analyses of the Georgian stimuli.</p>
<p>Experiment 1 is a forced choice identification task, in which listeners heard CCV portions excised from words containing stop-stop sequences. They were asked to decide whether the short sound they heard began with a CV or a CC sequence. It is predicted that listeners would more accurately identify stimuli as beginning with CC when timing lag is longer, and thus perceptual recoverability is assumed to be enhanced. The best CC identification rates are therefore expected for target stimuli based on initial back-to-front sequences (e.g., <italic>#gdV</italic>), which are the least overlapped (have the longest lag), and the poorest rates are expected for stimuli containing word-medial front-to-back sequences (e.g., <italic>dgV</italic>), which are the most overlapped (have the shortest lag).</p>
<p>Experiment 2 was conducted a week later, with the same participants. It consisted of a transcription task. Each listener was exposed only to those stimuli for which they previously gave a CV answer and asked to transcribe the short sequence by hand. This transcription task determined whether listeners responded CV in Experiment 1 because they failed to detect C1 in the stop-stop sequence, or because they detected a vocalic portion between C1 and C2, and treated it as a vowel. Experiment 2 thus complements Experiment 1 by allowing for reliable answers to each of the two questions of the study:</p>
<list list-type="simple">
<list-item><p>(a) Do listeners ever miss C1?</p></list-item>
<list-item><p>(b) Do listeners detect a C1 vocalic release when present, and does it help them to accurately identify C1?</p></list-item>
</list>
</sec>
</sec>
<sec>
<title>5. Material and methods</title>
<sec>
<title>5.1 Acoustic and articulatory data collection</title>
<p>The stimuli were prepared based on the acoustic and articulatory Georgian data previously collected at Haskins Laboratories, in New Haven, CT, and analysed in Chitoran et al., (<xref ref-type="bibr" rid="B12">2002</xref>). One specific speaker was selected to provide the stimuli, based on the fact that he produced more frequent vocalic transitions between C1 and C2 in his speech. The speaker, a male in his mid 20s, had been living in the United States for approximately two years at the time of the recording and reported using Georgian on a regular daily basis. He reported no speech or hearing impairment. Prior to the start of the experiment, the speaker was informed of the purpose of the study; he read and signed consent forms (all of this was done in English).</p>
<p>Simultaneous acoustic and kinematic data were collected for stimuli, including stop-stop sequences and filler items (<xref ref-type="table" rid="T1">Table 1</xref>). Target stimuli were recorded along with other distractors, which included CC combinations other than stop-stop. The stop-stop sequences in this study are identical tokens to the ones whose production was analyzed in Chitoran et al. (<xref ref-type="bibr" rid="B12">2002</xref>). Each target word was produced in the carrier phrase [sit&#8217;q&#8217;wa ____ gamoit&#688;k&#688;mis or&#676;er] (&#8216;The word ____ is pronounced twice&#8217;). A computer screen presented the sentences one at a time, in Georgian script. The speaker was invited to read each sentence aloud at a normal pace. If he paused or had a false start, he was asked to re-read the entire sentence. Fourteen repetitions of each sentence containing a stop-stop target word were presented in randomized order and recorded.</p>
<table-wrap id="T1">
<caption>
<p><bold>Table 1:</bold> Words used to create perception stimuli (&#8216;-&#8216; indicates a morpheme boundary).</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top" colspan="3"><bold>Front-to-back</bold></td>
<td align="left" valign="top" colspan="3"><bold>Back-to-front</bold></td>
</tr>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"><bold>stimuli</bold></td>
<td align="left" valign="top" colspan="2"><bold>words used</bold></td>
<td align="left" valign="top"><bold>stimuli</bold></td>
<td align="left" valign="top" colspan="2"><bold>words used</bold></td>
</tr>
<tr>
<td align="left" valign="top" rowspan="3"><bold>Word-initial</bold></td>
<td align="left" valign="top">bge</td>
<td align="left" valign="top">bgera</td>
<td align="left" valign="top">&#8216;sound&#8217;</td>
<td align="left" valign="top">gbe</td>
<td align="left" valign="top">g-ber-av-s</td>
<td align="left" valign="top">&#8216;is inflating you&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">p&#688;t&#688;i</td>
<td align="left" valign="top">p&#688;t&#688;ila</td>
<td align="left" valign="top">&#8216;hair lock&#8217;</td>
<td align="left" valign="top">t&#688;be</td>
<td align="left" valign="top">t&#688;b-eb-a</td>
<td align="left" valign="top">&#8216;it is warming up&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">dge</td>
<td align="left" valign="top">dg-eb-a</td>
<td align="left" valign="top">&#8216;s/he stands up&#8217;</td>
<td align="left" valign="top">gde</td>
<td align="left" valign="top">gd-eb-a</td>
<td align="left" valign="top">&#8216;to be thrown&#8217;</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="3"><bold>Word-medial</bold></td>
<td align="left" valign="top">bga</td>
<td align="left" valign="top">abga</td>
<td align="left" valign="top">&#8216;saddle&#8217;</td>
<td align="left" valign="top">gbe</td>
<td align="left" valign="top">da-gbera</td>
<td align="left" valign="top">&#8216;to say the sounds&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">p&#688;t&#688;a</td>
<td align="left" valign="top">ap&#688;t&#688;ar-i</td>
<td align="left" valign="top">&#8216;hyena&#8217;</td>
<td align="left" valign="top">t&#688;ba</td>
<td align="left" valign="top">ga-t&#688;b-a</td>
<td align="left" valign="top">&#8216;it has become warm&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">dge</td>
<td align="left" valign="top">a-dg-eb-a</td>
<td align="left" valign="top">&#8216;s/he will stand up&#8217;</td>
<td align="left" valign="top">gde</td>
<td align="left" valign="top">a-gd-eb-a</td>
<td align="left" valign="top">&#8216;throw in the air&#8217;</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="5"><bold>Fillers</bold></td>
<td align="left" valign="top">t&#8217;k&#8217;e</td>
<td align="left" valign="top">t&#8217;k&#8217;ena</td>
<td align="left" valign="top">&#8216;to hurt&#8217;</td>
<td align="left" valign="top">k&#8217;bi</td>
<td align="left" valign="top">k&#8217;bili</td>
<td align="left" valign="top">&#8216;tooth&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">t&#8217;k&#8217;a</td>
<td align="left" valign="top">bat&#8217;k&#8217;an-i</td>
<td align="left" valign="top">&#8216;lamb&#8217;</td>
<td align="left" valign="top">k&#688;t&#8217;i</td>
<td align="left" valign="top">k&#688;t&#8217;it&#688;or-i</td>
<td align="left" valign="top">&#8216;founder&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">bra</td>
<td align="left" valign="top">braz-i</td>
<td align="left" valign="top">&#8216;anger&#8217;</td>
<td align="left" valign="top">k&#8217;re</td>
<td align="left" valign="top">k&#8217;reba</td>
<td align="left" valign="top">&#8216;meeting&#8217;</td>
</tr>
<tr>
<td align="left" valign="top">t&#688;k&#688;e</td>
<td align="left" valign="top">albat&#688;#k&#688;er-i</td>
<td align="left" valign="top">&#8216;probably barley&#8217;</td>
<td align="left" valign="top">k&#688;t&#8217;e</td>
<td align="left" valign="top">p&#688;ak&#688;t&#8217;-eb-i</td>
<td align="left" valign="top">&#8216;facts&#8217;</td>
</tr>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top">t&#688;ba</td>
<td align="left" valign="top">albat&#688;#ba&#611;-&#643;i</td>
<td align="left" valign="top">&#8216;probably in the garden&#8217;</td>
</tr>
<tr>
<td align="left" valign="top"><bold>CVCV controls</bold></td>
<td align="left" valign="top" colspan="3">bile, t&#8217;ebi, deba, geba</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Intervocalic multiple consonant sequences in Georgian are reported to be syllabified as a simplex coda followed by a complex onset (<xref ref-type="bibr" rid="B31">Harris, 2002</xref>). However, when the sequence consists of only two consonants, the syllabification intuitions of native speakers vary, oscillating between a complex onset (V.CCV) and a coda+onset (VC.CV).</p>
<p>The EMA magnetometer system (<xref ref-type="bibr" rid="B50">Perkell et al., 1992</xref>), in use at Haskins at the time, was used for data collection. Two receiver coils were attached to two midsagittal points on the tongue: approximately 1 cm from the tongue tip (TT), and tongue dorsum (TD) as far back as possible, to capture velar constrictions. One coil each was placed on the upper and lower lip on the midsagittal plane. Reference coils were placed on the upper and lower teeth and on the nose bridge, and were used for head movement correction.</p>
</sec>
<sec>
<title>5.2 Stimuli preparation</title>
<p>From the recorded C1C2 sequences, six different stop-stop sequences &#8211; /bg, dg, pt, gb, gd, tb/ &#8211; varying in their place order (three front to back F-B, three back to front B-F) were selected as the target stimuli. Other C1C2 sequences that are not the target six sequences were used as fillers, in order to vary the types of stimuli to which the listeners were exposed. Multiple productions (three to five) were included for each sequence in order to examine the effect of timing variation naturally present in Georgian CC production. For this speaker, one /gat&#688;ba/ and two /t&#688;beba/ tokens were lost due to mispronunciation.</p>
<p>Some of the selected productions included C1 vocalic releases, which we indicate henceforth with a superscript <sup>V</sup>: C1<sup>V</sup>C2. The selected stimuli were excised C1C2V portions (<xref ref-type="fig" rid="F2">Figure 2</xref>), segmented on the acoustic signal from the midpoint of C1 closure to the midpoint of V, adjusted to the nearest zero-crossing, and avoiding coarticulation with the following consonant.</p>
<fig id="F2">
<caption>
<p><bold>Figure 2:</bold> Example of segmentation for the stimulus /g<sup>V</sup>de/ extracted from the word /gdeba/.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g2.png"/>
</fig>
<p>Four C1VC2V sequences were included as controls: [deba], [geba], [bile], [t&#8217;ebi]. These were the last two syllables excised from the trisyllabic words [ag<bold>deba</bold>], [ad<bold>geba</bold>], [sat&#8217;k&#8217;<bold>bile</bold>], [p&#688;ak&#688;<bold>t&#8217;ebi</bold>], from the midpoint of the C1 closure to the midpoint of the final vowel. In these four Georgian words, the initial syllable is stressed. By removing it, we made sure there was no prominence on the first vowel in the remaining CVCV sequences. All target stimuli and controls were attested word onsets in Georgian. They were segmented and normalized for intensity using Praat (<xref ref-type="bibr" rid="B4">Boersma &amp; Weenink, 2021</xref>).</p>
</sec>
<sec>
<title>5.3 Participants and procedure</title>
<p>The perception experiment was conducted in Tbilisi, Georgia, in a quiet office in the Linguistics Department at Tbilisi State University. Twenty-nine Georgian native listeners (17 female) participated in two experiments. A Georgian student research assistant was hired to recruit participants; these were students at Tbilisi State University, recruited via fliers and the assistant&#8217;s personal contacts. The assistant explained to the participants, in Georgian, the information letter, the consent forms, and the experiment instructions.</p>
<p>In Experiment 1, the forced choice identification task, a total of 222 stimuli were presented to, and randomized across, each participant. These included 114 target stimuli, 70 fillers, and 38 CVCV controls. After hearing each stimulus, the listeners were asked to identify whether it begins with a sequence of two consonants &#8216;cc&#8217;, or with a consonant-vowel sequence &#8216;cv&#8217;. Before the experiment, each participant confirmed that they were familiar with the terms &#8216;consonant&#8217; and &#8216;vowel&#8217;, and were able to give accurate examples of each.</p>
<p>Experiment 2 was conducted a week later with the same participants, to disambiguate the &#8216;cv&#8217; response in Experiment 1. In Experiment 1, listeners could have responded &#8216;cv&#8217; for a stimulus such as /bge/, for example, if they heard a vowel between the stops [b<sup>v</sup>ge], but also if they heard only one of the stops [_ge] or [b_e]. Experiment 2 thus consisted of a transcription test, in which listeners heard subsets of the stimuli from Experiment 1. Each participant heard all tokens (three to five) of any stimulus item for which they had given at least one &#8216;cv&#8217; response in Experiment 1, and was asked to transcribe what they heard by hand in Georgian orthography. This ensured that multiple productions of the same word were included when applicable. Georgian orthography being phonemic, IPA transcription of the responses is relatively easy. It was done by the first author, who is familiar with the Georgian alphabet. The native Georgian research assistant also verified the participants&#8217; transcriptions.</p>
<p>Both experiments were conducted on a Windows laptop computer, using a program written and kindly provided by Ren&#233; Carr&#233; and Emmanuel Ferragne. Both experiments were self-paced, and each one lasted between 20 and 25 minutes. For Experiment 1, the forced choice identification task, participants were seated in front of the computer, and the stimuli were played back via headphones, one at a time. They were asked to listen to each sound and respond by pressing either the F or the J key on the keyboard. The F key was labelled as &#8216;cv&#8217;, and the J key as &#8216;cc&#8217;, using Georgian letters: <bold>&#4311;&#4334;</bold> [t&#688;x] for &#8216;cv&#8217;, <bold>&#4311;&#4304;&#4316;&#4334;&#4315;&#4317;&#4309;&#4304;&#4316;&#4312; &#8211; &#4334;&#4315;&#4317;&#4309;&#4304;&#4316;&#4312;</bold> [t&#688;anxmovani&#8211;xmovani] &#8216;consonant&#8211;vowel&#8217;, and <bold>&#4311;&#4311;</bold> [t&#688;t&#688;] for &#8216;cc&#8217;, <bold>&#4311;&#4304;&#4316;&#4334;&#4315;&#4317;&#4309;&#4304;&#4316;&#4312; &#8211; &#4311;&#4304;&#4316;&#4334;&#4315;&#4317;&#4309;&#4304;&#4316;&#4312;</bold> [t&#688;anxmovani&#8211;t&#688;anxmovani] &#8216;consonant&#8211;consonant&#8217;. After each response, the participants clicked on the screen to move on to the next stimulus.</p>
<p>Experiment 1 was preceded by two practice blocks. In the first block, participants were asked to simply listen to 10 stimuli, to familiarize themselves with the short sounds they would hear. In the second practice block, they listened and responded to 10 different stimuli. This was to make the participants familiar with the task, and thus no feedback was provided for the practice trials. The 20 practice stimuli were not included in the actual experiments.</p>
<p>We would like to acknowledge that using perception stimuli recorded with EMA might raise concerns about possible speech distortions and intelligibility due to the EMA sensors. To the best of our knowledge, the effects of EMA sensors on the perception of speech are largely unexplored. Several studies have tested the effect of sensors on production, comparing typical and disordered speech such as aphasia, apraxia, and Parkinson&#8217;s Disease (e.g., <xref ref-type="bibr" rid="B38">Katz et al., 2006</xref>; <xref ref-type="bibr" rid="B69">Tienkamp et al., 2024</xref>). These studies reported some interference in the production of sibilant fricatives and in the acoustic-articulatory vowel space. While we cannot rule out the possibility that the presence of EMA sensors may have interfered to some extent with the Georgian speaker&#8217;s production, our stimuli come from a single speaker and were carefully selected based on the articulatory measure of overlap, on which our hypotheses are crucially based. Since testing an articulatory hypothesis is the main goal of our study, we prioritized this latter point. We were ultimately reassured by the relatively high accuracy scores of the transcriptions obtained in Experiment 2: C1 was correctly transcribed 66% of the time and C2, 75% of the time. These scores do not reflect serious intelligibility issues. Moreover, the higher accuracy of C2 transcriptions is what we expected to find, given that a vowel always followed C2. This suggests that sensor interference, if any, was limited.</p>
</sec>
<sec>
<title>5.4 Acoustic and articulatory analysis of the stimuli</title>
<p>To understand the relation between the participants&#8217; responses and the phonetic properties of the stimuli, we analyzed the acoustic and articulatory properties related to the timing lag (or overlap) between C1 and C2, as well as those related to the vocalic release of the C1. The following three acoustic parameters were measured:</p>
<list list-type="order">
<list-item><p><italic>Acoustic Lag</italic>: the duration of the inter-burst interval, measured from C1 release burst (<xref ref-type="fig" rid="F3">Figure 3a</xref>) to C2 release burst (<xref ref-type="fig" rid="F3">Figure 3d</xref>);</p></list-item>
<list-item><p><italic>Presence of Vocalic Release</italic>: the occurrence of vocalic releases (present vs. absent);</p></list-item>
<list-item><p><italic>Vocalic Release Duration</italic>: the duration of vocalic releases, when present, measured from <xref ref-type="fig" rid="F3">Figure 3b</xref> to <xref ref-type="fig" rid="F3">Figure 3c</xref>.</p></list-item>
</list>
<fig id="F3">
<caption>
<p><bold>Figure 3:</bold> Acoustic landmarks used in measurements: (a) C1 release burst; (b) vocalic release onset (if present); (c) vocalic release offset (if present); (d) C2 release burst.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g3.png"/>
</fig>
<p>Three articulatory measures were examined, based on the articulatory landmarks measured on the EMA signal:</p>
<list list-type="order">
<list-item><p><italic>Release Lag</italic>: Temporal distance from C1 release (4c) to C2 release (4f);</p></list-item>
<list-item><p><italic>Onset Lag</italic>: Temporal distance from C1 gesture onset (4a) to C2 gesture onset (4d);</p></list-item>
<list-item><p><italic>Relative Overlap</italic>: 1 &#8211; [C2 gesture onset (4d)-C1 achievement (4b)]/[C1 release (4c) -C1 achievement (4b)] (<xref ref-type="bibr" rid="B22">Gafos et al., 2010</xref>; <xref ref-type="bibr" rid="B59">Roon et al., 2021</xref>)</p></list-item>
</list>
<p>In EMA signals, movement trajectories of the receiver coils attached to the tongue tip, tongue dorsum, upper lip, and lower lip were evaluated. The articulatory constriction formation and release were identified using the Matlab analysis program, <italic>Mview</italic> (provided by Mark Tiede). The <italic>Mview</italic> algorithms allow the computation of articulatory landmarks based on the velocity profiles of the relevant sensors. The peak velocities of the constriction formation and release movements were calculated algorithmically. For each gesture, the following three points were identified and labelled, using a 20% threshold of the velocity peaks: the gesture onset, constriction (target) achievement, and constriction release. For labials, the Euclidean distance between upper and lower lip receiver coils was used to compute lip aperture, and thus measure labial constrictions. For coronals, the gestural landmarks were determined by evaluating the distance of the tongue tip receiver coil to the closest point on the palate. For dorsals, the vertical position of the tongue dorsum coil was used.</p>
<p>The relative overlap measure in (6), like the overlap measure in Chitoran et al. (<xref ref-type="bibr" rid="B12">2002</xref>), essentially quantifies what proportion of the C1 plateau (from achievement to release) is free from the influence of the C2 gesture (see <xref ref-type="fig" rid="F4">Figure 4</xref>). However, as the relative overlap measure is calculated by subtracting the overlap measure in Chitoran et al. from one, the current measure straightforwardly corresponds to the degree of relative overlap: Greater values thus correspond to the greater degrees of overlap, or the greater influence of C2 movement on C1.</p>
<fig id="F4">
<caption>
<p><bold>Figure 4:</bold> Schematic representations of the articulatory landmarks used in measurements.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g4.png"/>
</fig>
<p>As expected, some measures are correlated. When all stimuli were considered (<xref ref-type="table" rid="T2">Table 2</xref>), all measures except for the acoustic lag and the release lag showed significant correlations in the expected direction (i.e., positive correlations among the lag measures and negative correlations between the lag measures and the overlap measure). When considering the stimuli that had vocalic releases (<xref ref-type="table" rid="T3">Table 3</xref>), the duration of the vocalic release does not correlate significantly with any of the lag or overlap measure.</p>
<table-wrap id="T2">
<caption>
<p><bold>Table 2:</bold> Correlation matrix for target stimuli measurements.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"><bold>Acoustic Lag</bold></td>
<td align="left" valign="top"><bold>Onset Lag</bold></td>
<td align="left" valign="top"><bold>Release Lag</bold></td>
<td align="left" valign="top"><bold>Relative Overlap</bold></td>
</tr>
<tr>
<td align="left" valign="top">Acoustic Lag</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Onset Lag</td>
<td align="left" valign="top" style="background-color:#c7c8ca;">0.417**</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Release Lag</td>
<td align="left" valign="top">0.075</td>
<td align="left" valign="top" style="background-color:#dcddde;">0.271*</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Relative Overlap</td>
<td align="left" valign="top" style="background-color:#dcddde;">&#8211;0.297*</td>
<td align="left" valign="top" style="background-color:#a7a9ac;">&#8211;0.781***</td>
<td align="left" valign="top" style="background-color:#a7a9ac;">&#8211;0.450***</td>
<td align="left" valign="top">1.000</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T3">
<caption>
<p><bold>Table 3:</bold> Correlation matrix for the subset that included vocalic releases.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"><bold>Vocalic release</bold></td>
<td align="left" valign="top"><bold>Acoustic Lag</bold></td>
<td align="left" valign="top"><bold>Onset Lag</bold></td>
<td align="left" valign="top"><bold>Release Lag</bold></td>
<td align="left" valign="top"><bold>Relative Overlap</bold></td>
</tr>
<tr>
<td align="left" valign="top">Vocalic release</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Acoustic Lag</td>
<td align="left" valign="top">0.189</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Onset Lag</td>
<td align="left" valign="top">0.300(*)</td>
<td align="left" valign="top" style="background-color:#a7a9ac;">0.572***</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Release Lag</td>
<td align="left" valign="top">&#8211;0.269</td>
<td align="left" valign="top">0.168</td>
<td align="left" valign="top">0.150</td>
<td align="left" valign="top">1.000</td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top">Relative Overlap</td>
<td align="left" valign="top">0.007</td>
<td align="left" valign="top" style="background-color:#c7c8ca;">&#8211;0.482**</td>
<td align="left" valign="top" style="background-color:#a7a9ac;">&#8211;0.813***</td>
<td align="left" valign="top">&#8211;0.314(*)</td>
<td align="left" valign="top">1.000</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>We also tested whether the presence of a vocalic release was associated with gestural timing and overlap measures using point-biserial correlations. None of the correlations reached statistical significance: acoustic lag [r = .20, p = .13], onset lag [r = .17, p = .19], release lag [r = &#8211;.11, p = .41], and relative overlap [r = .18, p = .18]. These results indicate that the presence or absence of a vocalic release does not reliably co-vary with timing lag or gestural overlap.</p>
</sec>
</sec>
<sec>
<title>6. Results</title>
<p>Results of Experiment 1 showed that Georgian listeners successfully identified the presence of a two-consonant sequence in 71% of the C1C2V stimuli. This average is based on 28 out of the 29 participants. One participant was excluded from the analysis because their performance was more than two standard deviations below the group mean (40% correct). In the 29% when the participants responded &#8216;cv&#8217;, they may have either heard an epenthetic vowel between C1 and C2, or failed to hear either C1 or C2.</p>
<p>Experiment 2 aimed to test whether &#8216;cv&#8217; responses reflected misperception of C1 or C2, or the perception of an inserted vowel. C1 was correctly transcribed 66% of the time, including fully correct transcriptions (e.g., /p&#688;t&#688;a/ &#8594; &lt;p&#688;t&#688;a&gt;), vowel-insertion cases (e.g., /gde/ &#8594; &lt;gade&gt;), and cases where C1 was identified correctly but C2 was not (e.g., /t&#688;ba/ &#8594; &lt;t&#688;va&gt;). The remaining 34% included errors such as C1 deletion (e.g., /bga/ &#8594; &lt;ga&gt;), C1 laryngeal-category change (e.g., /gbe/ &#8594; &lt;k&#8217;be&gt; or &lt;k&#688;obe&gt;), C1 place change (e.g., /p&#688;t&#688;a/ &#8594; &lt;k&#688;t&#688;a&gt;), C1 manner change (e.g., /bga/ &#8594; &lt;vga&gt; or &lt;mga&gt;), metathesis (e.g., /t&#688;ba/ &#8594; &lt;bt&#688;a&gt;), or unrelated responses (e.g., /dge/ &#8594; &lt;rio&gt;). For comparison, C2 was correctly transcribed in 75% of trials, consistent with the expectation that C2 benefits from additional cues in the transition to the following vowel. Some C2 errors, such as /b/ transcribed as &lt;v&gt;, may reflect slight articulatory interference associated with the EMA sensors. However, such effects appear limited, as overall C2 accuracy remained relatively high. As the focus of this study is on C1, we do not further analyze C2 here.</p>
<p>We focus on the correct identification of C1, examining how each of the six measures contributes to correct identification of C1 and whether its effect interacts with Place Order (B-F vs. F-B) and C1 voicing (voiceless vs. voiced). As the listeners experienced all stimuli beginning with the C1(V)C2 sequence, we did not consider initial vs. medial word position in these analyses.</p>
<sec>
<title>6.1 H1: Longer timing lag (or decreased overlap) facilitates the recovery of C1</title>
<p>To evaluate H1, we considered four measures: acoustic lag, onset lag, release lag, and relative overlap. They are not independent from one another (see <xref ref-type="table" rid="T2">Tables 2</xref> and <xref ref-type="table" rid="T3">3</xref>), and thus we built a separate series of mixed effect logistic regression models for each measure, using the <italic>lme4</italic> package (<xref ref-type="bibr" rid="B1">Bates et al., 2015</xref>) in R (<xref ref-type="bibr" rid="B53">R Core Team 2022</xref>). The models predicted the likelihood of correct C1 identification (binary outcome, correct vs. incorrect), based on the fixed effects of Place Order (B-F vs. F-B) and C1 Voicing (voiceless vs. voiced), along with one of the four measures under consideration. For random effects, by-subject and by-item intercepts, as well as all possible random slopes, were considered. We began by building the full model with all possible interaction terms and the fullest random effects structure, and then found the optimal model by removing the interaction terms and the random slopes that do not contribute to the model fit. Nested models were compared using likelihood ratio tests, and AIC/BIC values were examined to confirm consistency across model comparisons. In all comparisons, AIC, BIC, and likelihood ratio results were consistent.</p>
<p>As the timing measures are expected to co-vary with the place order, at least to some extent, we evaluated the multicollinearity with VIF (variation inflation factors) and removed the predictors that are highly correlated with the measure of interest (indicated by VIF values greater than 5, e.g., <xref ref-type="bibr" rid="B62">Shrestha, 2020</xref>). This was to build the most reliable model that can best predict the C1 identification using the measure under consideration (i.e., other predictors were less important than the measure of interest). <xref ref-type="table" rid="T4">Table 4</xref> summarizes the best models for each of the four measures examined.</p>
<table-wrap id="T4">
<caption>
<p><bold>Table 4:</bold> Best models for each measure.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"><bold>Measure</bold></td>
<td align="left" valign="top"><bold>Best mixed-effect logistic regression model</bold></td>
</tr>
<tr>
<td align="left" valign="top">Acoustic Lag</td>
<td align="left" valign="top"><bold>Acoustic lag</bold> + Place Order * C1 Voicing + (1 + Place Order + C1 Voicing &#124;Subject) + (1 &#124;Item)</td>
</tr>
<tr>
<td align="left" valign="top">Onset Lag</td>
<td align="left" valign="top"><bold>Onset lag</bold> + Place Order * C1 Voicing + (1 + Place Order + Onset Lag + C1 Voicing &#124;Subject) + (1 &#124;Item)</td>
</tr>
<tr>
<td align="left" valign="top">Release Lag</td>
<td align="left" valign="top"><bold>Release lag</bold> * Place Order + Place Order * C1 Voicing + (1 + Place Order + Release Lag + C1 Voicing &#124;Subject)+(1 &#124;Item)</td>
</tr>
<tr>
<td align="left" valign="top">Relative Overlap</td>
<td align="left" valign="top"><bold>Relative Overlap</bold> + Place Order * C1 Voicing +(1 + Place Order + Relative Overlap + C1 Voicing &#124;Subject)+(1 &#124;Item)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To determine whether each of the measures contributes significantly to C1 identification, the best models in <xref ref-type="table" rid="T4">Table 4</xref> were compared with the model without the measure in question, using the likelihood ratio tests. If inclusion of the measure improves the model fit, we concluded the measure facilitates C1 identification.</p>
<p>Each of the four measures significantly contributed to the model fit, supporting H1. First, C1 identification was influenced by <italic>Acoustic Lag</italic> [&#967;<sup>2</sup>(1) = 15.02, <italic>p</italic> = 0.0001***], <italic>Onset Lag</italic> [&#967;<sup>2</sup>(1) = 15.30, <italic>p</italic> &lt; 0.0001***], and <italic>Relative Overlap</italic> [&#967;<sup>2</sup>(1) = 20.04, <italic>p</italic> = 0.0001***]. The likelihood of correct C1 identification increased, as <italic>Acoustic Lag</italic> [&#946; = 0.025, z = 3.91, p &lt; 0.0001***] and <italic>Onset Lag</italic> [&#946; = 0.027, z = 3.97, p &lt; 0.0001***] increased and <italic>Relative Overlap</italic> [&#946; = &#8211;0.393, z = 4.70, p &lt; 0.0001***] decreased, as expected. As shown in <xref ref-type="fig" rid="F5">figures 5</xref>, <xref ref-type="fig" rid="F6">6</xref>, and <xref ref-type="fig" rid="F7">7</xref>, these effects were consistent across Place Order or C1 Voicing.</p>
<fig id="F5">
<caption>
<p><bold>Figure 5:</bold> Predicted C1 ID accuracy by acoustic lag (ms).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g5.png"/>
</fig>
<fig id="F6">
<caption>
<p><bold>Figure 6:</bold> Predicted C1 ID accuracy by onset lag (ms).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g6.png"/>
</fig>
<fig id="F7">
<caption>
<p><bold>Figure 7:</bold> Predicted C1 ID accuracy by relative overlap.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g7.png"/>
</fig>
<p>On the other hand, the best model for <italic>Release Lag</italic> included a significant interaction between the measure of interest (release lag) and place order [x<sup>2</sup>(1) = 14.19, p = 0.0002***]. To understand the significant interaction better, post-hoc analyses were conducted using emtrends() in the <italic>emmeans</italic> package (<xref ref-type="bibr" rid="B45">Lenth, 2022</xref>). The post-hoc analyses revealed that the influence of <italic>Release Lag</italic> differed significantly in back-to-front and front-to-back sequences [z = 3.90, p = 0.0001***]: Longer release lags led to a greater chance of correct C1 identification in back-to-front sequences (slope = 0.014, [CI: 0.007, 0.021]), but not in front-to-back sequences (slope = &#8211;0.004, [CI: &#8211;0.011, 0.003]), as shown in <xref ref-type="fig" rid="F8">Figure 8</xref>.</p>
<fig id="F8">
<caption>
<p><bold>Figure 8:</bold> Predicted C1 ID accuracy by release lag (ms).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g8.png"/>
</fig>
</sec>
<sec>
<title>6.2 H2: A C1 vocalic release does not facilitate the recovery of C1</title>
<p>To test whether the vocalic release (transitional vocoid) helps the recovery of C1 gesture, we tested (1) whether the tokens with vocalic releases (<italic>Presence of Vocalic Release</italic>) were better identified, and (2) for the tokens with the vocalic releases, whether the releases of longer duration (<italic>Duration of Vocalic Release</italic>) led to better identification. The statistical models followed the same structure as in Section 6.1, except that C1 voicing was not considered, as none of the tokens with voiceless C1 included vocalic releases. Also, the vocalic release duration model was fitted to the subset of the data which contains only the stimuli produced with vocalic releases (n = 2,101, among which 1,148 was back-to-front. The entire dataset included 3,107 observations). <xref ref-type="table" rid="T5">Table 5</xref> shows the best models for the two measures.</p>
<table-wrap id="T5">
<caption>
<p><bold>Table 5:</bold> Best models for each measure.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"><bold>Measure</bold></td>
<td align="left" valign="top"><bold>Best mixed-effect logistic regression model</bold></td>
</tr>
<tr>
<td align="left" valign="top">Presence of vocalic release</td>
<td align="left" valign="top"><bold>Presence of Vocalic Release</bold> * Place Order + (1 + Place Order + Presence of Vocalic Release &#124;Subject) + (1 &#124;Item)</td>
</tr>
<tr>
<td align="left" valign="top">Vocalic release duration</td>
<td align="left" valign="top"><bold>Vocalic Release Duration</bold> + Place Order + (1+ Vocalic Release Duration + Place Order &#124;Subject) + (1&#124;Item)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>First, the presence of a vocalic release influenced the C1 identification, but the effect interacted with Place Order [x<sup>2</sup>(1) = 18.01, p &lt; 0.0001***], as shown in <xref ref-type="fig" rid="F9">Figure 9</xref>. A post-hoc pairwise comparison on the significant interaction revealed that, in back-to-front sequences only, C1 in the tokens without vocalic releases was better identified than C1 in those tokens with vocalic releases [&#946; = 2.089, z = 3.97, p = 0.0001***]. For front-to-back sequences, presence versus absence of vocalic releases did not influence C1 identification [p = 0.55]. Overall, the presence of a vocalic release did not improve C1 identification.</p>
<fig id="F9">
<caption>
<p><bold>Figure 9:</bold> Predicted C1 ID accuracy by presence/absence of vocalic release (transitional vocoid).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g9.png"/>
</fig>
<p>For the tokens produced with the vocalic releases, the duration of these releases was a significant predictor of C1 identification regardless of the Place Order [x<sup>2</sup>(1) = 10.73, p = 0.0011**]. However, the duration of vocalic releases and the C1 identification were inversely related as shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. That is, as the duration of the releases increased, the identification of C1 became worse [&#946; = &#8211;0.056, z = &#8211;3.41, p = 0.0007***]. Participants inaccurately transcribed C1 when the releases were longer.</p>
<fig id="F10">
<caption>
<p><bold>Figure 10:</bold> Predicted C1 ID accuracy by vocalic release duration (ms).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24395-g10.png"/>
</fig>
<p>In sum, the outcomes of the two models related to the vocalic release clearly show that the vocalic releases do not facilitate the correct identification of C1. Rather, such transitional vocoids may interfere with the identification of the intended C1, especially when they are longer.</p>
</sec>
</sec>
<sec>
<title>7. Discussion</title>
<p>Taken together, the results support the view that temporal organization, rather than vocalic releases, facilitates the perceptual recovery of C1. In the following sections, we discuss the implications of these findings, focusing first on the effects of temporal lag and overlap (Section 7.1), and then on the role of vocalic releases (Section 7.2).</p>
<sec>
<title>7.1 Timing lag</title>
<p>The current findings support the first hypothesis that longer timing lag and reduced overlap between C1 and C2 facilitate the recovery of C1 gestures: Georgian native listeners do benefit from longer lag between C1 and C2 in recovering C1 in stop-stop sequences. Among the four measures tested, <italic>Acoustic Lag, Onset Lag</italic>, and <italic>Relative Overlap</italic>, each contributed to increasing accurate C1 identification, as predicted. Among these, Acoustic Lag and Onset Lag are moderately correlated.</p>
<p>The fourth lag measure we considered&#8212;Release Lag&#8212;does not correlate with Acoustic Lag and shows only a weak correlation with Onset Lag. Our results showed an interaction between Release Lag and place order, indicating that the influence of release lag differs significantly in B-F and F-B sequences. In B-F sequences, C1 accuracy did improve with longer release lags, but this was not the case for F-B sequences.</p>
<p>Overall, the results concerning timing lag are consistent with H1. Georgian native listeners identified C1 more accurately in sequences with reduced overlap, when the gesture for C2 is timed later relative to C1. This effect was evident in the timing of the C2 gesture onset for all the stop-stop sequences considered. The later timing of the C2 release proved beneficial for the B-F sequences (e.g., /gb/, /t&#688;b/), presumably because the posterior-to-anterior transition increases the chance that a C1 release is masked. Release Lag had no significant impact on F-B sequences, suggesting that the perceptual vulnerability of C1 may vary by place order. These results thus align with the production patterns observed in previous studies (e.g., <xref ref-type="bibr" rid="B12">Chitoran et al., 2002</xref>), suggesting that native speakers may draw on perceptual knowledge when timing consecutive constrictions in onset clusters. The current findings, taken together with the previous findings on production patterns, provide converging evidence that timing lags (especially those based on gestural onsets) play a considerable role in recovering consonantal gestures. This suggests that the inter-consonantal timing lags within onset clusters are not merely a phonetic artefact but a part of grammar functionally encoded by Georgian speakers.</p>
<p>It is worth mentioning at this point that the overlap measures we considered crucially rely on gestural releases. This includes the acoustic measure we used&#8212;the inter-burst interval. Is it possible, though, that listeners may use other types of acoustic information that would correspond to consonant overlap? One possibility we have not considered, but has been proposed by Zsiga (<xref ref-type="bibr" rid="B80">2003</xref>), is duration ratio, a relative measure defined as the mean closure duration of the C1C2 cluster (with or without an intervening release) divided by the sum of the mean closure durations of single, intervocalic C1 and C2. This has been a useful measure of overlap across word boundaries in comparing non-native productions (English speakers producing L2 Russian utterances vs. Russian speakers producing L2 English utterances). It is worth verifying in the future whether this relative acoustic measure may be better correlated with the articulatory overlap measures.</p>
<p>For now, however, since C-C coordination with reduced overlap is known to often result in a vocalic release (transitional vocoid), especially in voiced sequences, we turn to the question of whether the presence of such vocalic releases benefits the recoverability of C1.</p>
</sec>
<sec>
<title>7.2 Vocalic releases</title>
<p>The second hypothesis, that the presence of C1 vocalic releases with richer C1 formant transition information would help listeners recover C1, is not supported by our results. Vocalic releases instead seem to be detrimental. This suggests that, unlike timing lag or degree of overlap, the presence of vocalic releases between two stops in C1C2 sequences may not be related to perceptual considerations. Or, if native speakers of Georgian indeed produce them for the listeners&#8217; sake, this result suggests that their efforts are ineffective&#8212;at least in the stop-stop sequences examined in this study.</p>
<p>An interaction with Place order was present in the results related to vocalic releases, as well. In F-B stop-stop sequences, the presence of a vocalic release had no effect on C1 identification. In B-F sequences, contrary to expectations, tokens without a vocalic release were more accurately identified than those with a vocalic release, a result which is not consistent with H2.</p>
<p>The duration of the vocalic release, when present, emerged as a significant predictor of accuracy, but again, contrary to H2, as the vocoid duration increases, C1 identification actually decreases. The inverse correlation between vocalic release duration and perceptual accuracy of C1 is at odds with findings from previous cross-language studies. For example, Wilson et al. (<xref ref-type="bibr" rid="B71">2014</xref>) and Wilson and Davidson (<xref ref-type="bibr" rid="B70">2013</xref>) found that longer burst duration resulted in more epenthesis between C1 and C2 stops when shadowing foreign speech, proposing that a longer burst has a similar acoustic profile to that of a vowel. In their study, greater burst amplitude similarly increased the rate of epenthesis. Both longer burst duration and greater burst amplitude are interpretable as presence of a vocoid, which protects C1 from misperception. However, these were the results of non-native listeners (English listeners hearing Russian stimuli), whose native phonotactics led them to interpret the acoustic properties of a longer and louder burst as a vowel. Georgian listeners confronted with non-native French data (Kwon &amp; Chitoran, 2024) behaved differently because of their different native phonotactic expectation. Used to the presence of a vocalic release in a CC sequence, they did not reliably distinguish French CCV vs. CVCV stimuli in a non-native AX discrimination task. This result and the current findings together suggest that the quality and duration of V1 relative to V2 matters. When V1 is a vocalic release (transitional vocoid) and significantly shorter than V2, a lexical vowel, Georgian listeners tend to ignore it. However, a longer vocoid, one comparable in duration with the lexical vowel but not in its spectral properties, has a negative effect on C1 identification. Georgian listeners are thus sensitive to a longer vocoid, but it provides them with misleading information.</p>
<p>A possible reason for this particular result may be that the presence of a long vocalic release interferes with the perception and recoverability of information at multiple structural levels, not just the segmental level. It can simultaneously affect the perceptibility of C1, as well as the perceptibility of higher-level prosodic structures, such as syllables or words, thus impinging on overall intelligibility. A vocalic release may be perceived as a full vowel, therefore a syllable nucleus. In support of this interpretation, <xref ref-type="bibr" rid="B15">Crouch et al. (2023b)</xref> found that transitional vocoids in Georgian C1C2 sequences alter the amplitude envelope in ways that may lead to resyllabification. In sequences with sonority plateaus or falls, productions with a vocoid showed an additional peak in the amplitude envelope, compared to productions of the same sequence without a vocoid. This additional peak can be interpreted as a syllable nucleus and may lead listeners to mis-parse syllable boundaries, particularly when the vocoid is long enough to be perceptually salient but not clearly vowel-like. This suggests that speakers may avoid vocoids in certain cluster types not to preserve segmental clarity but to avoid prosodic ambiguity. If so, then phonological grammar may encode not just segmental recoverability but also the recoverability of the syllabic organization. These findings are relevant for our current study, in which longer vocalic releases led to more CV responses and proved not effective for C1 identification.</p>
<p>So far, we (and other authors before us) have only considered segment-level perceptibility, with a focus on the recoverability of phonological contrasts. But we must consider the possibility that patterns that are predicted to facilitate the recovery of segmental contrasts may not be equally beneficial, and may even hinder, the recovery of syllable- or higher-level information.</p>
</sec>
<sec>
<title>7.3 Conclusion</title>
<p>We now return to our two initial hypotheses to consider what we have learned. Do the observed patterns&#8212;timing lag and vocalic releases&#8212;truly present an advantage for the recoverability of phonological information? Based on our results, we can maintain that the observed lag and overlap patterns in stop-stop sequences indeed help recover the phonological identity of C1, supporting H1. Vocalic releases, however, do not, contra H2. This is explained by the finding that the presence of the vocalic release is not linearly correlated with lag and thus does not provide a reliable cue to temporal organization. It may be used, instead, for conveying other types of information, such as voicing or cluster type.</p>
<p>Unlike prior studies that inferred timing-based perceptual effects from production data or nonnative perception data, the current findings offer direct perceptual evidence from native listeners. We provide evidence that timing lag matters in speech perception, contributing to segmental recovery in consonant clusters. These findings suggest that the native listeners rely on timing lag as the phonetic details relevant for recovering consonantal gestures. Yet the lack of perceptual benefit from vocalic releases suggests that not all phonetic patterns assumed to enhance recoverability serve that function. This contrast underscores the importance of evaluating perceptual recoverability empirically, rather than assuming their perceptual efficacy based on production patterns alone.</p>
<p>Corroborating previous findings on Georgian productions (<xref ref-type="bibr" rid="B12">Chitoran et al., 2002</xref>; <xref ref-type="bibr" rid="B14">Crouch, 2022</xref>), the current results support the inclusion of timing lag in the grammar of Georgian. Listeners benefit from longer lag (i.e., reduced overlap) and speakers tend to produce longer lag when C1 is expected to be more perceptually vulnerable, though it remains unclear whether speakers are manipulating the lag to aid listeners.</p>
<p>Going back to Mattingly (<xref ref-type="bibr" rid="B47">1981</xref>), we may ask: Do the speakers&#8217; phonological grammar integrate knowledge about temporal patterns and their acoustic consequences? The different timing lags in back-to-front and front-to-back sequences observed in production may be perceptually motivated in stop-stop sequences and learned as grammatical generalizations. Additional factors, not related to perceptibility, may also contribute to timing patterns. Pouplier et al.&#8217;s (<xref ref-type="bibr" rid="B52">2022</xref>) comparison of lag patterns in the same CC clusters across seven languages firmly highlighted wide language-specific diversity, and, at the same time, consistent lag patterns across languages in terms of the segmental composition of the clusters. Among the seven languages, Georgian stands out as having the largest variability of lag durations. It is the language that reaches the longest lag durations, but in other respects it also conforms to cross-linguistic cluster-specific patterns. For example, the Georgian /sC/ clusters align with those of the six other languages in having the shortest lag. The large within-language variability of lag in Georgian is consistent with our conclusion that timing lag is part of speakers&#8217; knowledge. Controlling lag allows the recovery of many types of onset clusters, some of which, importantly, contain morphological information. Georgian is a language with rich prefixal morphology, where multiple consonantal prefixes may be added. This particular aspect of the language structure highlights the importance of both segmental and prosodic recoverability. A consonantal prefix must be accurately recovered and, at the same time, adding consonantal prefixes must not alter the syllable count. The precise control of inter-consonantal timing lag, avoiding long vocalic releases, is therefore important for these closely intertwined intelligibility considerations.</p>
<p>The current findings suggest that perceptual recoverability should not be defined strictly at a local, segmental level. A more fruitful approach should consider multiple levels of structures simultaneously&#8212;segments, syllabic organization, morphological structure&#8212;in the context of the overall intelligibility of the message. Speakers likely aim to be understood and to accommodate the listener, but this intent may be reflected in the phonological grammar in a complex way.</p>
<p>A further important point that needs to be raised is whether diachronic information should also be considered when examining timing patterns. Easterday (<xref ref-type="bibr" rid="B19">2019</xref>) raised this issue specifically regarding obstruent sequences in languages classified as having a highly complex syllable structure. She notes that the most common typological source of clusters is vowel reduction, and some characteristics of the reduced vowel may be retained in C1 release bursts. From a diachronic perspective then, the observed place effects may be seen as motivated by perceptual recoverability or a consequence of perceptual recoverability. The timing patterns and perceptual properties of such clusters may have the effect of preserving complex syllable structure by protecting it from complete overlap that can lead to consonant loss. Georgian stop-stop sequences are known to have two historical sources (<xref ref-type="bibr" rid="B24">Gamkrelidze &amp; Ivanov, 1995</xref>). Back-to-front sequences developed through the deletion of an intervening vowel. In line with Easterday&#8217;s reasoning, it can be argued that B-F sequences may have preserved the timing of a lost vowel. Front-to-back sequences with a dorsal C2 are known as &#8216;harmonic clusters&#8217;, and have developed from velarized stops, single segments that subsequently broke into sequences (e.g., [d&#736;] becoming [dg] or [d&#611;]). Along the same reasoning, they may have preserved a timing closer to that of a single segment. Such stability of timing patterns across time can be taken as further evidence for the phonological status of timing information.</p>
<p>While the current findings, together with previous studies on Georgian clusters, provide strong evidence for the perceptual role of timing lag in stop-stop sequences, further work is needed to determine whether similar mechanisms underlie the timing of other cluster types or hold cross-linguistically. Regardless of their perceptual motivation, the accumulating evidence in the literature indicates that timing differences constitute part of language-specific phonological knowledge. The role of timing in phonological grammar has indeed become increasingly clear across languages and different types of phonological contrasts. Gafos (<xref ref-type="bibr" rid="B21">2002</xref>) demonstrated the role of temporal coordination in Moroccan Arabic templatic word formation, becoming the first study to develop a full-fledged formal analysis incorporating gestural timing. Since then, the phonological role of <italic>timing</italic> (whether gestural or not) has found additional support, in particular in the instantiation of phonological contrasts. Tone contrasts in several languages, for example, have been shown to consist of tonal units that differ exclusively in their timing (e.g., <xref ref-type="bibr" rid="B54">Remijsen &amp; Ayoker, 2014</xref>, for Shilluk; <xref ref-type="bibr" rid="B68">Svensson Lundmark et al., 2021</xref>, for Swedish; <xref ref-type="bibr" rid="B36">Karlin, 2022</xref>, for Serbian). Segmental contrast between complex segments and sequences of segments has been shown to rely on the different timing of the same component articulatory gestures (<xref ref-type="bibr" rid="B61">Shaw et al., 2021</xref>).</p>
<p>Our view of the role of timing in phonology, as supported by the Georgian data examined here, most closely resembles the one presented by Gafos et al. (<xref ref-type="bibr" rid="B23">2020</xref>), in three aspects:</p>
<list list-type="order">
<list-item><p>We argue, on the basis of our results, that inter-gestural timing patterns and their perceptual relevance can show language-specificity. The Georgian timing patterns on which native listeners rely perceptually are specific to Georgian in the same way that inter-segmental temporal coordination is shown to be language-specific in Gafos et al. (<xref ref-type="bibr" rid="B23">2020</xref>). In the latter case, differences in temporal coordination account for language-specific differences in syllable affiliation between Arabic and Spanish. In the case of Georgian clusters, Kwon and Chitoran (2024) show that French listeners&#8217; perception differs from Georgian listeners, suggesting further evidence for the language-specificity in the domain of perception.</p></list-item>
<list-item><p>It is the timing pattern, and not the presence or absence of a vocalic release, that is part of native speakers&#8217; phonological knowledge.</p></list-item>
<list-item><p>How speakers organize their vocal tracts is not independent of how they organize their native linguistic system in their minds (<xref ref-type="bibr" rid="B23">Gafos et al., 2020</xref>). Speakers of Georgian and Arabic have to accommodate morphological structure that impinges on prosodic and segmental requirements. When these requirements conflict is when we are likely to see exceptions from typological generalizations (e.g., the blatant disregard for segmental combinations that follow the sonority sequencing principle, in both Georgian and Arabic).</p></list-item>
</list>
<p>The results of our own study provide perceptually motivated explanations for the wide variability of overlap patterns in Georgian in Pouplier et al. (<xref ref-type="bibr" rid="B52">2022</xref>). The consonant sequences compared in that study were exclusively those that occurred in all of the seven languages under investigation (obstruent-liquid, sibilant-obstruent, and /kn/, /gn/, for a subset of the languages). A plausible interpretation for the reduced overlap measures found in some of the Georgian data is that they are motivated by the presence of stop-stop sequences in the language, the sequences that require close control of timing for recoverability reasons. The variable timing lag in Georgian, expanding further into the reduced overlap range than the other languages, can be seen as an optimization solution for the temporal unfolding of the component gestures. Importantly, reduced overlap (long lag) is not simply generalized across all CC sequences in Georgian. If it were, then presumably a sequence like /sp/ would no longer be perceived as two adjacent consonants. Instead, a broader overlap range is the preferable compromise such that, depending on their gestural composition, sequences spread out between the increased overlap (shorter lag) range, and reduced overlap (longer lag).</p>
<p>To sum up, the current study provides perceptual evidence for incorporating timing lag into phonological representations. The perceptual evidence has explanatory value, as it allows us to probe the relationship between vocal tract organization and linguistic knowledge. Taken together with the articulatory data (<xref ref-type="bibr" rid="B52">Pouplier et al., 2022</xref>), the current findings offer us a glimpse of how speakers manipulate vocal tract organization reflecting linguistic structure, and how listeners, in turn, use the temporal properties to recover the hierarchical, as well as the segmental, information.</p>
</sec>
</sec>
</body>
<back>
<sec>
<title>Additional file</title>
<p>The additional file for this article can be found as follows:</p>
<list list-type="bullet">
<list-item><p><bold>Supplementary Materials.</bold> A. Acoustic and Articulatory Measurements of Perception Stimuli and B. Statistical Model Output. <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://doi.org/10.16995/labphon.24395.s1">https://doi.org/10.16995/labphon.24395.s1</ext-link></p></list-item>
</list>
</sec>
<sec>
<title>Acknowledgements</title>
<p>We thank the Georgian speakers and experiment participants in Tbilisi, the anonymous reviewers for their valuable comments, and Louis Goldstein and Marianne Pouplier for helpful discussions. A previous version of this study was presented at LabPhon 16, and benefitted from insightful comments by Jennifer Hay, the discussant.</p>
</sec>
<sec>
<title>Funding information</title>
<p>This work was supported by ANR-DFG grant (ANR-14-FRAL-0004) for the project PATHS, and by the IdEx programme (ANR-18-IDEX-0001) to Universit&#233; Paris Cit&#233;. Research in Georgia was made possible by the Fulbright-Hays programme of the US Department of State.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<sec>
<title>Author contributions</title>
<p>IC: Conceptualization, study design, funding acquisition, data collection, interpretation, writing, editing.</p>
<p>HK: Data analysis, interpretation, writing, editing.</p>
</sec>
<ref-list>
<ref id="B1"><mixed-citation publication-type="journal"><string-name><surname>Bates</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Maechler</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Bolker</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Walker</surname>, <given-names>S.</given-names></string-name> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using lme4</article-title>. <source>Journal of Statistical Software</source>, <volume>67</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></mixed-citation></ref>
<ref id="B2"><mixed-citation publication-type="journal"><string-name><surname>Beddor</surname>, <given-names>P. S.</given-names></string-name> (<year>2009</year>). <article-title>A coarticulatory path to sound change</article-title>. <source>Language</source>, <volume>85</volume>(<issue>4</issue>), <fpage>785</fpage>&#8211;<lpage>821</lpage>. <pub-id pub-id-type="doi">10.1353/lan.0.0165</pub-id></mixed-citation></ref>
<ref id="B3"><mixed-citation publication-type="journal"><string-name><surname>Benki</surname>, <given-names>J. R.</given-names></string-name> (<year>2003</year>). <article-title>Analysis of English nonsense syllable recognition in noise</article-title>. <source>Phonetica</source>, <volume>60</volume>(<issue>2</issue>), <fpage>129</fpage>&#8211;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1159/000071450</pub-id></mixed-citation></ref>
<ref id="B4"><mixed-citation publication-type="webpage"><string-name><surname>Boersma</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Weenink</surname>, <given-names>D.</given-names></string-name> (<year>2021</year>). <article-title>Praat: Doing phonetics by computer [Computer program]. Version 6.1.50</article-title>. <uri>http://www.praat.org/</uri></mixed-citation></ref>
<ref id="B5"><mixed-citation publication-type="book"><string-name><surname>Browman</surname>, <given-names>C. P.</given-names></string-name>, &amp; <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name> (<year>1990</year>). <chapter-title>Tiers in articulatory phonology, with some implications for casual speech</chapter-title>. In <string-name><given-names>J.</given-names> <surname>Kingston</surname></string-name>, &amp; <string-name><given-names>M. E.</given-names> <surname>Beckman</surname></string-name> (Eds.), <source>Papers in Laboratory Phonology</source> (<edition>1st</edition> ed., pp. <fpage>341</fpage>&#8211;<lpage>376</lpage>). <publisher-name>Cambridge University Press</publisher-name>. <pub-id pub-id-type="doi">10.1017/CBO9780511627736.019</pub-id></mixed-citation></ref>
<ref id="B6"><mixed-citation publication-type="journal"><string-name><surname>Browman</surname>, <given-names>C. P.</given-names></string-name>, &amp; <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name> (<year>1992</year>). <article-title>Articulatory phonology: An overview</article-title>. <source>Phonetica</source>, <volume>49</volume>, <fpage>155</fpage>&#8211;<lpage>180</lpage>. <pub-id pub-id-type="doi">10.1159/000261913</pub-id></mixed-citation></ref>
<ref id="B7"><mixed-citation publication-type="journal"><string-name><surname>Byrd</surname>, <given-names>D.</given-names></string-name> (<year>1992</year>). <article-title>Perception of assimilation in consonant clusters: A gestural model</article-title>. <source>Phonetica</source>, <volume>49</volume>, <fpage>1</fpage>&#8211;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1159/000261900</pub-id></mixed-citation></ref>
<ref id="B8"><mixed-citation publication-type="journal"><string-name><surname>Byrd</surname>, <given-names>D.</given-names></string-name> (<year>1996</year>). <article-title>Influences on articulatory timing in consonant sequences</article-title>. <source>Journal of Phonetics</source>, <volume>24</volume>, <fpage>209</fpage>&#8211;<lpage>244</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.1996.0012</pub-id></mixed-citation></ref>
<ref id="B9"><mixed-citation publication-type="journal"><string-name><surname>Byrd</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Saltzman</surname>, <given-names>E.</given-names></string-name> (<year>2003</year>). <article-title>The elastic phrase: Modelling the dynamics of boundary-adjacent lengthening</article-title>. <source>Journal of Phonetics</source>, <volume>31</volume>, <fpage>149</fpage>&#8211;<lpage>180</lpage>. <pub-id pub-id-type="doi">10.1016/S0095-4470(02)00085-2</pub-id></mixed-citation></ref>
<ref id="B10"><mixed-citation publication-type="webpage"><string-name><surname>Chen</surname>, <given-names>L. H.</given-names></string-name> (<year>2003</year>). <chapter-title>Evidence for the role of gestural overlap in consonant place assimilation</chapter-title>. In <string-name><given-names>M. J.</given-names> <surname>Sol&#233;</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Recasens</surname></string-name>, &amp; <string-name><given-names>J.</given-names> <surname>Romero</surname></string-name> (Eds.), <source>Proceedings of the 15<sup>th</sup> International Congress of Phonetic Sciences</source>, <publisher-loc>Barcelona, Spain</publisher-loc>, <month>August</month> <day>3&#8211;9</day>, 2003. <uri>http://www.internationalphoneticassociation.org/icphs/icphs2003</uri></mixed-citation></ref>
<ref id="B11"><mixed-citation publication-type="book"><string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name>, &amp; <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name> (<year>2006</year>). <source>Testing the phonological status of perceptual recoverability: Articulatory evidence from Georgian</source>. Abstract, LabPhon 10, June 29&#8211;July 1, <publisher-loc>Paris, France</publisher-loc>.</mixed-citation></ref>
<ref id="B12"><mixed-citation publication-type="book"><string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Byrd</surname>, <given-names>D.</given-names></string-name> (<year>2002</year>). <chapter-title>Gestural overlap and recoverability: Articulatory evidence from Georgian</chapter-title>. In <string-name><given-names>C.</given-names> <surname>Gussenhoven</surname></string-name> &amp; <string-name><given-names>N.</given-names> <surname>Warner</surname></string-name> (Eds.), <source>Laboratory Phonology</source> <volume>7</volume>, pp. <fpage>419</fpage>&#8211;<lpage>447</lpage>. <pub-id pub-id-type="doi">10.1515/9783110197105.2.419</pub-id></mixed-citation></ref>
<ref id="B13"><mixed-citation publication-type="journal"><string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Yoon</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name> (<year>2014</year>). <article-title>Effects of prosodic boundary and syllable structure on the temporal realization of CV gestures</article-title>. <source>Journal of Phonetics</source>, <volume>44</volume>, <fpage>96</fpage>&#8211;<lpage>100</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2014.02.007</pub-id></mixed-citation></ref>
<ref id="B14"><mixed-citation publication-type="thesis"><string-name><surname>Crouch</surname>, <given-names>C.</given-names></string-name> (<year>2022</year>). <source>Postcards from the syllable edge: Sonority and articulatory timing in complex onsets in Georgian</source>. [Doctoral dissertation, <publisher-name>University of California</publisher-name>, <publisher-loc>Santa Barbara</publisher-loc>]. <uri>https://escholarship.org/uc/item/5w18167d</uri></mixed-citation></ref>
<ref id="B15"><mixed-citation publication-type="webpage"><string-name><surname>Crouch</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Katsika</surname>, <given-names>A.</given-names></string-name> (<year>2023b</year>). <chapter-title>Intrusive vocoids and syllable structure in Georgian</chapter-title>. In <string-name><given-names>R.</given-names> <surname>Skarnitzl</surname></string-name>, &amp; <string-name><given-names>J.</given-names> <surname>Vol&#237;n</surname></string-name> (Eds.), <source>Proceedings of the 20<sup>th</sup> International Congress of Phonetic Sciences</source> &#8211; ICPhS (pp. <fpage>2000</fpage>&#8211;<lpage>2004</lpage>). <month>August</month> <day>7&#8211;11</day>, 2023. <publisher-loc>Prague, Czech Republic</publisher-loc>. <uri>https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2023/full_papers/613.pdf</uri></mixed-citation></ref>
<ref id="B16"><mixed-citation publication-type="journal"><string-name><surname>Crouch</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Katsika</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name> (<year>2023a</year>). <article-title>Sonority sequencing and its relationship to articulatory timing in Georgian</article-title>. <source>Journal of the International Phonetic Association</source>, <volume>53</volume>(<issue>3</issue>). <pub-id pub-id-type="doi">10.1017/S0025100323000026</pub-id></mixed-citation></ref>
<ref id="B17"><mixed-citation publication-type="book"><string-name><surname>Dell</surname>, <given-names>F.</given-names></string-name>, &amp; <string-name><surname>Elmedlaoui</surname>, <given-names>M.</given-names></string-name> (<year>2002</year>). <source>Syllables in Tashlhiyt Berber and in Moroccan Arabic</source>. <publisher-name>Kluwer Academic Publishers</publisher-name>. <pub-id pub-id-type="doi">10.1007/978-94-010-0279-0</pub-id></mixed-citation></ref>
<ref id="B18"><mixed-citation publication-type="journal"><string-name><surname>Du</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Gafos</surname>, <given-names>A.</given-names></string-name> (<year>2022</year>). <article-title>Articulatory overlap as a function of stiffness in German, English and Spanish word-initial stop-lateral clusters</article-title>. <source>Laboratory Phonology: Journal of the Association for Laboratory Phonology</source>, <volume>14</volume>(<issue>1</issue>). <pub-id pub-id-type="doi">10.16995/labphon.7965</pub-id></mixed-citation></ref>
<ref id="B19"><mixed-citation publication-type="book"><string-name><surname>Easterday</surname>, <given-names>S.</given-names></string-name> (<year>2019</year>). <chapter-title>Highly complex syllable structure. A typological and diachronic study</chapter-title>. <source>Studies in Laboratory Phonology</source>, <volume>9</volume>. <publisher-name>Language Science Press</publisher-name>.</mixed-citation></ref>
<ref id="B20"><mixed-citation publication-type="journal"><string-name><surname>Edwards</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Beckman</surname>, <given-names>M. E.</given-names></string-name>, &amp; <string-name><surname>Fletcher</surname>, <given-names>J.</given-names></string-name> (<year>1991</year>). <article-title>The articulatory kinematics of final lengthening</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>89</volume>(<issue>1</issue>). <pub-id pub-id-type="doi">10.1121/1.400674</pub-id></mixed-citation></ref>
<ref id="B21"><mixed-citation publication-type="journal"><string-name><surname>Gafos</surname>, <given-names>A. I.</given-names></string-name> (<year>2002</year>). <article-title>A grammar of gestural coordination</article-title>. <source>Natural Language &amp; Linguistic Theory</source>, <volume>20</volume>, <fpage>269</fpage>&#8211;<lpage>337</lpage>. <pub-id pub-id-type="doi">10.1023/A:1014942312445</pub-id></mixed-citation></ref>
<ref id="B22"><mixed-citation publication-type="book"><string-name><surname>Gafos</surname>, <given-names>A. I.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Roon</surname>, <given-names>K. D.</given-names></string-name>, &amp; <string-name><surname>Zeroual</surname>, <given-names>C.</given-names></string-name> (<year>2010</year>). <chapter-title>Variation in timing and phonological grammar in Moroccan Arabic clusters</chapter-title>. In <string-name><given-names>C.</given-names> <surname>Fougeron</surname></string-name>, <string-name><given-names>B.</given-names> <surname>K&#252;hnert</surname></string-name>, <string-name><given-names>M.</given-names> <surname>D&#8217;Imperio</surname></string-name>, &amp; <string-name><given-names>N.</given-names> <surname>Vall&#233;e</surname></string-name> (Eds.), <source>Laboratory Phonology</source>, <volume>10</volume> (pp. <fpage>657</fpage>&#8211;<lpage>698</lpage>). <publisher-name>Mouton de Gruyter</publisher-name>. <pub-id pub-id-type="doi">10.1515/9783110224917.5.657</pub-id></mixed-citation></ref>
<ref id="B23"><mixed-citation publication-type="journal"><string-name><surname>Gafos</surname>, <given-names>A. I.</given-names></string-name>, <string-name><surname>Roeser</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Sotiropoulou</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Zeroual</surname>, <given-names>C.</given-names></string-name> (<year>2020</year>). <article-title>Structure in mind, structure in vocal tract</article-title>. <source>Natural Language &amp; Linguistic Theory</source>, <volume>38</volume>(<issue>1</issue>), <fpage>43</fpage>&#8211;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1007/s11049-019-09445-y</pub-id></mixed-citation></ref>
<ref id="B24"><mixed-citation publication-type="book"><string-name><surname>Gamkrelidze</surname>, <given-names>T. V.</given-names></string-name>, &amp; <string-name><surname>Ivanov</surname>, <given-names>V.</given-names></string-name> (<year>1995</year>). <source>Indo-European and the Indo-Europeans: A reconstruction and historical analysis of a Proto-Language and a Proto-Culture</source>. (English version by Johanna Nichols). <publisher-name>Mouton de Gruyter</publisher-name>. <pub-id pub-id-type="doi">10.1515/9783110815030</pub-id></mixed-citation></ref>
<ref id="B25"><mixed-citation publication-type="book"><string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Fowler</surname>, <given-names>C.</given-names></string-name> (<year>2003</year>). <chapter-title>Articulatory Phonology: A phonology for public language use</chapter-title>. In <string-name><given-names>N.</given-names> <surname>Schiller</surname></string-name>, &amp; <string-name><given-names>A.</given-names> <surname>Meyer</surname></string-name> (Eds.), <source>Phonetics and phonology in language comprehension and production</source> (pp. <fpage>159</fpage>&#8211;<lpage>207</lpage>). <publisher-name>Mouton de Gruyter</publisher-name>. <pub-id pub-id-type="doi">10.1515/9783110895094.159</pub-id></mixed-citation></ref>
<ref id="B26"><mixed-citation publication-type="book"><string-name><surname>Hall</surname>, <given-names>N.</given-names></string-name> (<year>2024</year>). <chapter-title>Intrusive and epenthetic vowels revisited</chapter-title>. In <string-name><given-names>J. Y.</given-names> <surname>Kim</surname></string-name>, <string-name><given-names>V.</given-names> <surname>Miatto</surname></string-name>, <string-name><given-names>A.</given-names> <surname>Petrovi&#263;</surname></string-name>, &amp; <string-name><given-names>L.</given-names> <surname>Repetti</surname></string-name> (Eds.), <source>Epenthesis and beyond: Recent approaches to insertion in phonology and its interfaces</source> (pp. <fpage>167</fpage>&#8211;<lpage>197</lpage>). <publisher-name>Language Science Press</publisher-name>.</mixed-citation></ref>
<ref id="B27"><mixed-citation publication-type="journal"><string-name><surname>Hardcastle</surname>, <given-names>W. J.</given-names></string-name> (<year>1985</year>). <article-title>Some phonetic and syntactic constraints on lingual coarticulation during /kl/ sequences</article-title>. <source>Speech Communication</source>, <volume>4</volume>, <fpage>247</fpage>&#8211;<lpage>263</lpage>. <pub-id pub-id-type="doi">10.1016/0167-6393(85)90051-2</pub-id></mixed-citation></ref>
<ref id="B28"><mixed-citation publication-type="book"><string-name><surname>Hardcastle</surname>, <given-names>W. J.</given-names></string-name>, &amp; <string-name><surname>Roach</surname>, <given-names>P.</given-names></string-name> (<year>1979</year>). <chapter-title>An instrumental investigation of coarticulation in stop consonant sequences</chapter-title>. In <string-name><given-names>H.</given-names> <surname>Hollien</surname></string-name>, &amp; <string-name><given-names>P.</given-names> <surname>Hollien</surname></string-name> (Eds.), <source>Current issues in the phonetic sciences</source> (pp. <fpage>531</fpage>&#8211;<lpage>540</lpage>). <publisher-name>John Benjamins</publisher-name>. <pub-id pub-id-type="doi">10.1075/cilt.9.56har</pub-id></mixed-citation></ref>
<ref id="B29"><mixed-citation publication-type="journal"><string-name><surname>Harrington</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Kleber</surname>, <given-names>F.</given-names></string-name>, &amp; <string-name><surname>Reubold</surname>, <given-names>U.</given-names></string-name> (<year>2008</year>). <article-title>Compensation for coarticulation, /u/-fronting, and sound change in Standard Southern British: An acoustic and perceptual study</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>123</volume>, <fpage>2825</fpage>&#8211;<lpage>2835</lpage>. <pub-id pub-id-type="doi">10.1121/1.2897042</pub-id></mixed-citation></ref>
<ref id="B30"><mixed-citation publication-type="book"><string-name><surname>Harrington</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Kleber</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Reubold</surname>, <given-names>U.</given-names></string-name>, <string-name><surname>Schiel</surname>, <given-names>F.</given-names></string-name>, &amp; <string-name><surname>Stevens</surname>, <given-names>M.</given-names></string-name> (<year>2019</year>). <chapter-title>The phonetic basis of the origin and spread of sound change</chapter-title>. In <string-name><given-names>W. F.</given-names> <surname>Katz</surname></string-name>, &amp; <string-name><given-names>P. F.</given-names> <surname>Assmann</surname></string-name> (Eds.), <source>The Routledge handbook of phonetics</source> (pp. <fpage>401</fpage>&#8211;<lpage>426</lpage>). <publisher-name>Routledge</publisher-name>. <pub-id pub-id-type="doi">10.4324/9780429056253-15</pub-id></mixed-citation></ref>
<ref id="B31"><mixed-citation publication-type="book"><string-name><surname>Harris</surname>, <given-names>A.</given-names></string-name> (<year>2002</year>). <chapter-title>The word in Georgian</chapter-title>. In <string-name><surname>Dixon</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><given-names>A.</given-names> <surname>Aikhenvald</surname></string-name> (Eds.), <source>Word: A cross-linguistic typology</source> (pp. <fpage>127</fpage>&#8211;<lpage>142</lpage>). <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B32"><mixed-citation publication-type="book"><string-name><surname>Hayes</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Kirchner</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Steriade</surname>, <given-names>D.</given-names></string-name> (Eds.) (<year>2004</year>). <source>Phonetically based phonology</source>. <publisher-name>Cambridge University Press</publisher-name>. <pub-id pub-id-type="doi">10.1017/CBO9780511486401</pub-id></mixed-citation></ref>
<ref id="B33"><mixed-citation publication-type="book"><string-name><surname>Henke</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Kaisse</surname>, <given-names>E. M.</given-names></string-name>, &amp; <string-name><surname>Wright</surname>, <given-names>R.</given-names></string-name> (<year>2012</year>). <chapter-title>Is the Sonority Sequencing Principle an epiphenomenon?</chapter-title> In <string-name><surname>Parker</surname>, <given-names>S.</given-names></string-name> (Ed.), <source>The sonority controversy</source> (pp. <fpage>65</fpage>&#8211;<lpage>100</lpage>). <publisher-name>Mouton De Gruyter</publisher-name>. <pub-id pub-id-type="doi">10.1515/9783110261523.65</pub-id></mixed-citation></ref>
<ref id="B34"><mixed-citation publication-type="journal"><string-name><surname>Iskarous</surname>, <given-names>K.</given-names></string-name> (<year>2017</year>). <article-title>The relation between the continuous and the discrete: A note on the first principles of speech dynamics</article-title>. <source>Journal of Phonetics</source>, <volume>64</volume>, <fpage>8</fpage>&#8211;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2017.05.003</pub-id></mixed-citation></ref>
<ref id="B35"><mixed-citation publication-type="journal"><string-name><surname>Iskarous</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Kavitskaya</surname>, <given-names>D.</given-names></string-name> (<year>2018</year>). <article-title>Sound change and the structure of synchronic variability: Phonetic and phonological factors in Slavic palatalization</article-title>. <source>Language</source>, <volume>94</volume>(<issue>1</issue>), <fpage>43</fpage>&#8211;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1353/lan.2018.0001</pub-id></mixed-citation></ref>
<ref id="B36"><mixed-citation publication-type="journal"><string-name><surname>Karlin</surname>, <given-names>R.</given-names></string-name> (<year>2022</year>). <article-title>Expanding the gestural model of lexical tone: Evidence from two dialects of Serbian</article-title>. <source>Journal of Laboratory Phonology</source>, <volume>13</volume>(<issue>1</issue>). <pub-id pub-id-type="doi">10.16995/labphon.6443</pub-id></mixed-citation></ref>
<ref id="B37"><mixed-citation publication-type="journal"><string-name><surname>Katsika</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <article-title>The role of prominence in determining the scope of boundary lengthening in Greek</article-title>. <source>Journal of Phonetics</source>, <volume>55</volume>, <fpage>149</fpage>&#8211;<lpage>181</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2015.12.003</pub-id></mixed-citation></ref>
<ref id="B38"><mixed-citation publication-type="journal"><string-name><surname>Katz</surname>, <given-names>W. F.</given-names></string-name>, <string-name><surname>Bharadwaj</surname>, <given-names>S. V.</given-names></string-name>, &amp; <string-name><surname>Stettler</surname>, <given-names>M. P.</given-names></string-name> (<year>2006</year>). <article-title>Influences of electromagnetic articulography sensors on speech produced by healthy adults and individuals with aphasia and apraxia</article-title>. <source>Journal of Speech, Language, and Hearing Research</source>, <volume>49</volume>(<issue>3</issue>), <fpage>645</fpage>&#8211;<lpage>659</lpage>. <pub-id pub-id-type="doi">10.1044/1092-4388(2006/047)</pub-id></mixed-citation></ref>
<ref id="B39"><mixed-citation publication-type="journal"><string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name> (<year>2005</year>). <article-title>Position and place effect in Russian word-initial and word-medial stop clusters</article-title>. <source>Journal of the Acoustic Society of America</source>, <volume>117</volume>(<issue>4</issue>), <elocation-id>2571</elocation-id>. <pub-id pub-id-type="doi">10.1121/1.4788568</pub-id></mixed-citation></ref>
<ref id="B40"><mixed-citation publication-type="book"><string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Pouplier</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Son</surname>, <given-names>M.</given-names></string-name> (<year>2007</year>). <chapter-title>Cross-language differences in overlap and assimilation patterns in Korean and Russian</chapter-title>. <source>Proceedings of the 16<sup>th</sup> International Congress of Phonetic Sciences</source> (pp. <fpage>1361</fpage>&#8211;<lpage>1364</lpage>), <publisher-name>Saarbr&#252;cken</publisher-name>.</mixed-citation></ref>
<ref id="B41"><mixed-citation publication-type="journal"><string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>So</surname>, <given-names>C. K.</given-names></string-name> (<year>2007</year>). <article-title>Place assimilation and phonetic grounding: A cross-linguistic perceptual study</article-title>. <source>Phonology</source>, <volume>24</volume>(<issue>3</issue>), <fpage>397</fpage>&#8211;<lpage>432</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675707001273</pub-id></mixed-citation></ref>
<ref id="B42"><mixed-citation publication-type="journal"><string-name><surname>Krivokapi&#263;</surname>, <given-names>J.</given-names></string-name> (<year>2007</year>). <article-title>Prosodic planning: Effects of phrasal length and complexity on pause duration</article-title>. <source>Journal of Phonetics</source>, <volume>35</volume>, <fpage>162</fpage>&#8211;<lpage>179</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2006.04.001</pub-id></mixed-citation></ref>
<ref id="B43"><mixed-citation publication-type="journal"><string-name><surname>K&#252;hnert</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Mooshammer</surname>, <given-names>C.</given-names></string-name> (<year>2006</year>). <article-title>Gestural overlap and C-center in selected French consonant clusters</article-title>. <source>Proceedings of the 7th International Seminar on Speech Production</source>. (pp. <fpage>327</fpage>&#8211;<lpage>334</lpage>).</mixed-citation></ref>
<ref id="B44"><mixed-citation publication-type="journal"><string-name><surname>Kwon</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name> (<year>2024</year>). <article-title>Perception of illusory clusters: The role of native timing</article-title>. <source>Phonetica</source>. <pub-id pub-id-type="doi">10.1515/phon-2023-2005</pub-id></mixed-citation></ref>
<ref id="B45"><mixed-citation publication-type="webpage"><string-name><surname>Lenth</surname>, <given-names>R. V.</given-names></string-name> (<year>2022</year>). <article-title>emmeans: Estimated marginal means, aka least-squares means. R package version 1.7.3</article-title>, &lt;<uri>https://CRAN.R-project.org/package=emmeans</uri>&gt;</mixed-citation></ref>
<ref id="B46"><mixed-citation publication-type="journal"><string-name><surname>Marslen-Wilson</surname>, <given-names>W.</given-names></string-name>, &amp; <string-name><surname>Zwitserlood</surname>, <given-names>P.</given-names></string-name> (<year>1989</year>). <article-title>Accessing spoken words: The importance of word onsets</article-title>. <source>Journal of Experimental Psychology: Human Perception and Performance</source>, <volume>15</volume>(<issue>3</issue>), <fpage>576</fpage>&#8211;<lpage>585</lpage>. <pub-id pub-id-type="doi">10.1037/0096-1523.15.3.576</pub-id></mixed-citation></ref>
<ref id="B47"><mixed-citation publication-type="book"><string-name><surname>Mattingly</surname>, <given-names>I. G.</given-names></string-name> (<year>1981</year>). <chapter-title>Phonetic representation and speech synthesis by rule</chapter-title>. In <string-name><given-names>T.</given-names> <surname>Myers</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Laver</surname></string-name>, &amp; <string-name><given-names>J.</given-names> <surname>Anderson</surname></string-name> (Eds.), <source>The Cognitive Representation of Speech</source> (pp. <fpage>415</fpage>&#8211;<lpage>420</lpage>). <publisher-name>North-Holland</publisher-name>. <pub-id pub-id-type="doi">10.1016/S0166-4115(08)60217-4</pub-id></mixed-citation></ref>
<ref id="B48"><mixed-citation publication-type="book"><string-name><surname>Ohala</surname>, <given-names>J. J.</given-names></string-name> (<year>1981</year>). <chapter-title>The listener as a source of sound change</chapter-title>. In <string-name><given-names>C. S.</given-names> <surname>Masek</surname></string-name>, <string-name><given-names>R. A.</given-names> <surname>Hendrick</surname></string-name>, &amp; <string-name><given-names>M. F.</given-names> <surname>Miller</surname></string-name> (Eds.), <source>Papers from a Parasession on Language and Behavior</source> (pp. <fpage>178</fpage>&#8211;<lpage>203</lpage>). <publisher-name>Chicago Linguistics Society</publisher-name>.</mixed-citation></ref>
<ref id="B49"><mixed-citation publication-type="thesis"><string-name><surname>Peng</surname>, <given-names>S.-H.</given-names></string-name> (<year>1996</year>). <source>Phonetic implementation and perception of place coarticulation and tone sandhi</source>. [Doctoral dissertation, <publisher-name>Ohio State University</publisher-name>]. <uri>http://rave.ohiolink.edu/etdc/view?acc_num=osu1384525774</uri></mixed-citation></ref>
<ref id="B50"><mixed-citation publication-type="journal"><string-name><surname>Perkell</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Cohen</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Svirsky</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Matthies</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Garabieta</surname>, <given-names>I.</given-names></string-name>, &amp; <string-name><surname>Jackson</surname>, <given-names>M.</given-names></string-name> (<year>1992</year>). <article-title>Electromagnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>92</volume>(<issue>6</issue>), <fpage>3078</fpage>&#8211;<lpage>3096</lpage>. <pub-id pub-id-type="doi">10.1121/1.404204</pub-id></mixed-citation></ref>
<ref id="B51"><mixed-citation publication-type="journal"><string-name><surname>Pouplier</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Marin</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name> (<year>2017</year>). <article-title>Speech rate effects in Russian onset clusters are modulated by frequency, but not auditory cue robustness</article-title>. <source>Journal of Phonetics</source>, <volume>64</volume>, <fpage>108</fpage>&#8211;<lpage>126</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2017.01.006</pub-id></mixed-citation></ref>
<ref id="B52"><mixed-citation publication-type="journal"><string-name><surname>Pouplier</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Past&#228;tter</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Marin</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Chitoran</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Lentz</surname>, <given-names>T. O.</given-names></string-name>, &amp; <string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name> (<year>2022</year>). <article-title>Language and cluster-specific effects in the timing of onset consonant sequences in seven languages</article-title>. <source>Journal of Phonetics</source>, <volume>93</volume>. <pub-id pub-id-type="doi">10.1016/j.wocn.2022.101153</pub-id></mixed-citation></ref>
<ref id="B53"><mixed-citation publication-type="webpage"><collab>R Core Team</collab>. (<year>2022</year>). <article-title>R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria</article-title>. <uri>https://www.R-project.org/</uri>.</mixed-citation></ref>
<ref id="B54"><mixed-citation publication-type="journal"><string-name><surname>Remijsen</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Ayoker</surname>, <given-names>O. G.</given-names></string-name> (<year>2014</year>). <article-title>Contrastive tonal alignment in falling contours in Shilluk</article-title>. <source>Phonology</source>, <volume>31</volume>(<issue>3</issue>), <fpage>435</fpage>&#8211;<lpage>462</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675714000219</pub-id></mixed-citation></ref>
<ref id="B55"><mixed-citation publication-type="journal"><string-name><surname>Ridouane</surname>, <given-names>R.</given-names></string-name> (<year>2016</year>). <article-title>Leading issues in Tashlhiyt phonology</article-title>. <source>Language and Linguistics Compass</source>, <volume>10</volume>(<issue>11</issue>), <fpage>644</fpage>&#8211;<lpage>660</lpage>. <pub-id pub-id-type="doi">10.1111/lnc3.12211</pub-id></mixed-citation></ref>
<ref id="B56"><mixed-citation publication-type="journal"><string-name><surname>Ridouane</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Cooper-Leavitt</surname>, <given-names>J.</given-names></string-name> (<year>2019</year>). <article-title>A story of two schwas: A production study from Tashlhiyt</article-title>. <source>Phonology</source>, <volume>36</volume>(<issue>3</issue>), <fpage>433</fpage>&#8211;<lpage>456</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675719000216</pub-id></mixed-citation></ref>
<ref id="B57"><mixed-citation publication-type="journal"><string-name><surname>Ridouane</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Fougeron</surname>, <given-names>C.</given-names></string-name> (<year>2011</year>). <article-title>Schwa elements in Tashlhiyt word-initial clusters</article-title>. <source>Journal of Laboratory Phonology</source>, <volume>2</volume>, <fpage>275</fpage>&#8211;<lpage>300</lpage>. <pub-id pub-id-type="doi">10.1515/labphon.2011.010</pub-id></mixed-citation></ref>
<ref id="B58"><mixed-citation publication-type="journal"><string-name><surname>Ridouane</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Hermes</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Hall&#233;</surname>, <given-names>P.</given-names></string-name> (<year>2014</year>). <article-title>Tashlhiyt&#8217;s ban of complex syllable onsets: Phonetic and perceptual evidence</article-title>. <source>STUF &#8211; Language Typology and Universals</source>, <volume>67</volume>(<issue>1</issue>), <fpage>7</fpage>&#8211;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1515/stuf-2014-0002</pub-id></mixed-citation></ref>
<ref id="B59"><mixed-citation publication-type="journal"><string-name><surname>Roon</surname>, <given-names>K. D.</given-names></string-name>, <string-name><surname>Hoole</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Zeroual</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Du</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Gafos</surname>, <given-names>A. I.</given-names></string-name> (<year>2021</year>). <article-title>Stiffness and articulatory overlap in Moroccan Arabic consonant clusters</article-title>. <source>Laboratory Phonology: Journal of the Association for Laboratory Phonology</source>, <volume>12</volume>(<issue>1</issue>), <elocation-id>8</elocation-id>. <pub-id pub-id-type="doi">10.5334/labphon.272</pub-id></mixed-citation></ref>
<ref id="B60"><mixed-citation publication-type="journal"><string-name><surname>Rubin</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Baer</surname>, <given-names>T.</given-names></string-name>, &amp; <string-name><surname>Mermelstein</surname>, <given-names>P.</given-names></string-name> (<year>1981</year>). <article-title>An articulatory synthesizer for perceptual research</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>70</volume>, <fpage>321</fpage>&#8211;<lpage>328</lpage>. <pub-id pub-id-type="doi">10.1121/1.386780</pub-id></mixed-citation></ref>
<ref id="B61"><mixed-citation publication-type="journal"><string-name><surname>Shaw</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Oh</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Durvasula</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Kochetov</surname>, <given-names>A.</given-names></string-name> (<year>2021</year>). <article-title>Articulatory coordination distinguishes complex segments from segment sequences</article-title>. <source>Phonology</source>, <volume>38</volume>(<issue>3</issue>), <fpage>437</fpage>&#8211;<lpage>477</lpage>. <pub-id pub-id-type="doi">10.1017/S0952675721000269</pub-id></mixed-citation></ref>
<ref id="B62"><mixed-citation publication-type="journal"><string-name><surname>Shrestha</surname>, <given-names>N.</given-names></string-name> (<year>2020</year>). <article-title>Detecting multicollinearity in regression analysis</article-title>. <source>American Journal of Applied Mathematics and Statistics</source>, <volume>8</volume>(<issue>2</issue>), <fpage>39</fpage>&#8211;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.12691/ajams-8-2-1</pub-id></mixed-citation></ref>
<ref id="B63"><mixed-citation publication-type="journal"><string-name><surname>Son</surname>, <given-names>M.</given-names></string-name> (<year>2008</year>). <article-title>Gradient reduction of C1 in /pk/ sequences</article-title>. <source>Phonetic Sciences</source>, <volume>15</volume>(<issue>4</issue>), <fpage>43</fpage>&#8211;<lpage>65</lpage>.</mixed-citation></ref>
<ref id="B64"><mixed-citation publication-type="journal"><string-name><surname>Sorensen</surname>, <given-names>T.</given-names></string-name>, &amp; <string-name><surname>Gafos</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <article-title>The gesture as an autonomous nonlinear dynamical system</article-title>. <source>Ecological Psychology</source>, <volume>28</volume>(<issue>4</issue>), <fpage>188</fpage>&#8211;<lpage>215</lpage>. <pub-id pub-id-type="doi">10.1080/10407413.2016.1230368</pub-id></mixed-citation></ref>
<ref id="B65"><mixed-citation publication-type="journal"><string-name><surname>Sotiropoulou</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Gafos</surname>, <given-names>A.</given-names></string-name> (<year>2022</year>). <article-title>Phonetic indices of syllabic organization in German stop-lateral clusters</article-title>. <source>Journal of the Association for Laboratory Phonology</source>, <volume>13</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.16995/labphon.6440</pub-id></mixed-citation></ref>
<ref id="B66"><mixed-citation publication-type="book"><string-name><surname>Steriade</surname>, <given-names>D.</given-names></string-name> (<year>2008</year>). <chapter-title>The phonology of perceptibility effects: The P-Map and its consequences for constraint organization</chapter-title>. In <string-name><given-names>K.</given-names> <surname>Hanson</surname></string-name>, &amp; <string-name><given-names>S.</given-names> <surname>Inkelas</surname></string-name> (Eds.), <source>The nature of the word: Studies in honor of Paul Kiparsky</source> (pp. <fpage>150</fpage>&#8211;<lpage>179</lpage>). <publisher-name>MIT Press</publisher-name>. <pub-id pub-id-type="doi">10.7551/mitpress/9780262083799.003.0007</pub-id></mixed-citation></ref>
<ref id="B67"><mixed-citation publication-type="journal"><string-name><surname>Surprenant</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Goldstein</surname>, <given-names>L.</given-names></string-name> (<year>1998</year>). <article-title>The perception of speech gestures</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>104</volume>(<issue>1</issue>), <fpage>518</fpage>&#8211;<lpage>529</lpage>. <pub-id pub-id-type="doi">10.1121/1.423253</pub-id></mixed-citation></ref>
<ref id="B68"><mixed-citation publication-type="journal"><string-name><surname>Svensson Lundmark</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Frid</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Ambrazaitis</surname>, <given-names>G.</given-names></string-name>, &amp; <string-name><surname>Sch&#246;tz</surname>, <given-names>S.</given-names></string-name> (<year>2021</year>). <article-title>Word-initial consonant-vowel coordination in a lexical pitch-accent language</article-title>. <source>Phonetica</source>, <volume>78</volume>(<issue>5&#8211;6</issue>), <fpage>515</fpage>&#8211;<lpage>569</lpage>. <pub-id pub-id-type="doi">10.1515/phon-2021-2014</pub-id></mixed-citation></ref>
<ref id="B69"><mixed-citation publication-type="book"><string-name><surname>Tienkamp</surname>, <given-names>T. B.</given-names></string-name>, <string-name><surname>Rebernik</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Jacobi</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Wieling</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Abur</surname>, <given-names>D.</given-names></string-name> (<year>2024</year>). <chapter-title>The impact of electromagnetic articulography sensors on the articulatory-acoustic vowel space in speakers with and without Parkinson&#8217;s disease</chapter-title>. <source>Proceedings of the 13</source>th <italic>International Seminar on Speech Production</italic>. <day>13&#8211;17</day> <month>May</month> 2024, <publisher-loc>Autrans, France</publisher-loc>. <pub-id pub-id-type="doi">10.21437/issp.2024-24</pub-id></mixed-citation></ref>
<ref id="B70"><mixed-citation publication-type="book"><string-name><surname>Wilson</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name> (<year>2013</year>). <chapter-title>Bayesian analysis of non-native cluster production</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Kan</surname></string-name>, <string-name><given-names>C.</given-names> <surname>Moore-Cantwell</surname></string-name>, &amp; <string-name><given-names>R.</given-names> <surname>Staubs</surname></string-name> (Eds.), <source>Proceedings of NELS</source> <volume>40</volume>. <publisher-loc>Amherst, MA</publisher-loc>: <publisher-name>Graduate Linguistics Student Association</publisher-name> (pp. <fpage>265</fpage>&#8211;<lpage>278</lpage>).</mixed-citation></ref>
<ref id="B71"><mixed-citation publication-type="journal"><string-name><surname>Wilson</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Martin</surname>, <given-names>S.</given-names></string-name> (<year>2014</year>). <article-title>Effects of acoustic-phonetic detail on cross-language speech production</article-title>. <source>Journal of Memory and Language</source>, <volume>77</volume>, <fpage>1</fpage>&#8211;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1016/j.jml.2014.08.001</pub-id></mixed-citation></ref>
<ref id="B72"><mixed-citation publication-type="thesis"><string-name><surname>Wright</surname>, <given-names>R.</given-names></string-name> (<year>1996</year>). <source>Consonant clusters and cue preservation in Tsou</source>. [Doctoral dissertation, <publisher-name>University of California</publisher-name>, <publisher-loc>Los Angeles</publisher-loc>]. <uri>https://linguistics.ucla.edu/images/stories/wright.1996.pdf</uri></mixed-citation></ref>
<ref id="B73"><mixed-citation publication-type="book"><string-name><surname>Wright</surname>, <given-names>R.</given-names></string-name> (<year>2001</year>). <chapter-title>Perceptual cues in contrast maintenance</chapter-title>. In <string-name><given-names>K.</given-names> <surname>Johnson</surname></string-name>, &amp; <string-name><given-names>E.</given-names> <surname>Hume</surname></string-name> (Eds.), <source>The role of speech perception in phonology</source> (pp. <fpage>251</fpage>&#8211;<lpage>277</lpage>). <publisher-name>Brill</publisher-name>. <pub-id pub-id-type="doi">10.1163/9789004454095_014</pub-id></mixed-citation></ref>
<ref id="B74"><mixed-citation publication-type="thesis"><string-name><surname>Yanagawa</surname>, <given-names>M.</given-names></string-name> (<year>2006</year>). <source>Articulatory timing in first and second language: A cross-linguistic study</source>. [Doctoral dissertation, <publisher-name>Yale University</publisher-name>].</mixed-citation></ref>
<ref id="B75"><mixed-citation publication-type="thesis"><string-name><surname>Yip</surname>, <given-names>J. C. K.</given-names></string-name> (<year>2013</year>). <source>Phonetic effects on the timing of gestural coordination in Modern Greek consonant clusters</source>. [Doctoral dissertation, <publisher-name>University of Michigan</publisher-name>]. <uri>https://www.proquest.com/docview/1497967202</uri></mixed-citation></ref>
<ref id="B76"><mixed-citation publication-type="journal"><string-name><surname>Zellou</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Lahrouchi</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Bensoukas</surname>, <given-names>K.</given-names></string-name> (<year>2024</year>). <article-title>The perception of vowelless words in Tashlhiyt</article-title>. <source>Glossa: A journal of general linguistics</source>, <volume>8</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.16995/glossa.10438</pub-id></mixed-citation></ref>
<ref id="B77"><mixed-citation publication-type="book"><string-name><surname>Zhgenti</surname>, <given-names>S.</given-names></string-name> (<year>1956</year>). <source>Kartuli enis ponetika</source> [Phonetics of the Georgian language]. <publisher-loc>Tbilisi</publisher-loc>.</mixed-citation></ref>
<ref id="B78"><mixed-citation publication-type="journal"><string-name><surname>Zsiga</surname>, <given-names>E. C.</given-names></string-name> (<year>1994</year>). <article-title>Acoustic evidence for gestural overlap in consonant sequences</article-title>. <source>Journal of Phonetics</source>, <volume>22</volume>, <fpage>121</fpage>&#8211;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1016/S0095-4470(19)30189-5</pub-id></mixed-citation></ref>
<ref id="B79"><mixed-citation publication-type="journal"><string-name><surname>Zsiga</surname>, <given-names>E. C.</given-names></string-name> (<year>2000</year>). <article-title>Phonetic alignment constraints: Consonant overlap and palatalization in English and Russian</article-title>. <source>Journal of Phonetics</source>, <volume>28</volume>, <fpage>69</fpage>&#8211;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.2000.0109</pub-id></mixed-citation></ref>
<ref id="B80"><mixed-citation publication-type="journal"><string-name><surname>Zsiga</surname>, <given-names>E. C.</given-names></string-name> (<year>2003</year>). <article-title>Articulatory timing in a second language: Evidence from Russian and English</article-title>. <source>Studies in Second Language Acquisition</source>, <volume>25</volume>, <fpage>399</fpage>&#8211;<lpage>432</lpage>. <pub-id pub-id-type="doi">10.1017/S0272263103000160</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>