<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.0" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1868-6354</journal-id>
<journal-title-group>
<journal-title>Laboratory Phonology: Journal of the Association for Laboratory Phonology</journal-title>
</journal-title-group>
<issn pub-type="epub">1868-6354</issn>
<publisher>
<publisher-name>Ubiquity Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5334/labphon.36</article-id>
<article-categories>
<subj-group>
<subject>Journal article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Evidence and characterization of a glide-vowel distinction in American English</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Jaggers</surname>
<given-names>Zachary Scott</given-names>
</name>
<email>zackjaggers@nyu.edu</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Linguistics, New York University, NY, US</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2018-02-07">
<day>07</day>
<month>02</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<year>2018</year>
</pub-date>
<volume>9</volume>
<issue>1</issue>
<elocation-id>3</elocation-id>
<history>
<date date-type="received" iso-8601-date="2016-07-19">
<day>19</day>
<month>07</month>
<year>2016</year>
</date>
<date date-type="accepted" iso-8601-date="2017-09-14">
<day>14</day>
<month>09</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2018 The Author(s)</copyright-statement>
<copyright-year>2018</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.journal-labphon.org/articles/10.5334/labphon.36/"/>
<abstract>
<p>This study tests whether native speakers of American English exhibit a glide-vowel distinction ([j]-[i]) in a speech elicitation experiment. When reading sentences out loud, participants&#8217; pronunciations of 4 near-minimal pairs of pre-existing lexical items (e.g., <italic>Eston</italic>[i&#601;] vs. <italic>pneumon</italic>[j&#601;]) exhibit significant differences when acoustically measured, confirming the presence of a [j]-[i] distinction. This distinction is also found to be productively extended to the production of 20 near-minimal pairs of nonce words (e.g., <italic>S&#250;mia</italic> &#8594; [sumi&#601;] vs. <italic>F&#237;mya</italic> &#8594; [fimj&#601;]), diversified and balanced along different phonologically relevant factors of the surrounding environment. Multiple acoustic measurements are compared to test what aspects most consistently convey the distinction: F2 (frontness), F1 (height), intensity, vocalic sequence duration, transition earliness, and transition speed. This serves the purpose of documenting the distinction&#8217;s acoustic phonetic realization. It also serves in the comparison of phonological representations. Multiple types of previously proposed phonological representations are considered along with the competing predictions they generate regarding the acoustic measurements performed. Results suggest that the primary and most consistent characteristic of the distinction is earliness of transition into the following vowel, with results also suggesting that the [j] glide has a greater degree of constriction. The [j] glide is found to have a significantly <italic>less</italic> anterior articulation, challenging the application of a representation based on place or articulator differences that would predict [j] to be <italic>more</italic> anterior.</p>
</abstract>
<kwd-group>
<kwd>acoustic phonetics</kwd>
<kwd>glides</kwd>
<kwd>semivowels</kwd>
<kwd>semiconsonants</kwd>
<kwd>hiatus</kwd>
<kwd>representation</kwd>
<kwd>features</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>1. Introduction</title>
<p>Pre-existing lexical items suggest that a glide-vowel distinction exists in near-minimally paired environments in American English:</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(1)</p></list-item>
</list>
<list list-type="sentence-gloss">
<list-item>
<list list-type="word">
<list-item><p>[CVV]:</p></list-item>
<list-item><p>[CGV]:</p></list-item>
</list>
<list list-type="word">
<list-item><p><italic>Estonia</italic></p></list-item>
<list-item><p><italic>Pneumonia</italic></p></list-item>
</list>
<list list-type="word">
<list-item><p>[&#603;sto&#769;ni&#601;],</p></list-item>
<list-item><p>[n&#650;mo&#769;nj&#601;],</p></list-item>
</list>
<list list-type="word">
<list-item><p><italic>millennia</italic></p></list-item>
<list-item><p><italic>Kenya</italic></p></list-item>
</list>
<list list-type="word">
<list-item><p>[m&#618;l&#603;&#769;ni&#601;],</p></list-item>
<list-item><p>[k&#603;&#769;nj&#601;],</p></list-item>
</list>
<list list-type="word">
<list-item><p><italic>duet</italic></p></list-item>
<list-item><p><italic>dwell</italic></p></list-item>
</list>
<list list-type="word">
<list-item><p>[du&#603;&#769;t]</p></list-item>
<list-item><p>[dw&#603;l]</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>However, the precise nature and representation of this distinction has not yet been established. There is also a lack of phonetic documentation, which could help in deciding between the representations that have been proposed so far. While the examples in (1) suggest that a glide-vowel distinction may be apparent in both the [j]-[i] and [w]-[u] paradigms, this study focuses on the [j]-[i] distinction. Using a speech elicitation experiment, this study tests for the [j]-[i] glide-vowel distinction in American English. It also collects phonetic data along a variety of characteristics in an effort to determine the proper representation.</p>
<p>This study tests if the [j]-[i] distinction can be elicited in pre-existing lexical item pairs like those above. It also tests if this distinction can be productively extended to newly encountered words and elicited via a &lt;y&gt; vs. &lt;i&gt; orthographic distinction. Acoustic analysis is used to capture the most consistent characteristics of the distinction. This provides documentation of the distinction, as well as guidance for how future research may best examine it. Furthermore, three competing broad classes of phonological representations that have been previously put forth are considered regarding the current analysis. This study therefore not only tests whether such a distinction is available to American English speakers; it also compares these representations, considers acoustic predictions they generate, and applies these predictions to the data at hand. This may help speak between these competing representations, by either identifying one optimal approach or at least ruling one out.</p>
</sec>
<sec>
<title>2. Background</title>
<sec>
<title>2.1. Competing phonological representations</title>
<p>Competing accounts debate whether glide-vowel distinctions are phonologically possible and attested. One argument (e.g., <xref ref-type="bibr" rid="B68">Steriade, 1984</xref>; <xref ref-type="bibr" rid="B39">Kaye &amp; Lowenstamm, 1984</xref>; <xref ref-type="bibr" rid="B16">Durand, 1987</xref>; <xref ref-type="bibr" rid="B15">Deligiorgis, 1988</xref>) is that there is no underlying distinction between vowels and glides, positing that glides are instead always phonologically derived from underlying vowels in certain environments. Levi (<xref ref-type="bibr" rid="B45">2004</xref>, <xref ref-type="bibr" rid="B46">2008</xref>), however, provides evidence from multiple languages in which glide surface forms are not fully predictable from their surrounding environment. This unpredictability leads Levi to conclude that languages can underlyingly distinguish between glides and vowels. This study maintains the assumption (as strongly motivated by <xref ref-type="bibr" rid="B45">Levi, 2004</xref>) that glide-vowel distinctions are <italic>available</italic> to the human phonological faculty, and it tests whether such a distinction is present and productive in the phonological system of American English. However, a further debate remains open regarding how such distinctions should be phonologically <italic>represented</italic>. This study therefore considers competing representation accounts and compares them as candidates for representing the apparent distinction in American English.</p>
<p>One kind of representation account proposes that glide-vowel distinctions are attributable to a distinction in the segment&#8217;s primary articulator: A <italic>place-based representation</italic>. In an analysis of multiple languages, Levi (<xref ref-type="bibr" rid="B45">2004</xref>, <xref ref-type="bibr" rid="B46">2008</xref>) proposes that all vowels are primarily [Dorsal], while glides&#8217; primary articulators differ. Levi suggests that /w/ is primarily [Labial], while /j/ is primarily [Coronal], in accord with Halle et al.&#8217;s (<xref ref-type="bibr" rid="B30">2000</xref>) Revised Articulator Theory. As one example, Levi (<xref ref-type="bibr" rid="B46">2008</xref>) describes Pulaar as having both derived and underlyingly phonemic glides, with the derived glides being predictable by the surrounding environment while the phonemic ones are not. To argue how they are represented, Levi analyzes how the glides participate in a previously documented (<xref ref-type="bibr" rid="B57">Paradis, 1992</xref>) process of consonant gradation, alternating with more fortified counterparts. She demonstrates that the fortified versions of phonemic /j/ and /w/ have coronal and labial places of articulation ([d&#865;&#658;] and [b], respectively), while their derived (underlyingly vocalic) counterparts fortify to a dorsal place of articulation ([&#609;]). This representation approach is similar, though not identical, to other representations previously put forth, such as proposals that palatal [j] is both [Coronal] and [Dorsal] (e.g., <xref ref-type="bibr" rid="B40">Keating, 1988</xref>; <xref ref-type="bibr" rid="B54">Nevins &amp; Chitoran, 2008</xref>). In terms of how such a distinction might manifest in production, we might expect tighter constriction at these more anterior places of articulation. Regarding another language, Karuk, Levi (<xref ref-type="bibr" rid="B46">2008</xref>) corroborates such predictions while discussing how the /w/ glide is documented (<xref ref-type="bibr" rid="B4">Bright, 1957</xref>) to exhibit bilabial frication. Furthermore, Keating&#8217;s (<xref ref-type="bibr" rid="B40">1988</xref>) conclusion that [j] is both [Coronal] and [Dorsal] comes from X-ray analysis demonstrating coronal constriction during the production of [j]. Therefore, in the analysis at hand, the primary articulation of the [j] glide would be predicted by this account to be more anterior than that of its [i] vowel counterpart. In terms of constraints on the phonological distribution, such a representation might predict this distinction to exhibit homorganicity effects and be constrained by the place of articulation of surrounding sounds.</p>
<p>Another kind of account, henceforth referred to as a <italic>constriction-based representation</italic>, makes use of the notion of constriction degree to distinguish glides and vowels, positing that the production of glides involves tighter constriction of the vocal tract than the production of their vowel counterparts. Padgett (<xref ref-type="bibr" rid="B56">2008</xref>) proposes that in systems with a glide-vowel contrast there is no difference of articulator or frontness of articulation, but only that of the feature [&#177;vocalic]. Padgett proposes that glides, being [&#8211;vocalic], have a distinctly tighter degree of constriction than their [+vocalic] counterparts, equal along all other featural dimensions. One of the examples Padgett provides is the previously documented (<xref ref-type="bibr" rid="B72">Townsend &amp; Janda, 1996</xref>) pattern of Slavic stops mutating into palatalized, affricated counterparts. This pattern is more frequent when a stop is followed by a glide than when followed by a vowel, and more frequent when a vowel is high than when it is not. Padgett argues that this scale of likelihood is attributable to the degree of constriction of the following segment: The narrower a following segment, when coupled with the release of the stop, the more likely the release is to be perceived and reanalyzed as affrication; therefore, the glide is narrower than its high vowel counterpart. This is similar to previous proposals suggesting that glides have narrower constriction targets (e.g., <xref ref-type="bibr" rid="B71">Straka, 1964</xref>; <xref ref-type="bibr" rid="B52">Maddieson &amp; Emmorey, 1985</xref>), with arguments referring both to phonological distribution and phonetic properties such as a lower acoustic intensity of glides. Such proposals vary, however, in precise feature specification, such as employing [&#177;consonantal] instead of [&#177;vocalic] (e.g., <xref ref-type="bibr" rid="B36">Hyman, 1985</xref>; <xref ref-type="bibr" rid="B32">Hayes, 1989</xref>; <xref ref-type="bibr" rid="B63">Rosenthall, 1994</xref>). For the distinction of interest in this study, such an account would entail that the production of a lingual [j] glide has a tighter constriction reached by a higher lingual articulation than its [i] vowel counterpart. In terms of distribution, this representation might predict the factor of sonority or openness to play a role in constraining the distinction.</p>
<p>Another kind of account is what will be referred to as a <italic>syllabic pre-linking account</italic>. Levi, in her cross-linguistic documentation and analysis (2004), maintains that two different types of glide-vowel distinctions are typologically possible. One is the place-based representation as introduced above. The other is based on Levin&#8217;s (<xref ref-type="bibr" rid="B47">1985</xref>) notion of &#8216;pre-linking&#8217; with respect to syllabification. In this type of system, glide and vowel counterparts are identically featured with respect to both place and constriction. Instead, identical segments in the lexical representation can be anticipatorily specified in terms of how they will be syllabified. While Levi (<xref ref-type="bibr" rid="B45">2004</xref>) concludes in favor of a place-based representation for the distinctions observed in some languages, she concludes that an apparent distinction in Spanish is best analyzed as a syllabic pre-linking kind of system, with some cases of vowels exceptionally specified in the lexeme&#8217;s underlying form to surface as vocalic syllable nuclei in environments where Spanish phonology more commonly dictates them to surface as their non-nuclear glide counterparts. This is in line with other previous accounts (e.g., <xref ref-type="bibr" rid="B62">Roca, 1997</xref>; <xref ref-type="bibr" rid="B31">Harris &amp; Kaisse, 1999</xref>) suggesting that the [GV]-[VV] distinction is a result of underlyingly specified syllabification of identically featured vowel phonemes. In the case at hand, both [j] and [i] surface forms would be underlyingly /i/ and an [iV] hiatus output would be the result of the underlying /i/ having been pre-specified to surface as a syllable nucleus, therefore not being parsed into the syllable margin as might otherwise be allowed or preferred by the language&#8217;s phonotactics. Regarding production, due to this difference in syllabification, such a distinction might be predicted to manifest mainly via timing differences, with a glide being shorter than its vowel counterpart but not of a differently specified target or degree of constriction. Along these lines, Catford (<xref ref-type="bibr" rid="B6">1977, p. 165</xref>) argues that glides are intrinsically fast and dynamic without a &#8216;noticeable duration&#8217; like their vowel counterparts (cf. <xref ref-type="bibr" rid="B51">Maddieson, 2008</xref>), but identical in terms of &#8216;articulatory stricture.&#8217; Regarding distribution, Levi (<xref ref-type="bibr" rid="B45">2004</xref>) suggests that such a distinction is more idiosyncratic and marked, and that the vocalic option is underrepresented across the lexicon.</p>
<p>The next section will discuss where glide-vowel distinctions appear available or constrained in the American English phonology. However, these observations regarding the distribution do not strongly speak to which kind of representation might be best applicable to the case at hand, therefore motivating analysis of production and how it may speak between the competing accounts.</p>
</sec>
<sec>
<title>2.2. Considering the phonological distribution in American English</title>
<p>The [j] glide in American English can occur as a simplex word-initial onset, followed by a host of following vowels (e.g., <italic>yolk</italic> [jok], <italic>year</italic> [ji&#633;], <italic>Yale</italic> [jel], <italic>young</italic> [j&#652;&#331;]). However, we rarely encounter word-initial cases of [iV] hiatus, except a few cases in which the initial segment is stressed (e.g., <italic>eon</italic> [&#237;&#593;n], <italic>Ian</italic> [&#237;&#601;n])<xref ref-type="fn" rid="n1">1</xref>; the glide-vowel distinction may therefore not be available in this environment without being confounded with stress. Following word-initial consonants, the distinction appears to be available: Pairs like <italic>fjord</italic> [fj&#596;&#633;d] + <italic>Fiona</italic> [fi&#243;n&#601;] and <italic>dwell</italic> [dw&#603;l] + <italic>duet</italic> [du&#603;&#769;t] exhibit the [j]-[i] and [w]-[u] distinctions, respectively. This study therefore limits the analysis to environments with a preceding consonant ([C_V]), where the distinction between a glide and its unstressed vowel counterpart does seem to be available. Another narrowing of the scope of this analysis regards [ju]. This study focuses only on [j] when it is its own glide segment, rather than part of a diphthongal nucleus. There is a large and varied host of [CjV] items in which the following vowel ([_V]) is [u] (e.g., <italic>fume</italic> [fjum], <italic>huge</italic> [hjud&#865;&#658;], <italic>cute</italic> [kjut]). However, previous research has shown that /ju/ appears to pattern as a monomoraic diphthong in the English phonemic inventory (<xref ref-type="bibr" rid="B38">Jensen, 1993</xref>; <xref ref-type="bibr" rid="B13">Davis &amp; Hammond, 1995</xref>), in which the high front vocoid behaves differently from the glide of interest here. Smith (<xref ref-type="bibr" rid="B64">2003</xref>) further documents that nuclear vs. onset onglides can exhibit different phonological behavior. This further supports treating [j] as distinct from the glide in the [ju] diphthong. Throughout this paper, the [j] glide of interest is that which is not a member of the [ju] diphthong unless otherwise noted.</p>
<p>This section discusses the apparent distribution of glide-vowel distinctions in American English, narrowing in on the [C_V] environment where a glide-vowel distinction does seem to be available. There are further constraints apparent regarding the distribution of such a distinction, which may speak to our consideration of competing phonological representations. However, it is important to note that this distinction does not appear to be robustly prevalent across the English lexicon&#8212;an aspect which (according to Levi, <xref ref-type="bibr" rid="B45">2004</xref>) would lend support to applying the syllabic pre-linking representation&#8212;and is not well documented. It is also variable across words and speakers. Therefore, the reported surface forms of different words considered here are most certainly not meant to be presented as categorical, but they have been cross-checked with the reports of other linguists and the transcriptions provided by multiple online dictionary sources (e.g., <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://Dictionary.com">Dictionary.com</ext-link> [<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://dictionary.com">dictionary.com</ext-link>], Merriam-Webster [<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://merriamwebster.com">merriamwebster.com</ext-link>], Cambridge Dictionary of American English [<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://dictionary.cambridge.org/us/dictionary/english/">http://dictionary.cambridge.org/us/dictionary/english/</ext-link>]). While some dictionary entries acknowledge potential for variation between [jV] and [iV] pronunciations, some list only one pronunciation and, across them, a majority vote can become apparent. A final note before proceeding is that, when considering the phonological distribution of glides and vowels in this section, the [w]-[u] paradigm is also taken into account simply to show that the distributions seem similarly constrained across the two paradigms. However, as has already been made clear, the experimental analysis conducted in this study focuses on the [j]-[i] distinction. (This is largely due to complications that would arise in trying to apply acoustic phonetic analysis alone, as pursued in this study, to aptly describe the articulation: Given the labial gesture, additional ultrasound analysis [e.g., <xref ref-type="bibr" rid="B24">Gick, 2002</xref>; <xref ref-type="bibr" rid="B70">Stone, 2005</xref>; <xref ref-type="bibr" rid="B11">Davidson, 2006</xref>] of lingual constriction and positioning would be ideal to provide a full description of how [w] and [u] may be distinctly produced).</p>
<p>One apparent constraint is that of the place features of neighboring segments. First regarding the [w]-[u] paradigm, there seem to be no cases of [CwV] in which the following vowel or diphthong involves a high back vocoid (e.g., *[Cwa&#650;], *[Cwu] [<xref ref-type="bibr" rid="B13">Davis &amp; Hammond, 1995</xref>]). The appearance of [w] in [CwV] sequences also seems to avoid labial preceding consonants (e.g., *[fwV] [<xref ref-type="bibr" rid="B9">Clements &amp; Keyser, 1983</xref>]).<xref ref-type="fn" rid="n2">2</xref> While [w] is considered labiovelar, [CwV] sequences with dorsal preceding consonants are not banned (e.g., <italic>quit</italic> [kw&#618;t], <italic>awkward</italic> [&#593;&#769;kw&#602;d]). Regarding the [j]-[i] paradigm, there are many fewer cases of [CjV] sequences. In fact, Davis and Hammond&#8217;s (<xref ref-type="bibr" rid="B13">1995</xref>) paper titled, &#8220;On the status of onglides in American English,&#8221; doesn&#8217;t mention non-[ju] cases of [j] when discussing post-consonantal environments. However, some near-minimal pairs do show a distinction between [i] and [j] (e.g., <italic>Estonia</italic> [&#603;st&#243;ni&#601;] + <italic>pneumonia</italic> [n&#650;m&#243;nj&#601;]). Many of these cases are variable, but they do show trends analogous to those apparent regarding [w]. No such cases are ever followed by a nucleus containing a high front vocoid such as *[Cja&#618;] or *[Cji]. Regarding place of articulation of the preceding consonant, the case of <italic>fjord</italic> [fj&#596;&#633;d] suggests that preceding labial consonants do not prohibit the glide. A more established word in English, <italic>piano</italic>, also allows for [j] in surface form [pj&#509;no] (while exhibiting inter-speaker variability with [pi&#509;no]). In cases where we might expect [j] to appear following a dorsal consonant, such as the borrowing of placename <italic>Kyoto</italic> (Japanese source form [k&#690;o&#720;to]), the adaptation instead appears to prefer a full [i] vowel in American English ([ki&#243;&#638;o]). We also do not see word-initial [CjV] sequences in which the preceding consonant is coronal. These observed homorganicity constraints could support a place-based representation, with a possible interpretation being that both restrictions against preceding dorsal and coronal consonants are due to homorganicity and that /j/ is therefore underlyingly both coronal and dorsal (<xref ref-type="bibr" rid="B40">Keating, 1988</xref>; <xref ref-type="bibr" rid="B54">Nevins &amp; Chitoran, 2008</xref>; cf. <xref ref-type="bibr" rid="B46">Levi, 2008</xref>).</p>
<p>The sonority of the preceding segment also appears to constrain the distribution of glides. Take, for example, the loanword adaptation of French <italic>noir</italic> (source form [nwa&#641;]). While [w] is allowed after other coronal consonants (e.g., <italic>dwarf</italic> [dw&#596;&#633;f]), <italic>noir</italic> is commonly adapted to [nu&#593;&#769;&#633;]. Following Steriade (<xref ref-type="bibr" rid="B69">1988</xref>), homorganicity constraints can play a role in the interactions of segments both within the syllable margin and between the margin and the nucleus (as observed above), while sonority constraints only play a role within the syllable margin but not between the margin and the nucleus. Therefore, the adaptation of <italic>noir</italic> banning a *[nw] onset while allowing [tw], [dw], and [sw] onsets could be attributable to the constraint against the flatter sonority cline within that complex onset. A parallel pattern is apparent regarding [j], with no clear cases of a word-initial *[mj] onset. (The fact that <italic>music</italic> invariably maintains the form [mj&#250;z&#618;k] is one argument in support of the glide in [ju] pertaining to a diphthongal nucleus, suggesting that the nasal and glide are not both within the syllable margin where the constraint against a flatter sonority cline would be applicable).</p>
<p>Turning attention to word-medial position, it appears that both homorganicity and sonority constraints may be circumvented, with [j] appearing after [n], which is both coronal and a sonorant (e.g., <italic>Kenya</italic> [k&#603;&#769;nj&#601;], <italic>pneumonia</italic> [n&#650;m&#243;nj&#601;]). This seems due to the option of licitly syllabifying the preceding consonant to the coda of the preceding syllable (e.g., [k&#603;&#769;n.j&#601;], [n&#650;.m&#243;n.j&#601;]). However, this syllabification appears to be constrained by the sonority of the preceding consonant. In the adaptation of placename <italic>Tokyo</italic>, in which a glide adaptation might be expected as a more faithful replication of the Japanese source form [to&#720;k&#690;o&#720;], the homorganicity constraint regarding the preceding consonant seems to reappear, resulting in a full vocalic adaptation ([t&#243;.ki.o]). Following Gouskova (<xref ref-type="bibr" rid="B27">2004</xref>), this may be due to what would otherwise be too steep a rise in sonority across the syllable boundary (*[t&#243;k.jo]). Avoiding coda syllabification of the voiceless stop, the homorganicity constraint then applies, which disprefers the complex *[kj] onset in the *[t&#243;.kjo] adaptation candidate and leads the [t&#243;.ki.o] candidate to be the winner. The observation that sonority cline constraints may play a role in the distribution of this distinction could lend support to a constriction-based proposal that glides are underlyingly specified as less sonorous than their vowel counterparts.</p>
<p>While we can further understand the constraints on this distinction by analyzing its phonological distribution, this does not present a clear choice between the competing place and constriction/height representations proposed, as both homorganicity and sonority appear to play a role. While not speaking between these two representations, the observation of both of these effects could therefore lend support to a hybrid account, like that proposed by Nevins and Chitoran (<xref ref-type="bibr" rid="B54">2008</xref>), that [j] differs from [i] along both the dimensions of place/articulator ([Cor, Dors] vs. just [Dors]) and constriction ([&#8211;vocalic] vs. [+vocalic]). However, as also acknowledged in this section&#8217;s discussion, this distinction appears to be somewhat infrequent across the lexicon, as well as variable. This could lend support to treating the current case as one of lexically exceptional syllabic pre-linking (<xref ref-type="bibr" rid="B45">Levi, 2004</xref>). The following section will turn to discussing how each of these representations generates different predictions in the articulation and, therefore, acoustics of the [j]-[i] distinction of interest here, motivating the acoustic analysis pursued in this study.</p>
</sec>
<sec>
<title>2.3. Acoustic characterization</title>
<p>The competing phonological representations considered above generate different predictions regarding the acoustics of the [j]-[i] glide-vowel distinction of interest in this study. There are acoustic aspects widely considered to correlate with lingual frontness, height, and constriction. There are also aspects related to timing that may play a crucial role in conveying such a distinction no matter how it is represented, though they would arguably play the only role in the syllabic pre-linking account. Therefore, analyzing the acoustics of such a distinction may speak to which representation appears more directly borne out in production, or at least rule one out.</p>
<p>Recall that a <italic>place-based representation</italic> suggests that /j/ is primarily [Coronal] (or at least includes a [Coronal] specification), while /i/ (like all vowels) is [Dorsal]. This predicts that a [j] production should have more fronted tongue mass than [i] if the primary articulator and/or target of constriction is anatomically more anterior. Acoustically, we can analyze F2 to examine a vocoid&#8217;s frontness: More anterior vocoids have a higher F2 than more posterior ones. This representation therefore predicts that [j] should reach a higher F2 than [i].</p>
<p>On the other hand, a <italic>constriction-based representation</italic> suggests that /j/ does not differ at all in place from /i/. Instead, it contends that glides have a tighter constriction than their vowel counterparts. In this case, then, [j] production should involve a tighter constriction achieved by a higher lingual articulation. Two acoustic measurements may capture this. The first is F1, which can be used to examine a vocoid&#8217;s height: Vocoids of a higher lingual articulation have a lower F1 than vocoids of a lower lingual articulation. This representation therefore predicts that [j] should reach a lower F1 than [i]. It also predicts that [j] should have a lower acoustic intensity due to its narrower constriction of the vocal tract.</p>
<p>There is acoustic documentation suggesting that we may find such acoustic aspects to characterize the distinction at hand. In studies of intervocalic glides (i.e., [VGV] environments), a dip in intensity between the first and second vowel is observed when an intervening glide is present, as compared to [VV] hiatus (e.g., <xref ref-type="bibr" rid="B1">Aguilar, 1999</xref>; <xref ref-type="bibr" rid="B12">Davidson &amp; Erker, 2014</xref>). However, especially given Straka&#8217;s (<xref ref-type="bibr" rid="B71">1964</xref>) finding that there does not seem to be some consistent threshold across phonological contexts, this could differ in the environment of interest here ([C_V]). Additionally, in an acoustic phonetic analysis of glides and other approximants in English, Espy-Wilson (<xref ref-type="bibr" rid="B17">1992</xref>) finds [j] to correlate with a higher F2 and lower F1 than [i], therefore potentially supporting a combination of the place- and constriction-based representations (like that put forth by <xref ref-type="bibr" rid="B54">Nevins &amp; Chitoran, 2008</xref>). However, that study does not directly examine this potential underlying distinction in controlled and balanced environments but the comparative acoustics of differently identified surface forms.</p>
<p>In a <italic>syllabic pre-linking account</italic>, only timing should play a role in conveying such a distinction since no difference is proposed regarding place or constriction. One way is in the duration of the entire vocalic sequence. A [jV] sequence may be shorter overall than a [iV] sequence due to the prior being one syllable instead of two. Crystal and House (<xref ref-type="bibr" rid="B10">1990</xref>) observe that the number of syllables within a stress group can influence its duration: More syllables means a longer duration. They also find that syllables with fewer segments tend to be shorter. And while a [iV] sequence has more syllables (two as opposed to one), those syllables each have fewer segments. However, the effect size observed by Crystal and House is greater for higher prosodic units. Therefore, based on the number of syllables (vs. number of phones per syllable), the prediction stands that a [jV] sequence&#8217;s duration will be shorter than that of a [iV] hiatus sequence.</p>
<p>Another timing factor that could play a role is the speed of transition. We might expect a [jV] sequence to have a faster transition than a [iV] sequence. Liberman et al. (<xref ref-type="bibr" rid="B48">1956</xref>) suggest that this is the case, at least on perceptual grounds. In a perceptual experiment using simulated (drawn formant) speech stimuli, they find that the speed of transition from the formant starting point to the target formant state of the following vowel influences listeners&#8217; perceptions: Fastest leads to a [CV] percept; slowest leads to a [VV] percept; in between leads to a [GV] percept. Studies of production have also confirmed temporal aspects regarding formant transitions to distinguish glides. In a study of English diphthongs, Gay (<xref ref-type="bibr" rid="B20">1968</xref>) finds that the rate of transition of F2 serves as a reliable distinguisher between diphthongs.</p>
<p>A final timing factor to consider is the earliness of transition. That is, the distinction may not necessarily be about how fast the transition is, but how early it starts. The phonological interpretation that [j] is in the syllable margin with [C_], while [i] is further structurally separate as a member of the syllable nucleus, would suggest a tighter gestural coordination between a glide and the preceding consonant, whether directly formalized (&#225; la <xref ref-type="bibr" rid="B19">Gafos, 2002</xref>) or a result of the gestural planning of the relative segmental syllabifications. While the studies above, as well as Chitoran&#8217;s (<xref ref-type="bibr" rid="B8">2002</xref>) acoustic analysis of Romanian diphthongs, examine transition speed, this study is intended to tease transition speed apart from transition earliness as potentially distinct characteristics. To summarize, multiple characteristics related to timing could therefore be central to distinguishing a [jV] sequence from a [iV] sequence: Duration of the entire sequence, speed of transition into [_V], or earliness of transition into [_V]. It could be a combination of these, or one might be a more consistent distinguishing factor.</p>
<p>It is important to note that all of the phonological accounts should predict the distinction to result in some kinds of timing differences like those presented above. No matter the featural representation, the glide will be part of the syllable margin, instead of the nucleus. Therefore, while factors related to timing may be found to convey this distinction, they do not necessarily rule out a place- or constriction-based representation. If we find <italic>only</italic> such timing characteristics to play a role, this may lend support to applying a syllabic pre-linking account and challenge the other representations proposed. However, if timing differences are found in tandem with the other predictions presented above, this would still lend more support to those associated accounts than to concluding with a pre-linking account. This is because, if anything, a pre-linking (which we could also think of as the timing- or structure-only) account would predict the reverse in terms of formants and intensity. If the glide were identically featured yet forced to the syllable margin, the faster articulation might predict a lessened amplitude of articulator movements, resulting in formant centralization due to not fully reaching the target (Gay, <xref ref-type="bibr" rid="B22">1981</xref>; <xref ref-type="bibr" rid="B5">Browman &amp; Goldstein, 1990</xref>; <xref ref-type="bibr" rid="B73">Turner et al., 1995</xref>).</p>
<p>To summarize the acoustic predictions generated by these competing accounts: All accounts should predict some kind of timing difference due to syllabification, with a [jV] sequence showing a shorter overall duration, an earlier transition to [_V], and/or a faster transition. The place-based representation also predicts acoustic evidence that [j] is more front than [i], and that it therefore has a higher F2. The constriction-based representation, instead, predicts that [j] has a lower acoustic intensity and/or a higher lingual articulation and therefore a lower F1. A finding of <italic>only</italic> timing differences, possibly in tandem with formant centralization, would lend stronger support to the syllabic pre-linking account.</p>
</sec>
</sec>
<sec sec-type="methods">
<title>3. Method</title>
<sec>
<title>3.1. Elicitation</title>
<sec>
<title>3.1.1. Participants</title>
<p>Nine speakers participated in this study. All were remunerated for their time. The study took about 40 minutes, including informed consent and a questionnaire eliciting demographic information. All were identified as native speakers of American English. All were white, non-Hispanic or Hispanic. Most were female (8/9). Their ages ranged from 19 to 37 years. A participant&#8217;s particular US region they identified with was not tightly controlled for, as this variable has not so far been shown to significantly differ across any regional varieties of American English. No speakers reported having ever been diagnosed with any speech- or hearing-related disorder.</p>
</sec>
<sec>
<title>3.1.2. Stimuli</title>
<p>The experiment elicited utterances of both pre-existing lexical items and nonce names. All were designed with the purpose of eliciting the [j]-[i] distinction in a [C_V] environment. This environment was chosen because it is one where the distinction does seem to be available, as discussed above. Each stimulus was embedded within a unique sentence that participants read aloud. This was done in two blocks, with lexical items in the first and nonce names in the second. There were 8 lexical items considered to carry this distinction in near-minimally paired environments. These are listed in Table <xref ref-type="table" rid="T1">1</xref>, with the words categorized by their expected pronunciation and counterparts vertically paired. Word position, preceding segment, and stress placement relative to the sequence of interest were controlled across all words. As mentioned above, expected pronunciations were cross-checked, though admittedly subject to potential variation. At the time of writing, when looking across different sources&#8217; entries, an asymmetry was apparent for each pair in the expected direction, in agreement with Table <xref ref-type="table" rid="T1">1</xref>. (For example, while <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://Dictionary.com">Dictionary.com</ext-link> listed both pronunciations as options for <italic>gardenia</italic>, both Merriam-Webster and Cambridge listed only the [j&#601;] pronunciation. On the other hand, the [i&#601;] pronunciation of <italic>Armenia</italic> was the only option listed by the Cambridge Dictionary of American English, while <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://Dicitonary.com">Dicitonary.com</ext-link> and Merriam-Webster listed both options). Furthermore, at the end of the experiment, participants were asked to complete a metalinguistic task of providing syllable counts, with each of the pre-existing lexical items of interest included. For each, the syllable count associable with the expected pronunciation was that more frequently provided. There were 9 additional lexical items elicited for future/followup analysis of this variable (e.g., <italic>fjords, Tokyo</italic>), but these did not have near-minimal pair counterparts and are not addressed further in this analysis.</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption><p>Stimuli: Lexical items of interest.</p></caption>
<table>
<tr>
<th align="left" valign="bottom">expected pronunciation</th>
<th align="left"></th>
<th align="left"></th>
<th align="left"></th>
<th align="left"></th>
</tr>
<tr>
<td align="left" colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">[iV]:</td>
<td align="left">Estonia</td>
<td align="left">hernia</td>
<td align="left">millennia</td>
<td align="left">Armenia</td>
</tr>
<tr>
<td align="left">[jV]:</td>
<td align="left">pneumonia</td>
<td align="left">California</td>
<td align="left">Kenya</td>
<td align="left">gardenia</td>
</tr>
</table>
</table-wrap>
<p>The 17 lexical items were each assigned to a unique sentence. These assignments were kept constant. Attention was paid to placing the stimulus in a prosodically prominent position in the early half of the sentence. The following word in every sentence began with a voiceless labial obstruent, simply as a means of controlling the place and sonority of the consonant immediately following the word-final sequence of interest. Two examples are provided in (2).</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(2)</td>
<td>a)</td>
<td>The state of California passed a new bill.</td>
</tr>
<tr>
<td>&#160;</td>
<td>b)</td>
<td>The gardenia flower has a strong scent.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>There were 40 nonce names designed to test if the distinction could be elicited productively. They were also designed with the intent of eliciting the distinction in a much more diverse array of phonological environments, to therefore examine what acoustic aspects convey it consistently. To elicit the distinction itself, pairs differing in orthography were made with the aim of eliciting [j] via the &lt;y&gt; grapheme and [i] via &lt;i&gt;. The preceding consonant and the position in the word were also manipulated. Three places of articulation of the preceding consonant were used&#8212;labial, coronal, and dorsal. Within each place of articulation, four manners of articulation were used&#8212;voiceless stop, voiced stop, voiceless fricative, and nasal (the latter two manners unavailable for the dorsal place in English). Finally, word position was also manipulated. Position here is defined in terms of the placement of the [Cj]/[Ci] sequence&#8212;initial vs. medial. Paired counterparts across the main condition of orthography were created, matched along the other three factors to therefore balance for potential phonological effects on the distribution of this distinction, as discussed above (Section 2.2). There were also 40 filler nonce names created, none incorporating the variable of interest.</p>
<p>Additional environmental factors within the nonce names were controlled. The following vowel (elicited via the &lt;a&gt; grapheme) was kept constant within each position: [&#593;&#769;] for initial position and [&#601;] for medial position. For the initial-position stimulus pairs, the place of the following consonant was kept identical. For the non-target syllable in each stimulus, the inventory of nuclear vowels was [&#593;, i, u, o]. This provided some diversity while keeping nucleus weight constant. A final aspect of the stimuli was the use of acute accent &lt; &#769;&gt; marks to represent stress placement. This was incorporated to keep participants from placing stress on the high front vocoid of interest (e.g., pronouncing <italic>S&#250;mia</italic> as [su.m&#237;.&#601;]). This factor also served as a distractor variable, with the 40 filler stimuli varying more unpredictably in stress placement (e.g., <italic>Sh&#243;glubo, Blit&#250;, S&#243;ga</italic>). Table <xref ref-type="table" rid="T2">2</xref> lists the nonce name stimuli along with their honorifics, which will be further explained next.</p>
<table-wrap id="T2">
<label>Table 2</label>
<caption><p>Stimuli: Nonce names (w/honorifics).</p></caption>
<table>
<tr>
<th align="left"></th>
<th align="left"></th>
<th align="left" colspan="2"><sc>INITIAL</sc></th>
<th align="left" colspan="2"><sc>MEDIAL</sc></th>
</tr>
<tr>
<th align="center" colspan="2">C_</th>
<th align="left">&lt;i&gt;</th>
<th align="left">&lt;y&gt;</th>
<th align="left">&lt;i&gt;</th>
<th align="left">&lt;y&gt;</th>
</tr>
<tr>
<td align="left" colspan="6"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="4"><sc>LABIAL</sc></td>
<td align="left">/p_/</td>
<td align="left">Dr. Pi&#225;cho</td>
<td align="left">Governor Py&#225;sha</td>
<td align="left">Coach N&#243;pia</td>
<td align="left">Officer D&#225;pya</td>
</tr>
<tr>
<td align="left">/b_/</td>
<td align="left">Mr. Bi&#225;si</td>
<td align="left">Mr. By&#225;su</td>
<td align="left">Miss Sh&#225;bia</td>
<td align="left">Mrs. Ch&#243;bya</td>
</tr>
<tr>
<td align="left">/f_/</td>
<td align="left">Mr. Fi&#225;ki</td>
<td align="left">Officer Fy&#225;ga</td>
<td align="left">Mr. G&#243;fia</td>
<td align="left">Dr. Z&#250;fya</td>
</tr>
<tr>
<td align="left">/m_/</td>
<td align="left">Sister Mi&#225;shu</td>
<td align="left">Professor My&#225;chi</td>
<td align="left">Professor S&#250;mia</td>
<td align="left">Dr. F&#237;mya</td>
</tr>
<tr>
<td align="left" colspan="6"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="4"><sc>CORONAL</sc></td>
<td align="left">/t_/</td>
<td align="left">Dr. Ti&#225;gu</td>
<td align="left">Sister Ty&#225;ko</td>
<td align="left">Governor B&#237;tia</td>
<td align="left">Mr. P&#243;tya</td>
</tr>
<tr>
<td align="left">/d_/</td>
<td align="left">Dr. Di&#225;fa</td>
<td align="left">Mr. Dy&#225;pu</td>
<td align="left">Sister M&#243;dia</td>
<td align="left">Sister V&#225;dya</td>
</tr>
<tr>
<td align="left">/s_/</td>
<td align="left">Officer Si&#225;ko</td>
<td align="left">Professor Sy&#225;gi</td>
<td align="left">Officer K&#250;sia</td>
<td align="left">Officer G&#237;sya</td>
</tr>
<tr>
<td align="left">/n_/</td>
<td align="left">Sister Ni&#225;fi</td>
<td align="left">Dr. Ny&#225;pa</td>
<td align="left">Miss V&#243;nia</td>
<td align="left">Judge B&#250;nya</td>
</tr>
<tr>
<td align="left" colspan="6"><hr/></td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2"><sc>DORSAL</sc></td>
<td align="left">/k_/</td>
<td align="left">Professor Ki&#225;sa</td>
<td align="left">Mr. Ky&#225;so</td>
<td align="left">Mrs. D&#243;kia</td>
<td align="left">Mr. P&#250;kya</td>
</tr>
<tr>
<td align="left">/&#609;_/</td>
<td align="left">Pastor Gi&#225;fu</td>
<td align="left">Dr. Gy&#225;pi</td>
<td align="left">Judge N&#225;gia</td>
<td align="left">Professor T&#237;gya</td>
</tr>
</table>
</table-wrap>
<p>Like the lexical items, each nonce stimulus was embedded in a unique sentence, with that sentence assignment remaining constant. Carrier sentences presented the nonce stimuli as surnames in sentence-initial position. The honorifics kept the stimuli away from completely phrase-initial position while still early in the sentence in a prosodically prominent position. All sentences were of the formula presented in (3) with some examples.</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(3)</td>
<td>&#160;</td>
<td><italic>Honorific</italic></td>
<td>+</td>
<td><italic>Stimulus</italic></td>
<td>+</td>
<td><italic>Verb</italic></td>
<td>+</td>
<td><italic>Direct Object/Adjunct Modifier</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a)</td>
<td>Mr.</td>
<td>&#160;</td>
<td>By&#225;su</td>
<td>&#160;</td>
<td>started</td>
<td>&#160;</td>
<td>a band.</td>
</tr>
<tr>
<td>&#160;</td>
<td>b)</td>
<td>Judge</td>
<td>&#160;</td>
<td>B&#250;nya</td>
<td>&#160;</td>
<td>paints</td>
<td>&#160;</td>
<td>beautifully.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Environmental factors within the sentences were also controlled. For medial-position stimuli, the onset segment of the following word in the sentence was always a voiceless labial obstruent (the same method for controlling the following segment as that used for the real word stimuli described above). For initial-position stimuli, the preceding honorific was always [&#633;]-final.</p>
</sec>
<sec>
<title>3.1.3. Procedure</title>
<p>The study took place in the Phonetics and Experimental Phonology laboratory at New York University. Participants were seated in a sound-attenuated booth at a desk with a computer screen in front of them. Their speech was recorded with a Shure SM35-XLR head-mounted microphone connected to a Marantz PMD 660 audio recorder (44.1 kHz sampling). Sentences were presented on the computer screen one at a time. The participant would read the sentence aloud and advance to the next by pressing the down arrow on a standard keyboard. This method expressly avoided auditory repetition, so that no such distinction nor its implementation could be auditorily primed. Previous research suggests that speakers&#8217; productions can be phonetically influenced by previous exposure (e.g., <xref ref-type="bibr" rid="B25">Goldinger, 1998</xref>; <xref ref-type="bibr" rid="B53">Namy et al., 2002</xref>; <xref ref-type="bibr" rid="B18">Fowler et al., 2003</xref>; <xref ref-type="bibr" rid="B23">Gentilucci &amp; Bernardis, 2007</xref>). Therefore, elicitation was only done by orthographic presentation, so that participants&#8217; productions could not be influenced by previous exposure in any way. The researcher was present in the sound booth to provide training for the nonce stimuli section and ask the participant to repeat any stimulus if needed. All participants were led through the same procedure with the same stimuli and presentation described below.</p>
<p>The first block consisted of the 17 sentences containing pre-existing lexical items. Sentences were randomized, and then near-minimal pairs were moved to allow substantial space between the counterparts. This was repeated to result in four cycles through the stimuli, with the spacing of near-minimal pair counterparts across cycle boundaries also manually adjusted.</p>
<p>The second block consisted of the 40 sentences containing nonce names (and 40 with filler nonce names). First, the nonce names of interest (targets) were randomized. Then, ordering was adjusted to put maximum distance between near-minimal pair counterparts&#8212;those matching in environmental factors and differing in &lt;y&gt; vs. &lt;i&gt; orthography. Then, the filler stimuli were randomly ordered and added, one after each target stimulus, so that the cycle would alternate between target and filler stimuli. This was repeated to result in four cycles through the stimuli. After this, spacing of near-minimal pair counterparts was given similar attention across cycle boundaries.</p>
<p>Between the two blocks, there was a short training session regarding the nonce stimuli. The researcher told participants that they would be encountering unfamiliar last names. They were told that the names use only four vowels&#8212;[&#593;], [i], [u], and [o]. They were instructed to be consistent in pronunciation, thinking one letter equals one sound (e.g., the letter &lt;g&gt; should always be pronounced as [&#609;], and never as [d&#865;&#658;]). They were then instructed that the vowel marked with an acute accent &lt; &#769;&gt; was stressed.</p>
<p>After this instruction, three cycles of training stimuli were presented. Training stimuli were all made according to the filler stimulus formula. None included &lt;y&gt; or any &lt;iV&gt; sequence; some included simplex &lt;w&gt; onsets. The first cycle was auditory and orthographic repetition. There were ten training stimuli consisting of just an honorific + name. A pre-recording of the stimulus uttered by another English speaker played automatically with each slide showing the orthography, and the participant would repeat it. In these pre-recorded utterances, when an &lt;a&gt; was final and not stressed, it was reduced to a schwa, which participants followed naturally. The second cycle removed the auditory component. There were five training stimuli consisting of just an honorific + name. In this cycle, a pre-recorded utterance was no longer played and the participant would read the orthographically presented stimulus aloud. The researcher provided feedback after any errors (which were usually regarding stress placement). The last training cycle consisted of four stimuli in full sentence form. Participants were told that they would now be reading complete sentences with these names. They were told that it was important to not pause within a sentence and that they may be asked to repeat if they paused within. However, they were informed that there was no time limit and that they could say the sentence in their head before saying it out loud.</p>
<p>Feedback was provided during the second block. However, no feedback was ever given regarding the variable of interest. The researcher did nothing if, on a &lt;y&gt; stimulus, the participant&#8217;s utterance was perceived as a [iV] sequence or, for a &lt;i&gt; stimulus, the participant&#8217;s utterance was perceived as a [jV] sequence. Both of these behaviors were perceived to occur, though, suggesting that phonological effects on the distribution did sometimes override the orthographic elicitation. No participant was perceived to categorically produce only [jV] or [iV] across the orthographic presentations. One phenomenon that did elicit feedback regarding &lt;y&gt; and &lt;i&gt; was the pronunciation of either as the [a&#618;] diphthong. This was not common, but did occur a few times with more than one participant. In such cases, the feedback was framed along the following lines, &#8220;Don&#8217;t pronounce the letter &lt;i&gt; or &lt;y&gt; as [a&#618;]. The only vowels are [&#593;], [i], [u], and [o].&#8221; Feedback never included an utterance by the researcher of a [jV] or [iV] sequence. Common errors eliciting feedback were misplacement of stress, pausing, and segmental errors not within but sometimes neighboring the sequence of interest.</p>
</sec>
</sec>
<sec>
<title>3.2. Analysis and predictions</title>
<p>There were 1672 utterances examined, after excluding tokens that were produced in an unexpected way (e.g., the vocoid of interest pronounced as [a&#618;], a relevant neighboring segment mispronounced, stress misplaced, the sequence held out as a speech delay). Praat software (<xref ref-type="bibr" rid="B3">Boersma &amp; Weenink, 2015</xref>) was used for segmentation and analysis. The entire [jV]/[iV]-expectant vocalic sequence was segmented. The beginning (henceforth &#8216;vocalic onset&#8217;), was identified as the onset of F1 after an obstruent or re-strengthening of formants after the release of a preceding nasal. The end (henceforth &#8216;vocalic offset&#8217;) was identified as the severe reduction in amplitude or complexity of the formants attributable to the following consonant. All acoustic measurements were performed over this entire vocalic sequence.</p>
<p>The following reviews what this distinction might look like acoustically. In line with a place-based representation, we would expect that [j] has more anterior raising of tongue mass than [i], and therefore a higher F2. In line with a constriction-based representation, we would expect that [j] has a higher lingual articulation and tighter constriction than [i], and therefore a lower F1 and lower acoustic intensity. As predicted by all accounts (now including that of syllabic pre-linking), timing may also play a role, with a [jV] sequence being shorter overall, having an earlier transition, and/or having a faster transition than a [iV] sequence.</p>
<p>Figure <xref ref-type="fig" rid="F1">1</xref> shows spectrograms of utterances by the same speaker of <italic>Estonia</italic> [CiV] and <italic>pneumonia</italic> [CjV] appearing to exhibit the distinction. Inspection confirms some of the predictions above. The [jV] sequence is of a shorter duration overall than that of [iV]. In terms of the F2 trajectory, the [iV] sequence appears to take longer to reach its maximum before transitioning downward toward the following vowel, suggesting that the [jV] transition starts earlier. However, the F2 max is greater for [iV] than for [jV], suggesting that [j] is <italic>less</italic> front than [i]&#8212;the opposite of that predicted by the place-based representation. The intensity tracker also shows a greater jump in the [jV] case (+6.1 dB) than the [iV] case (+3.5dB), suggesting that [j] has a lower intensity relative to [_V] than [i] does, and [jV] has a lower F1 min, suggesting a higher lingual articulation. Other than that of F2 max, these observations are all in line with the acoustic phonetic predictions discussed above.</p>
<fig id="F1">
<label>Figure 1</label>
<caption><p>Example utterance spectrograms. Spectrograms of utterances by the same speaker of a near-minimal pair expecting and appearing to exhibit the distinction of interest. The vertical red line shows the vocalic onset and the end of the spectrogram is where the vocalic offset was segmented, with the duration of the entire vocalic sequence noted. The yellow line is Praat&#8217;s intensity tracker. The F2 max and F2 min of each vocalic sequence of interest are also noted.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/6220/file/75125/"/>
</fig>
<p>Table <xref ref-type="table" rid="T3">3</xref> summarizes the measurements, predictions, and how each relates to the competing representation accounts. All measurements were performed across the entire segmented vocalic sequence using a Praat script. The maximum value of F2 was identified and recorded (F2 max) to examine frontness: A higher value for [j] would mean that it is more front than [i], therefore supporting a place-based representation. The minimum value of F1 was identified and recorded (F1 min) to examine height of lingual articulation: A lower value for [j] would mean that it is of a higher lingual articulation than [i], therefore supporting a constriction-based representation. The minimum intensity value was recorded and subtracted from the maximum intensity value (intensity range): A greater intensity range for [jV] would mean that [j] has a lower intensity relative to [_V], therefore supporting a constriction-based representation. The amount of time between the voicing onset and the timepoint at which F2 max occurred was also recorded (F2 max time) to examine the transition starting point (following <xref ref-type="bibr" rid="B8">Chitoran, 2002</xref> and <xref ref-type="bibr" rid="B61">Ren, 1986</xref>), predicting a [jV] sequence&#8217;s transition to begin earlier. (See Section 4.3 for further discussion regarding the choice to treat this measurement of earliness as absolute&#8212;milliseconds from voicing onset&#8212;rather than relative&#8212;such as percentage of the entire vocalic sequence&#8217;s duration). The timepoint and value of the F2 minimum were recorded and used to calculate the slope of F2&#8217;s transition between that point and F2 max (F2 slope), predicting a [jV] sequence to have a greater F2 slope and therefore a faster transition (as suggested by <xref ref-type="bibr" rid="B48">Liberman et al., 1956</xref>). Finally, the overall duration was measured between vocalic onset and offset, predicting a [jV] sequence to have an overall shorter duration than a [iV] sequence.</p>
<table-wrap id="T3">
<label>Table 3</label>
<caption><p>Measurements and competing acoustic predictions.</p></caption>
<table>
<tr>
<th align="left"><sc>MEASUREMENT</sc></th>
<th align="left"><sc>PREDICTION</sc></th>
<th align="left"><sc>REASON</sc></th>
<th align="left"><sc>ACCOUNT</sc></th>
</tr>
<tr>
<td align="left" colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">F2 max</td>
<td align="left">[iV] &lt; [jV]</td>
<td align="left">[j] more front than [i]</td>
<td align="left">place</td>
</tr>
<tr>
<td align="left">F1 min</td>
<td align="left">[iV] &gt; [jV]</td>
<td align="left">lingual articulation higher than [i]</td>
<td align="left">constriction</td>
</tr>
<tr>
<td align="left">intensity range</td>
<td align="left">[iV] &lt; [jV]</td>
<td align="left">[j] more constricted than [i]</td>
<td align="left"></td>
</tr>
<tr>
<td align="left">F2 max time</td>
<td align="left">[iV] &gt; [jV]</td>
<td align="left">[jV] has earlier transition than [iV]</td>
<td align="left">all accounts</td>
</tr>
<tr>
<td align="left">F2 slope</td>
<td align="left">[iV] &lt; [jV]</td>
<td align="left">[jV] has faster transition than [iV]</td>
<td align="left"></td>
</tr>
<tr>
<td align="left">duration</td>
<td align="left">[iV] &gt; [jV]</td>
<td align="left">[jV] = 1 syllable; [iV] = 2 syllables</td>
<td align="left"></td>
</tr>
</table>
</table-wrap>
<p>Of course, one possibility is that there is no significant difference along any measurement, which would not support the hypothesis that a distinction was elicited (at least as detectable by the measurements taken here). An observation in the <italic>opposite</italic> direction of one of the predictions above would motivate <italic>ruling out</italic> the associated account. For example, an observation that [j] has a <italic>lower</italic> F2 max would suggest that a place-based approach is not a good fit for representing the distinction at hand. An observation that any predictions from the first two sets are borne out would lend support to the associated account(s). For example, an observation that [j] has a <italic>higher</italic> F2 max would support applying a place-based representation. However, that observation in tandem with a lower F1 min or wider intensity range could be interpreted as inconclusive between place- and constriction-based accounts, or as support for a hybrid approach that both features are part of this distinction&#8217;s specification (e.g., <xref ref-type="bibr" rid="B54">Nevins &amp; Chitoran, 2008</xref>). Furthermore, as discussed above (Section 2.3), if we observe any predictions from the first two sets in tandem with any from the final set of timing-related predictions, this would still lend more support to those accounts than to a syllabic pre-linking account, since a difference in syllabification and therefore timing is predicted by all accounts. Strongest support for a syllabic pre-linking account would come from observing <italic>only</italic> timing-related differences.</p>
</sec>
</sec>
<sec>
<title>4. Results</title>
<p>In this section, all of the acoustic measurements (previously summarized in Table <xref ref-type="table" rid="T3">3</xref>) are submitted to statistical tests to examine whether they exhibited significant differences across the near-minimal pairs, which would suggest the distinction was successfully elicited. Utterances are only coded for expected output, not perceived production. For pre-existing lexical items (Section 4.1), this is the factor of expected pronunciation (as previously given in Table <xref ref-type="table" rid="T1">1</xref>). For nonce stimuli (Section 4.2), this is the factor of orthography, which expects a &lt;y&gt; &#8594; [j], &lt;i&gt; &#8594; [i] mapping. Any significant result would suggest that the distinction was successfully elicited, with paired counterparts distinguishable along any acoustic measurement(s) with a significant effect.</p>
<sec>
<title>4.1. Pre-existing lexical items</title>
<p>A linear mixed-effects model was performed for each of the acoustic measurements using the lme() function from the nlme package (<xref ref-type="bibr" rid="B59">Pinheiro et al., 2017</xref>) for the R statistical programming environment (<xref ref-type="bibr" rid="B60">R Core Team, 2015</xref>). This tested expected pronunciation as the independent variable for its effect on each acoustic measurement as the dependent variable. To capture the paired nature of the experimental design, a random effect (intercept and slope) was included for each combination of speaker and near-minimal pair. For example, anatomical differences might lead formants to have a higher average for one speaker than another, and that could also affect the magnitude of the difference in any formant used by that speaker to convey the distinction. Or, one speaker might not exhibit the distinction for one word pair, though they exhibit it for another. Or, timing-related factors might be used differently for one word pair than another, such as the distinction being realized differently between the <italic>Kenya</italic> + <italic>millennia</italic> pair and the <italic>California</italic> + <italic>hernia</italic> pair due to differing syllable counts, seeing as entire word length can affect the duration of syllables within the word (<xref ref-type="bibr" rid="B50">Lindblom, 1968</xref>).</p>
<p>The results of this analysis are presented in Table <xref ref-type="table" rid="T4">4</xref>. The first three columns are descriptive statistics. &#8216;Percent [i] &gt; [j]&#8217; describes what proportion of near-minimal utterance pairs exhibit a difference in the [i] &gt; [j] direction; the farther this number is from 50% means an acoustic difference is less chance-like in its patterning and therefore more consistent in conveying this distinction. The means of both categories are also provided. The final two columns are results of the statistical models applied. The &#8216;Coefficient: [j]-expectant&#8217; is how much and in what direction the model predicts an acoustic measure to differ from the [i]-expectant prediction (the intercept) when the utterance is of a [j]-expectant stimulus.</p>
<table-wrap id="T4">
<label>Table 4</label>
<caption><p>Results: Pre-existing lexical items. Descriptive statistics and results of linear mixed-effects models per measurement across the factor of expected pronunciation. Measurements are ordered by their consistency of conveying the distinction&#8212;how far, in either direction, Percent [i] &gt; [j] is from 50%.</p></caption>
<table>
<tr>
<th valign="top" align="left">Measurement</th>
<th valign="top" align="right">Percent[i] &gt; [j]</th>
<th valign="top" align="right">Mean: [i]-expectant</th>
<th valign="top" align="right">Mean: [j]-expectant</th>
<th valign="top" align="right">Coefficient: [j]-expectant</th>
<th valign="top" align="right"><italic>p</italic></th>
<th align="left"></th>
</tr>
<tr>
<td align="left" colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">F2 max time</td>
<td align="right">94%</td>
<td align="right">35.67 ms</td>
<td align="right">19.43 ms</td>
<td align="right">&#8211;16.49 ms</td>
<td align="right">9.99<italic>e</italic>&#8211;16</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">duration</td>
<td align="right">83%</td>
<td align="right">167.03 ms</td>
<td align="right">130.23 ms</td>
<td align="right">&#8211;37.09 ms</td>
<td align="right">2.49<italic>e</italic>&#8211;09</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">F2 max</td>
<td align="right">75%</td>
<td align="right">2609 Hz</td>
<td align="right">2543 Hz</td>
<td align="right">&#8211;67 Hz</td>
<td align="right">0.00066</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">F2 slope</td>
<td align="right">25%</td>
<td align="right">10.078 Hz/ms</td>
<td align="right">10.791 Hz/ms</td>
<td align="right">+0.769 Hz/ms</td>
<td align="right">0.04965</td>
<td align="left" valign="top">*</td>
</tr>
<tr>
<td align="left">intensity range</td>
<td align="right">31%</td>
<td align="right">5.57 dB</td>
<td align="right">6.57 dB</td>
<td align="right">+0.992 dB</td>
<td align="right">0.03172</td>
<td align="left" valign="top">*</td>
</tr>
<tr>
<td align="left">F1 min</td>
<td align="right">56%</td>
<td align="right">431 Hz</td>
<td align="right">423 Hz</td>
<td align="right">&#8211;6 Hz</td>
<td align="right">0.44118</td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
<p>The expected glide-vowel distinction across the near-minimal word pairs appears borne out in the data along multiple acoustic dimensions. The [j]-expectant counterparts have significantly earlier transitions into the following vowel, as represented by the earliness of F2 max, and significantly shorter durations of the entire vocalic sequence. They also have significantly wider intensity ranges, suggesting that [j] has a lower intensity relative to that of the following vowel. A difference in frontness, as represented by F2 max, is also significant but in the <italic>opposite</italic> direction than that predicted by a place-based representation: [j] has a significantly lower F2 max, and thus a less anterior constriction than [i]. And the measurement of F2 slope suggests that [j] has a significantly faster transition into the following vowel. F1 min shows no significant effect. Figure <xref ref-type="fig" rid="F2">2</xref> provides visualizations of each measurement&#8217;s results, grouped by expected pronunciation.</p>
<fig id="F2">
<label>Figure 2</label>
<caption><p>Pre-existing lexical items. Visualizations provide box plots across the two expected pronunciation conditions. Points represent the measurement for each speaker&#8217;s utterance of each word (averaged across repetitions and grouped by expected pronunciation alongside the respective box plot). Lines connect each pair&#8217;s counterparts, with a green line representing that [i] &gt; [j] and a red line representing that [i] &lt; [j].</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/6220/file/75126/"/>
</fig>
<p>These results suggest that a [j]-[i] distinction is present between near-minimal pre-existing word pairs and that timing may be the most reliable distinguisher between [j] and [i]: [jV] sequences are shorter than [iV] sequences, seemingly brought about by an earlier and faster transition into [_V]. The results also challenge applying a place-based representation, with a difference in frontness found to be significant but in the reverse direction of that predicted by this representation: [j] is <italic>less</italic> anterior than [i]. This could be a by-product of timing, as discussed earlier (Section 2.3): [j] being faster than [i] could result in more reduction and formant centralization, therefore keeping [j] from reaching its front target (however anterior that target may be) and in this case resulting in a significantly less anterior realization than that of [i]. Results speak in support of applying a constriction-based representation. The intensity range effect suggests that [j] has a lower intensity, relative to that of the following vowel. Furthermore, the lack of a significant reverse effect for F1 min like that observed for F2 max could suggest that the target is actually a higher articulation than [i] but production is reduced due to the faster articulation, though not to a realization lower than (only insignificantly different from) that of [i]. In summary, these results challenge applying a place-based representation and support applying a constriction-based representation, at least regarding the production of pre-existing near-minimal word pairs in American English.</p>
</sec>
<sec>
<title>4.2. Nonce stimuli</title>
<p>This section extends the same analysis to the nonce stimulus data, with analogous linear mixed-effects models of each measurement across the condition of stimulus orthography, where a &lt;i&gt; orthography expects a [i] output and a &lt;y&gt; orthography expects a [j] output. Again, a random effect was specified for each combination of speaker and near-minimal pair. The results are presented in Table <xref ref-type="table" rid="T5">5</xref>, with the format exactly like that of Table <xref ref-type="table" rid="T4">4</xref> regarding the pre-existing lexical item results.</p>
<table-wrap id="T5">
<label>Table 5</label>
<caption><p>Results: Nonce stimuli. Descriptive statistics and results of linear mixed-effects models per measurement across the factor of expected pronunciation (in this case, &lt;i&gt; vs. &lt;y&gt; stimulus orthography). Measurements are ordered by their consistency of conveying the distinction&#8212;how far, in either direction, Percent [i] &gt; [j] is from 50%.</p></caption>
<table>
<tr>
<th align="left" valign="top">Measurement</th>
<th align="center" valign="top">Percent[i] &gt; [j]</th>
<th align="center" valign="top">Mean: [i]-expectant</th>
<th align="center" valign="top">Mean: [j]-expectant</th>
<th align="center" valign="top">Coefficient: [j]-expectant</th>
<th align="center" valign="top"><italic>p</italic></th>
<th align="left" valign="top"></th>
</tr>
<tr>
<td align="left" colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">duration</td>
<td align="right">66%</td>
<td align="right">199.89 ms</td>
<td align="right">186.39 ms</td>
<td align="right">&#8211;13.21 ms</td>
<td align="right">6.44<italic>e</italic>&#8211;08</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">F2 max time</td>
<td align="right">66%</td>
<td align="right">23.93 ms</td>
<td align="right">19.59 ms</td>
<td align="right">&#8211;4.38 ms</td>
<td align="right">3.76<italic>e</italic>&#8211;06</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">intensity range</td>
<td align="right">65%</td>
<td align="right">10.06 dB</td>
<td align="right">9.28 dB</td>
<td align="right">&#8211;0.763 dB</td>
<td align="right">0.00014</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">F2 max</td>
<td align="right">59%</td>
<td align="right">2632 Hz</td>
<td align="right">2619 Hz</td>
<td align="right">&#8211;12 Hz</td>
<td align="right">0.08380</td>
<td align="left" valign="top"><sup>&#8226;</sup></td>
</tr>
<tr>
<td align="left">F2 slope</td>
<td align="right">45%</td>
<td align="right">9.249 Hz/ms</td>
<td align="right">9.563 Hz/ms</td>
<td align="right">&#8211;0.278 Hz/ms</td>
<td align="right">0.11019</td>
<td align="right"></td>
</tr>
<tr>
<td align="left">F1 min</td>
<td align="right">51%</td>
<td align="right">371 Hz</td>
<td align="right">386 Hz</td>
<td align="right">&#8211;3 Hz</td>
<td align="right">0.36107</td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
<p>Again, the expected glide-vowel distinction across the near-minimal nonce stimulus pairs appears borne out in the data along multiple acoustic dimensions. The [j]-expectant counterparts have significantly earlier transitions into the following vowel and significantly shorter durations of the entire vocalic sequence. There is again a significant effect on the intensity range, however this is in the reverse direction ([i]-expectant stimulus utterances show a greater intensity range across the vocalic sequence than [j]-expectant stimulus utterances). It&#8217;s possible that this is a task effect. Recall that stress placement was explicitly marked in the nonce stimuli and used as a distractor variable during the experiment. Subjects may have been hyperarticulating stress by using a wider than normal intensity range to distinguish stressed syllables from unstressed syllables. The hyper-differentiation of intensity between syllables may be overriding any observably lower intensity of [j]. However, this potential for reversal does suggest that intensity may not be the most reliable characteristic of this distinction. The remaining measurements pattern in parallel with the results of the pre-existing lexical items discussed above. F2 max again patterns counter to what would be predicted by a place-based representation, with [j] having a lower F2 max (this time approaching, while not reaching, significance) and therefore a less anterior articulation. Figure <xref ref-type="fig" rid="F3">3</xref> provides visualizations of each measurement&#8217;s results, grouped by expected pronunciation (in this case, stimulus orthography).</p>
<fig id="F3">
<label>Figure 3</label>
<caption><p>Nonce stimuli. Visualizations provide box plots across the two orthography conditions. Points represent the measurement for each speaker&#8217;s utterance of each nonce stimulus (averaged across repetitions and grouped by orthography alongside the respective box plot). Lines connect each pair&#8217;s counterparts, with a green line representing that [i] &gt; [j] and a red line representing that [i] &lt; [j].</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/6220/file/75127/"/>
</fig>
<p>These results suggest that the [j]-[i] distinction observed between pre-existing near-minimally paired lexical items (Section 4.1) is also productively extended to new words, as elicited via the &lt;y&gt; vs. &lt;i&gt; orthographic distinction, and across a wider variety of surrounding environments. They further suggest that transition earliness and overall vocalic sequence duration are the more consistent acoustic dimensions that convey this distinction ([jV] sequences are shorter than [iV] sequences, with the transition into [_V] coming earlier after [j] than after [i]). The nonce stimulus results continue to challenge applying a place-based representation to this case, with [j] again found to have a lower F2 max and therefore a less anterior articulation&#8212;the reverse of that predicted by this representation. However, these results speak less strongly in favor of a constriction-based representation, with [j] now appearing to have a greater intensity with respect to that of the following vowel. How to conclude or proceed based on these observations will be further discussed below (Section 5).</p>
<p>While the central pursuit of the nonce stimulus part of this study is to examine what acoustic characteristics consistently convey this distinction across a more diversified array of surrounding environments, the data may also speak to how those environmental factors constrain the distinction&#8217;s availability. Table <xref ref-type="table" rid="T6">6</xref> reports measurements of what appears to be the most consistent characteristic, transition earliness as measured by F2 max time, across the different environmental conditions of word position and place of articulation of the preceding consonant that were manipulated in the nonce stimuli. Though linear regression modeling of each environmental condition&#8217;s effect (testing for a significant interaction between condition and orthography as a predictor of F2 max time) finds no effect to be significant, the distinction&#8217;s availability does appear to pattern in some expected ways by being realized more consistently in certain conditions. It is apparently more available when the preceding consonant is in medial position rather than initial position. This, as previously discussed above (Section 2.2), could be due to consonants that would otherwise disprefer sharing a complex onset with the glide being more readily parsed as the coda of the preceding syllable when the consonant is not word-initial. The distinction is also apparently more available when the preceding consonant is coronal or labial and less available when the preceding consonant is dorsal (the strongest factor appearing to limit the distinction&#8217;s availability). This suggests a homorganicity constraint banning [Cj] sequences when the preceding consonant is dorsal (to be discussed further still in Section 5).</p>
<table-wrap id="T6">
<label>Table 6</label>
<caption><p>Descriptive statistics of F2 max time (representing transition earliness) across manipulated conditions of the surrounding environment. Within each factor, conditions are ordered by how consistently the measurement of F2 max time exhibits the distinction&#8212;how far, in either direction, Percent [i] &gt; [j] is from 50%.</p></caption>
<table>
<tr>
<th align="left" valign="bottom">Factor</th>
<th align="left" valign="bottom">Condition</th>
<th align="center" valign="bottom">Percent[i] &gt; [j]</th>
<th align="center" valign="bottom">Mean: [i]-expectant</th>
<th align="center" valign="bottom">Mean: [j]-expectant</th>
<th align="center" valign="bottom">Mean of Differences</th>
</tr>
<tr>
<td align="left" colspan="6"><hr/></td>
</tr>
<tr>
<td align="left">position</td>
<td align="left">medial</td>
<td align="right">71%</td>
<td align="right">22.36 ms</td>
<td align="right">16.62 ms</td>
<td align="right">5.75 ms</td>
</tr>
<tr>
<td align="left"></td>
<td align="left">initial</td>
<td align="right">62%</td>
<td align="right">25.47 ms</td>
<td align="right">22.53 ms</td>
<td align="right">2.94 ms</td>
</tr>
<tr>
<td align="left">C_ place</td>
<td align="left">coronal</td>
<td align="right">71%</td>
<td align="right">26.75 ms</td>
<td align="right">21.18 ms</td>
<td align="right">5.58 ms</td>
</tr>
<tr>
<td align="left"></td>
<td align="left">labial</td>
<td align="right">69%</td>
<td align="right">27.37 ms</td>
<td align="right">22.76 ms</td>
<td align="right">4.61 ms</td>
</tr>
<tr>
<td align="left"></td>
<td align="left">dorsal</td>
<td align="right">53%</td>
<td align="right">11.47 ms</td>
<td align="right">10.17 ms</td>
<td align="right">1.30 ms</td>
</tr>
</table>
</table-wrap>
</sec>
<sec>
<title>4.3. Transition earliness: Absolute vs. relative</title>
<p>Given that transition earliness appears to be the most consistent differentiating characteristic of this distinction, it is briefly given some more nuanced attention here. In the above analyses, transition earliness is treated as an absolute measurement: How many milliseconds after the onset of a [jV]/[iV] sequence is the maximum F2 reached before its descending transition into the following vowel begins? However, we know that the duration of a segment can be influenced by segment-extrinsic factors like speech rate (<xref ref-type="bibr" rid="B26">Goldman-Eisler, 1968</xref>; <xref ref-type="bibr" rid="B28">Grosjean &amp; Lane, 1976</xref>; <xref ref-type="bibr" rid="B21">Gay, 1978</xref>; <xref ref-type="bibr" rid="B35">Hirata, 2004</xref>), whether a segment appears in a stressed syllable (<xref ref-type="bibr" rid="B49">Lindblom, 1963</xref>; <xref ref-type="bibr" rid="B42">Klatt, 1975</xref>), and the crowding of its prosodic environment, such as the number of syllables in a stress group (<xref ref-type="bibr" rid="B10">Crystal &amp; House, 1990</xref>). This consideration might, therefore, motivate examining transition earliness as a relative, proportional measurement rather than an absolute measurement: How far percentage-wise into a [jV]/[iV] sequence (no matter its entire duration) does the transition to [_V] begin? A [jV] sequence may show an average F2 max occurrence time of 20ms and an average total duration of 185ms, but it might be hypothesized to come later than 20ms after the onset in a longer [jV] sequence of, say, 200ms total. Therefore, a relativized measurement may capture the distinction better across a diverse array of environments by accounting for this potential variation.</p>
<p>On the other hand, effects on segmental duration are not entirely absolute or consistent. Studies examining the effects of speech rate have observed that pauses (<xref ref-type="bibr" rid="B26">Goldman-Eisler, 1968</xref>) and vowels (<xref ref-type="bibr" rid="B44">Kozchevnikov &amp; Chistovich, 1965</xref>; <xref ref-type="bibr" rid="B21">Gay, 1978</xref>) are the main loci of duration changes across different speech rates. Furthermore, Klatt (<xref ref-type="bibr" rid="B41">1973</xref>) argues that segments may have intrinsic absolute durations that become apparent when considering segment-extrinsic influences. When testing a model of segment duration incorporating multiple factors that have been found to influence it, Klatt observes that vowel categories seem to have respective floors of compressibility at which point the model&#8217;s predictions of shorter durations fail. Absolute duration may therefore play an important role in the characterization of a glide as well. Like Klatt observes regarding vowels, there could be a floor of compressibility for glides, albeit shorter. A duration shorter than that floor might, instead, resemble the very fast formant transition after a consonantal constriction. This is supported at least on perceptual grounds by Ohala&#8217;s (<xref ref-type="bibr" rid="B55">1978</xref>) analysis that a change in Southern Bantu in which palatalized labial stops /p&#690;/ changed to coronal stops /t/ is due to the reanalysis of [&#690;] as a formant transition. (The coronal, as opposed to dorsal end result may be explained by the labial [p]: The high starting point of the F2 transition that would result from palatalization, and therefore resemble a dorsal transition, could have been mitigated by the low starting point after the labial constriction and therefore result in something in between the two). We might also imagine a ceiling of expandability at which point a glide is no longer glide-like but vowel-like, no matter how long surrounding segments may be. This double-sided bounding of the duration of a glide is also at least perceptually supported by Liberman et al.&#8217;s (<xref ref-type="bibr" rid="B48">1956</xref>) finding that when increasing the speed of formant transition from a high F2 starting point, listeners&#8217; percepts change from [iV] to [jV] and then to [&#609;V].</p>
<p>The data below provide a fuller description of transition earliness. In Table <xref ref-type="table" rid="T7">7</xref>, the timepoint at which F2 max occurs is represented both in Relative (percentage of entire vocalic sequence duration) and Absolute (milliseconds after vocalic onset) terms. Figure <xref ref-type="fig" rid="F4">4</xref> provides a series of Smoothing Spline ANOVA plots (<xref ref-type="bibr" rid="B29">Gu, 2002</xref>) of the F2 contours of two pre-existing lexical item pairs: One pair of those most similar in the entire form (<italic>Estonia</italic> + <italic>pneumonia</italic>), and another pair less so (<italic>millennia</italic> + <italic>Kenya</italic>). These plots are based off 50 equidistant measurements of F2 across the entire vocalic sequence, fitting curves to the datasets being compared and providing Bayesian confidence intervals to determine areas of significant difference between the formant contours.<xref ref-type="fn" rid="n3">3</xref> The Relative versions show the F2 contour in terms proportional to the duration of the entire vocalic sequence, where the x-axis is the ordinal number for each of the 50 equidistant measurements (&#8216;timepoint&#8217;). For the Absolute versions, the x-axis is the conversion of these measurement points to their absolute duration, the number of milliseconds after vocalic sequence onset (&#8216;raw timepoint&#8217;).</p>
<table-wrap id="T7">
<label>Table 7</label>
<caption><p>Results: Transition earliness (absolute vs. relative). Descriptive statistics and results of linear mixed-effects models per measurement across the factor of expected pronunciation.</p></caption>
<table>
<tr>
<th align="left"></th>
<th align="center" valign="bottom">Percent[i] &gt; [j]</th>
<th align="center" valign="bottom">Mean: [i]-expectant</th>
<th align="center" valign="bottom">Mean: [j]-expectant</th>
<th align="center" valign="bottom">Coefficient: [j]-expectant</th>
<th align="center" valign="bottom"><italic>p</italic></th>
<th align="left"></th>
</tr>
<tr>
<td align="left" colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">Pre-existing</td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
</tr>
<tr>
<td align="left" colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">absolute</td>
<td align="right">94%</td>
<td align="right">35.67 ms</td>
<td align="right">19.43 ms</td>
<td align="right">&#8211;16.49 ms</td>
<td align="right">9.99<italic>e</italic>&#8211;16</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">relative</td>
<td align="right">89%</td>
<td align="right">20.9%</td>
<td align="right">14.6%</td>
<td align="right">&#8211;06.3%</td>
<td align="right">4.69<italic>e</italic>&#8211;08</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">Nonce</td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
<td align="right"></td>
</tr>
<tr>
<td align="left" colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">absolute</td>
<td align="right">66%</td>
<td align="right">23.93 ms</td>
<td align="right">19.59 ms</td>
<td align="right">&#8211;4.38 ms</td>
<td align="right">3.76<italic>e</italic>&#8211;06</td>
<td align="left" valign="top">***</td>
</tr>
<tr>
<td align="left">relative</td>
<td align="right">61%</td>
<td align="right">11.4%</td>
<td align="right">09.8%</td>
<td align="right">&#8211;01.6%</td>
<td align="right">.00073</td>
<td align="left" valign="top">***</td>
</tr>
</table>
</table-wrap>
<fig id="F4">
<label>Figure 4</label>
<caption><p>Smoothing Spline ANOVA plots. Plots on the lefthand side represent the x-axis in Relative terms, with 50 timepoints (evenly spaced across the entire vocalic sequence duration) expressed ordinally. Plots on the righthand side are in Absolute terms, with the x-axis converted to the amount of time (ms) between each point and the onset of the vocalic sequence. In each plot, the earlier half is that containing the high front vocoid of primary interest; the latter half represents the following vowel and also some indication of the transition into the initial segment of the following word.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/6220/file/75128/"/>
</fig>
<p>These results demonstrate that both the absolute and relative approaches to the measurement of transition earliness significantly reveal the distinction. However, they also suggest that an absolute approach to measuring transition earliness may be more consistent at capturing it. The results in Table <xref ref-type="table" rid="T7">7</xref> suggest that looking at how many milliseconds after the vocalic sequence&#8217;s onset the F2 max is reached is a more consistent identifier of this distinction than looking at how far proportionally into the entire vocalic sequence&#8217;s duration it&#8217;s reached. Additionally, a more holistic analysis of the F2 contours in Figure <xref ref-type="fig" rid="F4">4</xref> suggests that the contours are not always identified as significantly distinct when relativized to the entire sequence&#8217;s duration, rather than anchored to real time. As discussed above, the relative approach is meant to account for other effects, such as a less exact environmental pairing. However, the absolute approach does seem to still capture the distinction across cases and fares better at doing so in a case of more similar environmental pairing.</p>
<p>Furthermore, when examining the Absolute plots, it is apparent that the confidence intervals become wider toward the end of the contour (the right side). This is due to variation in the duration of the entire sequence: When some utterances have shorter durations, there become fewer measurement points that can be referred to in the calculation of the confidence interval. So the entire sequence duration seems to vary, but the [j]-[i] distinction is still apparent when examined in absolute terms. The combination of these observations suggests that this variation of the entire sequence&#8217;s duration may be more attributable to varying duration of the following vowel (corroborating findings mentioned above [e.g., <xref ref-type="bibr" rid="B44">Kozchevnikov &amp; Chistovich, 1965</xref>; <xref ref-type="bibr" rid="B21">Gay, 1978</xref>]). This somewhat orthogonal behavior lends support to the notion that, in [jV] and [iV] sequences, the prior and latter segments are truly separate concatenated segments rather than parts of a single complex unit.</p>
</sec>
</sec>
<sec>
<title>5. Discussion and conclusions</title>
<p>The results of this study suggest that there is a distinction between the [j] glide and [i] vowel available to native speakers of American English. This is elicited in utterances of near-minimally paired lexical items. It is also extended productively to nonce stimuli, elicited solely by orthography. Analysis identifies what acoustic aspects play a consistent role in the production of this distinction. The glide appears to most consistently be characterized by an earlier transition to the following vowel and, likely as a result, a shorter overall duration of the vocalic sequence. Results from the pre-existing word pairs also suggest the glide to have a lower acoustic intensity, though this effect was reversed in the nonce stimulus production task (possibly as a task effect due to increased focus on stress placement). And while [j] is not shown to have a significantly higher and tighter lingual articulation (i.e., there is an insignificant difference in F1 min), neither is it shown to have a significantly lower and more open articulation. On the other hand, acoustic measurements do show a significant difference in articulatory frontness (measured by F2 max), suggesting that [j] is significantly less anterior (with a lower F2 max) than [i].</p>
<p>This further understanding of the acoustic character of this distinction serves us in multiple ways. It documents the distinction and aids in future approaches to identifying and segmenting it, which may help improve and increase its future documentation and allow for more robust analysis of its distribution and variability. The acoustic characterization can also contribute to the choice between different phonological representations considered. The finding that [j] has a significantly less anterior articulation supports ruling out a place-based representation (such as that proposed by Levi [<xref ref-type="bibr" rid="B45">2004</xref>, <xref ref-type="bibr" rid="B46">2008</xref>] or a hybrid of it like that proposed by Nevins and Chitoran [<xref ref-type="bibr" rid="B54">2008</xref>]), which would have predicted [j] to be significantly <italic>more</italic> anterior than [i]. While the timing-related aspect of transition earliness was identified as the most consistent, this does not necessarily support applying a pre-linking account and ruling out a constriction-based representation, as all accounts considered generate such a timing-related difference as a by-product of syllabification. The observation that [j] is significantly less anterior could be attributable to reduction and centralization caused by the faster articulation of the glide. This could motivate us to update our predictions regarding the other formant measure of F1, leading us to consider only the stronger threshold of a parallel reverse effect (a significantly higher F1 min and therefore lower, more open lingual articulation) as support for ruling out a constriction-based representation and concluding with a pre-linking account. That is, if the significantly less anterior articulation of [j] is attributable to reduction due to its faster articulation, the lack of a significantly lower articulation could suggest a higher underlying target (or at least stronger resistance to reduction along the dimension of constriction height than anteriority). This is coupled with the fact that, at least in the pre-existing word pairs, [j] is found to have a lower acoustic intensity. The results of this study are therefore still consistent with a constriction-based representation of the distinction.</p>
<p>Furthermore, characterizing the acoustics of this distinction may further our understanding of its phonological distribution. As discussed at the beginning of this paper (Section 2.2), and suggested by the results (Table <xref ref-type="table" rid="T6">6</xref>), aspects of the surrounding environment appear to constrain the availability of this distinction. For example, the place of articulation of the preceding consonant appears to constrain the glide&#8217;s appearance ([j] seems dispreferred when following a dorsal consonant). This constraint on the distribution of this distinction may be explained as a result of the distinction&#8217;s acoustic realization and its apparent reliance on transition earliness. As discussed in Section 4.3, while the distinction requires [j] to have an earlier transition to the following vowel, if it is too early and fast, there is potential for [j] to be misperceived or reanalyzed as the formant transition cue in a [CV] sequence with a dorsal consonant. This explanation would also hold for the apparent dispreference of [w] after labial constrictions (<xref ref-type="bibr" rid="B9">Clements &amp; Keyser, 1983</xref>).</p>
<p>There are multiple further directions of inquiry that this study motivates. One is to examine the perception of this distinction, both in terms of cueing and contrast. The analysis above examines acoustic measurements as characteristics of this distinction: What details of the acoustic signal exhibit significant differences across production of these apparently distinct categories? Some characteristics (e.g., transition earliness and duration) appear more consistent and reliable than others (e.g., intensity). It would be helpful to know if this characterization of the distinction&#8217;s <italic>production</italic> is a reasonable representation of how it is cued to the <italic>perception</italic> of the human listener: Do the same characteristics play the same roles as cues in the listener&#8217;s perceptual distinction of glides from vowels? A perception experiment cross-manipulating these acoustic dimensions and testing them as predictors of participants&#8217; responses could provide beneficial comparison to the observations made here. A finding that intensity and F1 play a strong role in perceptually cueing the distinction could further strengthen our confidence in selecting a constriction-based approach as the optimal representation. Furthermore, manipulating the duration of surrounding segments to test for boundary shifts of this distinction (e.g., <xref ref-type="bibr" rid="B2">Ainsworth, 1974</xref>; <xref ref-type="bibr" rid="B34">Hirata, 1990</xref>) could speak further to the question of how absolute or relative the cue of transition earliness is (Section 4.3): Is there a duration beyond which a high front vocoid is categorically considered a vowel rather than a glide, irrespective of how long the following vowel may be? Also, while this study suggests a <italic>distinction</italic> that can be produced by speakers of American English, it does not necessarily demonstrate a <italic>contrastive</italic> function of it in the grammar. That is, none of the lexical item or nonce stimulus pairs tested here are exact minimal pairs (only near-minimal). Does this distinction have the potential to bear a contrastive load? Further experimentation could test if American English speakers can use this distinction to recoverably contrast minimally paired nonce words.</p>
<p>Another extension of this study would be to analyze the acoustic character of glide-vowel distinctions in other languages, such as those documented by the many studies cited throughout this paper. This study&#8217;s results are only intended to shine light on what representation may be most plausible (or at least rule any candidates out) for the distinction apparent in the American English phonological system under consideration. It is possible that languages previously argued on more phonological grounds to be best represented with the other approaches considered do actually cue it differently, with acoustic characterizations in line with those predicted by the respective representations. This approach of acoustic characterization is further applicable to the analysis of any distinction for which there is a diverse suite of potential acoustic cues. And, as employed here, that acoustic characterization may be useful in comparing the acoustic predictions generated by competing phonological representations of such a distinction and therefore speaking between them. Further such analysis will contribute to the ongoing broader question of how interwoven or disconnected phonological representation and phonetic realization can be (e.g., <xref ref-type="bibr" rid="B58">Pierrehumbert, 1990</xref>; <xref ref-type="bibr" rid="B33">Hayes et al., 2004</xref>; <xref ref-type="bibr" rid="B65">Smith, 2005</xref>).</p>
</sec>
</body>
<back>
<fn-group>
<fn id="n1"><p>While the word-initial patterning of [jV] vs. [iV] appears to conflate with stress, one exception of [iV] hiatus where the following vowel, instead of the initial vowel, is stressed might be the name <italic>Iago</italic> [i&#593;&#769;&#609;o]. However, this name is not highly frequent in English and could be assigned something akin to a loanword status, possibly allowing for phonological exceptionality (<xref ref-type="bibr" rid="B37">It&#244; and Mester, 1999</xref>; <xref ref-type="bibr" rid="B66">Smith, 2006</xref>, <xref ref-type="bibr" rid="B67">2009</xref>).</p></fn>
<fn id="n2"><p>The pattern of [w] being dispreferred after labial consonants is not exceptionless. Some Spanish loanwords such as <italic>Buena Vista</italic> [bw&#603;n&#601;v&#618;&#769;st&#601;] maintain [w] after a labial consonant in their adaptations, though other loan adaptations do still exhibit this constraint, such as <italic>Puerto Rico</italic> [p&#596;&#633;&#638;&#601;&#633;i&#769;ko] (cf. [pw&#8230;]) or the variable adaptation of French <italic>voila</italic> as [w&#593;l&#593;&#769;] (cf. [vw&#8230;]).</p></fn>
<fn id="n3"><p>Smoothing Spline ANOVA analysis was first used in linguistics by Davidson (<xref ref-type="bibr" rid="B11">2006</xref>) in the analysis of tongue shapes imaged by ultrasound to examine differences in coarticulation. De Decker and Nycz (<xref ref-type="bibr" rid="B14">2006</xref>) extended the use of this analytical tool to the study of vowel formants, finding this analysis of temporal formant contours to informatively reveal differences between vowel categories and dialectal category variants that single-point or timespan-averaged analyses might otherwise miss. Further work has employed this method in the analysis of vowels and diphthongs (<xref ref-type="bibr" rid="B43">Koops, 2010</xref>; <xref ref-type="bibr" rid="B7">Chanethom, 2011</xref>).</p></fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>I would like to thank Lisa Davidson, Maria Gouskova, and Frans Adriaans for their valuable feedback at many stages of this research. Many additional thanks go to Susannah Levi, Suzy Ahn, Sean Martin, Becky Laturnus, members of the NYU Phonetics and Experimental Phonology Lab, and audiences at the 170th meeting of the Acoustical Society of America and the 2017 annual meeting of the Linguistic Society of America for their feedback and discussion. I am grateful to the anonymous reviewers, whose comments regarding this paper&#8217;s earlier manuscript led to significant improvement. I would also like to thank the participants who provided their time and their voices for analysis.</p>
</ack>
<sec>
<title>Competing Interests</title>
<p>The author has no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aguilar</surname>
<given-names>L.</given-names>
</name>
</person-group>
<article-title>Hiatus and diphthong: Acoustic cues and speech situation differences</article-title>
<source>Speech Communication</source>
<year iso-8601-date="1999">1999</year>
<volume>28</volume>
<issue>1</issue>
<fpage>57</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="doi">10.1016/S0167-6393(99)00003-5</pub-id>
</element-citation>
</ref>
<ref id="B2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ainsworth</surname>
<given-names>W.</given-names>
</name>
</person-group>
<article-title>The influence of precursive sequences on the perception of synthesized vowels</article-title>
<source>Language and Speech</source>
<year iso-8601-date="1974">1974</year>
<volume>17</volume>
<issue>2</issue>
<fpage>103</fpage>
<lpage>109</lpage>
<pub-id pub-id-type="doi">10.1177/002383097401700201</pub-id>
</element-citation>
</ref>
<ref id="B3">
<label>3</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Boersma</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Weenink</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Praat: Doing phonetics by computer [computer program]</article-title>
<year iso-8601-date="2015">2015</year>
<comment>version 5.3.77. URL: <uri>http://www.fon.hum.uva.nl/praat/</uri></comment>
</element-citation>
</ref>
<ref id="B4">
<label>4</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Bright</surname>
<given-names>W.</given-names>
</name>
</person-group>
<source>The Karok language</source>
<year iso-8601-date="1957">1957</year>
<publisher-loc>Berkeley and Los Angeles</publisher-loc>
<publisher-name>University of California Press</publisher-name>
</element-citation>
</ref>
<ref id="B5">
<label>5</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Browman</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Goldstein</surname>
<given-names>L.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Kingston</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Beckman</surname>
<given-names>M.</given-names>
</name>
</person-group>
<chapter-title>Tiers in articulatory phonology, with some implications for casual speech</chapter-title>
<source>Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech</source>
<year iso-8601-date="1990">1990</year>
<publisher-name>Cambridge University Press</publisher-name>
<fpage>341</fpage>
<lpage>376</lpage>
<pub-id pub-id-type="doi">10.1017/CBO9780511627736.019</pub-id>
</element-citation>
</ref>
<ref id="B6">
<label>6</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Catford</surname>
<given-names>J.</given-names>
</name>
</person-group>
<source>Fundamental problems in phonetics</source>
<year iso-8601-date="1977">1977</year>
<publisher-loc>Bloomington</publisher-loc>
<publisher-name>Indiana University Press</publisher-name>
</element-citation>
</ref>
<ref id="B7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chanethom</surname>
<given-names>V.</given-names>
</name>
</person-group>
<article-title>Dynamic differences in the production of diphthongs by French-English bilingual children</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="2011">2011</year>
<volume>130</volume>
<issue>4</issue>
<fpage>2522</fpage>
<lpage>2522</lpage>
<pub-id pub-id-type="doi">10.1121/1.3655063</pub-id>
</element-citation>
</ref>
<ref id="B8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chitoran</surname>
<given-names>I.</given-names>
</name>
</person-group>
<article-title>A perception-production study of Romanian diphthongs and glide-vowel sequences</article-title>
<source>Journal of the International Phonetic Association</source>
<year iso-8601-date="2002">2002</year>
<volume>32</volume>
<issue>2</issue>
<fpage>203</fpage>
<lpage>222</lpage>
<pub-id pub-id-type="doi">10.1017/S0025100302001044</pub-id>
</element-citation>
</ref>
<ref id="B9">
<label>9</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Clements</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Keyser</surname>
<given-names>S.</given-names>
</name>
</person-group>
<source>CV phonology: A generative theory of the syllable</source>
<year iso-8601-date="1983">1983</year>
<publisher-loc>Cambridge, MA</publisher-loc>
<publisher-name>MIT Press</publisher-name>
</element-citation>
</ref>
<ref id="B10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crystal</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>House</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>Articulation rate and the duration of syllables and stress groups in connected speech</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1990">1990</year>
<volume>88</volume>
<issue>1</issue>
<fpage>101</fpage>
<lpage>112</lpage>
<pub-id pub-id-type="doi">10.1121/1.399955</pub-id>
</element-citation>
</ref>
<ref id="B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davidson</surname>
<given-names>L.</given-names>
</name>
</person-group>
<article-title>Comparing tongue shapes from ultrasound imaging using Smoothing Spline Analysis of Variance</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="2006">2006</year>
<volume>120</volume>
<issue>1</issue>
<fpage>407</fpage>
<lpage>415</lpage>
<pub-id pub-id-type="doi">10.1121/1.2205133</pub-id>
</element-citation>
</ref>
<ref id="B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davidson</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Erker</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Hiatus resolution in American English: The case against glide insertion</article-title>
<source>Language</source>
<year iso-8601-date="2014">2014</year>
<volume>90</volume>
<issue>2</issue>
<fpage>482</fpage>
<lpage>514</lpage>
<pub-id pub-id-type="doi">10.1353/lan.2014.0028</pub-id>
</element-citation>
</ref>
<ref id="B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davis</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Hammond</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>On the status of onglides in American English</article-title>
<source>Phonology</source>
<year iso-8601-date="1995">1995</year>
<volume>12</volume>
<fpage>159</fpage>
<lpage>182</lpage>
<pub-id pub-id-type="doi">10.1017/S0952675700002463</pub-id>
</element-citation>
</ref>
<ref id="B14">
<label>14</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>De Decker</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Nycz</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>A new way of analyzing vowels: Comparing formant contours using Smoothing Spline ANOVA</article-title>
<conf-name>Poster presented at New Ways of Analyzing Variation (NWAV)</conf-name>
<conf-date>9&#8211;12 November</conf-date>
<year iso-8601-date="2006">2006</year>
<conf-loc>Columbus, OH</conf-loc>
<volume>35</volume>
</element-citation>
</ref>
<ref id="B15">
<label>15</label>
<element-citation publication-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Deligiorgis</surname>
<given-names>I.</given-names>
</name>
</person-group>
<source>Glides and syllables (Doctoral dissertation)</source>
<year iso-8601-date="1988">1988</year>
<publisher-name>University of Iowa</publisher-name>
</element-citation>
</ref>
<ref id="B16">
<label>16</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Durand</surname>
<given-names>J.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Anderson</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Durand</surname>
<given-names>J.</given-names>
</name>
</person-group>
<chapter-title>On the phonological status of glides: The evidence from Malay</chapter-title>
<source>Explorations in Dependency Phonology</source>
<year iso-8601-date="1987">1987</year>
<publisher-loc>Dordrecht, Holland</publisher-loc>
<publisher-name>Foris Publications</publisher-name>
<fpage>79</fpage>
<lpage>107</lpage>
</element-citation>
</ref>
<ref id="B17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Espy-Wilson</surname>
<given-names>C.</given-names>
</name>
</person-group>
<article-title>Acoustic measures for linguistic features distinguishing the semivowels /wjrl/ in American English</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1992">1992</year>
<volume>92</volume>
<issue>2</issue>
<fpage>736</fpage>
<lpage>757</lpage>
<pub-id pub-id-type="doi">10.1121/1.403998</pub-id>
</element-citation>
</ref>
<ref id="B18">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fowler</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sabadini</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Weihing</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Rapid access to speech gestures in perception: Evidence from choice and simple response time tasks</article-title>
<source>Journal of Memory and Language</source>
<year iso-8601-date="2003">2003</year>
<volume>49</volume>
<issue>3</issue>
<fpage>396</fpage>
<lpage>413</lpage>
<pub-id pub-id-type="doi">10.1016/S0749-596X(03)00072-X</pub-id>
</element-citation>
</ref>
<ref id="B19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gafos</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>A grammar of gestural coordination</article-title>
<source>Natural Language &amp; Linguistic Theory</source>
<year iso-8601-date="2002">2002</year>
<volume>20</volume>
<issue>2</issue>
<fpage>269</fpage>
<lpage>337</lpage>
<pub-id pub-id-type="doi">10.1023/A:1014942312445</pub-id>
</element-citation>
</ref>
<ref id="B20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gay</surname>
<given-names>T.</given-names>
</name>
</person-group>
<article-title>Effect of speaking rate on diphthong formant movements</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1968">1968</year>
<volume>44</volume>
<issue>6</issue>
<fpage>1570</fpage>
<lpage>1573</lpage>
<pub-id pub-id-type="doi">10.1121/1.1911298</pub-id>
</element-citation>
</ref>
<ref id="B21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gay</surname>
<given-names>T.</given-names>
</name>
</person-group>
<article-title>Effect of speaking rate on vowel formant movements</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1978">1978</year>
<volume>63</volume>
<issue>1</issue>
<fpage>223</fpage>
<lpage>230</lpage>
<pub-id pub-id-type="doi">10.1121/1.381717</pub-id>
</element-citation>
</ref>
<ref id="B22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gay</surname>
<given-names>T.</given-names>
</name>
</person-group>
<article-title>Mechanisms in the control of speech rate</article-title>
<source>Phonetica</source>
<year iso-8601-date="1981">1981</year>
<volume>38</volume>
<issue>1&#8211;3</issue>
<fpage>148</fpage>
<lpage>158</lpage>
<pub-id pub-id-type="doi">10.1159/000260020</pub-id>
</element-citation>
</ref>
<ref id="B23">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gentilucci</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bernardis</surname>
<given-names>P.</given-names>
</name>
</person-group>
<article-title>Imitation during phoneme production</article-title>
<source>Neuropsychologia</source>
<year iso-8601-date="2007">2007</year>
<volume>45</volume>
<issue>3</issue>
<fpage>608</fpage>
<lpage>615</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2006.04.004</pub-id>
</element-citation>
</ref>
<ref id="B24">
<label>24</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gick</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>The use of ultrasound for linguistic phonetic fieldwork</article-title>
<source>Journal of the International Phonetic Association</source>
<year iso-8601-date="2002">2002</year>
<volume>32</volume>
<issue>2</issue>
<fpage>113</fpage>
<lpage>121</lpage>
<pub-id pub-id-type="doi">10.1017/S0025100302001007</pub-id>
</element-citation>
</ref>
<ref id="B25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goldinger</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Echoes of echoes? An episodic theory of lexical access</article-title>
<source>Psychological Review</source>
<year iso-8601-date="1998">1998</year>
<volume>105</volume>
<issue>2</issue>
<fpage>251</fpage>
<lpage>279</lpage>
<pub-id pub-id-type="doi">10.1037/0033-295X.105.2.251</pub-id>
</element-citation>
</ref>
<ref id="B26">
<label>26</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Goldman-Eisler</surname>
<given-names>F.</given-names>
</name>
</person-group>
<source>Psycholinguistics: Experiments in spontaneous speech</source>
<year iso-8601-date="1968">1968</year>
<publisher-loc>London</publisher-loc>
<publisher-name>Academic Press</publisher-name>
</element-citation>
</ref>
<ref id="B27">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gouskova</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>Relational hierarchies in Optimality Theory: The case of syllable contact</article-title>
<source>Phonology</source>
<year iso-8601-date="2004">2004</year>
<volume>21</volume>
<issue>02</issue>
<fpage>201</fpage>
<lpage>250</lpage>
<pub-id pub-id-type="doi">10.1017/S095267570400020X</pub-id>
</element-citation>
</ref>
<ref id="B28">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grosjean</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Lane</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>How the listener integrates the components of speaking rate</article-title>
<source>Journal of Experimental Psychology: Human Perception and Performance</source>
<year iso-8601-date="1976">1976</year>
<volume>2</volume>
<issue>4</issue>
<fpage>538</fpage>
<lpage>543</lpage>
<pub-id pub-id-type="doi">10.1037/0096-1523.2.4.538</pub-id>
</element-citation>
</ref>
<ref id="B29">
<label>29</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Gu</surname>
<given-names>C.</given-names>
</name>
</person-group>
<source>Smoothing Spline ANOVA Models</source>
<year iso-8601-date="2002">2002</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>Springer</publisher-name>
<pub-id pub-id-type="doi">10.1007/978-1-4614-5369-7</pub-id>
</element-citation>
</ref>
<ref id="B30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Halle</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Vaux</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Wolfe</surname>
<given-names>A.</given-names>
</name>
</person-group>
<article-title>On feature spreading and the representation of place of articulation</article-title>
<source>Linguistic Inquiry</source>
<year iso-8601-date="2000">2000</year>
<volume>31</volume>
<issue>3</issue>
<fpage>387</fpage>
<lpage>444</lpage>
<pub-id pub-id-type="doi">10.1162/002438900554398</pub-id>
</element-citation>
</ref>
<ref id="B31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harris</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kaisse</surname>
<given-names>E.</given-names>
</name>
</person-group>
<article-title>Palatal vowels, glides, and obstruents in Argentinean Spanish</article-title>
<source>Phonology</source>
<year iso-8601-date="1999">1999</year>
<volume>16</volume>
<fpage>117</fpage>
<lpage>190</lpage>
<pub-id pub-id-type="doi">10.1017/S0952675799003735</pub-id>
</element-citation>
</ref>
<ref id="B32">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hayes</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>Compensatory lengthening in moraic phonology</article-title>
<source>Linguistic Inquiry</source>
<year iso-8601-date="1989">1989</year>
<volume>20</volume>
<issue>2</issue>
<fpage>253</fpage>
<lpage>306</lpage>
</element-citation>
</ref>
<ref id="B33">
<label>33</label>
<element-citation publication-type="book">
<person-group person-group-type="editor">
<name>
<surname>Hayes</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Kirchner</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Steriade</surname>
<given-names>D.</given-names>
</name>
</person-group>
<source>Phonetically based phonology</source>
<year iso-8601-date="2004">2004</year>
<publisher-name>Cambridge University Press</publisher-name>
<pub-id pub-id-type="doi">10.1017/CBO9780511486401</pub-id>
</element-citation>
</ref>
<ref id="B34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hirata</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<article-title>Perception of geminated stops in Japanese word and sentence levels</article-title>
<source>The Bulletin of the Phonetic Society of Japan</source>
<year iso-8601-date="1990">1990</year>
<volume>194</volume>
<fpage>23</fpage>
<lpage>28</lpage>
</element-citation>
</ref>
<ref id="B35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hirata</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<article-title>Effects of speaking rate on the vowel length distinction in Japanese</article-title>
<source>Journal of Phonetics</source>
<year iso-8601-date="2004">2004</year>
<volume>32</volume>
<issue>4</issue>
<fpage>565</fpage>
<lpage>589</lpage>
<pub-id pub-id-type="doi">10.1016/j.wocn.2004.02.004</pub-id>
</element-citation>
</ref>
<ref id="B36">
<label>36</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Hyman</surname>
<given-names>L.</given-names>
</name>
</person-group>
<source>A theory of phonological weight</source>
<year iso-8601-date="1985">1985</year>
<publisher-loc>Dordrecht, Holland</publisher-loc>
<publisher-name>Foris Publications</publisher-name>
</element-citation>
</ref>
<ref id="B37">
<label>37</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>It&#244;</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Mester</surname>
<given-names>A.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Tsujimura</surname>
<given-names>N.</given-names>
</name>
</person-group>
<chapter-title>The phonological lexicon</chapter-title>
<source>Handbook of Japanese Linguistics</source>
<year iso-8601-date="1999">1999</year>
<publisher-loc>Oxford</publisher-loc>
<publisher-name>Blackwell</publisher-name>
<fpage>62</fpage>
<lpage>100</lpage>
</element-citation>
</ref>
<ref id="B38">
<label>38</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Jensen</surname>
<given-names>J.</given-names>
</name>
</person-group>
<chapter-title>Segmental Phonology</chapter-title>
<source>English Phonology</source>
<year iso-8601-date="1993">1993</year>
<publisher-loc>Amsterdam</publisher-loc>
<publisher-name>John Benjamins</publisher-name>
<fpage>25</fpage>
<lpage>45</lpage>
<pub-id pub-id-type="doi">10.1075/cilt.99</pub-id>
</element-citation>
</ref>
<ref id="B39">
<label>39</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Kaye</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lowenstamm</surname>
<given-names>J.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Dell</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Hirst</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Vergnaud</surname>
<given-names>J.-R.</given-names>
</name>
</person-group>
<chapter-title>De la syllabicit&#233;</chapter-title>
<source>Forme sonore du langage</source>
<year iso-8601-date="1984">1984</year>
<publisher-loc>Paris</publisher-loc>
<publisher-name>Hermann</publisher-name>
<fpage>123</fpage>
<lpage>159</lpage>
</element-citation>
</ref>
<ref id="B40">
<label>40</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keating</surname>
<given-names>P.</given-names>
</name>
</person-group>
<article-title>Palatals as complex segments: X-ray evidence</article-title>
<source>UCLA Working Papers in Phonetics</source>
<year iso-8601-date="1988">1988</year>
<volume>69</volume>
<fpage>77</fpage>
<lpage>91</lpage>
</element-citation>
</ref>
<ref id="B41">
<label>41</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klatt</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Interaction between two factors that influence vowel duration</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1973">1973</year>
<volume>54</volume>
<issue>4</issue>
<fpage>1102</fpage>
<lpage>1104</lpage>
<pub-id pub-id-type="doi">10.1121/1.1914322</pub-id>
</element-citation>
</ref>
<ref id="B42">
<label>42</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klatt</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Vowel lengthening is syntactically determined in a connected discourse</article-title>
<source>Journal of Phonetics</source>
<year iso-8601-date="1975">1975</year>
<volume>3</volume>
<issue>3</issue>
<fpage>129</fpage>
<lpage>140</lpage>
</element-citation>
</ref>
<ref id="B43">
<label>43</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koops</surname>
<given-names>C.</given-names>
</name>
</person-group>
<article-title>/u/-fronting is not monolithic: Two types of fronted /u/ in Houston Anglos</article-title>
<source>University of Pennsylvania Working Papers in Linguistics</source>
<year iso-8601-date="2010">2010</year>
<volume>16</volume>
<issue>2</issue>
<fpage>14</fpage>
</element-citation>
</ref>
<ref id="B44">
<label>44</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Kozchevnikov</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Chistovich</surname>
<given-names>L.</given-names>
</name>
</person-group>
<source>Speech: Articulation and perception [translated]</source>
<year iso-8601-date="1965">1965</year>
<publisher-loc>Washington, DC</publisher-loc>
<publisher-name>Joint Publications Research Service</publisher-name>
</element-citation>
</ref>
<ref id="B45">
<label>45</label>
<element-citation publication-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Levi</surname>
<given-names>S.</given-names>
</name>
</person-group>
<source>The representation of underlying glides: A cross-linguistic study (Doctoral dissertation)</source>
<year iso-8601-date="2004">2004</year>
<publisher-name>University of Washington</publisher-name>
</element-citation>
</ref>
<ref id="B46">
<label>46</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Levi</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Phonemic vs. derived glides</article-title>
<source>Lingua</source>
<year iso-8601-date="2008">2008</year>
<volume>118</volume>
<issue>12</issue>
<fpage>1956</fpage>
<lpage>1978</lpage>
<pub-id pub-id-type="doi">10.1016/j.lingua.2007.10.003</pub-id>
</element-citation>
</ref>
<ref id="B47">
<label>47</label>
<element-citation publication-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Levin</surname>
<given-names>J.</given-names>
</name>
</person-group>
<source>A metrical theory of syllabicity (Doctoral dissertation)</source>
<year iso-8601-date="1985">1985</year>
<publisher-name>Massachusetts Institute of Technology</publisher-name>
</element-citation>
</ref>
<ref id="B48">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liberman</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Delattre</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Gerstman</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>F.</given-names>
</name>
</person-group>
<article-title>Tempo of frequency change as a cue for distinguishing classes of speech sounds</article-title>
<source>Journal of Experimental Psychology</source>
<year iso-8601-date="1956">1956</year>
<volume>52</volume>
<issue>2</issue>
<fpage>127</fpage>
<lpage>137</lpage>
<pub-id pub-id-type="doi">10.1037/h0041240</pub-id>
</element-citation>
</ref>
<ref id="B49">
<label>49</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lindblom</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>Spectrographic study of vowel reduction</article-title>
<source>The Journal of the Acoustical Society of America</source>
<year iso-8601-date="1963">1963</year>
<volume>35</volume>
<issue>11</issue>
<fpage>1773</fpage>
<lpage>1781</lpage>
<pub-id pub-id-type="doi">10.1121/1.1918816</pub-id>
</element-citation>
</ref>
<ref id="B50">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lindblom</surname>
<given-names>B.</given-names>
</name>
</person-group>
<article-title>Temporal organization of syllable production</article-title>
<source>Speech Transmission Laboratory Quarterly Progress and Status Report</source>
<year iso-8601-date="1968">1968</year>
<volume>2</volume>
<issue>3</issue>
</element-citation>
</ref>
<ref id="B51">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maddieson</surname>
<given-names>I.</given-names>
</name>
</person-group>
<article-title>Glides and gemination</article-title>
<source>Lingua</source>
<year iso-8601-date="2008">2008</year>
<volume>118</volume>
<issue>12</issue>
<fpage>1926</fpage>
<lpage>1936</lpage>
<pub-id pub-id-type="doi">10.1016/j.lingua.2007.10.005</pub-id>
</element-citation>
</ref>
<ref id="B52">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maddieson</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Emmorey</surname>
<given-names>K.</given-names>
</name>
</person-group>
<article-title>Relationship between semivowels and vowels: Cross-linguistic investigations of acoustic difference and coarticulation</article-title>
<source>Phonetica</source>
<year iso-8601-date="1985">1985</year>
<volume>42</volume>
<issue>4</issue>
<fpage>163</fpage>
<lpage>174</lpage>
<pub-id pub-id-type="doi">10.1159/000261748</pub-id>
</element-citation>
</ref>
<ref id="B53">
<label>53</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Namy</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Nygaard</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Sauerteig</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Gender differences in vocal accommodation: The role of perception</article-title>
<source>Journal of Language and Social Psychology</source>
<year iso-8601-date="2002">2002</year>
<volume>21</volume>
<issue>4</issue>
<fpage>422</fpage>
<lpage>432</lpage>
<pub-id pub-id-type="doi">10.1177/026192702237958</pub-id>
</element-citation>
</ref>
<ref id="B54">
<label>54</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nevins</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Chitoran</surname>
<given-names>I.</given-names>
</name>
</person-group>
<article-title>Phonological representations and the variable patterning of glides</article-title>
<source>Lingua</source>
<year iso-8601-date="2008">2008</year>
<volume>118</volume>
<fpage>1979</fpage>
<lpage>1997</lpage>
<pub-id pub-id-type="doi">10.1016/j.lingua.2007.10.006</pub-id>
</element-citation>
</ref>
<ref id="B55">
<label>55</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Ohala</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Southern Bantu vs. the world: The case of palatalization of labials</article-title>
<conf-name>Proceedings of the Annual Meeting of the Berkeley Linguistics Society</conf-name>
<year iso-8601-date="1978">1978</year>
<volume>4</volume>
<fpage>370</fpage>
<lpage>386</lpage>
<pub-id pub-id-type="doi">10.3765/bls.v4i0.2218</pub-id>
</element-citation>
</ref>
<ref id="B56">
<label>56</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Padgett</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Glides, vowels, and features</article-title>
<source>Lingua</source>
<year iso-8601-date="2008">2008</year>
<volume>118</volume>
<issue>12</issue>
<fpage>1937</fpage>
<lpage>1955</lpage>
<pub-id pub-id-type="doi">10.1016/j.lingua.2007.10.002</pub-id>
</element-citation>
</ref>
<ref id="B57">
<label>57</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Paradis</surname>
<given-names>C.</given-names>
</name>
</person-group>
<source>Lexical phonology and morphology: The nominal classes in Fula</source>
<year iso-8601-date="1992">1992</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>Garland Publishing Inc</publisher-name>
</element-citation>
</ref>
<ref id="B58">
<label>58</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pierrehumbert</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Phonological and phonetic representation</article-title>
<source>Journal of Phonetics</source>
<year iso-8601-date="1990">1990</year>
<volume>18</volume>
<issue>3</issue>
<fpage>375</fpage>
<lpage>394</lpage>
</element-citation>
</ref>
<ref id="B59">
<label>59</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pinheiro</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bates</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>DebRoy</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sarkar</surname>
<given-names>D.</given-names>
</name>
<collab>R Core Team</collab>
</person-group>
<source>nlme: Linear and Nonlinear Mixed Effects Models</source>
<year iso-8601-date="2017">2017</year>
<comment>R package version 3.1-131</comment>
</element-citation>
</ref>
<ref id="B60">
<label>60</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<collab>R Core Team</collab>
</person-group>
<source>R: A Language and Environment for Statistical Computing</source>
<year iso-8601-date="2015">2015</year>
<publisher-loc>Vienna, Austria</publisher-loc>
<publisher-name>R Foundation for Statistical Computing</publisher-name>
</element-citation>
</ref>
<ref id="B61">
<label>61</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Ren</surname>
<given-names>H.</given-names>
</name>
</person-group>
<chapter-title>On the acoustic structure of diphthongal syllables</chapter-title>
<source>UCLA Working Papers in Phonetics</source>
<year iso-8601-date="1986">1986</year>
<publisher-name>UCLA</publisher-name>
<volume>65</volume>
</element-citation>
</ref>
<ref id="B62">
<label>62</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roca</surname>
<given-names>I.</given-names>
</name>
</person-group>
<article-title>There are no &#8216;glides&#8217;, at least in Spanish: An optimality account</article-title>
<source>Probus</source>
<year iso-8601-date="1997">1997</year>
<volume>9</volume>
<fpage>233</fpage>
<lpage>265</lpage>
<pub-id pub-id-type="doi">10.1515/prbs.1997.9.3.233</pub-id>
</element-citation>
</ref>
<ref id="B63">
<label>63</label>
<element-citation publication-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Rosenthall</surname>
<given-names>S.</given-names>
</name>
</person-group>
<source>Vowel/glide alternation in a theory of constraint interaction (Doctoral dissertation)</source>
<year iso-8601-date="1994">1994</year>
<publisher-name>University of Massachusetts-Amherst</publisher-name>
</element-citation>
</ref>
<ref id="B64">
<label>64</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>J.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Rennison</surname>
<given-names>J. R.</given-names>
</name>
<name>
<surname>P&#246;chtrager</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Neubarth</surname>
<given-names>F.</given-names>
</name>
</person-group>
<chapter-title>Onset sonority constraints and subsyllabic structure</chapter-title>
<source>Phonologica 2002</source>
<year iso-8601-date="2003">2003</year>
<publisher-loc>Berlin</publisher-loc>
<publisher-name>de Gruyter</publisher-name>
</element-citation>
</ref>
<ref id="B65">
<label>65</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Phonological constraints are not directly phonetic</article-title>
<conf-name>Proceedings from the Annual Meeting of the Chicago Linguistic Society</conf-name>
<year iso-8601-date="2005">2005</year>
<conf-sponsor>Chicago Linguistic Society</conf-sponsor>
<volume>41</volume>
<fpage>457</fpage>
<lpage>471</lpage>
</element-citation>
</ref>
<ref id="B66">
<label>66</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>J.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Vance</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>K.</given-names>
</name>
</person-group>
<article-title>Loan phonology is not all perception: Evidence from Japanese loan doublets</article-title>
<source>Japanese/Korean Linguistics</source>
<year iso-8601-date="2006">2006</year>
<volume>14</volume>
<fpage>63</fpage>
<lpage>74</lpage>
</element-citation>
</ref>
<ref id="B67">
<label>67</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>J.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Parker</surname>
<given-names>S.</given-names>
</name>
</person-group>
<chapter-title>Source similarity in loanword adaptation: Correspondence Theory and the posited source-language representation</chapter-title>
<source>Phonological Argumentation: Essays on Evidence and Motivation</source>
<year iso-8601-date="2009">2009</year>
<publisher-loc>London</publisher-loc>
<publisher-name>Equinox</publisher-name>
<fpage>155</fpage>
<lpage>177</lpage>
</element-citation>
</ref>
<ref id="B68">
<label>68</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Steriade</surname>
<given-names>D.</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Brugman</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Macaulay</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>Glides and vowels in Romanian</article-title>
<conf-name>Proceedings of the 10th Annual Meeting of the Berkeley Linguistics Society</conf-name>
<year iso-8601-date="1984">1984</year>
<fpage>47</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="doi">10.3765/bls.v10i0.1935</pub-id>
</element-citation>
</ref>
<ref id="B69">
<label>69</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Steriade</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>Review of Clements &amp; Keyser (1983)</article-title>
<source>Language</source>
<year iso-8601-date="1988">1988</year>
<volume>64</volume>
<fpage>118</fpage>
<lpage>129</lpage>
<pub-id pub-id-type="doi">10.2307/414790</pub-id>
</element-citation>
</ref>
<ref id="B70">
<label>70</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stone</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>A guide to analyzing tongue motion from ultrasound images</article-title>
<source>Clinical Linguistics and Phonetics</source>
<year iso-8601-date="2005">2005</year>
<volume>19</volume>
<fpage>455</fpage>
<lpage>502</lpage>
<pub-id pub-id-type="doi">10.1080/02699200500113558</pub-id>
</element-citation>
</ref>
<ref id="B71">
<label>71</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Straka</surname>
<given-names>G.</given-names>
</name>
</person-group>
<article-title>A propos de la question des semi-voyelles</article-title>
<source>STUF &#8211; Language Typology and Universals</source>
<year iso-8601-date="1964">1964</year>
<volume>17</volume>
<fpage>301</fpage>
<lpage>323</lpage>
<pub-id pub-id-type="doi">10.1524/stuf.1964.17.16.301</pub-id>
</element-citation>
</ref>
<ref id="B72">
<label>72</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Townsend</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Janda</surname>
<given-names>L.</given-names>
</name>
</person-group>
<source>Common and comparative Slavic: Phonology and inflection</source>
<year iso-8601-date="1996">1996</year>
<publisher-loc>Columbus, OH</publisher-loc>
<publisher-name>Slavica</publisher-name>
</element-citation>
</ref>
<ref id="B73">
<label>73</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turner</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Tjaden</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Weismer</surname>
<given-names>G.</given-names>
</name>
</person-group>
<article-title>The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis</article-title>
<source>Journal of Speech, Language, and Hearing Research</source>
<year iso-8601-date="1995">1995</year>
<volume>38</volume>
<issue>5</issue>
<fpage>1001</fpage>
<lpage>1013</lpage>
<pub-id pub-id-type="doi">10.1044/jshr.3805.1001</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</article>