<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1868-6354</journal-id>
<journal-title-group>
<journal-title>Laboratory Phonology: Journal of the Association for Laboratory Phonology</journal-title>
</journal-title-group>
<issn pub-type="epub">1868-6354</issn>
<publisher>
<publisher-name>Open Library of Humanities</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.16995/labphon.8753</article-id>
<article-categories>
<subj-group>
<subject>Journal article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Vowel-initial glottalization as a prominence cue in speech perception and online processing</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Steffman</surname>
<given-names>Jeremy</given-names>
</name>
<email>jeremysteffman@gmail.com</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Department of Linguistics, Northwestern University, Evanston, IL, USA</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2023-01-11">
<day>11</day>
<month>01</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>46</lpage>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2023 The Author(s)</copyright-statement>
<copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.journal-labphon.org/articles/10.16995/labphon.8753/"/>
<abstract>
<p>Three experiments examined the relevance of vowel-initial glottalization in the perception of vowel contrasts in American English, in light of the claimed prominence-marking function of glottalization in word-initial vowels. Experiment 1 showed that the presence of a preceding glottal stop leads listeners to re-calibrate their perception of a vowel contrast in line with the prominence-driven modulation of vowel formants. Experiment 2 manipulated cues to glottalization along a continuum and found that subtler cues generate the same effect, with bigger perceptual shifts as glottalization cues increase in strength. Experiment 3 examined the timecourse of this effect in a visual world eyetracking task, finding a rapid influence of glottalization which is simultaneous with the influence of formant cues in online processing. Results are discussed in terms of the importance of phonetically detailed prominence marking in speech perception, and implications for models of processing which consider segmental and prosodic information jointly.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>1. Background</title>
<p>One important question in prosody research is the following: How do speakers make syllables and words prominent in speech, and how do listeners make use of this information? The answer to this question is complex, entailing a consideration of a language&#8217;s various cues to prominence, and the listener&#8217;s incorporation of prominence information in different domains of perception and processing.</p>
<p>In speech production, the literature has documented various ways in which speech articulations and acoustics are modulated by prosodic prominence, referred to here under the umbrella term of &#8220;prominence strengthening.&#8221; These effects generally help enhance a given segment&#8217;s perceptual salience, and/or enhance acoustic (or featural) properties relevant for the contrast system of a given language (e.g., <xref ref-type="bibr" rid="B5">Beckman, Edwards, &amp; Fletcher, 1992</xref>; <xref ref-type="bibr" rid="B12">Cho, 2005</xref>; <xref ref-type="bibr" rid="B17">Cole, Kim, Choi, &amp; Hasegawa-Johnson, 2007</xref>; <xref ref-type="bibr" rid="B20">de Jong, 1995</xref>; <xref ref-type="bibr" rid="B26">Garellek, 2014</xref>; <xref ref-type="bibr" rid="B43">Kim, Kim, &amp; Cho, 2018</xref>).</p>
<p>In comparison, relatively little work has been carried out examining the perceptual component of the above question. The present study thus addresses one part of this line of inquiry from the perspective of the listener. In three experiments, this study tests how glottalized voice quality and production of a glottal stop impact the perception of vowels in American English, in line with the hypothesized function of glottalization as prominence marking. The perception of /&#603;/ versus /&#230;/ is adopted as a test case. A visual-world eyetracking experiment further tests how the influence of glottalization plays out in online speech processing, and compares this data to that of a previous study (<xref ref-type="bibr" rid="B83">Steffman, 2021a</xref>), informing our understanding of how prominence cues are integrated as speech unfolds.</p>
<p>The introduction proceeds with a working definition of prosodic prominence (1.1), the role that vowel-initial glottalization has been shown to play in speech production (1.2), and finally the role of prosodic information in perception (1.3), motivating the test of vowel-initial glottalization as a prominence cue.</p>
<sec>
<title>1.1. Defining prominence</title>
<p>As suggested by continuing and recent reviews (<xref ref-type="bibr" rid="B3">Baumann &amp; Cangemi, 2020</xref>; <xref ref-type="bibr" rid="B47">Ladd &amp; Arvaniti, 2023</xref>), defining prominence is not an entirely straightforward enterprise. For the purpose of the present study, prominence is considered in two regards.</p>
<p>Firstly, following commonly used terminology from Jun (<xref ref-type="bibr" rid="B38">2005</xref>, <xref ref-type="bibr" rid="B39">2014</xref>), a language&#8217;s prosodic system can be described as having &#8220;head prominence&#8221; and/or &#8220;edge prominence.&#8221; In the former, the expression of prominence is linked to a prosodic head. Relevant to the present study, in American English this head is a metrically prominent syllable in a phrase. Metrically prominent syllables may be marked as phrasally prominent, and produced with a prominence-lending pitch movement (a pitch accent). This sort of prominence will henceforth be described as &#8220;phrasal prominence.&#8221; Of note, In languages which are described as lacking head prominence, the notion of a prosodic head is not relevant and intonational F0 events demarcate domain (phrase) edges; Ladd and Arvaniti (<xref ref-type="bibr" rid="B47">2023</xref>) raise the question if, in languages of this sort, the concept of prosodic prominence is a useful one at all.</p>
<p>Another definition of prominence can be made without reference to metrical structure, or the prosodic features of a particular language. This is a language-general notion of &#8220;standing out.&#8221; Two definitions are as follows:</p>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(1)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>&#8220;Prosodic prominence [is] the strength of a spoken word relative to the words surrounding it in the utterance&#8221; (<xref ref-type="bibr" rid="B18">Cole, Mo, &amp; Hasegawa-Johnson, 2010, p. 425</xref>).</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(2)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p>&#8220;Loosely defined, &#8216;perceptual prominence&#8217; refers to any aspect of speech that somehow &#8216;stands out&#8217; to the listener&#8221; (<xref ref-type="bibr" rid="B3">Baumann &amp; Cangemi, 2020, p. 20</xref>).</p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>The definition in (1) uses the concept of a word, though the same definition could also apply to sub-word units. Both this definition and the perceptual definition in (2) are evidently related to phrasal prominence: a phrasally prominent, pitch-accented syllable/word will be prominent in the sense of both (1) and (2).<xref ref-type="fn" rid="n1">1</xref> However the definitions are broader in that other properties besides phrasal prominence may also impact the (perceived) &#8220;strength&#8221; of a word in relation to surrounding material (including, e.g., word frequency as in <xref ref-type="bibr" rid="B4">Baumann &amp; Winter, 2018</xref>). One relevant example of this is domain-initial strengthening (e.g., <xref ref-type="bibr" rid="B13">Cho &amp; McQueen, 2005</xref>; <xref ref-type="bibr" rid="B40">Keating, 2006</xref>; <xref ref-type="bibr" rid="B41">Keating, Cho, Fougeron, &amp; Hsu, 2004</xref>). Here the phonetic properties of segments are strengthened in phrase-initial positions, though not necessarily in analogous fashion to strengthening under phrasal prominence (<xref ref-type="bibr" rid="B43">Kim, Kim, &amp; Cho, 2018</xref>). These strengthening effects can be seen as enhancing the acoustic/phonetic prominence of a given segment, if prominence is defined as in (1) and (2) above.</p>
<p>As will be described in Section 1.2, vowel-initial glottalization in American English can be related both to phrasal prominence, and to the more general definitions given in (1) and (2). On the one hand, it is probabilistically predicted by phrasal prominence: Phrasally prominent vowel-initial words are more likely to be preceded by glottalization, discussed below. On the other hand, vowel-initial glottalization is also predicted by phrasing, and can be seen as an instance of general acoustic/phonetic prominence strengthening for the following vowel. Both of these views of prominence are thus relevant when considering vowel-initial glottalization effects.</p>
</sec>
<sec>
<title>1.2. Vowel-initial glottalization in speech production</title>
<p>&#8220;Glottalization&#8221; is used here as a cover term to refer to the production of a sustained closure of the vocal folds, i.e., a glottal stop [&#660;], and localized voice quality changes that are associated with constriction of the vocal folds during voicing (<xref ref-type="bibr" rid="B25">Garellek, 2013</xref>; <xref ref-type="bibr" rid="B35">Huffman, 2005</xref>). The cover term is useful if we consider the latter of these to be an &#8220;incomplete&#8221; or lenited glottal stop realization, as is common in the literature (<xref ref-type="bibr" rid="B23">Dilley, Shattuck-Hufnagel, &amp; Ostendorf, 1996</xref>; <xref ref-type="bibr" rid="B69">Pierrehumbert &amp; Talkin, 1992</xref>).</p>
<p>Of the languages described in the UPSID database (<xref ref-type="bibr" rid="B49">Maddieson &amp; Precoda, 1989</xref>), about half use glottalization contrastively (often represented as /&#660;/). However, in many languages that do not use glottalization contrastively, it is well documented that glottalization is nevertheless pervasive in speech, for example in English, Dutch, and Spanish (<xref ref-type="bibr" rid="B23">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B26">Garellek, 2014</xref>; <xref ref-type="bibr" rid="B37">Jongenburger &amp; van Heuven, 1991</xref>). An important task for speech research is thus accounting for the prevalence and distribution of glottalization in spoken language.</p>
<p>One clear predictor of glottalization in American English (among other languages) is prosodic organization, both related to prosodic boundaries and prosodic prominence, as noted above. Glottal stops are described as being &#8220;inserted&#8221; at the beginning of vowel-initial words in prosodically strong positions, where prosodically strong positions include the beginning of a prosodic phrase (<xref ref-type="bibr" rid="B23">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B69">Pierrehumbert &amp; Talkin, 1992</xref>), and in words which bear phrasal prominence (<xref ref-type="bibr" rid="B23">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B25">Garellek, 2013</xref>). Dilley et al. (<xref ref-type="bibr" rid="B23">1996</xref>) in particular show that phrase-medial, word-initial vowels in pitch-accented (phrasally-prominent) syllables are glottalized at higher rates as compared to non-prominent equivalents. Notably however, not all pitch accented word-initial vowels are glottalized, and vowels in words which lack pitch-accent but do not have a reduced vowel are more likely to be glottalized than reduced vowels. Speakers also vary widely in their overall rate of glottalization and the extent to which prominence impacts their rate of glottalization. In this sense, glottalization in word-initial vowels is only probabilistically related to phrasal prominence marking, though with a clear tendency to co-occur with phrasal prominence. Redi and Shattuck-Hufnagel (<xref ref-type="bibr" rid="B73">2001</xref>) document similar patterns, and consistent inter-speaker variation, and state: &#8220;It is clear from these results and from earlier studies that phrase-level glottalization is not obligatory [&#8230;] glottalization may serve as a marker of &#8216;degree of finality&#8217; (when it occurs at phrase boundaries) or &#8216;degree of prominence&#8217; (when it occurs at pitch-accented syllables). Perceptual experiments will be necessary to evaluate the hypothesis that glottalization unrelated to segmental allophony is interpreted by listeners as evidence for a boundary or a prominence, and to determine whether it is interpreted along a continuum or as a contrastive binary feature&#8221; (p. 427). The present study addresses both of these perceptual questions.</p>
<p>Garellek (<xref ref-type="bibr" rid="B25">2013</xref>, <xref ref-type="bibr" rid="B26">2014</xref>) further suggests a functional motivation for vowel-initial glottalization in American English, using electroglottography (EGG) to examine voicing in vowel-initial words. Garellek (<xref ref-type="bibr" rid="B26">2014</xref>) found that phrase-initial vowels, particularly non-prominent vowels, were generally produced with less vocal fold contact during voicing, corresponding to breathy voicing (not glottalization). This suggests, for Garellek&#8217;s data at least, glottalization is not having a systematic effect on non-prominent vowels phrase-initially and is more related to prominence marking (cf. <xref ref-type="bibr" rid="B23">Dilley et al., 1996</xref>). This effect also became larger at higher-level phrasal domains. Breathier phrase-initial voicing can be attributed to phrase-initial pitch reset, where falling pitch (immediately after reset) results in relaxation of the cricothyroid and thyroarytenoid muscles, and vocal fold abduction (<xref ref-type="bibr" rid="B56">Mendelsohn &amp; Zhang, 2011</xref>; <xref ref-type="bibr" rid="B96">Zhang, 2011</xref>). Breathier voicing generally leads to decreased intensity and weaker formant energy (<xref ref-type="bibr" rid="B27">Garellek &amp; Keating, 2011</xref>; <xref ref-type="bibr" rid="B30">Gordon &amp; Ladefoged, 2001</xref>), and Garellek (<xref ref-type="bibr" rid="B26">2014</xref>) accordingly proposes that phrase-initial glottalization, most evident in his data in prominent phrase-initial vowels, occurs as a countervailing influence which mitigates the effects of pitch-reset-induced breathiness on voice quality. Glottalization in prominent phrase-initial vowels &#8220;strengthens&#8221; these vowels, as described by Garellek, in the sense that it engenders more high frequency energy and overall intensity, and boosts frequency information that will be useful in vowel perception (<xref ref-type="bibr" rid="B46">Kreiman &amp; Sidtis, 2011</xref>; cf. <xref ref-type="bibr" rid="B25">Garellek, 2013</xref> who found a boost of harmonic energy between 1500&#8211;2500 Hz). Glottalization may also be functionally useful in prominence-marking in separating prominent vowel-initial words from surrounding material, and modulating the amplitude envelope in the vicinity of prominent vowels to make them stand out. Preceding silence from a glottal stop will likewise give a boost to listeners&#8217; auditory system at the onset of the vowel (<xref ref-type="bibr" rid="B21">Delgutte, 1980</xref>; <xref ref-type="bibr" rid="B22">Delgutte &amp; Kiang, 1984</xref>). This view of phrase-initial (and phrase-medial) glottalization implicates (acoustic/phonetic) prominence as a driving force behind it, in that vowels which are preceded by glottalization are enhanced (though this may be either at prosodic domain edges to mitigate phrase-initial breathiness, or at phrasally prominent prosodic heads). In this sense, glottalization in word-initial vowels in American English can be seen as an example of phonetic prominence strengthening, which is related probabilistically to phrasal prominence.</p>
<p>In addition to prosodic prominence, other factors have been shown to influence the rate and distribution of glottalization preceding a vowel in various languages. These include speech rate (<xref ref-type="bibr" rid="B71">Pompino-Marschall &amp; &#379;ygis, 2010</xref>; <xref ref-type="bibr" rid="B90">Umeda, 1978</xref>) and vowel height (<xref ref-type="bibr" rid="B31">Groves, Groves, Jacobs, et al., 1985</xref>; <xref ref-type="bibr" rid="B57">Michnowicz &amp; Kagan, 2016</xref>; <xref ref-type="bibr" rid="B71">Pompino-Marschall &amp; &#379;ygis, 2010</xref>; <xref ref-type="bibr" rid="B88">Thompson, Thompson, &amp; Efrat, 1974</xref>). As documented in German and Spanish, the relative openness of vowels in a vowel hiatus environment predicts the production of glottalization between them: Lower (more open) vowels are more likely to be preceded by a glottal stop (<xref ref-type="bibr" rid="B54">Mckinnon, 2018</xref>; <xref ref-type="bibr" rid="B71">Pompino-Marschall &amp; &#379;ygis, 2010</xref>). However, relevant to the present study, in American English this is not systematic. Umeda (<xref ref-type="bibr" rid="B90">1978</xref>) found no relationship between relative differences in vowel height and production of a glottal stop in a hiatus environment, and Garellek (<xref ref-type="bibr" rid="B25">2013</xref>) found that the rate of production of glottal stop in a vowel-initial word was not related to vowel height. Given this, it appears that vowel-initial glottalization is not well predicted by vowel height in American English as it is in e.g., German. This point will be returned to in Section 3.3 in light of the results.</p>
</sec>
<sec>
<title>1.3. Prosody and prominence in perception</title>
<p>Given the aforementioned patterns attested in the speech production literature, we can now consider some ways in which prosodic information impacts speech perception, and how these prior findings relate to the objectives of the current study.</p>
<p>In some studies, prosodic information (e.g., an intonational tune), has been shown to exert a predictive, or anticipatory, influence on speech processing. For example, Weber, Grice, and Crocker (<xref ref-type="bibr" rid="B92">2006</xref>) found that German intonational tunes are used by listeners to disambiguate temporarily ambiguous sentences as S(ubject) V(erb) O(bject) or OVS, prior to critical case information which disambiguated the constituent order. Similar anticipatory effects of pitch accent type were shown by Ito and Speer (<xref ref-type="bibr" rid="B36">2008</xref>), where by a prominent (L+H*) pitch accent was interpreted as conveying contrastive focus on one element in adjective-noun pairs, generating anticipatory looks to a referent (e.g., as participants decorated a Christmas tree: &#8220;Hang the blue ball, now hang the GREEN &#8230;.&#8221; generates anticipatory looks to a green ball). Results such as these in Weber et al. (<xref ref-type="bibr" rid="B92">2006</xref>) and Ito and Speer (<xref ref-type="bibr" rid="B36">2008</xref>) (among others, e.g., <xref ref-type="bibr" rid="B19">Dahan, Tanenhaus, &amp; Chambers, 2002</xref>; <xref ref-type="bibr" rid="B65">Nakamura, Harris, &amp; Jun, 2022</xref>; <xref ref-type="bibr" rid="B78">Snedeker &amp; Trueswell 2003</xref>) indicate that prosodic cues, especially intonational tunes, can be used to anticipate upcoming speech in terms of syntactic, discourse, and information structure.</p>
<p>Complementing this research, the role of prosodic features such as prominence in the perception of speech segments (and relatedly in pre-lexical and lexical processing) has been a recent topic of interest in the literature (<xref ref-type="bibr" rid="B44">Kim, Mitterer, &amp; Cho, 2018</xref>; <xref ref-type="bibr" rid="B55">McQueen &amp; Dilley, 2020</xref>; <xref ref-type="bibr" rid="B58">Mitterer, Cho, &amp; Kim, 2016</xref>; <xref ref-type="bibr" rid="B59">Mitterer, Kim, &amp; Cho, 2019</xref>). In comparison to the results described in the preceding paragraph, data in this line of research offers a different view of the way in which listeners use prosodic information in their perception of fine-grained phonetic detail, and their integration of prosody in perception of cues to segmental contrasts. As alluded to above, it is well documented in the speech production literature that prosodic organization modulates cues that are relevant in the perception of segmental contrasts (see e.g., <xref ref-type="bibr" rid="B40">Keating, 2006 for an overview</xref>). For example, voice onset time (VOT) in aspirated stops, an important cue for voicing contrasts, varies systematically as a function of prosodic factors. VOT is longer at the beginning of prosodic domains and in phrasally prominent positions (<xref ref-type="bibr" rid="B17">Cole et al., 2007</xref>; <xref ref-type="bibr" rid="B41">Keating et al., 2004</xref>; <xref ref-type="bibr" rid="B44">Kim, Mitterer, &amp; Cho, 2018</xref>). Another example of prosodically modulated cues to segmental contrasts, described in more detail in Section 2, is that of vowel formants. To the extent that phrasal prosody impacts segmental realization along these lines, the listener is hypothesized to benefit from integrating prosodic information in their perception of segmental cues (<xref ref-type="bibr" rid="B42">Kim &amp; Cho, 2013</xref>; <xref ref-type="bibr" rid="B58">Mitterer et al., 2016</xref>).</p>
<p>A model which has framed this line of inquiry and received empirical support is that of <italic>Prosodic Analysis</italic> (<xref ref-type="bibr" rid="B14">Cho, McQueen, &amp; Cox, 2007</xref>; <xref ref-type="bibr" rid="B55">McQueen &amp; Dilley, 2020</xref>). The model&#8217;s architecture stipulates simultaneous parses of segmental information and prosodic information from the speech signal, though the role of each of these in processing is different. Adopting an activation-competition view of word recognition, the model postulates that segmental information activates entries in the lexicon, while phrasal prosodic information is used to select among possible lexical candidates. In the original formulation of the model this entails the reconciliation of prosodic boundaries and word boundaries to determine lexical selection (cf. <xref ref-type="bibr" rid="B16">Christophe, Peperkamp, Pallier, Block, &amp; Mehler, 2004</xref>). Empirical support for the model comes from studies showing a delayed influence of prosodic boundary information in processing (<xref ref-type="bibr" rid="B44">Kim, Mitterer, &amp; Cho, 2018</xref>; <xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>), consistent with a post-lexical influence in word recognition. This framing of the role of prosody in processing departs somewhat from the anticipatory effects described above, and this follows from the fact that prosodic characteristics are good predictors of sentence and discourse structure as in Weber et al. (<xref ref-type="bibr" rid="B92">2006</xref>) and Ito and Speer (<xref ref-type="bibr" rid="B36">2008</xref>); however, they are not good predictors of particular lexical items themselves (i.e., generally speaking, a given word can be produced with a range of prosodic expressions, phrase-medially, phrase-initially, and so on). In this sense, the Prosodic Analysis model (and existing data) suggests that prosodic information is not used to anticipate a given word, but is instead integrated with bottom-up cues in lexical processing with a relative delay, consistent with modulation of activated lexical hypotheses. In other words, if the listener&#8217;s task is to identify a lexical item (in the absence of other good predictive information), prosodic cues may be integrated in this process but not used to anticipate what word will be said prior to acoustic information about that word is perceived. What the Prosodic Analysis model and available data show more generally is the importance of considering both prosodic and segmental factors as being processed in parallel in speech recognition, with many outstanding questions (see <xref ref-type="bibr" rid="B55">McQueen &amp; Dilley 2020</xref> for a recent overview).</p>
<p>With respect to glottalization specifically, recent perception and processing studies in Maltese, a language in which /&#660;/ is contrastive, suggest that listeners are sensitive to its prosodic patterning in the language (<xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>; <xref ref-type="bibr" rid="B60">Mitterer, Kim, &amp; Cho, 2021a</xref>, <xref ref-type="bibr" rid="B61">2021b</xref>). In addition to marking a phonemic contrast in Maltese, vowel-initial words can be glottalized when they are at the beginning of a prosodic phrase as a form of phrase-initial strengthening. Glottalization thus serves a sort of dual function: It is phonemic and conveys contrast, and also patterns based on prosodic organization. Mitterer et al. (<xref ref-type="bibr" rid="B59">2019</xref>) show that listeners are aware of this dual patterning. When a word is phrase-initial, the listener is more likely to attribute the presence of glottalization as being driven by prosody, thus inferring a phonemically vowel-initial word. In contrast, when glottalization precedes a vowel phrase-medially, the listener is more likely to infer that the word is phonemically/contrastively glottalized. Consistent with the prosodic analysis model, these effects were seen to be delayed in time, as assessed in a visual world eyetracking study, and supporting that prediction from the prosodic analysis model. Mitterer et al. (<xref ref-type="bibr" rid="B60">2021a</xref>) show that glottal stops differ from other stops (e.g., /t/) in that they do not strongly constrain lexical access, suggesting that listeners&#8217; interpretation of glottalization is intimately linked to prosodic features in a way that differs from other stops. Mitterer et al. (<xref ref-type="bibr" rid="B61">2021b</xref>) further show that glottalization is clearly interpreted as a prosodic feature in that it impacts syntactic parsing decisions in the resolution of attachment ambiguity: The presence of word-initial glottalization leads listeners to posit a preceding prosodic boundary, and thus the presence of a syntactic boundary. These results together thus suggest that vowel-initial glottalization can be treated as prosodic cue in perception by listeners, even when glottalization is contrastive.</p>
<p>Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) offers another relevant comparison for the present study. Steffman examined the influence of prosodic prominence on listeners&#8217; perception of vowel contrasts, as cued by the intonational tune and durational patterns of a phrase. Vowels are strengthened phonetically by formant modulations described in Section 2. Steffman thus tested how phrase-level prominence impacted the perception of vowel formants, and further examined the timecourse of its influence. As noted above, in American English, the expression of prominence is related to the placement of pitch accents in a phrase, which are linked to metrically prominent syllables and (in the autosegmental-metrical model of American English intonation, e.g., <xref ref-type="bibr" rid="B67">Pierrehumbert, 1980</xref>) are implemented as F0 targets in an intonation contour. Steffman manipulated F0, duration and intensity in a phrase to shift perceived pitch accentuation, and the perceived prominence of a target word, in the stimuli. In one condition, the target word (which was categorized by listeners) was relatively prominent, interpretable as having an (H*) pitch accent in the phrase &#8220;I&#8217;ll say [TARGET] now&#8221; (where [TARGET] indicates the target word; this could be uttered in a broad focus context). In the other condition, the target word was preceded by focus on the verb &#8220;say&#8221;: &#8220;I&#8217;ll SAY [target] now,&#8221; where &#8220;say&#8221; bore a prominent L+H* pitch accent (this could be uttered in a contrastive focus context, e.g., A: &#8220;Will you write [target] now?&#8221;, B: &#8220;I&#8217;ll SAY [target] now&#8221;). In this condition the target is post-focus and non-prominent (more details on the stimuli in <xref ref-type="bibr" rid="B83">Steffman, 2021a</xref> are given in Section 4.5.2, which compares that data to the results of this study). This prominence manipulation is one of phrasal/global prominence cues, and was found to impact listeners&#8217; perception of the target vowel in line with the patterns which will be described in Section 2.</p>
<p>Using eyetracking data, Steffman additionally found that, in contrast to the strictly delayed influence of prosodic boundaries documented in previous studies (<xref ref-type="bibr" rid="B43">Kim, Kim, &amp; Cho, 2018</xref>; <xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>), phrasal prominence showed subtle earlier influences in vowel perception, though these effects were quite small, and strengthened over time to be more robust later in processing. The presence of the earlier effect was discussed in Steffman (<xref ref-type="bibr" rid="B82">2020</xref>, 2021a) as reflecting prominence processing at multiple stages, described in terms of the Multistage Assessment of Prominence in Processing (MAPP) model. This model proposes that prosodic information needn&#8217;t show a strictly delayed (post-lexical) influence in processing as in the Prosodic Analysis model. Instead, early effects reflect &#8220;phonetic prominence&#8221;: The relative acoustic/phonetic salience of a word (signaled by whatever cues lend prominence in this sense). The fact that the effect was strongest later in time was interpreted as the result of a more abstract/phonological prominence percept (e.g., the presence or absence of pitch-accentuation), which is reconciled with lexical candidates, under the hypothesis that the lexicon contains information about prosodically conditioned pronunciation variants along the lines of Brand and Ernestus (<xref ref-type="bibr" rid="B8">2018</xref>); Mitterer et al. (<xref ref-type="bibr" rid="B60">2021a</xref>); Pitt (<xref ref-type="bibr" rid="B70">2009</xref>). Notably, this multi-stage effect was generated from stimuli that varied both in terms of phonological prominence (pitch accent structure within the phrase), and necessarily, the relative phonetic prominence of the target word. One prediction from the MAPP model is thus that cues which convey only &#8220;phonetic prominence,&#8221; i.e., vowel initial glottalization, without varying a more global prominence in terms of pitch accent structure etc., should show a clear early effect, and a different online processing pattern than the effect in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>). The present data thus address this prediction from the model directly as a test of how different cues to prominence may be processed differently.</p>
</sec>
</sec>
<sec>
<title>2. The present study</title>
<p>Given these recent studies on the role of prominence in vowel perception and the processing of vowel-initial glottalization, the present experiments will inform if prominence cued by glottalization should be considered as a mediating factor in vowel perception in American English, a language where glottalization is not contrastive. To the extent that vowel-initial glottalization is a relevant prominence cue, we can examine the timecourse of its influence in relation to the general prediction from the prosodic analysis model that prosody shows a delayed influence in processing, and compare this data to that in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>).</p>
<p>Relevant to the present study, the literature documents a variety of ways in which vowel articulations may be modulated under prominence. Typically, prosodic prominence is here considered in terms of phrase-level prominence marking: The presence or absence of a pitch accent. A well-documented pattern of prominence strengthening in vowels has been termed <italic>sonority expansion</italic>, where sonority is defined as &#8220;the overall openness of the vocal tract or the impedance looking forward from the glottis&#8221; (<xref ref-type="bibr" rid="B77">Silverman &amp; Pierrehumbert, 1990, p. 75</xref>). In this sense, a more sonorous vowel articulation is one which is produced with increased amplitude of jaw movement and other articulatory adjustments that allow more energy to radiate from the mouth. Sonority-expanding gestures make a vowel articulation more acoustically prominent (louder, longer etc.), and have been described as enhancing its &#8220;sonority features&#8221; (<xref ref-type="bibr" rid="B20">de Jong, 1995</xref>). Other effects, not consistent with sonority expansion, have also been documented in the literature, for example, the production of more extreme high vowel articulations (as with /i/), which are not more open but instead reflect hyperarticulation of the vowel target under prominence (<xref ref-type="bibr" rid="B12">Cho, 2005</xref>; <xref ref-type="bibr" rid="B20">de Jong, 1995</xref>; <xref ref-type="bibr" rid="B24">Erickson, 2002</xref>). In this sense, patterns of prominence strengthening are dependent on the vowels under consideration, and the system of contrasts in the language (e.g., <xref ref-type="bibr" rid="B12">Cho, 2005</xref>; <xref ref-type="bibr" rid="B28">Garellek &amp; White, 2015</xref>), and so is the listener&#8217;s perception of vowels a function of prominence (<xref ref-type="bibr" rid="B82">Steffman, 2020</xref>).</p>
<p>Vowels which <italic>do</italic> undergo sonority expansion are realized as acoustically lower and backer in the vowel space, with higher F1 and lower F2 (<xref ref-type="bibr" rid="B12">Cho, 2005</xref>), and listeners&#8217; perception of prominence in a prominence transcription task reflects this formant variation as well (<xref ref-type="bibr" rid="B63">Mo, Cole, &amp; Hasegawa-Johnson, 2009</xref>). This pattern will form the basis of the test case adopted in the present study as we ask if listeners expect a more prominent variant of a vowel (specifically with higher F1 and lower F2) to be realized in a prominent context.</p>
<p>The questions raised in Section 1 are addressed by testing if a glottal stop modulates vowel perception in line with sonority expansion effects on vowel formants (Experiment 1), using the contrast between /&#603;/ and /&#230;/ as a test case (vowels which undergo sonority expansion). This study further tests if fine-grained glottalization cues that do not entail a sustained stop generate the same effect (Experiment 2), and if glottalization mediates online processing of vowel information in the ways predicted by the current model of Prosodic Analysis (Experiment 3). For this purpose two methods are used: A two-alternative forced choice task, and a visual world eyetracking task. In both, listeners categorized stimuli on an /&#603;/-/&#230;/ continuum with various contextual manipulations of glottalization. The stimuli used in the present experiments, the data for each experiment, and the scripts used the analyze the data are included in full in the open-access repository for the paper hosted via the Open Science Framework at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://osf.io/v4cdz/">https://osf.io/v4cdz/</ext-link>.</p>
<sec>
<title>2.1. Predictions</title>
<p>In order to help explain the creation of the stimuli, let us first consider several empirical predictions. If a vowel preceded by glottalization is perceived as prominent, a more prominent acoustic realization of that vowel may be expected by listeners. In this case, it would mean a lower and backer realization of the vowel (with higher F1 and lower F2), with a prominent /&#603;/ essentially becoming acoustically more like /&#230;/. The corresponding perceptual response would thus be a shift in categorization of the F1/F2 continuum, with more sonorant (lower, backer) F1/F2 values categorized as /&#603;/ in a prominent context (when preceded by glottalization), as compared to a non-prominent one. Empirically, this predicts increased /&#603;/ responses under prominence. Such an effect would constitute perceptual re-calibration for a prominent vowel realization. It is worth noting here that Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) found this effect with the same contrast, when prominence was cued by global/phrasal context as described above.</p>
</sec>
<sec>
<title>2.2. Materials</title>
<p>The materials used in all experiments reported here were created by re-synthesizing the speech of a male American English speaker. The speech used in making the stimuli was recorded in a sound-attenuated booth in the UCLA Phonetics Lab, using an SM10A Shure<sup>TM</sup> microphone and headset. Recordings were digitized at 32 bit with a 44.1 kHz sampling rate.</p>
<sec>
<title>2.2.1. A full glottal stop: Experiments 1 and 3</title>
<p>The method for creating the stimuli was to design a continuum that varied in F1 and F2, ranging between two vowels, and manipulate the presence or absence of preceding glottalization. The two words used as endpoints of the continuum were &#8220;ebb&#8221; /&#603;/, and &#8220;ab&#8221; /&#230;/. F1 and F2 were manipulated by LPC decomposition and resynthesis using the Burg method (<xref ref-type="bibr" rid="B93">Winn, 2016</xref>) in Praat (<xref ref-type="bibr" rid="B6">Boersma &amp; Weenink, 2020</xref>). The formant values for each endpoint were based on model sound productions of &#8220;ebb&#8221; and &#8220;ab,&#8221; with measures across the entire vowel (i.e., time-series measurements that included the dynamics of F1 and F2). The resynthesis process estimated the source and filter for the starting model sound from the &#8220;ebb&#8221; model. The filter model&#8217;s F1 and F2 were then adjusted to match those of a model &#8220;ab&#8221; production. From these two filter models, eight intermediate filter steps were created by interpolating between these model endpoint values in Bark space (<xref ref-type="bibr" rid="B89">Traunm&#252;ller, 1990</xref>). Phase-locked higher frequencies from the starting base /&#603;/ model were restored to all continuum steps, improving the naturalness of the continuum. The result was a 10-step continuum ranging from /&#603;/ to /&#230;/ values in F1 and F2. Intensity and pitch were invariant across the continuum.</p>
<p>The starting point for stimulus creation was a production of the sentence &#8220;say the ebb now,&#8221; with &#8220;the&#8221; produced as [&#240;&#477;], which was how the model speaker produced it without explicit instruction (as compared to the alternative pronunciation [&#240;i]).<xref ref-type="fn" rid="n2">2</xref> The sentence was produced with an H* pitch accent on the word &#8220;ebb,&#8221; such that the word with the target vowel bore the final (nuclear) pitch accent in the phrase (this was systematic in the model speaker&#8217;s other productions of the sentence, including those which were not used in stimulus creation, and was a natural way for them to produce the sentence).</p>
<p>The file from which the continuum was created was one produced without a glottal stop preceding the target word. The model speaker (a trained phonetician) reported that it was most natural for them to produce a glottal stop between the two vowels, though renditions with and without a glottal stop were both easy to produce. The speaker was prompted to record multiple productions of both target words with and without a preceding glottal stop. The base files for stimulus creation were selected as those which had the clearest production of the target vowels, sounded natural in terms of tempo etc., and which were either very clearly produced with, or without, a glottal stop. The creation of the continuum only altered F1 and F2 in the target word as described above, creating a [&#240;&#477;&#603;b] to [&#240;&#477;&#230;b] continuum, with continuous formant transitions from the precursor vowel to the target (as there was no intervening glottal stop). Formant tracks for the 10-step continuum, and preceding vowel are shown in <xref ref-type="fig" rid="F1">Figure 1</xref> panel B. This constitutes what will be referred to as the &#8220;no glottal stop condition,&#8221; where no glottal stop preceded the target sound in the vowel hiatus environment. The formants in the precursor vowel [&#477;] were also slightly lowered and backed in the vowel space (F1 raised, F2 lowered) so that these manipulations did not introduce a confound related to spectral contrast effects.<xref ref-type="fn" rid="n3">3</xref> This manipulation made the precursor vowel sound slightly lower than a canonical [&#477;], though it was clearly intelligible and judged to sound natural.</p>
<fig id="F1">
<label>Figure 1</label>
<caption>
<p>Visualizations of the stimuli used in all Experiments. Panel A: Waveforms and spectrograms showing the glottal stop manipulation (y axis 0&#8211;4000 Hz and ticks on the x axis are placed at 100 ms intervals, in this example the target vowel is at step 10, the most /&#230;/-like). The intensity profile is additionally overlaid on the spectrograms as a dashed line. Panel B: Formant tracks showing the 10-step continuum created from the VV sequence (the target and the preceding vowel). Panel C: Waveforms showing the four steps of the glottalization continuum from Experiment 2, with just the target vowel and preceding vowel shown. The two vertical lines show the beginning and end of [&#477;] respectively (the rightmost line being the same as the vertical line in Panel B).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g1.png"/>
</fig>
<p>The method for creating the &#8220;glottal stop condition&#8221; was to cross-splice [&#660;] from a different production of the carrier phrase in which it preceded the target. The portion of the glottal stop that was inserted was the silent closure (approximately 100 ms in duration), and the short aperiodic burst that accompanied the release of the stop (approximately 15 ms). The stop duration was based on several repetitions from the model speaker (in a careful speech style), and was judged to sound appropriate for the speech rate and of the stimuli. This duration is fairly long, though not outside of the norm: Byrd (<xref ref-type="bibr" rid="B11">1993</xref>) describes the durational characteristics of glottal stops in the TIMIT database of American English and finds a mean duration of 76 ms for glottal stop closures between two vowels with 100 ms falling within one standard deviation of that mean (cf. <xref ref-type="bibr" rid="B32">Henton, Ladefoged, &amp; Maddieson, 1992</xref>).<xref ref-type="fn" rid="n4">4</xref></p>
<p>The production from which [&#660;] was cross-spliced was [&#240;&#477;&#660;&#230;b]. In the case that any information about the following vowel is contained in the release of the stop (though none was perceived), it would bias listeners towards /&#230;/ when a glottal stop precedes the target, which is the opposite of the predicted prominence effect, described in Section 2.1. The point at which the glottal stop was inserted was where formant trajectories began to shift to the target vowel, indicated by the dashed vertical line in <xref ref-type="fig" rid="F1">Figure 1</xref> panel B. The insertion of [&#660;] resulted in a sudden end to the vowel in the precursor. To render the precursor more natural, several periods from [&#477;] in the production of [&#240;&#477;&#660;&#230;b] were cross-spliced and appended to the precursor vowel at a zero crossing in the waveform. This cross-spliced material replaced the six pitch periods that immediately preceded formant variation along the continuum in the no glottal stop condition (with approximately 60 ms of voicing replaced). The cross-spliced material introduced a dip in amplitude and irregular voicing going into the glottal stop, which was judged to improve the naturalness of the stimuli substantially. This modified precursor vowel and following [&#660;] were cross-spliced to precede all steps on the continuum, resulting in a [&#240;&#477;&#660;&#603;b] to [&#240;&#477;&#660;&#230;b] continuum, one endpoint of which is shown in <xref ref-type="fig" rid="F1">Figure 1</xref> panel A. Note that the appended periods were identical for all stimuli, as the precursor did not vary across the formant continuum. All stimuli underwent formant resynthesis, however the glottal stop condition was created by cross-splicing, while there was no cross-splicing manipulation in the no glottal stop condition. This was done in order to keep the continuum acoustically identical across conditions. As a consequence the glottal stop condition is, in a sense, less natural than the no glottal stop condition, though the manipulation was found to sound very similar to naturally produced glottal stops (produced by the speaker in recording for the stimuli). The sudden onset of the target vowel in the glottal stop condition was additionally found to match the acoustic profile of these naturally produced stops and thus deemed to be an adequate manipulation of glottalization cues.</p>
</sec>
<sec>
<title>2.2.2. A glottalization continuum: Experiment 2</title>
<p>As is well documented in the speech production literature, and noted above, the way in which glottalization is realized phonetically is notoriously variable, and needn&#8217;t entail the production of a sustained stop at the glottis (<xref ref-type="bibr" rid="B23">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B25">Garellek, 2013</xref>; <xref ref-type="bibr" rid="B73">Redi &amp; Shattuck-Hufnagel, 2001</xref>). As such, an important question is if different realizations of a glottal stop produce similar perceptual effects. Various studies have shown that glottalization may be cued perceptually by a decrease in pitch and intensity (<xref ref-type="bibr" rid="B29">Gerfen &amp; Baker, 2005</xref>; <xref ref-type="bibr" rid="B68">Pierrehumbert &amp; Frisch, 1997</xref>). Accordingly, Experiment 2 was designed to create a continuum that varied in glottalization strength. Step 1 in the glottalization continuum in Experiment 2 was the same as the &#8220;no glottal stop condition&#8221; in Experiment 1. Three additional glottalization conditions were created (labeled step 2&#8211;4 in <xref ref-type="fig" rid="F1">Figure 1</xref> panel C). In each, pitch and intensity cues were varied to signal an increase in the strength of glottalization between the pre-target and target vowels. The endpoint of the continuum is not a complete stop (unlike the glottal stop condition in Experiment 1).</p>
<p>This manipulation was implemented by decreasing the f0 and intensity at the juncture of the two vowels, indicated by the dashed vertical line in <xref ref-type="fig" rid="F1">Figure 1</xref> panel B. The seven f0 periods at and surrounding this point were manipulated. Intensity was manipulated as a 2 dB decrease in intensity per glottalization continuum step for these seven periods, which were then cross-spliced into the original unmodified production at zero crossings in the waveform. The pitch manipulation, which was implemented with the PSOLA method in Praat (<xref ref-type="bibr" rid="B64">Moulines &amp; Charpentier, 1990</xref>) took the f0 period at the juncture and decreased it linearly by 25 Hz at each step. An original f0 of approximately 115 Hz at Step 1 thus became 90, 65, and 40 Hz at Steps 2, 3 and 4 respectively. f0 was interpolated linearly from this low point across the surrounding three periods on either side to the f0 values surrounding them. The result was a four-step continuum in strength of glottalization, shown in <xref ref-type="fig" rid="F1">Figure 1</xref> panel C.</p>
<p>Experiment 2 used a subset of the formant continuum steps from Experiment 1, as it was observed that listeners in Experiment 1 were essentially at ceiling in their categorization responses for steps 1&#8211;3. For this reason only steps 3&#8211;10 from Experiment 1 were used.</p>
</sec>
</sec>
</sec>
<sec>
<title>3. Experiments 1 and 2</title>
<p>Experiments 1 and 2 are described and presented together here, given their similarity. In addition to the general prediction of increased /&#603;/ responses under prominence, In Experiment 2 we can further predict that increasing strength of glottalization should entail increasing strength of this effect, where we see additive shifts in categorization from Steps 1&#8211;4 along the glottalization continuum.</p>
<sec>
<title>3.1. Participants and procedure</title>
<sec>
<title>3.1.1. Experiment 1</title>
<p>Thirty participants were recruited for Experiment 1. All participants were self-reported native American English speakers with normal hearing, and were recruited from the student population at the University of California, Los Angeles. Each participant completed a language background questionnaire and provided informed consent to participate. Participants received course credit for their participation. The online platform that was used to control stimulus presentation was Appsobabble (<xref ref-type="bibr" rid="B87">Tehrani, 2020</xref>).</p>
<p>The procedure was a simple two-alternative forced choice (2AFC) task in which participants heard a stimulus and categorized it as one of two words, &#8220;ebb&#8221; or &#8220;ab.&#8221; Participants completed testing seated in front of a desktop computer monitor, in a sound-attenuated room in the UCLA Phonetics Lab. Stimuli were presented binaurally via a PELTOR<sup>TM</sup> 3M<sup>TM</sup> listen-only headset. The target words were represented orthographically, each target word centered in each half of the monitor. The side of the screen on which the target words appeared was counterbalanced across participants, such that for half of the participants &#8220;ebb&#8221; was on the left, and for the other half &#8220;ebb&#8221; was on the right.</p>
<p>Participants were instructed that their task was to identify which word they heard by key press, where a &#8220;j&#8221; key press indicated the word on the right side of the screen, and an &#8220;f&#8221; key press indicated the word on the left. Prior to the test trials, participants completed four training trials. In these trials, the continuum endpoints were presented once in each glottalization condition. In the subsequent test trials, each unique stimulus was presented 10 times, in random order, for a total of 200 test trials during the experiment (20 unique stimuli &#215; 10 repetitions). Halfway through the test trials, participants were prompted to take a short self-paced break. The experiment took approximately 15&#8211;20 minutes to complete in total.</p>
</sec>
<sec>
<title>3.1.2. Experiment 2</title>
<p>Thirty-four participants, none of whom had taken part in Experiment 1, were recruited from the same population for Experiment 2. Data collection and recruitment took place remotely due to COVID 19. Participants were asked to complete the experiment in a quiet location while using headphones. There were a total of 32 unique stimuli used in the experiment (8 formant continuum steps &#215; 4 glottalization continuum steps) each of which was repeated a total of 7 times for a total of 224 trials in the experiment. The four training trials in Experiment 2 presented the endpoints of the glottalization continuum (step 1 and step 4), with the endpoints of the formant continuum, such that listeners heard the endpoints of both continua. The experimental procedure was otherwise the same as in Experiment 1.</p>
</sec>
</sec>
<sec>
<title>3.2. Analysis</title>
<p>The analysis of categorization data in all experiments reported here was carried out using a Bayesian logistic mixed-effects regression model, implemented with the R package <italic>brms</italic> (<xref ref-type="bibr" rid="B10">B&#252;rkner, 2017</xref>). The models were run using R version 4.1.2 (<xref ref-type="bibr" rid="B72">R Core Team, 2021</xref>) in the RStudio environment (<xref ref-type="bibr" rid="B76">RStudio Team, 2021</xref>). Weakly informative normally distributed priors were employed for both the intercept and fixed effects (mean = 0, standard deviation = 1.5 in log-odds space).<xref ref-type="fn" rid="n5">5</xref>,<xref ref-type="fn" rid="n6">6</xref></p>
<p>In reporting effects two measures are given, both characterizing the estimated posterior distribution for a given fixed effect. First we report the estimate and 95% credible intervals (CrI) for an estimate. This gives the effect size (in log-odds), and characterizes the distribution/certainly around the estimate. When 95% credible intervals exclude 0, this suggests a consistently estimated directionality, and accordingly a robust influence. In comparison, 95% credible intervals which <italic>include</italic> 0 would indicate substantial variability in the estimated direction of an effect, and therefore a non-reliable impact on categorization. An additional metric is reported: The &#8220;probability of direction,&#8221; henceforth pd, computed with <italic>bayestestR</italic> package (<xref ref-type="bibr" rid="B50">Makowski, Ben-Shachar, &amp; L&#252;decke, 2019</xref>). This metric is conceptually similar to reporting CrI, but is useful in that it corresponds more intuitively to a frequentist model&#8217;s p-value: The pd indexes the percentage of a posterior distribution which shows a given sign, with values ranging between 50 and 100 percent. A posterior centered precisely on zero (i.e., no evidence for an effect), will have a pd of 50, while a posterior with a skewed negative or positive distribution will have pd that approaches 100. Convincing evidence for an effect would come from pd values that are greater than 97.5 (the pd value that corresponds to 95% CrI excluding zero; a pd value of 100 would indicate all of the distribution for an estimate excludes the value of zero, this would be very strong evidence for an effect). Tables showing all fixed effects estimates for each model are included in the Appendix.</p>
<p>Models were coded to predict categorization responses, with an /&#603;/ response mapped to 1, and an /&#230;/ response mapped to 0. The formant continuum was coded as a continuous variable, and scaled and centered. In Experiment 1, glottalization was contrast coded with the presence of a glottal stop mapped to 0.5, and the absence of a glottal stop mapped to &#8211;0.5. Categorization responses were predicted as a function of continuum step, glottalization, and the interaction of these two fixed effects. In Experiment 2, the glottalization continuum was treated as a continuous variable, and was scaled and centered. Categorization responses were predicted as a function of glottalization continuum, formant continuum, and their interaction. As a control variable, stimulus repetition was also included as a fixed effect, referring to the repetition of a given unique stimulus over the course of the experiment, to account for the possibility of listener bias in categorizing repeated stimuli. This variable was centered and scaled, and ranged from 1&#8211;10 in Experiment 1, 1&#8211;7 in Experiment 2 and 1&#8211;8 in Experiment 3 (due to the different number of repetitions of each unique stimulus). Random effects in the each model included random intercepts for participant and random slopes for all fixed effects and the interaction between glottalization and continuum step.</p>
</sec>
<sec>
<title>3.3. Results and discussion</title>
<p>The results of Experiments 1 and 2 are shown together in <xref ref-type="fig" rid="F2">Figure 2</xref>. In both Experiment 1 and Experiment 2 changing formant values along the continuum shifted categorization in the expected way; increasing (scaled) step values along the continuum decreased the log-odds of an /&#603;/ response (Experiment 1: &#946; = &#8211;3.42, 95%CrI = [&#8211;3.76, &#8211;3.10]; pd = 100; Experiment 2: &#946; = &#8211;3.06, 95%CrI = [&#8211;3.41, &#8211;2.73]; pd = 100). Neither experiment showed a credible effect of the stimulus repetition variable (pd = 72 in Experiment 1, pd = 83 in Experiment 2), indicating that there was not a categorization bias introduced by repetitions of unique stimuli.</p>
<fig id="F2">
<label>Figure 2</label>
<caption>
<p>Categorization results in Experiment 1 (panel A) and 2 (panel B and C). In panels A and B, the x axis shows the formant continuum and the y axis shows listeners&#8217; proportion of /&#603;/ responses at each step, split by glottalization condition. Lines in panel A and B show a logistic fit to the data with points showing empirical means. Error bars show one SE from the data (not model estimates). Panel C shows the effect of the glottalization continuum on the x axis, pooled across formant continuum steps. Step numbering for the formant continuum refers to the values from the original 10 step continuum, with Experiment 2 ranging from step 3 to step 10.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g2.png"/>
</fig>
<p>In Experiment 1, the glottal stop condition showed a credible effect in shifting categorization (&#946; = 1.74, 95%CrI = [1.30,2.17]; pd = 100). As shown in <xref ref-type="fig" rid="F1">Figure 1A</xref>, a preceding glottal stop increased /&#603;/ responses. This result lines up with the predictions outlined in Section 2.1, suggesting that listeners do indeed adjust their perception of the contrast in line with sonority expansion: A vowel preceded by a glottal stop is expected to be realized as a more prominent variant, i.e., lower and backer in the vowel space.</p>
<p>In Experiment 2, the glottalization continuum additionally showed a credible effect in shifting categorization responses (&#946; = 0.40, 95%CrI = [0.30,0.50]; pd = 100). This is evident in <xref ref-type="fig" rid="F2">Figure 2B</xref> as increasing rightward shifts along the glottalization continuum, with the strongest glottalization cues (step 4) showing the largest difference from step 1 (no glottalization). The results are further shown in <xref ref-type="fig" rid="F3">Figure 3B</xref>, which collapses across all steps of the formant continuum, showing a graded increase in /&#603;/ responses as glottalization cues increase in strength. The effect size (in log odds) is smaller than in Experiment 1, though direct comparisons are not straightforward because of the way that the variables were coded. In Experiment 2, the estimate is for a one-unit change in the scaled value of glottalization continuum step. Relating the scaled and centered values to actual continuum values and comparing the difference between step 1 and step 4 (weakest to strongest glottalization cues) yields an estimated log-odds change of approximately 1.05, suggesting a slightly smaller effect than the full stop in Experiment 1. This may be expected because glottalization cues, even at their strongest in Experiment 2, are in a sense &#8220;weaker&#8221; than the full stop in Experiment 1. This effect size estimate is in agreement with an alternative parameterization of the model in which glottalization continuum was treated as a four level categorical variable, included in the open access repository.<xref ref-type="fn" rid="n7">7</xref></p>
<p>There was additionally a credible interaction between continuum and glottal stop condition in both Experiments (Experiment 1: &#946; = &#8211;0.77, 95%CrI = [&#8211;1.15, &#8211;0.41]; pd = 100; Experiment 2 &#946; = &#8211;0.26, 95%CrI = [&#8211;0.42, &#8211;0.12]; pd = 100). The interaction was inspected using the <italic>estimate slopes</italic> function from the <italic>modelbased</italic> package (<xref ref-type="bibr" rid="B51">Makowski, Ben-Shachar, Patil, &amp; L&#252;decke, 2020</xref>), which estimated the marginal effect of the formant continuum across glottalization conditions; see the online repository for code implementing this assessment of the interactions. In Experiment 1, this assessment showed a larger effect of formant continuum step for the glottal stop condition (&#946; = &#8211;3.80, 95%CrI = [&#8211;4.23, &#8211;3.41]) as compared to the no glottal stop condition (&#946; = &#8211;3.03, 95%CrI = [&#8211;3.39, &#8211;2.70]). The same trend was observed for an increase along the glottalization continuum in Experiment 2, where the most glottalized endpoint showed the largest effect of formant continuum step (&#946; = &#8211;3.40, 95%CrI = [&#8211;3.86, &#8211;3.00]) as compared to the not glottalized endpoint (&#946; = &#8211;2.71, 95%CrI = [&#8211;3.07, &#8211;2.38]), with an increase in the effect along the glottalization continuum. A larger effect of formant continuum step is analogous to a steeper categorization slope, and in this sense the presence of the interaction can be taken to suggest that glottalization leads to sharper categorization of differences in vowel formants.<xref ref-type="fn" rid="n8">8</xref> This makes sense if we consider glottalization as rendering the target vowel more perceptually prominent, though glottalization also simply acoustically sets the target apart from preceding context, which enhances auditory processing as noted above (<xref ref-type="bibr" rid="B21">Delgutte, 1980</xref>; <xref ref-type="bibr" rid="B22">Delgutte &amp; Kiang, 1984</xref>).</p>
<p>We can now consider the results of Experiment 1 and 2 in relation to the aforementioned relation between vowel height and vowel-initial glottalization, whereby a general cross-linguistic pattern is that lower vowels favor glottalization (e.g., <xref ref-type="bibr" rid="B9">Brunner &amp; Zygis, 2011</xref>). On the one hand, this relationship could be treated as a statistical pattern by listeners: Glottalization could lead to the expectation of a lower vowel phoneme (in the present study, /&#230;/). The results indicate that this is clearly not the case, as glottalization favors perception of /&#603;/. The fact that a lower vowel percept is <italic>not</italic> favored by preceding glottalization comports with the findings that there is not a predictive relationship between phonological/categorical vowel height and the production of vowel initial glottalization (<xref ref-type="bibr" rid="B25">Garellek, 2013</xref>; <xref ref-type="bibr" rid="B90">Umeda, 1978</xref>), such that listeners do not use preceding glottalization to identify the vowel as being the lower vowel category /&#230;/. What the results indicate instead is that vowel-initial glottalization leads listeners to re-calibrate such that the acoustic space which is mapped to a given vowel category is lower and backer (in F1/F2), in line with sonority expansion. This relation to (acoustic) vowel height is a restatement of the predicted prominence effect, though future work will benefit from looking at other vowels, including those which are <italic>not</italic> realized as acoustically lower/backer under prominence (e.g., American English /i/, <xref ref-type="bibr" rid="B12">Cho, 2005</xref>).</p>
<p>The data from Experiments 1 and 2 thus supports the prediction that vowel-initial glottalization serves a prominence-marking function for listeners. Notably, we can see that different realizations of glottalization engender similar perceptual effects, with a clear relationship between strength of glottalization and the magnitude of the perceptual shifts evidenced by listeners. The effect seems to change fairly continuously as a function of the glottalization continuum, addressing &#8220;whether [glottalization] is interpreted along a continuum or as a contrastive binary feature&#8221; (<xref ref-type="bibr" rid="B73">Redi &amp; Shattuck-Hufnagel, 2001, p. 427</xref>).</p>
</sec>
</sec>
<sec>
<title>4. Experiment 3</title>
<p>Given the effect of glottalization on categorization in both Experiments 1 and 2, Experiment 3 examined the timecourse of its influence in online processing in a visual world eyetracking task.</p>
<sec>
<title>4.1. Materials</title>
<p>Experiment 3 made use of the same materials as Experiment 1, though it used a subset of the 10 step continuum. The method by which the Experiment 3 stimuli were selected was the same as that used in Mitterer and Reinisch (<xref ref-type="bibr" rid="B62">2013</xref>). The overall interpolated categorization function for Experiment 1 was inspected. The point at which the interpolated function crossed 50% (i.e., the most ambiguous region in the continuum) was identified. The three steps on each side of this crossover point were used in Experiment 3. This led to the selection of steps 4&#8211;9 from Experiment 1. There were accordingly 12 unique stimuli used (6 continuum steps &#215; 2 prominence conditions).</p>
</sec>
<sec>
<title>4.2. Participants and procedure</title>
<p>Forty participants, none of whom had taken part in Experiment 1 or 2, were recruited from the same population as previous experiments to participate in Experiment 3. Testing was carried out in a sound-attenuated room in the UCLA Phonetics Lab.</p>
<p>Participants were seated in front of an arm-mounted SR Eyelink 1000 (SR Research, Mississauga, Canada) set to track the left eye<xref ref-type="fn" rid="n9">9</xref> using pupil tracking and corneal reflection at a sampling rate of 500 Hz, and set to record remotely (i.e., without a head mount) at a distance of approximately 550 mm. At the start of the experiment, participants&#8217; gaze was calibrated with a 5-point calibration procedure.</p>
<p>Stimuli were presented binaurally via a PELTOR<sup>TM</sup> 3M<sup>TM</sup> listen-only headset. The visual display was presented on a 1920 &#215; 1080 ASUS HDMI monitor. In each trial, participants were presented with a black fixation cross (60px by 60px) in the center of monitor. The target words themselves were displayed in 60pt black Arial font, with one word centered in the left half of the monitor, and the other in the right half of the monitor. The side of the screen on which the words appeared was counterbalanced across participants, though for a given participant the same word always appeared on the same side of the screen as in Kingston, Levy, Rysling, and Staub (<xref ref-type="bibr" rid="B45">2016</xref>); Reinisch and Sjerps (<xref ref-type="bibr" rid="B74">2013</xref>). Two interest areas (300px by 150px) were defined around the target words. These were slightly larger than the printed words, to ensure that looks in the vicinity of the target words were also recorded, following Chong and Garellek (<xref ref-type="bibr" rid="B15">2018</xref>) and Kingston et al. (<xref ref-type="bibr" rid="B45">2016</xref>).</p>
<p>The onset of the audio stimulus was look-contingent, such that stimuli did not begin to play until a look to the fixation cross had been registered. This was done to ensure that participants were not already looking at a target word at the onset of the stimulus. As soon as a look to the fixation cross was registered, the audio stimulus began, and the target words appeared simultaneously with the onset of the audio. The trial ended after participant provided a click response. The next trial began automatically after a click response was registered. At the start of each new trial, the cursor position was re-centered on the computer screen, following Kingston et al. (<xref ref-type="bibr" rid="B45">2016</xref>). Trials were separated by an interval of one second. Eye movements were recorded from the first appearance of the fixation cross until the participants provided a click response and the next trial began.</p>
<p>There were four practice trials, with each continuum endpoint being presented in each prominence condition once. Following this, there were a total of 96 test trials; each of 12 unique stimuli was presented a total of eight times, with stimulus presentation completely randomized. The experiment, including calibration, took approximately 20 minutes to complete.</p>
</sec>
<sec>
<title>4.3. Timecourse predictions</title>
<p>Given the variables under consideration and the previous accounts of prosody and prominence in processing described in Section 1.3, we can operationalize some predictions for Experiment 3, which will motivate the analyses described below. First, a general expectation is that vowel-internal formant cues should exhibit a rapid influence in online processing as shown, for example, by Reinisch and Sjerps (<xref ref-type="bibr" rid="B74">2013</xref>). It takes approximately 200 milliseconds to program a saccadic eye movement (e.g., <xref ref-type="bibr" rid="B52">Matin, Shao, &amp; Boff, 1993</xref>), meaning that we expect (at least) a 200 ms lag between the time that a given stimulus dimension is presented to listeners and the time it influences their looking behavior. Given this, we can predict to see an influence of vowel acoustics (modeled with the continuum variable) in online processing as early as 200 ms from the onset of the target vowel, a rapid effect.</p>
<p>Taking this timing as a baseline for what constitutes a rapid effect, consider the timecourse predictions for vowel-initial glottalization from both the Prosodic Analysis model and MAPP model.</p>
<list list-type="bullet">
<list-item><p><italic>Prosodic Analysis model</italic>:</p>
<p><list list-type="simple">
<list-item><p>1.&#160;&#160;<italic>Prediction 1: Timing of effects</italic>. If a glottal stop is processed as contributing only to a prosodic parse of the signal which is integrated later in word recognition following Cho et al. (<xref ref-type="bibr" rid="B14">2007</xref>), it should show a later-stage effect in line with Kim, Mitterer, and Cho (<xref ref-type="bibr" rid="B44">2018</xref>) and Mitterer et al. (<xref ref-type="bibr" rid="B59">2019</xref>). Given the expectation that formant information is processed rapidly, this predicts an asynchrony between the influence of these two effects, with formant cues showing an earlier influence than glottalization.</p></list-item>
<list-item><p>2.&#160;&#160;<italic>Prediction 2: Interaction between effects</italic>. Relatedly, if formant cues are used only to activate lexical hypotheses (independent of prosodic information), there should be no interaction between formant cues and glottalization, most crucially early in processing. This predicts that early processing of formant information will not vary across glottalization conditions.</p></list-item>
</list></p>
</list-item>
<list-item><p><italic>MAPP model</italic>:</p>
<p><list list-type="simple">
<list-item><p>1.&#160;&#160;<italic>Prediction 1: Timing of effects</italic>. Following the MAPP model, if glottalization is a prominence effect that modulates (early) sublexical processing, we can predict that its influence will be simultaneous with the influence of vowel formants.</p></list-item>
<list-item><p>2.&#160;&#160;<italic>Prediction 2: Interaction between effects</italic>. Another prediction from the MAPP model is that processing of formant information will interact with glottalization such that formant cues will be processed differently depending on glottalization. This predicts that (early) processing of formant information will vary across glottalization conditions.</p></list-item>
</list></p></list-item>
</list>
<p>Importantly, as described in Section 2.2 the glottalization manipulation only <italic>preceded the target vowel in time</italic> in the stimuli, and the target itself is acoustically identical across glottalization conditions.</p>
<p>We now turn to the results, first examining the categorization responses and a preliminary look at the eye movement data. Following this, more details on the eyetracking analysis and the eyetracking results are presented in Section 4.5.</p>
</sec>
<sec>
<title>4.4. Results</title>
<p>As shown in <xref ref-type="fig" rid="F3">Figure 3</xref> panel A, categorization results from Experiment 3 essentially replicated Experiment 1. Formant cues from the continuum exerted a reliable influence in categorization (&#946; = &#8211;2.64, 95%CrI = [&#8211;3.01, &#8211;2.28]; pd = 100), and we can see the categorization function is overall fairly well-anchored. The glottal stop effect from Experiment 1 was also replicated, with the presence of a preceding glottal stop increasing listeners&#8217; /&#603;/ responses (&#946; = 2.51, 95%CrI = [2.00, 3.04]; pd = 100). An overall bias towards /&#603;/ is also evident in the tendency of listeners to categorize the target as /&#603;/, especially when it is preceded by a glottal stop. As with the previous Experiments, the control variable for stimulus repetition did not show a credible effect (pd = 88). There was additionally a credible interaction between continuum and glottal stop condition, mirroring what was seen in Experiment 1 (&#946; = &#8211;0.43, 95%CrI = [&#8211;0.78, &#8211;0.11]; pd = 99). Comparison of the marginal effect of formant continuum across glottalization conditions also aligns with what was seen in Experiment 1 in showing a larger effect in the glottal stop condition (&#946; = &#8211;2.85, 95%CrI = [&#8211;3.29, &#8211;2.46]) as compared to the no glottal stop condition (&#946; = &#8211;2.42, 95%CrI = [&#8211;2.81, &#8211;2.05]).</p>
<fig id="F3">
<label>Figure 3</label>
<caption>
<p>Categorization results in Experiment 3 (panel A), and eye movement data in Experiment 3 (panel B; see text). Error bars and ribbons show one SE, computed from the data. The vertical dotted line at 200 ms indicates the earliest time at which information in the target vowel is expected to impact fixations. Step numbering refers to the values from the original 10 step continuum used in Experiment 1.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g3.png"/>
</fig>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> panel B shows the eye movement data from the experiment, plotting eye movement trajectories as a function of continuum step and glottalization condition. The measure plotted on the y axis is listeners&#8217; preference to fixate on /&#603;/, computed as the proportion of looks to /&#603;/ minus looks to /&#230;/ in each 20 ms time bin. Here a value of zero indicates no preference, a positive value indicates a preference to fixate on /&#603;/, and a negative value indicates a preference to fixate on /&#230;/. Note that the time which is marked as zero on the x axis is the precise point in the stimulus (in either glottalization condition) where there begins to be any difference based on vowel continuum acoustics, corresponding to the positioning of the dashed line in <xref ref-type="fig" rid="F1">Figure 1</xref>. In other words, the stimuli up until this time will be different based on the glottalization manipulation preceding the target vowel, but there are not yet any formant cues to vowel identity at this point. We can see the effect of continuum step in the separation of lines based on coloration, with more /&#603;/-like continuum acoustics leading to a preference to fixate on /&#603;/. This separation, or fanning out, of trajectories appears to occur at roughly 200 ms from the onset of the vowel. The effect of vowel-initial glottalization is also evident in the separation we see based on line type: In line with the categorization data, a preceding glottal stop (dashed lines) facilitates looks to /&#603;/, an online effect corresponding to the categorization results we have seen thus far. We can also note that there is an /&#603;/-bias in eye movements, as also suggested by the categorization data, with steps 1&#8211;4 showing a strong /&#603;/ preference. Qualitatively, it thus appears that both vowel-internal acoustic cues, and preceding glottalization, are both shaping listeners&#8217; perception of the target word.</p>
</sec>
<sec>
<title>4.5. Eyetracking analyses and results</title>
<p>Two complementary analyses of the eyetracking data are presented here. The dependent measure in each analysis was a &#8220;preference measure,&#8221; which offers a normalized measure of listeners&#8217; propensity to fixate on a target (cf. <xref ref-type="bibr" rid="B74">Reinisch &amp; Sjerps, 2013</xref>). This measure is computed as log-transformed looks to &#8220;ebb&#8221; minus log-transformed looks to &#8220;ab,&#8221; using the empirical logit (Elog) transformation given in Barr (<xref ref-type="bibr" rid="B2">2008</xref>).<xref ref-type="fn" rid="n10">10</xref> This measure was computed within a given time bin in a trial, the size of which was different in the two different analyses, described below. The analysis window of 0&#8211;1200 ms from the onset of the target vowel in the stimulus is used.</p>
<p>In the first eyetracking analysis, eye movement data from Experiment 3 was analyzed by a Generalized Additive Mixed Model (GAMM) using the R packages <italic>mgcv</italic> (<xref ref-type="bibr" rid="B94">Wood, 2006</xref>) and <italic>itsadug</italic> (<xref ref-type="bibr" rid="B91">van Rij, Wieling, Baayen, &amp; van Rijn, 2016</xref>). GAMMs have recently been suggested to offer an appealing alternative to moving window analyses in that they allow for an encoding of the temporal contingency across time bins, and further allow for modeling non-linearity in the data (see <xref ref-type="bibr" rid="B95">Zahner, Kutscheid, &amp; Braun, 2019</xref> for a discussion of the advantages of GAMMs for eyetracking data). The data was sampled at 20 ms time bins for the GAMM analysis (as in <xref ref-type="bibr" rid="B83">Steffman, 2021a</xref>; <xref ref-type="bibr" rid="B95">Zahner et al., 2019</xref>). The GAMM was an AR1 error model, fit using the technique described in e.g., <xref ref-type="bibr" rid="B79">S&#243;skuthy 2017</xref>, to reduce residual autocorrelation. The rho parameter was specified in the model based on a previous run of the same model without the AR1 component (see the open access repository for code implementing this). The number of knots in some terms were increased (to k = 20) following inspection with the <italic>gam.check</italic> function, after which the number of knots was adequate as determined by that function.</p>
<p>The model was fit with parametric terms for continuum step (scaled and centered), glottalization condition, and the interaction between these fixed effects. The control variable of stimulus repetition was additionally included. Parametric terms in the GAMM model are analogous to fixed effects in mixed effects models and capture if listeners&#8217; fixation preference in the analysis window as a whole varies as a function of the predictors. Smooth terms in GAMMs are additionally fit to model changes over time, and (potentially) non-linear patterns in the data. The model was fit to capture the interaction between continuum acoustics and time using a non-linear tensor-product interaction term, which allows us to examine how, over time, vowel acoustics mediate listeners&#8217; preference to fixate on a given target. Crucially, this term was interacted with glottalization condition as a &#8220;by&#8221; term in the tensor-product term, modeling the potential interaction between glottalization, and the influence of continuum acoustics over time. As a control variable, an additional tensor-product term was fit for (scaled) stimulus repetition over time, modeling how the dependent variable changed over time as a function of repetition. This term showed no systematic effect of repetition on looking behavior (in line with previous categorization analyses), so it will not be discussed further, though a plot of the predictions from the model for the influence of stimulus repetition is included on the open access repository. Random effects in the model were specified using the reference-difference smooth method described in Soskuthy (<xref ref-type="bibr" rid="B80">2021</xref>), with factor smooths for participant, and for participant by glottal stop condition (coded as an ordered factor). In both factor smooth terms, the <italic>m</italic> parameter was set to 1, following Baayen, van Rij, de Cat, and Wood (<xref ref-type="bibr" rid="B1">2018</xref>) and Soskuthy (<xref ref-type="bibr" rid="B80">2021</xref>). The numerical GAMM model output is included in the appendix, though the terms in the model as it was coded are generally not useful for intrepreting timecourse questions of interest here (<xref ref-type="bibr" rid="B66">Nixon, van Rij, Mok, Baayen, &amp; Chen, 2016</xref>; <xref ref-type="bibr" rid="B95">Zahner et al., 2019</xref>).</p>
<p>The model described above will be compared to one which did not include a non-linear interaction term for glottalization condition with continuum step and time. In this model, glottalization condition was not included as a &#8220;by&#8221; term in the tensor-product interaction, but instead in a separate smooth modeling the effects of glottalization over time. This latter model thus captures an independent effect of glottalization, but crucially, not an interaction with continuum step. The fit of the two models will be compared in light of the predictions described in Section 4.3, testing prediction 2 from each model. The code for both models and model comparison is contained in full on the open-access repository.</p>
<p>The second analysis presented here is a traditional moving window analysis, which assesses how vowel-internal formant cues influence eye movements in relation to the glottal stop manipulation. The moving window analysis serves the purpose of comparing across Expeirment 3 and Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) with a focus on the relative timing of the influence of continuum step (formants) and the prominence manipulation. Notably, a GAMM model can be used for this purpose too, however in the GAMM numerical time estimates for the influence of continuum step needs to be computed with difference smooths on a pairwise basis, i.e., the timing of an effect between step 1 and step 2, step 1 and step 3, and so on. The moving window analysis thus offers a more global picture of the timing of continuum step. This additional analysis is accordingly carried out to provide converging evidence for the effects in Experiment 3, and to further offer a compact comparison with data from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>). Models were fit for each experiment separately, due to the fact that the continuum acoustics, and the nature of the prominence effects were different across them. Despite these substantial differences, the <italic>relative</italic> timing of the continuum effect and prominence effect can be considered comparable, given the predictions in Section 4.3. In other words, because the prosodic analysis model predicts a two-stage influence of formants and then prominence we can test this prediction in both Experiment 3 and the data from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>), and evaluate the relative timing of the effects across experiments (prediction 1 from each model in Section 4.3), even though the formant acoustics and prominence cues are not directly comparable to one another. More details from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) are given in Section 4.5.2, following the presentation of the GAMM results of Experiment 3.</p>
<p>Time bins of 100 ms were used in the moving window analysis, with the preference measure computed at 100 ms intervals across a trial. A 100 ms window was selected because it provides a fairly fine-grained temporal assessment, while also granting a reasonable amount of independence from bin to bin (<xref ref-type="bibr" rid="B2">Barr, 2008</xref>; <xref ref-type="bibr" rid="B62">Mitterer &amp; Reinisch, 2013</xref>), a known issue in moving window analyses. The dependent measure was predicted as a function of (scaled) formant continuum step, and glottalization context (coded as in the categorization models), and the interaction of these two fixed effects in each time bin. Stimulus repetition was again included as a fixed effect. Random effects were random intercepts for participant and random slopes that were the same as the fixed effects and interaction term. These models were run in <italic>brms</italic> as with models of the categorization data. The assessment of the models will be in terms of when, over binned time, each has a robust effect on listeners&#8217; fixations, with a focus on the relative timing of continuum step and prominence.</p>
<sec>
<title>4.5.1. GAMM results</title>
<p>The GAMM modeling analysis focused on the relationship between glottalization and formants in jointly shaping listeners&#8217; processing of the target word, testing the predictions in Section 4.3. To test if including an interaction between continuum step and glottalization (in the tensor product term of the model) improved model fit, the GAMM with this interaction was compared to one in which glottalization condition was in a separate smooth term over time (described above), using the <italic>compare ML</italic> function in <italic>itsadug</italic> (<xref ref-type="bibr" rid="B91">van Rij et al., 2016</xref>). A Chi-Square test on the ML scores indicated that the model containing the interaction between glottalization and continuum step is a significantly better model than the one lacking the interaction (&#967;<sup>2</sup>&#8221; (4) = 57.14, p &lt; 0.001). This suggests that the way formant cues are processed interacts with glottalization condition. The nature of this interaction is explored below.</p>
<p>First, we can note that the parametric terms in the best fitting GAMM model confirm an influence of vowel formants and glottalization in the analysis window as a whole (p &lt; 0.001 for both), as would be expected given the observations made of <xref ref-type="fig" rid="F3">Figure 3</xref>. Further, aligning with all categorization analyses, the repetition control variable did not have a significant effect on eye movements (p = 0.72).</p>
<p>To assess the relationship between continuum step, glottal stop condition, and time, three-dimensional topographic surface plots are presented in <xref ref-type="fig" rid="F4">Figure 4</xref>. These plots show the model fit, representing the effect of continuum step (as a continuous variable on the y axis) over time (on the x axis). The dependent variable (listeners&#8217; Elog-transformed preference to fixate on the /&#603;/ target) is represented on a gradient color scale. The two panels represent model fits for each glottalization condition, panel A being when the target is preceded by a glottal stop. A value of zero (in the middle of the color scale) indicates no preference, while a positive value (closer to yellow on the color scale) indicates a preference for the /&#603;/ target. A negative value (closer to purple on the color scale) indicates a preference for /&#230;/. Shading on the surface shows locations where listeners&#8217; preference is not significantly different than zero, i.e., when 95% CI from the model estimate include the value of zero. Note that listeners do not show a preference early in the analysis window, with shading on all of the surface prior to approximately 200 ms. The fact that shading occupies the first 200 ms of the analysis window indicates that listeners are not using information that precedes the target vowel to predict target vowel identity independently. If preceding information (i.e., the presence of a glottal stop) was systematically used to predict vowel identity directly, shading on the surface would disappear prior to 200 ms from the vowel onset (if observed, this sort of predictive effect would suggest an issue with the experimental design in the sense that the task is too predictable, and unlike more naturalistic speech perception).</p>
<fig id="F4">
<label>Figure 4</label>
<caption>
<p>Surface plots showing the GAMM model fit in Experiment 3, with continuum step on the y axis, time on the x axis, and listeners&#8217; log-transformed fixation preference indexed by coloration. Gray shading indicates places on the surface where listeners have no preference for either target. The vertical dotted line at 200 ms indicates the earliest time at which information in the target vowel is expected to impact fixations. Step numbering refers to the values from the original 10 step continuum used in Experiment 1.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g4.png"/>
</fig>
<p>As time progresses, listeners develop graded preferences based on continuum step. At the end of the analysis window, there is a range of preferences: A stronger /&#603;/ preference at step 4 on the continuum, and a stronger /&#230;/ preference at step 9. Note too that a portion in the middle region of the continuum never attains a significant preference in either panel. That is, the model finds that the ambiguous region of the continuum remains ambiguous even at the end of the analysis window. This is shown by a narrow band of the shaded area persisting until the end of the analysis window. With this in mind, we now can assess the impact of a glottal stop on listeners&#8217; processing of the continuum over time. The effect of the glottal stop is evident in observing (1) the coloration of each panel A and B, and (2) the shape and position of the shaded area showing areas on the surface for which listeners did not have a preference for either target. In terms of coloration, note the color scale used in both panels is shared by them: The same color on each panel reflects the same degree of /&#603;/ preference. We can see that each panel overall occupies different color spaces, with the glottal stop condition showing a stronger /&#603;/ preference (more yellow on the plot), and the no glottal stop condition showing a stronger /&#230;/ preference (more purple on the plot). In other words, acoustically identical continuum steps are perceived as more like one target or the other, as a function of glottalization. Importantly, these differences are evident as early as listeners show <italic>any</italic> preference, that is, as soon as shading on the surfaces disappears.<xref ref-type="fn" rid="n11">11</xref></p>
<p>Additionally, the surface plots show that glottal stop condition also influences which stimuli are perceived as ambiguous by listeners. This is apparent in the vertical positioning of the shaded region, particularly the narrow band of that region that persists throughout the analysis window. The regions along the continuum which show no preference in looks vary based on glottal stop condition, starting early (roughly 200 ms from target onset) and persisting throughout the analysis window. This pattern is not only reflected in the narrow band of the shaded region, but also in the surrounding shading which extends around that region. This shading shows a relative delay in processing formant cues in the region of steps 7&#8211;9 in the glottal stop condition, and steps 4&#8211;6 in the no glottal stop condition, whereby regions more in the proximity of ambiguous steps show slower recognition of a vowel (i.e., a significant fixation preference). Critically, where these regions are is impacted by the glottalization manipulation. This pattern can also be framed in terms of expectations: Pre-target glottalization cues favor the recognition of a particular vowel, slowing down recognition of the alternative (though notably, this pattern does not constitute a predictive effect in the sense that only at 200 ms from target onset do listeners begin to show a preference). Inspection of the surface plots therefore supports a difference in early formant processing across conditions, with differences across conditions evident at the earliest moments, early modulation of which vowel acoustics are ambiguous to listeners, and the speed at which a particular vowel is recognized.</p>
<p>To complement the visualization of the surface plots with another assessment of the glottalization effect, the difference smooth between glottalization conditions was computed, which offers a time estimate for the overall effect of glottalization (with scaled continuum step and repetition variables set to their median by default). A difference smooth models the difference between two conditions over time. When the difference becomes reliably different from the value of zero (with 95% CI for the smooth excluding zero) we can take this to indicate when (in time) an effect is reliable (see the open access repository for the difference smooth code and visualization). The difference smooth shows that the effect of glottalization condition becomes significant 242 ms from the onset of the target vowel until the end of the analysis window, a further indication that its influence is early in time.</p>
<p>In summary, the GAMM analysis supports predictions 1 and 2 of the MAPP model: Glottalization interacts with the processing of formant information early in time as shown by the surface plots, and shows an early overall influence as indicated by the difference smooth (242 ms from vowel onset).</p>
</sec>
<sec>
<title>4.5.2. Comparison to Steffman 2021</title>
<p>Given these results we now consider how the glottalization effects described above compare to data from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>), which asked a similar question about vowel perception under variations in prominence as described in Section 1.3. Here it is thus relevant to consider the design of the stimuli and experiment in that paper. Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) adopted a highly similar eyetracking design to Experiment 3, with the intent that they may be compared. Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) tested perception of the same contrast as the present study, and also made use of a 6-step continuum ranging between /&#603;/ and /&#230;/, using the same target words (though the continuum was not acoustically identical to the one used here). The experiments can also be considered fairly comparable in that the visual eyetracking display was identical in each of them, and the instructions and procedure were the same. Where the two experiments differ crucially is the way in which prominence was manipulated.</p>
<p>As described in Section 1.3, in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) the target word was placed in two carrier phrases, which manipulated the relative prominence of the target word: &#8220;I&#8217;ll say [TARGET] now&#8221; versus &#8220;I&#8217;ll SAY [target] now.&#8221; In creating the stimuli for these conditions, the goal was to manipulate only the context surrounding the target (with the target identical across conditions) in such a way that listeners&#8217; perception of target prominence varied in the way described in Section 1.3. As with the present experiments, these stimuli present a fairly conservative manipulation in changing only context, to ensure that properties of the target sound itself do no influence responses. Two productions served as the basis for the stimuli. In one, the target was relatively prominent, produced with a nuclear H* accent, appropriate for a broad focus context, in the sentence &#8220;I&#8217;ll say [TARGET] now.&#8221; The prosodically prominent condition was created simply by using a version of this frame. In the prosodically non-prominent condition, the vowel in the word &#8220;say&#8221; from a production in which focus was on &#8220;say&#8221; (&#8220;I&#8217;ll SAY [target now]&#8221;) replaced the original vowel in that frame. This cross-spliced vowel in &#8220;say&#8221; therefore has increased amplitude and duration relative to &#8220;say&#8221; in the other condition, and a prominent L+H* pitch accent. Following this, the pitch on the preceding word &#8220;I&#8217;ll&#8221; was re-synthesized to match the pitch values of this word in &#8220;I&#8217;ll SAY [target now],&#8221; with lower F0 for the production of L in L+H*. Pitch on &#8220;I&#8217;ll&#8221; in the prominent condition was also resynthesized, overlaid with values from another broad focus production to ensure that both conditions underwent an equal amount of resynthesis. The post-target word &#8220;now&#8221; was identical across conditions, realized as unaccented and phrase-final with a low (L-L%) boundary tone. These manipulations thus created differences in the pre-target pitch contour, as well as the duration, overall amplitude, and amplitude envelope of the pre-target vowel /eI/. The F0 and intensity of the target were averaged between the values from the productions of &#8220;I&#8217;ll say [TARGET] now&#8221; and &#8220;I&#8217;ll SAY [target] now,&#8221; rendering it acoustically intermediate and ambiguous, which was judged to sound appropriate for both frames. A formant continuum was additionally created using the method described in Section 2.2. This prominence manipulation, though it controls the acoustic properties of the target, is nevertheless more global than the present experiments, and varies multiple acoustic dimensions in all of the pre-target material, conveying different prominence structures for the target and the material before it (see <xref ref-type="bibr" rid="B83">Steffman 2021a</xref> for more details).</p>
<p>Though the stimuli in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) thus differ substantially from those in Experiment 3 in how prominence is cued, the two sets of stimuli have similarities. In both, cues manipulating prominence <italic>only precede the target word in time</italic>. Thus any differences across prominence conditions are coming from pre-target material, with the target, and post-target material being identical across prominence conditions. The analyses of both experiments additionally both crucially take the onset of the target vowel as the beginning of the analysis window. Registering the onset of the window to this point for both Experiment 3 and Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) facilitates comparison in terms of the timing of these effects in the sense that in both, we examine how listeners&#8217; preference to fixate on a target word develops at the start of that word (with preceding prominence cues varying). These similarities can be kept in mind as the data from these two experiments are compared with the moving window analysis, though it should also be kept in mind that this is a between-subjects comparison.</p>
<p>A visualization of the eyetracking data from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) is given in <xref ref-type="fig" rid="F5">Figure 5</xref>, with a layout mirroring <xref ref-type="fig" rid="F3">Figure 3</xref>. As in <xref ref-type="fig" rid="F3">Figure 3</xref>, we can note that trajectories fan out and separate as a function of changing acoustics along the continuum (more /&#603;/-like acoustics along the continuum favor fixations on /&#603;/). We can also note a comparable prominence effect to that seen in Experiment 3: The prosodically prominent condition in which the target is not preceded by focus on &#8220;say&#8221; shows increased fixations to /&#603;/, analogous the effect of a preceding glottal stop in Experiment 3. Based on this visual assessment we can thus conclude that there is a similar impact of these two (very different) prominence cues across experiments. The following section compares these experiments in terms of the timecourse of formant cues and prominence in a moving window analysis.</p>
<fig id="F5">
<label>Figure 5</label>
<caption>
<p>Eyetracking results from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>), displaying eye movements as a function of prominence and continuum step, laid out as in <xref ref-type="fig" rid="F3">Figure 3</xref>. Steps are numbered 1&#8211;6 ranging from most to least like /&#603;/.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g5.png"/>
</fig>
</sec>
<sec>
<title>4.5.3. Moving window analysis results</title>
<p>In the moving window analysis, the effects of prominence and continuum step in the models are summarized visually in <xref ref-type="fig" rid="F6">Figure 6</xref>. The estimate for each effect from the model is given along with 95% CrI, each of which are plotted over time, which is presented in 100 ms time bins. The full model summaries which produced the estimates plotted here are contained in the open access repository.</p>
<fig id="F6">
<label>Figure 6</label>
<caption>
<p>Model estimates for the effect of continuum step and prominence (glottalization) in the moving window analysis for Experiment 3, with estimates from the same analysis for data from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) for comparison. Each point is located at the end of a time bin, e.g., 200 indicates 100&#8211;200 ms. Point shape indicates whether or not an effect is credible in a given time bin.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-14-8753-g6.png"/>
</fig>
<p>First consider just the data from Experiment 3. An effect can be taken to be reliable if, in a given time bin, 95%CrI for the effect <italic>exclude</italic> the value of 0, as indicated by a circular point for that time bin and that effect in the figure. A reliable effect of continuum step in Experiment 3 is evident in the 200&#8211;300 ms time bin (note that estimates are arbitrarily negative because of the way in which the variables were coded, i.e., decreases in /&#603;/-preference as a function of increasing values of continuum step). This effect is early and is consistent with previous work showing a rapid use of formants in processing vowel information (<xref ref-type="bibr" rid="B74">Reinisch &amp; Sjerps, 2013</xref>). Next, consider the timing of this step effect in relation to the glottal stop effect (labeled as the prominence effect for Experiment 3). This effect also becomes credibly different from zero at the same time as the effect of continuum step (200&#8211;300 from target onset), agreeing with the estimate obtained from the difference smooth in the GAMM model (242 ms). The influence of continuum step and the glottal stop thus occur in the same time bin, aligning with the GAMM analysis and supporting prediction 1 from the MAPP model.</p>
<p>This relative timing pattern can be compared to the timing of the effects of vowel acoustics and prominence from Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>), also plotted in <xref ref-type="fig" rid="F6">Figure 6</xref>. The effect of continuum step is reliable 300&#8211;400 ms from the onset of the target vowel, one time bin later than the effect of continuum step in Experiment 3. The effect of the phrasal prominence manipulation in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) is smaller in size compared to Experiment 3, and does not show a consistent divergence from 0 until the 700&#8211;800 ms time bin (though there is a transitory and smaller credible effect between 400&#8211;600 ms). This lines up with the GAMM analysis presented in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>), which showed subtle effects of phrasal prominence early in time, with larger and more robust effects only apparent later in the analysis window. Importantly, the robust effect is clearly asynchronous with the effect of vowel acoustics in that experiment, differentiating it from the synchronous influence of a glottal stop and vowel formants seen in Experiment 3.</p>
<p>In summary, the timecourse data in Experiment 3 show a rapid influence of vowel-initial glottalization in vowel perception, in line with sonority expansion effects on vowel formants. This influence was rapid in the sense that it impacted fixations as soon as listeners showed a preference for any target, and interacted with the processing of formant cues, as determined by the GAMM analysis. The effect was further rapid in the sense that it occurred only 242 ms after the onset of the target vowel (according to the GAMM difference smooth), and in the same 200&#8211;300 time bin as the influence of formant cues (according to the moving window analysis). These results support both predictions from the MAPP model, given in Section 4.3. The relative timing of the effects of the continuum and formants also differed from that obtained for the data in Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) in the moving window analysis.</p>
</sec>
</sec>
</sec>
<sec>
<title>5. General Discussion</title>
<p>The present study set out to examine if listeners are impacted by the presence of vowel-initial glottalization in their perception of vowel contrasts. Experiment 1 showed that the production of a sustained glottal stop preceding a vowel led to listeners re-calibrating vowel perception in a way that reflected sonority expansion: Acoustically lower and backer F1/F2 in the vowel space (higher F1, lower F2) were perceived as /&#603;/ more often with preceding glottalization in line with the acoustically lower/backer realization of /&#603;/ under prominence. Experiment 2 showed that these effects are also evident when glottalization was cued by dipping pitch and intensity along a continuum, and without a full glottal stop. Intermediate steps on the glottalization continuum led to intermediate shifts in categorization, suggesting that stronger vowel-initial glottalization cued a stronger percept of prominence. Experiment 3 replicated the effects of a full glottal stop seen in Experiment 1 in a visual-world eyetracking paradigm which compared the timecourse of the influence of a preceding glottal stop to that of vowel-internal formant cues. Both of these influences were simultaneous, with a vowel-initial glottal stop immediately impacting perception and modulating how formant cues are used at the earliest moments in processing.</p>
<sec>
<title>5.1. Glottalization and prominence</title>
<p>Let us first consider these results as they relate to the hypothesized prominence-marking function of word-initial glottalization in American English in the speech production literature. The presence of glottalization preceding a vowel led to listeners&#8217; expectation of a more prominent (in this case, sonorous) variant of that vowel being produced. Such an expectation led listeners to map acoustically identical formant values to /&#603;/ more often under prominence (versus /&#230;/). This data thus supports the proposal that glottalization cues prominence to listeners, in line with its implementation as a prominence marker in production. This interpretation more generally accords with Mitterer et al. (<xref ref-type="bibr" rid="B60">2021a</xref>; <xref ref-type="bibr" rid="B61">2021b</xref>) in that glottalization is an important prosody-related cue which is recruited in perception.</p>
<p>It is worth noting here that across all conditions in the present experiments the target word was pitch accented, such that the prominence effects seen here suggest different levels of perceptual prominence within pitch accented words, and fine-grained variation in prominence perception as shown in Experiment 2. If we consider &#8220;pitch-accented&#8221; to be a phonological specification of prominence category, these results speak to the importance of considering within-category variation in perceived prominence as meaningfully impacting the perception of segmental material, in line too with Dilley et al. (<xref ref-type="bibr" rid="B23">1996</xref>) showing that pitch accented vowel-initial words are often glottalized, but not always (i.e., there is a probabilistic relationship between pitch accentuation and vowel-initial glottalization). This further raises the question of listeners&#8217; behavior when prominence cues conflict, for example when glottalization precedes an unaccented phrase-medial vowel (possible, but less common as shown in <xref ref-type="bibr" rid="B23">Dilley et al. 1996</xref>). This study and Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) showed an effect for two different prominence cues, and one prediction is that these cues are additive when combined, allowing for the possibility of a sort of &#8220;perceptual garden path&#8221; effect when they conflict. We could thus predict an overall delay in recognition and (potentially) revised fixation behavior in eyetracking as cues unfold, for example if glottalization information precedes the relevant pitch accent information in time. On the other hand, if listeners instead wait until both cues have been received it could be taken to suggest that they are integrating them into a more holistic and abstract prominence percept. Pitting cues (e.g., glottalization and pitch accentuation) against one another in this sense will also allow for testing precedence and possible interactions (e.g., perhaps glottalization is an important cue only when words are pitch accented). Tests of this sort will help us to better understand the ways multiple cues are used in combination by listeners, and hopefully, what sort of representation of prominence is implicated.</p>
<p>More broadly, the results suggest that future research will benefit from considering other patterns of prominence strengthening as relevant in segmental perception. For example, consider the lengthening of VOT in voiceless stops which is observed in prominent syllables (<xref ref-type="bibr" rid="B17">Cole et al., 2007</xref>; <xref ref-type="bibr" rid="B43">Kim, Kim, &amp; Cho, 2018</xref>). Given the present results we can predict that prominence-signaling lengthening of VOT may impact perception of the following vowel. If found, this would further indicate the importance of fine-grained prominence-strengthening cues in segmental perception. A key takeaway from these results is accordingly the view that prosody should be considered not only in terms of suprasegmental parameters, nor strictly abstract structural terms (phrase boundaries, pitch accents) but should be viewed holistically and as encoded in fine-grained detail and the modulation of cues such as VOT and formant structure.</p>
</sec>
<sec>
<title>5.2. Implications for models of speech processing</title>
<p>The eyetracking data further enrich our understanding of the interplay between prosodic and segmental/lexical processing. As noted in Section 1.3, previous examinations of prosodic influences in segmental processing support a delayed influence of prosodic structure, overall consistent with a post-lexical model of prosodic effects (as in the Prosodic Analysis model). Such an account of the present data predicts an asynchronous influence of segment-internal cues to a contrast and prosodic context, with segmental cues preceding prosodic context in the timecourse of their influence. The data in Experiment 3 are not consistent with this account, with <italic>simultaneous</italic> effects of formants and a preceding glottal stop in online processing.</p>
<p>These data thus present an extension from the Prosodic Analysis model in showing a richer set of prominence effects in segmental/lexical processing than a strictly post-lexical influence. In that sense, they are consistent with the predictions from the MAPP model. The comparison to Steffman (<xref ref-type="bibr" rid="B83">2021a</xref>) shows that different prominence cues have different relative timing in processing when compared to vowel-internal spectral information. This is taken as evidence that prominence processing may vary depending on the prominence cue.</p>
<p>Nevertheless, existing data on phrasal prosodic boundaries in processing show clear support for only a later influence of prosodic boundary information in the perception of segmental material (<xref ref-type="bibr" rid="B44">Kim, Mitterer, &amp; Cho, 2018</xref>; <xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>). In this sense, the present data suggest the field will benefit from considering that prominence information and prosodic boundary information may enter differently into processing. One possible view of the asymmetrical role of these prosodic dimensions is that prosodic boundary information is necessarily structural: The listener must determine the presence of a boundary based on phonetic cues, broader phonological context, word boundary information, and syntactic information. Inferences about these levels of representation can be presumed to take place in parallel, and with the consideration of multiple hypotheses, framed recently through the lens of Bayesian inference by McQueen and Dilley (<xref ref-type="bibr" rid="B55">2020</xref>).</p>
<p>Phrasal prominence, as defined in Section 1.1, could also be described as structural in the sense that in American English (among other languages) it is determined based on metrical structure and phrasing (e.g., the most prominent pitch accent, the nuclear accent, is the last one in an intonational phrase). However, prominence should also clearly be viewed at a more fine-grained level: The present study shows the importance of considering phonetic prominence, signaled by language-specific cues such as vowel-initial glottalization. In this sense, the determination of a given unit&#8217;s prominence therefore needn&#8217;t be determined by only a global or phrasal prosodic parse, but instead may be computed by the listener on a syllable-by-syllable basis. Phonetic prominence is thus useful for the listener to determine if a segment has undergone prominence strengthening effects, reconciling the extent to which a segment is perceptually prominent with its acoustics to determine how it should map to a phonemic category. This view implicates perceptual prominence at both sub-lexical and higher levels, in multiple stages of processing.</p>
<p>The MAPP model, as a two-stage model, predicts that structural/phonological versus phonetic prominence effects should be differentiable, and the present data confirm this prediction: Glottalization as a prominence cue is processed early, and differently from more global (and perhaps phonological) prominence distinctions, as shown in Section 4.5.3.</p>
<p>Additional tests of the model and of the nature of prominence in this domain will also benefit from considering how local or distributed cues are in time. Delayed influences in prosodic boundary processing studies (<xref ref-type="bibr" rid="B44">Kim, Mitterer, &amp; Cho, 2018</xref>; <xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>) have been observed with only localized manipulations (e.g., lengthening of just one syllable in <xref ref-type="bibr" rid="B59">Mitterer et al., 2019</xref>), such that it is clear that locality does not translate directly into rapid cue use, at least where boundary processing is concerned. Future work addressing questions of cue locality and cue functionality (prominence versus boundary marking) might approach the issue by attempting to cross these parameters and compare local to global prominence cues, as well as comparing local to global boundary cues, within the same experiment.</p>
</sec>
<sec>
<title>5.3. Some future directions</title>
<p>Additional tests for this sort of distinction between localized/phonetic and global/structural prominence cues could take the form of examining the extent to which each can be modulated by task factors. Certain early effects in processing are assumed to be relatively immune to task effects and cognitive load as shown by, e.g., Bosker, Reinisch, and Sjerps (<xref ref-type="bibr" rid="B7">2017</xref>). More global prosodic factors have recently been shown to be influenced by task and stimulus presentation factors (<xref ref-type="bibr" rid="B81">Steffman, 2019</xref>, <xref ref-type="bibr" rid="B84">2021b</xref>). For example, Steffman (<xref ref-type="bibr" rid="B84">2021b</xref>) found that rhythmic effects in the perception of segmental cues are disrupted when stimuli vary in speech rate, while speech rate effects (typically assumed to result from low-level auditory processing) are robust to rhythmic variation and occur consistently. To the extent that the effects of vowel-initial glottalization seen here reflect early sub-lexical processing, we might expect them to be robust to these sorts of task effects, whereas global prominence effects may be more fragile.</p>
<p>In this vein, one outstanding question is the extent to which localized prominence strengthening effects are related to more general auditory processing. Though glottalization as prominence strengthening is certainly implemented in a language-specific fashion by speakers, it has the effect of making the following vowel acoustically prominent in a more general way (i.e., a vowel preceded by glottalization is rendered louder than, and perceptually more separated from, preceding material) which boosts auditory processing (<xref ref-type="bibr" rid="B21">Delgutte, 1980</xref>; <xref ref-type="bibr" rid="B22">Delgutte &amp; Kiang, 1984</xref>). Pulling apart the role of language-specific phonetic knowledge and language-general prominence perception may be difficult as phonetic strengthening patterns tend to serve the function of making the strengthened segment more prominent perceptually (though <xref ref-type="bibr" rid="B82">Steffman, 2020</xref> shows that the effects of prominence on vowel perception are specific to the vowel contrast in question). Some indirect evidence for a language-specific interpretation of glottalization cues comes from comparing the early time course of the effect seen here to the delayed influence documented in Mitterer et al. (<xref ref-type="bibr" rid="B59">2019</xref>), where a delayed effect is consistent with post-lexical prosodic analysis. This suggests that the processing of glottalization for American English listeners is different from its processing in Maltese. One account for this asymmetry has to do with the function of the glottalization cues in this study as compared to Mitterer et al. (<xref ref-type="bibr" rid="B59">2019</xref>). Importantly, in that study listeners&#8217; task was to determine if a word was phonemically /&#660;/-initial. In that sense glottalization was a contrastive cue, the perception of which was modulated by phrasing due to it&#8217;s additional phrase-initial boundary marking function. The hypothesis then is that even though vowel-initial glottalization in Maltese may make the following material more phonetically prominent, when the lexical decision depends critically on prosodic phrasing (not prominence), this leads to a relative delay in processing. Carefully controlled cross-linguistic experiments may be useful as a further test of language-general versus language-specific effects going forwards, particularly across languages (and within a language) in which glottalization can have different functions.</p>
<p>In sum, relating the present results to other phonetic strengthening patterns and other languages will help build our understanding of the detailed interplay between segmental and prosodic processing in speech comprehension, and the development of models of this process.</p>
</sec>
</sec>
</body>
<back>
<app-group>
<app>
<title>Appendix</title>
<table-wrap>
<label>Table 1</label>
<caption>
<p>Model summaries for categorization results.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Experiment 1</bold></td>
<td align="left" valign="top"><bold>Estimate</bold></td>
<td align="left" valign="top"><bold>Est. Error</bold></td>
<td align="left" valign="top"><bold>L-95% CI</bold></td>
<td align="left" valign="top"><bold>U-95%CI</bold></td>
<td align="left" valign="top"><bold>pd</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">intercept</td>
<td align="left" valign="top">1.19</td>
<td align="left" valign="top">0.16</td>
<td align="left" valign="top">0.88</td>
<td align="left" valign="top">1.50</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop</td>
<td align="left" valign="top">1.74</td>
<td align="left" valign="top">0.22</td>
<td align="left" valign="top">1.30</td>
<td align="left" valign="top">2.17</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">continuum</td>
<td align="left" valign="top">&#8211;3.42</td>
<td align="left" valign="top">0.17</td>
<td align="left" valign="top">&#8211;3.76</td>
<td align="left" valign="top">&#8211;3.10</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">repetition</td>
<td align="left" valign="top">&#8211;0.07</td>
<td align="left" valign="top">0.12</td>
<td align="left" valign="top">&#8211;0.30</td>
<td align="left" valign="top">0.17</td>
<td align="left" valign="top">72</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop:continuum</td>
<td align="left" valign="top">&#8211;0.77</td>
<td align="left" valign="top">0.19</td>
<td align="left" valign="top">&#8211;1.15</td>
<td align="left" valign="top">&#8211;0.41</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Experiment 2</bold></td>
<td align="left" valign="top"><bold>Estimate</bold></td>
<td align="left" valign="top"><bold>Est. Error</bold></td>
<td align="left" valign="top"><bold>L-95% CI</bold></td>
<td align="left" valign="top"><bold>U-95%CI</bold></td>
<td align="left" valign="top"><bold>pd</bold></td>
</tr>
<tr>
<td align="left" valign="top">intercept</td>
<td align="left" valign="top">0.77</td>
<td align="left" valign="top">0.15</td>
<td align="left" valign="top">0.51</td>
<td align="left" valign="top">1.02</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">glottalization (scaled)</td>
<td align="left" valign="top">0.40</td>
<td align="left" valign="top">0.05</td>
<td align="left" valign="top">0.30</td>
<td align="left" valign="top">0.50</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">continuum</td>
<td align="left" valign="top">&#8211;3.06</td>
<td align="left" valign="top">0.17</td>
<td align="left" valign="top">&#8211;3.41</td>
<td align="left" valign="top">&#8211;2.73</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">repetition</td>
<td align="left" valign="top">0.10</td>
<td align="left" valign="top">0.10</td>
<td align="left" valign="top">&#8211;0.10</td>
<td align="left" valign="top">0.31</td>
<td align="left" valign="top">83</td>
</tr>
<tr>
<td align="left" valign="top">glottalization:continuum</td>
<td align="left" valign="top">&#8211;0.26</td>
<td align="left" valign="top">0.08</td>
<td align="left" valign="top">&#8211;0.42</td>
<td align="left" valign="top">&#8211;0.12</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Experiment 3</bold></td>
<td align="left" valign="top"><bold>Estimate</bold></td>
<td align="left" valign="top"><bold>Est. Error</bold></td>
<td align="left" valign="top"><bold>L-95% CI</bold></td>
<td align="left" valign="top"><bold>U-95%CI</bold></td>
<td align="left" valign="top"><bold>pd</bold></td>
</tr>
<tr>
<td align="left" valign="top">intercept</td>
<td align="left" valign="top">0.97</td>
<td align="left" valign="top">0.15</td>
<td align="left" valign="top">0.67</td>
<td align="left" valign="top">1.27</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop</td>
<td align="left" valign="top">2.51</td>
<td align="left" valign="top">0.26</td>
<td align="left" valign="top">2.00</td>
<td align="left" valign="top">3.04</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">continuum</td>
<td align="left" valign="top">&#8211;2.64</td>
<td align="left" valign="top">0.19</td>
<td align="left" valign="top">&#8211;3.01</td>
<td align="left" valign="top">&#8211;2.28</td>
<td align="left" valign="top">100</td>
</tr>
<tr>
<td align="left" valign="top">repetition</td>
<td align="left" valign="top">0.08</td>
<td align="left" valign="top">0.07</td>
<td align="left" valign="top">&#8211;0.06</td>
<td align="left" valign="top">0.21</td>
<td align="left" valign="top">88</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop:continuum</td>
<td align="left" valign="top">&#8211;0.43</td>
<td align="left" valign="top">0.17</td>
<td align="left" valign="top">&#8211;0.78</td>
<td align="left" valign="top">&#8211;0.11</td>
<td align="left" valign="top">99</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap>
<label>Table 2</label>
<caption>
<p>Model summary for the GAMM used in Experiment 3, with parametric terms shown above and smooth terms shown below.</p>
</caption>
<table>
<thead>
<tr>
<td align="left" valign="top"><bold>Parametric terms</bold></td>
<td align="left" valign="top"><bold>Estimate</bold></td>
<td align="left" valign="top"><bold>Est. Error</bold></td>
<td align="left" valign="top"><bold>t-value</bold></td>
<td align="left" valign="top"><bold>p-value</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">intercept</td>
<td align="left" valign="top">0.88</td>
<td align="left" valign="top">0.08</td>
<td align="left" valign="top">11.28</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">continuum</td>
<td align="left" valign="top">&#8211;1.71</td>
<td align="left" valign="top">0.17</td>
<td align="left" valign="top">&#8211;9.57</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop</td>
<td align="left" valign="top">&#8211;1.26</td>
<td align="left" valign="top">0.12</td>
<td align="left" valign="top">&#8211;10.46</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">repetition</td>
<td align="left" valign="top">&#8211;0.02</td>
<td align="left" valign="top">0.05</td>
<td align="left" valign="top">&#8211;0.36</td>
<td align="left" valign="top">0.72</td>
</tr>
<tr>
<td align="left" valign="top">glottal stop:continuum</td>
<td align="left" valign="top">&#8211;0.44</td>
<td align="left" valign="top">0.20</td>
<td align="left" valign="top">&#8211;2.16</td>
<td align="left" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Smooth terms</bold></td>
<td align="left" valign="top"><bold>edf</bold></td>
<td align="left" valign="top"><bold>ref df</bold></td>
<td align="left" valign="top"><bold>F-value</bold></td>
<td align="left" valign="top"><bold>p-value</bold></td>
</tr>
<tr>
<td align="left" valign="top">te(time, continuum condition = glottal stop)</td>
<td align="left" valign="top">20.82</td>
<td align="left" valign="top">22.33</td>
<td align="left" valign="top">69.62</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">te(time, continuum; condition = no glottal stop)</td>
<td align="left" valign="top">17.65</td>
<td align="left" valign="top">19.89</td>
<td align="left" valign="top">67.87</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">te(time, repetition)</td>
<td align="left" valign="top">5.86</td>
<td align="left" valign="top">7.72</td>
<td align="left" valign="top">1.47</td>
<td align="left" valign="top">0.13</td>
</tr>
<tr>
<td align="left" valign="top">s(time, participant)</td>
<td align="left" valign="top">251.12</td>
<td align="left" valign="top">359.00</td>
<td align="left" valign="top">2.83</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
<tr>
<td align="left" valign="top">s(time, participant; condition)</td>
<td align="left" valign="top">214.22</td>
<td align="left" valign="top">359.00</td>
<td align="left" valign="top">1.72</td>
<td align="left" valign="top">&lt;0.001</td>
</tr>
</tbody>
</table>
</table-wrap>
</app>
</app-group>
<fn-group>
<fn id="n1"><p>As Ladd and Arvaniti (<xref ref-type="bibr" rid="B47">2023</xref>) discuss, a purely general definition of prominence can be disadvantageous in that it does not facilitate discussion of variation across languages in how prominence is produced and perceived (e.g., <xref ref-type="bibr" rid="B75">Riesberg, Kalbertodt, Baumann, &amp; Himmelmann, 2020</xref>).</p></fn>
<fn id="n2"><p>To keep the stimulus design simpler, only one file was used as the base file (the &#8220;ebb&#8221; model). Though this may have engendered a slight bias towards /&#603;/ responses (seen in Experiment 1), it should be noted that this caveat does not impact the interpretation of the glottalization effect, which is totally contextual in the sense that the glottalization manipulation did not alter the acoustics of the F1/F2 continuum.</p></fn>
<fn id="n3"><p>Spectral contrast refers to the perception of frequency regions in the spectrum (here, formants) relative to contextual spectral information (<xref ref-type="bibr" rid="B34">Holt, Lotto, &amp; Kluender, 2000</xref>; <xref ref-type="bibr" rid="B86">Stilp, 2020</xref>). The impact of a preceding vowel&#8217;s formants on the perception of a following vowel should be considered in this light (here, the formants in the vowel in the word &#8220;the&#8221; impacting perception of the continuum). Contrast effects diminish in strength as there is increased distance between context and target (<xref ref-type="bibr" rid="B33">Holt, 2005</xref>; <xref ref-type="bibr" rid="B85">Stilp, 2018</xref>). Contrast effects here will thus be strongest in the no glottal stop condition, where no glottal stop temporally separates the preceding vowel and the target continuum. In the present stimuli, the precursor vowel generally has higher F1 and lower F2 than the formant values on the continuum. Thus, F1 in the continuum will be perceived as relatively low and F2 in the continuum will be perceived as relatively high (more like /&#603;/) as a function of spectral contrast with the precursor. This predicts that the target is more likely to be perceived as /&#603;/ in the no glottal stop condition, where contrast effects should be strongest. This is the opposite of the prediction based on glottalization as a prominence cue, where the target is more likely to be perceived as /&#603;/ in the glottal stop condition, described in Section 2.1. In this sense contrast effects are not a confound, they predict the opposite of the prominence prediction.</p></fn>
<fn id="n4"><p>No previous work that describes the relationship between glottal stop duration and following vowel duration in American English is known to the author.</p></fn>
<fn id="n5"><p>The 0 mean of the prior for the intercept encodes a expectation of equal odds of &#8220;ebb&#8221; versus &#8220;ab&#8221; responses at the center of the continuum, as the continuum variable is centered and scaled. The 0 mean of the prior for the fixed effects encodes a prior expectation a change of 0 in log odds as a function of each fixed effect (i.e., no prior expectation of an effect). The standard deviation of 1.5 (in log-odds) encodes a wide window of uncertainly around these values, which is essentially flat in log-odds space (<xref ref-type="bibr" rid="B53">McElreath, 2020</xref>). This represents high uncertainty about what the effects will be in both magnitude and directionality. Such priors thus provide some information to the model but are only very weakly informative, allowing for the data to &#8220;speak for itself.&#8221; This is appropriate for hypothesis testing of the sort carried out here where there is not any prior expectation about the data, see e.g., <xref ref-type="bibr" rid="B53">McElreath 2020</xref> for discussion of priors in logistic regression.</p></fn>
<fn id="n6"><p>The model was fit to draw 4,000 samples from the posterior in each of four Markov chains. To ensure sufficient independence from the starting value in each chain, each was run with a burn-in period of 1,000 iterations, discarding the first 1,000 samples and retaining the latter 75% of the samples for inference. <italic>&#82;&#770;</italic>, a metric which compares between-chain to within-chain estimates (which should agree with one another) was inspected for each estimate to confirm adequate mixing of the chains. Bulk and Tail ESS (effective sample size), which indicates the efficiency of sampling in the bulk and tails of the posterior, additionally were inspected to confirm adequate sampling.</p></fn>
<fn id="n7"><p>Two alternative parameterizations of the Experiment 2 model are included in the open-access repository for the paper but not reported here. In one, the glottal stop continuum was treated as an ordinal predictor (monotonic effect), which showed the same credible impact on categorization responses. In the other, the glottal stop continuum was treated as a categorical variable with four unordered levels. In this second model, pairwise comparisons between all levels, compared with <italic>emmeans</italic> (<xref ref-type="bibr" rid="B48">Lenth, Singmann, Love, Buerkner, &amp; Herve, 2018</xref>) were reliably different (all having pd &gt; 98). Alternative modeling approaches thus all lead to the same conclusions about the effect being robust.</p></fn>
<fn id="n8"><p>Another consequence of the interaction is that there is a larger effect of glottalization at the numerically lower end of the continuum in both experiments, as the (most) glottalized glottalization condition essentially remains more anchored at this lower end, allowing for a larger difference with the no glottal stop condition.</p></fn>
<fn id="n9"><p>Binocular recording is not available for this arm-mounted set up.</p></fn>
<fn id="n10"><p>The transformation is the following, where <italic>n</italic> is the total number of samples in a given time bin and <italic>y</italic> is the number of samples for a given interest area:</p>
<p><disp-formula>
<alternatives>
<mml:math id="eq001-mml">
<mml:mrow><mml:mi>Empirical</mml:mi><mml:mo>&#x00A0;</mml:mo><mml:mi>logit</mml:mi><mml:mo>=</mml:mo><mml:mtext mathvariant="italic">log</mml:mtext><mml:mfenced><mml:mrow><mml:mstyle scriptlevel='+1'><mml:mfrac><mml:mrow><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:mn>0.5</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2212;</mml:mo><mml:mi>y</mml:mi><mml:mo>+</mml:mo><mml:mn>0.5</mml:mn></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced></mml:mrow></mml:math>
<tex-math id="M1">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
{\rm{Empirical\ logit}} = log \left( {{\textstyle{{y\ +\ 0.5} \over {n\ -\ y\ +\ 0.5}}}} \right)
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-14-8753-e1.gif"/>
</alternatives>
</disp-formula></p></fn>
<fn id="n11"><p>We can also note that slightly more of the surface overall is shaded when there is no glottal stop (33%), as compared to when there is a glottal stop (27%), with a lack of preference for either target persisting slightly longer in the no glottal stop condition (particularly at more /&#603;/-like steps). This is consistent with the idea that a preceding glottal stop facilitates recognition of the target vowel, allowing listeners to develop a fixation preference sooner overall, as compared to when no glottal stop precedes the target.</p></fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>Many thanks are due Adam Royer for recording stimuli for the experiments, to Danielle Frederickson, Qingxia Guo and Bryan Gonzalez for help with data collection, and to Sun-Ah Jun, Pat Keating, Megha Sundara and Taehong Cho for valuable feedback and discussion.</p>
</ack>
<sec>
<title>Competing Interests</title>
<p>The author has no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1"><label>1</label><mixed-citation publication-type="book"><string-name><surname>Baayen</surname>, <given-names>R. H.</given-names></string-name>, <string-name><surname>van Rij</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>de Cat</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Wood</surname>, <given-names>S.</given-names></string-name> (<year>2018</year>). <chapter-title>Autocorrelated errors in experimental data in the language sciences: Some solutions offered by generalized additive mixed models</chapter-title>. In <source>Mixed-Effects Regression Models in Linguistics</source> (pp. <fpage>49</fpage>&#8211;<lpage>69</lpage>). <publisher-name>Springer</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1007/978-3-319-69830-4_4</pub-id></mixed-citation></ref>
<ref id="B2"><label>2</label><mixed-citation publication-type="journal"><string-name><surname>Barr</surname>, <given-names>D. J.</given-names></string-name> (<year>2008</year>). <article-title>Analyzing &#8216;visual world&#8217; eyetracking data using multilevel logistic regression</article-title>. <source>Journal of Memory and Language</source>, <volume>59</volume>(<issue>4</issue>), <fpage>457</fpage>&#8211;<lpage>474</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2007.09.002</pub-id></mixed-citation></ref>
<ref id="B3"><label>3</label><mixed-citation publication-type="journal"><string-name><surname>Baumann</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Cangemi</surname>, <given-names>F.</given-names></string-name> (<year>2020</year>). <article-title>Integrating phonetics and phonology in the study of linguistic prominence</article-title>. <source>Journal of Phonetics</source>, <volume>81</volume>, <elocation-id>100993</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2020.100993</pub-id></mixed-citation></ref>
<ref id="B4"><label>4</label><mixed-citation publication-type="journal"><string-name><surname>Baumann</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Winter</surname>, <given-names>B.</given-names></string-name> (<year>2018</year>). <article-title>What makes a word prominent? Predicting untrained German listeners&#8217; perceptual judgments</article-title>. <source>Journal of Phonetics</source>, <volume>70</volume>, <fpage>20</fpage>&#8211;<lpage>38</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2018.05.004</pub-id></mixed-citation></ref>
<ref id="B5"><label>5</label><mixed-citation publication-type="book"><string-name><surname>Beckman</surname>, <given-names>M. E.</given-names></string-name>, <string-name><surname>Edwards</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Fletcher</surname>, <given-names>J.</given-names></string-name> (<year>1992</year>). <chapter-title>Prosodic structure and tempo in a sonority model of articulatory dynamics</chapter-title>. In <string-name><given-names>G. J.</given-names> <surname>Docherty</surname></string-name> &amp; <string-name><given-names>D. R.</given-names> <surname>Ladd</surname></string-name> (Eds.), <source>Gesture, Segment, Prosody</source> (pp. <fpage>68</fpage>&#8211;<lpage>89</lpage>). <publisher-name>Cambridge University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1017/CBO9780511519918.004</pub-id></mixed-citation></ref>
<ref id="B6"><label>6</label><mixed-citation publication-type="webpage"><string-name><surname>Boersma</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Weenink</surname>, <given-names>D.</given-names></string-name> (<year>2020</year>). <source>Praat: doing phonetics by computer (version 6.1.09)</source>. Retrieved from <uri>http://www.praat.org</uri></mixed-citation></ref>
<ref id="B7"><label>7</label><mixed-citation publication-type="journal"><string-name><surname>Bosker</surname>, <given-names>H. R.</given-names></string-name>, <string-name><surname>Reinisch</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Sjerps</surname>, <given-names>M. J.</given-names></string-name> (<year>2017</year>). <article-title>Cognitive load makes speech sound fast, but does not modulate acoustic context effects</article-title>. <source>Journal of Memory and Language</source>, <volume>94</volume>, <fpage>166</fpage>&#8211;<lpage>176</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2016.12.002</pub-id></mixed-citation></ref>
<ref id="B8"><label>8</label><mixed-citation publication-type="journal"><string-name><surname>Brand</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Ernestus</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>Listeners&#8217; processing of a given reduced word pronunciation variant directly reflects their exposure to this variant: Evidence from native listeners and learners of french</article-title>. <source>Quarterly Journal of Experimental Psychology</source>, <volume>71</volume>(<issue>5</issue>), <fpage>1240</fpage>&#8211;<lpage>1259</lpage>. DOI: <pub-id pub-id-type="doi">10.1080/17470218.2017.1313282</pub-id></mixed-citation></ref>
<ref id="B9"><label>9</label><mixed-citation publication-type="journal"><string-name><surname>Brunner</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Zygis</surname>, <given-names>M.</given-names></string-name> (<year>2011</year>). <article-title>Why do glottal stops and low vowels like each other?</article-title> In <source>Proceedings of the 17th International Congress of Phonetic Sciences</source> (pp. <fpage>376</fpage>&#8211;<lpage>379</lpage>).</mixed-citation></ref>
<ref id="B10"><label>10</label><mixed-citation publication-type="journal"><string-name><surname>B&#252;rkner</surname>, <given-names>P.-C.</given-names></string-name> (<year>2017</year>). <article-title>brms: An R package for Bayesian multilevel models using Stan</article-title>. <source>Journal of Statistical Software</source>, <volume>80</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>28</lpage>. DOI: <pub-id pub-id-type="doi">10.18637/jss.v080.i01</pub-id></mixed-citation></ref>
<ref id="B11"><label>11</label><mixed-citation publication-type="journal"><string-name><surname>Byrd</surname>, <given-names>D.</given-names></string-name> (<year>1993</year>). <article-title>54,000 American stops</article-title>. <source>UCLA working Papers in Phonetics</source>, <volume>83</volume>, <fpage>97</fpage>&#8211;<lpage>116</lpage>.</mixed-citation></ref>
<ref id="B12"><label>12</label><mixed-citation publication-type="journal"><string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2005</year>). <article-title>Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of /A, i/ in English</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>117</volume>(<issue>6</issue>), <fpage>3867</fpage>&#8211;<lpage>3878</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.1861893</pub-id></mixed-citation></ref>
<ref id="B13"><label>13</label><mixed-citation publication-type="journal"><string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name>, &amp; <string-name><surname>McQueen</surname>, <given-names>J. M.</given-names></string-name> (<year>2005</year>). <article-title>Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress</article-title>. <source>Journal of Phonetics</source>, <volume>33</volume>(<issue>2</issue>), <fpage>121</fpage>&#8211;<lpage>157</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2005.01.001</pub-id></mixed-citation></ref>
<ref id="B14"><label>14</label><mixed-citation publication-type="journal"><string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>McQueen</surname>, <given-names>J. M.</given-names></string-name>, &amp; <string-name><surname>Cox</surname>, <given-names>E. A.</given-names></string-name> (<year>2007</year>). <article-title>Prosodically driven phonetic detail in speech processing: The case of domain-initial strengthening in English</article-title>. <source>Journal of Phonetics</source>, <volume>35</volume>(<issue>2</issue>), <fpage>210</fpage>&#8211;<lpage>243</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2006.03.003</pub-id></mixed-citation></ref>
<ref id="B15"><label>15</label><mixed-citation publication-type="journal"><string-name><surname>Chong</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>Online perception of glottalized coda stops in American English</article-title>. <source>Laboratory Phonology: Journal of the Association for Laboratory Phonology</source>. DOI: <pub-id pub-id-type="doi">10.5334/labphon.70</pub-id></mixed-citation></ref>
<ref id="B16"><label>16</label><mixed-citation publication-type="journal"><string-name><surname>Christophe</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Peperkamp</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Pallier</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Block</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Mehler</surname>, <given-names>J.</given-names></string-name> (<year>2004</year>). <article-title>Phonological phrase boundaries constrain lexical access I. Adult data</article-title>. <source>Journal of Memory and Language</source>, <volume>51</volume>(<issue>4</issue>), <fpage>523</fpage>&#8211;<lpage>547</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2004.07.001</pub-id></mixed-citation></ref>
<ref id="B17"><label>17</label><mixed-citation publication-type="journal"><string-name><surname>Cole</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Choi</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Hasegawa-Johnson</surname>, <given-names>M.</given-names></string-name> (<year>2007</year>). <article-title>Prosodic effects on acoustic cues to stop voicing and place of articulation: Evidence from Radio News speech</article-title>. <source>Journal of Phonetics</source>, <volume>35</volume>(<issue>2</issue>), <fpage>180</fpage>&#8211;<lpage>209</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2006.03.004</pub-id></mixed-citation></ref>
<ref id="B18"><label>18</label><mixed-citation publication-type="journal"><string-name><surname>Cole</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Mo</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Hasegawa-Johnson</surname>, <given-names>M.</given-names></string-name> (<year>2010</year>). <article-title>Signal-based and expectationbased factors in the perception of prosodic prominence</article-title>. <source>Laboratory Phonology: Journal of the Association for Laboratory Phonology</source>, <volume>1</volume>(<issue>2</issue>), <fpage>425</fpage>&#8211;<lpage>452</lpage>. DOI: <pub-id pub-id-type="doi">10.1515/labphon.2010.022</pub-id></mixed-citation></ref>
<ref id="B19"><label>19</label><mixed-citation publication-type="journal"><string-name><surname>Dahan</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Tanenhaus</surname>, <given-names>M. K.</given-names></string-name>, &amp; <string-name><surname>Chambers</surname>, <given-names>C. G.</given-names></string-name> (<year>2002</year>). <article-title>Accent and reference resolution in spoken-language comprehension</article-title>. <source>Journal of Memory and Language</source>, <volume>47</volume>(<issue>2</issue>), <fpage>292</fpage>&#8211;<lpage>314</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/S0749-596X(02)00001-3</pub-id></mixed-citation></ref>
<ref id="B20"><label>20</label><mixed-citation publication-type="journal"><string-name><surname>de Jong</surname>, <given-names>K.</given-names></string-name> (<year>1995</year>). <article-title>The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>97</volume>(<issue>1</issue>), <fpage>491</fpage>&#8211;<lpage>504</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.412275</pub-id></mixed-citation></ref>
<ref id="B21"><label>21</label><mixed-citation publication-type="journal"><string-name><surname>Delgutte</surname>, <given-names>B.</given-names></string-name> (<year>1980</year>). <article-title>Representation of speech-like sounds in the discharge patterns of audry-nerve fibers</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>68</volume>(<issue>3</issue>), <fpage>843</fpage>&#8211;<lpage>857</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.384824</pub-id></mixed-citation></ref>
<ref id="B22"><label>22</label><mixed-citation publication-type="journal"><string-name><surname>Delgutte</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Kiang</surname>, <given-names>N. Y.</given-names></string-name> (<year>1984</year>). <article-title>Speech coding in the auditory nerve: I. vowel-like sounds</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>75</volume>(<issue>3</issue>), <fpage>866</fpage>&#8211;<lpage>878</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.390596</pub-id></mixed-citation></ref>
<ref id="B23"><label>23</label><mixed-citation publication-type="journal"><string-name><surname>Dilley</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Shattuck-Hufnagel</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Ostendorf</surname>, <given-names>M.</given-names></string-name> (<year>1996</year>). <article-title>Glottalization of word-initial vowels as a function of prosodic structure</article-title>. <source>Journal of Phonetics</source>, <volume>24</volume>(<issue>4</issue>), <fpage>423</fpage>&#8211;<lpage>444</lpage>. DOI: <pub-id pub-id-type="doi">10.1006/jpho.1996.0023</pub-id></mixed-citation></ref>
<ref id="B24"><label>24</label><mixed-citation publication-type="journal"><string-name><surname>Erickson</surname>, <given-names>D.</given-names></string-name> (<year>2002</year>). <article-title>Articulation of extreme formant patterns for emphasized vowels</article-title>. <source>Phonetica</source>, <volume>59</volume>(<issue>2&#8211;3</issue>), <fpage>134</fpage>&#8211;<lpage>149</lpage>. DOI: <pub-id pub-id-type="doi">10.1159/000066067</pub-id></mixed-citation></ref>
<ref id="B25"><label>25</label><mixed-citation publication-type="book"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2013</year>). <source>Production and perception of glottal stops</source> (Unpublished doctoral dissertation). <publisher-name>University of California</publisher-name>, <publisher-loc>Los Angeles</publisher-loc>.</mixed-citation></ref>
<ref id="B26"><label>26</label><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2014</year>). <article-title>Voice quality strengthening and glottalization</article-title>. <source>Journal of Phonetics</source>, <volume>45</volume>, <fpage>106</fpage>&#8211;<lpage>113</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2014.04.001</pub-id></mixed-citation></ref>
<ref id="B27"><label>27</label><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name> (<year>2011</year>). <article-title>The acoustic consequences of phonation and tone interactions in Jalapa Mazatec</article-title>. <source>Journal of the International Phonetic Association</source>, <fpage>185</fpage>&#8211;<lpage>205</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0025100311000193</pub-id></mixed-citation></ref>
<ref id="B28"><label>28</label><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>White</surname>, <given-names>J.</given-names></string-name> (<year>2015</year>). <article-title>Phonetics of Tongan stress</article-title>. <source>Journal of the International Phonetic Association</source>, <volume>45</volume>(<issue>1</issue>), <fpage>13</fpage>&#8211;<lpage>34</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0025100314000206</pub-id></mixed-citation></ref>
<ref id="B29"><label>29</label><mixed-citation publication-type="journal"><string-name><surname>Gerfen</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Baker</surname>, <given-names>K.</given-names></string-name> (<year>2005</year>). <article-title>The production and perception of laryngealized vowels in Coatzospan Mixtec</article-title>. <source>Journal of Phonetics</source>, <volume>33</volume>(<issue>3</issue>), <fpage>311</fpage>&#8211;<lpage>334</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2004.11.002</pub-id></mixed-citation></ref>
<ref id="B30"><label>30</label><mixed-citation publication-type="journal"><string-name><surname>Gordon</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Ladefoged</surname>, <given-names>P.</given-names></string-name> (<year>2001</year>). <article-title>Phonation types: a cross-linguistic overview</article-title>. <source>Journal of Phonetics</source>, <volume>29</volume>(<issue>4</issue>), <fpage>383</fpage>&#8211;<lpage>406</lpage>. DOI: <pub-id pub-id-type="doi">10.1006/jpho.2001.0147</pub-id></mixed-citation></ref>
<ref id="B31"><label>31</label><mixed-citation publication-type="book"><string-name><surname>Groves</surname>, <given-names>T. R.</given-names></string-name>, <string-name><surname>Groves</surname>, <given-names>G. W.</given-names></string-name>, <string-name><surname>Jacobs</surname>, <given-names>R.</given-names></string-name>, et al. (<year>1985</year>). <source>Kiribatese: an outline description</source>. <publisher-name>Dept. of Linguistics, Research School of Pacific Studies, The Australian National University</publisher-name>.</mixed-citation></ref>
<ref id="B32"><label>32</label><mixed-citation publication-type="journal"><string-name><surname>Henton</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Ladefoged</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Maddieson</surname>, <given-names>I.</given-names></string-name> (<year>1992</year>). <article-title>Stops in the world&#8217;s languages</article-title>. <source>Phonetica</source>, <volume>49</volume>(<issue>2</issue>), <fpage>65</fpage>&#8211;<lpage>101</lpage>. DOI: <pub-id pub-id-type="doi">10.1159/000261905</pub-id></mixed-citation></ref>
<ref id="B33"><label>33</label><mixed-citation publication-type="journal"><string-name><surname>Holt</surname>, <given-names>L. L.</given-names></string-name> (<year>2005</year>). <article-title>Temporally nonadjacent nonlinguistic sounds affect speech categorization</article-title>. <source>Psychological Science</source>, <volume>16</volume>(<issue>4</issue>), <fpage>305</fpage>&#8211;<lpage>312</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/j.0956-7976.2005.01532.x</pub-id></mixed-citation></ref>
<ref id="B34"><label>34</label><mixed-citation publication-type="journal"><string-name><surname>Holt</surname>, <given-names>L. L.</given-names></string-name>, <string-name><surname>Lotto</surname>, <given-names>A. J.</given-names></string-name>, &amp; <string-name><surname>Kluender</surname>, <given-names>K. R.</given-names></string-name> (<year>2000</year>). <article-title>Neighboring spectral content influences vowel identification</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>108</volume>(<issue>2</issue>), <fpage>710</fpage>&#8211;<lpage>722</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.429604</pub-id></mixed-citation></ref>
<ref id="B35"><label>35</label><mixed-citation publication-type="journal"><string-name><surname>Huffman</surname>, <given-names>M. K.</given-names></string-name> (<year>2005</year>). <article-title>Segmental and prosodic effects on coda glottalization</article-title>. <source>Journal of Phonetics</source>, <volume>33</volume>(<issue>3</issue>), <fpage>335</fpage>&#8211;<lpage>362</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2005.02.004</pub-id></mixed-citation></ref>
<ref id="B36"><label>36</label><mixed-citation publication-type="journal"><string-name><surname>Ito</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Speer</surname>, <given-names>S. R.</given-names></string-name> (<year>2008</year>). <article-title>Anticipatory effects of intonation: Eye movements during instructed visual search</article-title>. <source>Journal of memory and language</source>, <volume>58</volume>(<issue>2</issue>), <fpage>541</fpage>&#8211;<lpage>573</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2007.06.013</pub-id></mixed-citation></ref>
<ref id="B37"><label>37</label><mixed-citation publication-type="journal"><string-name><surname>Jongenburger</surname>, <given-names>W.</given-names></string-name>, &amp; <string-name><surname>van Heuven</surname>, <given-names>V. J.</given-names></string-name> (<year>1991</year>). <article-title>The distribution of (word initial) glottal stop in Dutch</article-title>. <source>Linguistics in the Netherlands</source>, <volume>8</volume>(<issue>1</issue>), <fpage>101</fpage>&#8211;<lpage>110</lpage>. DOI: <pub-id pub-id-type="doi">10.1075/avt.8.13jon</pub-id></mixed-citation></ref>
<ref id="B38"><label>38</label><mixed-citation publication-type="book"><string-name><surname>Jun</surname>, <given-names>S.-A.</given-names></string-name> (<year>2005</year>). <source>Prosodic Typology: The Phonology of Intonation and Phrasing</source> (Vol. <volume>1</volume>). <publisher-name>Oxford University Press</publisher-name>.</mixed-citation></ref>
<ref id="B39"><label>39</label><mixed-citation publication-type="book"><string-name><surname>Jun</surname>, <given-names>S.-A.</given-names></string-name> (<year>2014</year>). <source>Prosodic Typology II: The Phonology of Intonation and Phrasing</source> (Vol. <volume>2</volume>). <publisher-name>Oxford University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1093/acprof:oso/9780199567300.001.0001</pub-id></mixed-citation></ref>
<ref id="B40"><label>40</label><mixed-citation publication-type="book"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name> (<year>2006</year>). <chapter-title>Phonetic encoding of prosodic structure</chapter-title>. In <source>Speech Production: Models, Phonetic Processes, and Techniques</source> (pp. <fpage>167</fpage>&#8211;<lpage>186</lpage>). <publisher-name>Psychology Press</publisher-name>.</mixed-citation></ref>
<ref id="B41"><label>41</label><mixed-citation publication-type="book"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Fougeron</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Hsu</surname>, <given-names>C.-S.</given-names></string-name> (<year>2004</year>). <chapter-title>Domain-initial articulatory strengthening in four languages</chapter-title>. In <source>Phonetic interpretation: Papers in Laboratory Phonology VI</source> (pp. <fpage>143</fpage>&#8211;<lpage>161</lpage>). <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B42"><label>42</label><mixed-citation publication-type="journal"><string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2013</year>). <article-title>Prosodic boundary information modulates phonetic categorization</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>134</volume>(<issue>1</issue>), <fpage>EL19</fpage>&#8211;<lpage>EL25</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.4807431</pub-id></mixed-citation></ref>
<ref id="B43"><label>43</label><mixed-citation publication-type="journal"><string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2018</year>). <article-title>Prosodic-structural modulation of stop voicing contrast along the VOT continuum in trochaic and iambic words in American English</article-title>. <source>Journal of Phonetics</source>, <volume>71</volume>, <fpage>65</fpage>&#8211;<lpage>80</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2018.07.004</pub-id></mixed-citation></ref>
<ref id="B44"><label>44</label><mixed-citation publication-type="journal"><string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2018</year>). <article-title>A time course of prosodic modulation in phonological inferencing: The case of Korean post-obstruent tensing</article-title>. <source>PloS one</source>, <volume>13</volume>(<issue>8</issue>). DOI: <pub-id pub-id-type="doi">10.1371/journal.pone.0202912</pub-id></mixed-citation></ref>
<ref id="B45"><label>45</label><mixed-citation publication-type="journal"><string-name><surname>Kingston</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Levy</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Rysling</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Staub</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <article-title>Eye movement evidence for an immediate Ganong effect</article-title>. <source>Journal of Experimental Psychology: Human Perception and Performance</source>, <volume>42</volume>(<issue>12</issue>), <fpage>1969</fpage>. DOI: <pub-id pub-id-type="doi">10.1037/xhp0000269</pub-id></mixed-citation></ref>
<ref id="B46"><label>46</label><mixed-citation publication-type="book"><string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Sidtis</surname>, <given-names>D.</given-names></string-name> (<year>2011</year>). <source>Foundations of voice studies: An interdisciplinary approach to voice production and perception</source>. <publisher-name>John Wiley &amp; Sons</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1002/9781444395068</pub-id></mixed-citation></ref>
<ref id="B47"><label>47</label><mixed-citation publication-type="journal"><string-name><surname>Ladd</surname>, <given-names>D. R.</given-names></string-name>, &amp; <string-name><surname>Arvaniti</surname>, <given-names>A.</given-names></string-name> (<year>2023</year>). <article-title>Prosodic prominence across languages</article-title>. <source>Annual Review of Linguistics</source>, <volume>9</volume>(<issue>1</issue>). DOI: <pub-id pub-id-type="doi">10.1146/annurev-linguistics-031120-101954</pub-id></mixed-citation></ref>
<ref id="B48"><label>48</label><mixed-citation publication-type="webpage"><string-name><surname>Lenth</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Singmann</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Love</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Buerkner</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Herve</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <source>emmeans: Estimated Marginal Means, aka Least-Squares Means</source>. Retrieved from <uri>https://CRAN.R-project.org/package=emmeans</uri></mixed-citation></ref>
<ref id="B49"><label>49</label><mixed-citation publication-type="journal"><string-name><surname>Maddieson</surname>, <given-names>I.</given-names></string-name>, &amp; <string-name><surname>Precoda</surname>, <given-names>K.</given-names></string-name> (<year>1989</year>). <article-title>Updating UPSID</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>86</volume>(<issue>S1</issue>), <fpage>S19</fpage>&#8211;<lpage>S19</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.2027403</pub-id></mixed-citation></ref>
<ref id="B50"><label>50</label><mixed-citation publication-type="webpage"><string-name><surname>Makowski</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Ben-Shachar</surname>, <given-names>M. S.</given-names></string-name>, &amp; <string-name><surname>L&#252;decke</surname>, <given-names>D.</given-names></string-name> (<year>2019</year>). <article-title>bayestestr: Describing effects and their uncertainty, existence and significance within the bayesian framework</article-title>. <source>Journal of Open Source Software</source>, <volume>4</volume>(<issue>40</issue>), <fpage>1541</fpage>. Retrieved from <uri>https://joss.theoj.org/papers/10.21105/joss.01541</uri>. DOI: <pub-id pub-id-type="doi">10.21105/joss.01541</pub-id></mixed-citation></ref>
<ref id="B51"><label>51</label><mixed-citation publication-type="webpage"><string-name><surname>Makowski</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Ben-Shachar</surname>, <given-names>M. S.</given-names></string-name>, <string-name><surname>Patil</surname>, <given-names>I.</given-names></string-name>, &amp; <string-name><surname>L&#252;decke</surname>, <given-names>D.</given-names></string-name> (<year>2020</year>). <article-title>Estimation of model-based predictions, contrasts and means</article-title>. <source>CRAN</source>. Retrieved from <uri>https://github.com/easystats/modelbased</uri></mixed-citation></ref>
<ref id="B52"><label>52</label><mixed-citation publication-type="journal"><string-name><surname>Matin</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Shao</surname>, <given-names>K. C.</given-names></string-name>, &amp; <string-name><surname>Boff</surname>, <given-names>K. R.</given-names></string-name> (<year>1993</year>). <article-title>Saccadic overhead: Informationprocessing time with and without saccades</article-title>. <source>Perception &amp; Psychophysics</source>, <volume>53</volume>(<issue>4</issue>), <fpage>372</fpage>&#8211;<lpage>380</lpage>. DOI: <pub-id pub-id-type="doi">10.3758/BF03206780</pub-id></mixed-citation></ref>
<ref id="B53"><label>53</label><mixed-citation publication-type="book"><string-name><surname>McElreath</surname>, <given-names>R.</given-names></string-name> (<year>2020</year>). <source>Statistical rethinking: A Bayesian course with examples in R and Stan</source>. <publisher-name>Chapman and Hall/CRC</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1201/9780429029608</pub-id></mixed-citation></ref>
<ref id="B54"><label>54</label><mixed-citation publication-type="journal"><string-name><surname>Mckinnon</surname>, <given-names>S.</given-names></string-name> (<year>2018</year>). <article-title>A sociophonetic analysis of word-initial vowel glottalization in monolingual and bilingual guatemalan spanish</article-title>. In <source>Hispanic Linguistics Symposium, University of Texas</source> (pp. <fpage>25</fpage>&#8211;<lpage>27</lpage>).</mixed-citation></ref>
<ref id="B55"><label>55</label><mixed-citation publication-type="book"><string-name><surname>McQueen</surname>, <given-names>J. M.</given-names></string-name>, &amp; <string-name><surname>Dilley</surname>, <given-names>L.</given-names></string-name> (<year>2020</year>). <chapter-title>Prosody and spoken-word recognition</chapter-title>. In <source>The oxford handbook of language prosody</source> (pp. <fpage>509</fpage>&#8211;<lpage>521</lpage>). <publisher-name>Oxford University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1093/oxfordhb/9780198832232.013.33</pub-id></mixed-citation></ref>
<ref id="B56"><label>56</label><mixed-citation publication-type="journal"><string-name><surname>Mendelsohn</surname>, <given-names>A. H.</given-names></string-name>, &amp; <string-name><surname>Zhang</surname>, <given-names>Z.</given-names></string-name> (<year>2011</year>). <article-title>Phonation threshold pressure and onset frequency in a two-layer physical model of the vocal folds</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>130</volume>(<issue>5</issue>), <fpage>2961</fpage>&#8211;<lpage>2968</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.3644913</pub-id></mixed-citation></ref>
<ref id="B57"><label>57</label><mixed-citation publication-type="journal"><string-name><surname>Michnowicz</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Kagan</surname>, <given-names>L.</given-names></string-name> (<year>2016</year>). <article-title>On glottal stops in Yucatan Spanish</article-title>. <source>Spanish language and sociolinguistic analysis</source>, <fpage>219</fpage>&#8211;<lpage>239</lpage>. DOI: <pub-id pub-id-type="doi">10.1075/ihll.8.09mic</pub-id></mixed-citation></ref>
<ref id="B58"><label>58</label><mixed-citation publication-type="journal"><string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name>, &amp; <string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name> (<year>2016</year>). <article-title>How does prosody influence speech categorization?</article-title> <source>Journal of Phonetics</source>, <volume>54</volume>, <fpage>68</fpage>&#8211;<lpage>79</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2015.09.002</pub-id></mixed-citation></ref>
<ref id="B59"><label>59</label><mixed-citation publication-type="journal"><string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2019</year>). <article-title>The glottal stop between segmental and suprasegmental processing: The case of Maltese</article-title>. <source>Journal of Memory and Language</source>, <volume>108</volume>, <elocation-id>104034</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2019.104034</pub-id></mixed-citation></ref>
<ref id="B60"><label>60</label><mixed-citation publication-type="journal"><string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2021a</year>). <article-title>Glottal stops do not constrain lexical access as do oral stops</article-title>. <source>PloS one</source>, <volume>16</volume>(<issue>11</issue>), <elocation-id>e0259573</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1371/journal.pone.0259573</pub-id></mixed-citation></ref>
<ref id="B61"><label>61</label><mixed-citation publication-type="journal"><string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Cho</surname>, <given-names>T.</given-names></string-name> (<year>2021b</year>). <article-title>The role of segmental information in syntactic processing through the syntax&#8211;prosody interface</article-title>. <source>Language and Speech</source>, <volume>64</volume>(<issue>4</issue>), <fpage>962</fpage>&#8211;<lpage>979</lpage>. DOI: <pub-id pub-id-type="doi">10.1177/0023830920974401</pub-id></mixed-citation></ref>
<ref id="B62"><label>62</label><mixed-citation publication-type="journal"><string-name><surname>Mitterer</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Reinisch</surname>, <given-names>E.</given-names></string-name> (<year>2013</year>). <article-title>No delays in application of perceptual learning in speech recognition: Evidence from eye tracking</article-title>. <source>Journal of Memory and Language</source>, <volume>69</volume>(<issue>4</issue>), <fpage>527</fpage>&#8211;<lpage>545</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2013.07.002</pub-id></mixed-citation></ref>
<ref id="B63"><label>63</label><mixed-citation publication-type="journal"><string-name><surname>Mo</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Cole</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Hasegawa-Johnson</surname>, <given-names>M.</given-names></string-name> (<year>2009</year>). <article-title>Prosodic effects on vowel production: evidence from formant structure</article-title>. In <source>Proceedings of INTERSPEECH</source> (pp. <fpage>2535</fpage>&#8211;<lpage>2538</lpage>). DOI: <pub-id pub-id-type="doi">10.21437/Interspeech.2009-668</pub-id></mixed-citation></ref>
<ref id="B64"><label>64</label><mixed-citation publication-type="journal"><string-name><surname>Moulines</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Charpentier</surname>, <given-names>F.</given-names></string-name> (<year>1990</year>). <article-title>Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones</article-title>. <source>Speech Communication</source>, <volume>9</volume>(<issue>5&#8211;6</issue>), <fpage>453</fpage>&#8211;<lpage>467</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/0167-6393(90)90021-Z</pub-id></mixed-citation></ref>
<ref id="B65"><label>65</label><mixed-citation publication-type="journal"><string-name><surname>Nakamura</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Harris</surname>, <given-names>J. A.</given-names></string-name>, &amp; <string-name><surname>Jun</surname>, <given-names>S.-A.</given-names></string-name> (<year>2022</year>). <article-title>Integrating prosody in anticipatory language processing: how listeners adapt to unconventional prosodic cues</article-title>. <source>Language, Cognition and Neuroscience</source>, <volume>37</volume>(<issue>5</issue>), <fpage>624</fpage>&#8211;<lpage>647</lpage>. DOI: <pub-id pub-id-type="doi">10.1080/23273798.2021.2010778</pub-id></mixed-citation></ref>
<ref id="B66"><label>66</label><mixed-citation publication-type="journal"><string-name><surname>Nixon</surname>, <given-names>J. S.</given-names></string-name>, <string-name><surname>van Rij</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Mok</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Baayen</surname>, <given-names>R. H.</given-names></string-name>, &amp; <string-name><surname>Chen</surname>, <given-names>Y.</given-names></string-name> (<year>2016</year>). <article-title>The temporal dynamics of perceptual uncertainty: eye movement evidence from Cantonese segment and tone perception</article-title>. <source>Journal of Memory and Language</source>, <volume>90</volume>, <fpage>103</fpage>&#8211;<lpage>125</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2016.03.005</pub-id></mixed-citation></ref>
<ref id="B67"><label>67</label><mixed-citation publication-type="book"><string-name><surname>Pierrehumbert</surname>, <given-names>J. B.</given-names></string-name> (<year>1980</year>). <source>The phonology and phonetics of English intonation</source> (Unpublished doctoral dissertation). <publisher-name>Massachusetts Institute of Technology</publisher-name>.</mixed-citation></ref>
<ref id="B68"><label>68</label><mixed-citation publication-type="book"><string-name><surname>Pierrehumbert</surname>, <given-names>J. B.</given-names></string-name>, &amp; <string-name><surname>Frisch</surname>, <given-names>S.</given-names></string-name> (<year>1997</year>). <chapter-title>Synthesizing allophonic glottalization</chapter-title>. In <source>Progress in Speech Synthesis</source> (pp. <fpage>9</fpage>&#8211;<lpage>26</lpage>). <publisher-name>Springer</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1007/978-1-4612-1894-4_2</pub-id></mixed-citation></ref>
<ref id="B69"><label>69</label><mixed-citation publication-type="book"><string-name><surname>Pierrehumbert</surname>, <given-names>J. B.</given-names></string-name>, &amp; <string-name><surname>Talkin</surname>, <given-names>D.</given-names></string-name> (<year>1992</year>). <chapter-title>Lenition of /h/ and glottal stop</chapter-title>. In <string-name><given-names>G. J.</given-names> <surname>Docherty</surname></string-name> &amp; <string-name><given-names>D. R.</given-names> <surname>Ladd</surname></string-name> (Eds.), <source>Gesture, Segment, Prosody</source> (p. <fpage>90</fpage>&#8211;<lpage>127</lpage>). <publisher-name>Cambridge University Press</publisher-name>. DOI: <pub-id pub-id-type="doi">10.1017/CBO9780511519918.005</pub-id></mixed-citation></ref>
<ref id="B70"><label>70</label><mixed-citation publication-type="journal"><string-name><surname>Pitt</surname>, <given-names>M. A.</given-names></string-name> (<year>2009</year>). <article-title>How are pronunciation variants of spoken words recognized? A test of generalization to newly learned words</article-title>. <source>Journal of Memory and Language</source>, <volume>61</volume>(<issue>1</issue>), <fpage>19</fpage>&#8211;<lpage>36</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2009.02.005</pub-id></mixed-citation></ref>
<ref id="B71"><label>71</label><mixed-citation publication-type="journal"><string-name><surname>Pompino-Marschall</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>&#379;ygis</surname>, <given-names>M.</given-names></string-name> (<year>2010</year>). <article-title>Glottal marking of vowel-initial words in german</article-title>. <source>ZAS Papers in Linguistics</source>, <volume>52</volume>, <fpage>1</fpage>&#8211;<lpage>17</lpage>. DOI: <pub-id pub-id-type="doi">10.21248/zaspil.52.2010.380</pub-id></mixed-citation></ref>
<ref id="B72"><label>72</label><mixed-citation publication-type="webpage"><collab>R Core Team</collab>. (<year>2021</year>). <chapter-title>R: A language and environment for statistical computing [Computer software manual]</chapter-title>. <publisher-loc>Vienna, Austria</publisher-loc>. Retrieved from <uri>https://www.R-project.org/</uri></mixed-citation></ref>
<ref id="B73"><label>73</label><mixed-citation publication-type="journal"><string-name><surname>Redi</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Shattuck-Hufnagel</surname>, <given-names>S.</given-names></string-name> (<year>2001</year>). <article-title>Variation in the realization of glottalization in normal speakers</article-title>. <source>Journal of Phonetics</source>, <volume>29</volume>(<issue>4</issue>), <fpage>407</fpage>&#8211;<lpage>429</lpage>. DOI: <pub-id pub-id-type="doi">10.1006/jpho.2001.0145</pub-id></mixed-citation></ref>
<ref id="B74"><label>74</label><mixed-citation publication-type="journal"><string-name><surname>Reinisch</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Sjerps</surname>, <given-names>M. J.</given-names></string-name> (<year>2013</year>). <article-title>The uptake of spectral and temporal cues in vowel perception is rapidly influenced by context</article-title>. <source>Journal of Phonetics</source>, <volume>41</volume>(<issue>2</issue>), <fpage>101</fpage>&#8211;<lpage>116</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2013.01.002</pub-id></mixed-citation></ref>
<ref id="B75"><label>75</label><mixed-citation publication-type="journal"><string-name><surname>Riesberg</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Kalbertodt</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Baumann</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Himmelmann</surname>, <given-names>N.</given-names></string-name> (<year>2020</year>). <article-title>Using rapid prosody transcription to probe little-known prosodic systems: The case of Papuan Malay</article-title>. <source>Laboratory Phonology: Journal of the Association for Laboratory Phonology</source>, <volume>11</volume>. DOI: <pub-id pub-id-type="doi">10.5334/labphon.192</pub-id></mixed-citation></ref>
<ref id="B76"><label>76</label><mixed-citation publication-type="webpage"><collab>RStudio Team</collab>. (<year>2021</year>). <chapter-title>Rstudio: Integrated development environment for r [Computer software manual]</chapter-title>. <publisher-loc>Boston, MA</publisher-loc>. Retrieved from <uri>http://www.rstudio.com/</uri></mixed-citation></ref>
<ref id="B77"><label>77</label><mixed-citation publication-type="journal"><string-name><surname>Silverman</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Pierrehumbert</surname>, <given-names>J.</given-names></string-name> (<year>1990</year>). <article-title>The timing of prenuclear high accents in English</article-title>. In <string-name><given-names>M. E.</given-names> <surname>Beckman</surname></string-name> &amp; <string-name><given-names>J.</given-names> <surname>Kingston</surname></string-name> (Eds.), <source>Papers in Laboratory Phonology</source> (pp. <fpage>72</fpage>&#8211;<lpage>106</lpage>). DOI: <pub-id pub-id-type="doi">10.1017/CBO9780511627736.005</pub-id></mixed-citation></ref>
<ref id="B78"><label>78</label><mixed-citation publication-type="journal"><string-name><surname>Snedeker</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Trueswell</surname>, <given-names>J.</given-names></string-name> (<year>2003</year>). <article-title>Using prosody to avoid ambiguity: Effects of speaker awareness and referential context</article-title>. <source>Journal of Memory and language</source>, <volume>48</volume>(<issue>1</issue>), <fpage>103</fpage>&#8211;<lpage>130</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/S0749-596X(02)00519-3</pub-id></mixed-citation></ref>
<ref id="B79"><label>79</label><mixed-citation publication-type="journal"><string-name><surname>S&#243;skuthy</surname>, <given-names>M.</given-names></string-name> (<year>2017</year>). <article-title>Generalised additive mixed models for dynamic analysis in linguistics: A practical introduction</article-title>. <source>arXiv preprint arXiv:1703.05339</source>.</mixed-citation></ref>
<ref id="B80"><label>80</label><mixed-citation publication-type="journal"><string-name><surname>Soskuthy</surname>, <given-names>M.</given-names></string-name> (<year>2021</year>). <article-title>Evaluating generalised additive mixed modelling strategies for dynamic speech analysis</article-title>. <source>Journal of Phonetics</source>, <volume>84</volume>, <elocation-id>101017</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2020.101017</pub-id></mixed-citation></ref>
<ref id="B81"><label>81</label><mixed-citation publication-type="journal"><string-name><surname>Steffman</surname>, <given-names>J.</given-names></string-name> (<year>2019</year>). <article-title>Phrase-final lengthening modulates listeners&#8217; perception of vowel duration as a cue to coda stop voicing</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>145</volume>(<issue>6</issue>), <fpage>EL560</fpage>&#8211;<lpage>EL566</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.5111772</pub-id></mixed-citation></ref>
<ref id="B82"><label>82</label><mixed-citation publication-type="book"><string-name><surname>Steffman</surname>, <given-names>J.</given-names></string-name> (<year>2020</year>). <source>Prosodic prominence in vowel perception and spoken language processing</source> (Unpublished doctoral dissertation). <publisher-name>University of California</publisher-name>, <publisher-loc>Los Angeles</publisher-loc>.</mixed-citation></ref>
<ref id="B83"><label>83</label><mixed-citation publication-type="journal"><string-name><surname>Steffman</surname>, <given-names>J.</given-names></string-name> (<year>2021a</year>). <article-title>Prosodic prominence effects in the processing of spectral cues</article-title>. <source>Language, Cognition and Neuroscience</source>, <volume>36</volume>(<issue>5</issue>), <fpage>586</fpage>&#8211;<lpage>611</lpage>. DOI: <pub-id pub-id-type="doi">10.1080/23273798.2020.1862259</pub-id></mixed-citation></ref>
<ref id="B84"><label>84</label><mixed-citation publication-type="journal"><string-name><surname>Steffman</surname>, <given-names>J.</given-names></string-name> (<year>2021b</year>). <article-title>Rhythmic and speech rate effects in the perception of durational cues</article-title>. <source>Attention, Perception, &amp; Psychophysics</source>, <volume>83</volume>(<issue>8</issue>), <fpage>3162</fpage>&#8211;<lpage>3182</lpage>. DOI: <pub-id pub-id-type="doi">10.3758/s13414-021-02334-w</pub-id></mixed-citation></ref>
<ref id="B85"><label>85</label><mixed-citation publication-type="journal"><string-name><surname>Stilp</surname>, <given-names>C.</given-names></string-name> (<year>2018</year>). <article-title>Short-term, not long-term, average spectra of preceding sentences bias consonant categorization</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>144</volume>(<issue>3</issue>), <fpage>1797</fpage>&#8211;<lpage>1797</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.5067927</pub-id></mixed-citation></ref>
<ref id="B86"><label>86</label><mixed-citation publication-type="journal"><string-name><surname>Stilp</surname>, <given-names>C.</given-names></string-name> (<year>2020</year>). <article-title>Acoustic context effects in speech perception</article-title>. <source>Wiley Interdisciplinary Reviews: Cognitive Science</source>, <volume>11</volume>(<issue>1</issue>), <elocation-id>e1517</elocation-id>. DOI: <pub-id pub-id-type="doi">10.1002/wcs.1517</pub-id></mixed-citation></ref>
<ref id="B87"><label>87</label><mixed-citation publication-type="webpage"><string-name><surname>Tehrani</surname>, <given-names>H.</given-names></string-name> (<year>2020</year>). <source>Appsobabble: Online applications platform</source>. Retrieved from <uri>https://www.appsobabble.com</uri></mixed-citation></ref>
<ref id="B88"><label>88</label><mixed-citation publication-type="journal"><string-name><surname>Thompson</surname>, <given-names>L. C.</given-names></string-name>, <string-name><surname>Thompson</surname>, <given-names>M. T.</given-names></string-name>, &amp; <string-name><surname>Efrat</surname>, <given-names>B. S.</given-names></string-name> (<year>1974</year>). <article-title>Some phonological developments in Straits Salish</article-title>. <source>International Journal of American Linguistics</source>, <volume>40</volume>(<issue>3</issue>), <fpage>182</fpage>&#8211;<lpage>196</lpage>. DOI: <pub-id pub-id-type="doi">10.1086/465311</pub-id></mixed-citation></ref>
<ref id="B89"><label>89</label><mixed-citation publication-type="journal"><string-name><surname>Traunm&#252;ller</surname>, <given-names>H.</given-names></string-name> (<year>1990</year>). <article-title>Analytical expressions for the tonotopic sensory scale</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>88</volume>(<issue>1</issue>), <fpage>97</fpage>&#8211;<lpage>100</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.399849</pub-id></mixed-citation></ref>
<ref id="B90"><label>90</label><mixed-citation publication-type="journal"><string-name><surname>Umeda</surname>, <given-names>N.</given-names></string-name> (<year>1978</year>). <article-title>Occurrence of glottal stops in fluent speech</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>64</volume>(<issue>1</issue>), <fpage>88</fpage>&#8211;<lpage>94</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.381959</pub-id></mixed-citation></ref>
<ref id="B91"><label>91</label><mixed-citation publication-type="journal"><string-name><surname>van Rij</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Wieling</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Baayen</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>van Rijn</surname>, <given-names>H.</given-names></string-name> (<year>2016</year>). <source>itsadug: Interpreting time series and autocorrelated data using GAMMs [R package]</source>.</mixed-citation></ref>
<ref id="B92"><label>92</label><mixed-citation publication-type="journal"><string-name><surname>Weber</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Grice</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Crocker</surname>, <given-names>M. W.</given-names></string-name> (<year>2006</year>). <article-title>The role of prosody in the interpretation of structural ambiguities: A study of anticipatory eye movements</article-title>. <source>Cognition</source>, <volume>99</volume>(<issue>2</issue>), <fpage>B63</fpage>&#8211;<lpage>B72</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.cognition.2005.07.001</pub-id></mixed-citation></ref>
<ref id="B93"><label>93</label><mixed-citation publication-type="webpage"><string-name><surname>Winn</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <source>Vowel formant continua from modified natural speech (Praat script)</source>. Retrieved from <uri>http://www.mattwinn.com/praat/Make_Formant_Continuum_v38.txt</uri> (Version 38)</mixed-citation></ref>
<ref id="B94"><label>94</label><mixed-citation publication-type="book"><string-name><surname>Wood</surname>, <given-names>S. N.</given-names></string-name> (<year>2006</year>). <source>Generalized Additive Models: an Introduction with R</source>. <publisher-name>Chapman and Hall/CRC</publisher-name>. (DOI: <pub-id pub-id-type="doi">10.1201/9781420010404)</pub-id></mixed-citation></ref>
<ref id="B95"><label>95</label><mixed-citation publication-type="journal"><string-name><surname>Zahner</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Kutscheid</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Braun</surname>, <given-names>B.</given-names></string-name> (<year>2019</year>). <article-title>Alignment of f0 peak in different pitch accent types affects perception of metrical stress</article-title>. <source>Journal of Phonetics</source>, <volume>74</volume>, <fpage>75</fpage>&#8211;<lpage>95</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.wocn.2019.02.004</pub-id></mixed-citation></ref>
<ref id="B96"><label>96</label><mixed-citation publication-type="journal"><string-name><surname>Zhang</surname>, <given-names>Z.</given-names></string-name> (<year>2011</year>). <article-title>Restraining mechanisms in regulating glottal closure during phonation</article-title>. <source>The Journal of the Acoustical Society of America</source>, <volume>130</volume>(<issue>6</issue>), <fpage>4010</fpage>&#8211;<lpage>4019</lpage>. DOI: <pub-id pub-id-type="doi">10.1121/1.3658477</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>