<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.2" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">1868-6354</journal-id>
<journal-title-group>
<journal-title>Laboratory Phonology: Journal of the Association for Laboratory Phonology</journal-title>
</journal-title-group>
<issn pub-type="epub">1868-6354</issn>
<publisher>
<publisher-name>Open Library of Humanities</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.16995/labphon.24285</article-id>
<article-categories>
<subj-group>
<subject>Journal article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Disentangling acoustic and social biases in creaky voice perception: The effects of f0 and face gender on creakiness ratings</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Brown</surname>
<given-names>Jeanne</given-names>
</name>
<email>jeanne.brown@mail.mcgill.ca</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Clayards</surname>
<given-names>Meghan</given-names>
</name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>Department of Linguistics, McGill University, Montreal, Canada</aff>
<aff id="aff-2"><label>2</label>Department of Linguistics, School of Communication Sciences and Disorders, McGill University, Montreal, Canada</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2026-03-24">
<day>24</day>
<month>03</month>
<year>2026</year>
</pub-date>
<pub-date pub-type="collection">
<year>2026</year>
</pub-date>
<volume>17</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>46</lpage>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2026 The Author(s)</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.journal-labphon.org/articles/10.16995/labphon.24285/"/>
<abstract>
<p>Creaky voice has historically been associated with men&#8217;s speech, supported by acoustic studies. Since around 2010, however, sociolinguistic work alongside public discourse has perpetuated greater creaky voice use by women, typically implementing impressionistic coding. This study investigates whether this recent shift can be attributed to perceptual social and acoustic biases related to (perceived) speaker gender and pitch (f0), respectively. Using a matched-guise paradigm, 40 Canadian English listeners rated the perceived creakiness of the same modal and creaky voices&#8212;altered to have ambiguously gendered formants and median f0s (115, 135, 155 Hz)&#8212;paired with female and male faces. Bayesian regression analyses revealed strong effects of voice quality and moderate effects of f0: Creaky and lower-f0 stimuli were rated as creakier. No overall effect of face gender was found. However, a weak interaction between face gender and f0 suggests a possible gender prototypicality bias: At lower f0s, female faces were rated as slightly creakier than male faces and at higher f0s, male faces were rated creakier than female faces. These findings show that neither acoustic nor gender-based biases alone can account for widespread reports of women-led creaky voice use. Several possible explanations for this discrepancy are discussed.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>1. Introduction</title>
<p>Along with the development of more sophisticated phonetic measures and statistical modelling in recent years, the analysis of voice quality has seen rapid growth (<xref ref-type="bibr" rid="B47">Garellek, 2022</xref>). Creaky voice in particular has seen more systematic investigations from both phonetic as well as sociolinguistic perspectives, given its diverse uses in languages of the world, for phonological contrast, prosodic and segmental structures, pragmatic and discourse-related function, indexicality, as well as the construction of personae (<xref ref-type="bibr" rid="B28">Davidson, 2021</xref>). More recently, sociophoneticians have raised questions about the perceptual dimensions of creaky voice and how they interact with acoustic and articulatory features of voice (e.g., <xref ref-type="bibr" rid="B85">Kreiman et al., 2014</xref>) as well as social evaluations, stereotypes, and expectations of voice (e.g., <xref ref-type="bibr" rid="B100">Ligon et al., 2019</xref>; <xref ref-type="bibr" rid="B150">White et al., 2024</xref>). Despite this growing body of work, a persistent mismatch remains between how creaky voice is produced and how it is perceived&#8212;particularly in terms of gendered associations. While traditional qualitative research (e.g., <xref ref-type="bibr" rid="B63">Henton &amp; Bladon, 1988</xref>) and acoustic work (e.g., <xref ref-type="bibr" rid="B83">Klatt &amp; Klatt, 1990</xref>; <xref ref-type="bibr" rid="B51">Gittelson et al., 2021</xref>) both link creak primarily with male speakers, more recent impressionistic studies and public discourse have emphasized its use by women (<xref ref-type="bibr" rid="B156">Yuasa, 2010</xref>; <xref ref-type="bibr" rid="B120">Podesva, 2013</xref>; <xref ref-type="bibr" rid="B123">Quenqua, 2012</xref>; for example). This paper addresses this gap by directly examining the effects of perceived speaker gender and variation in speaker f0 on perceptual creakiness ratings. Focusing on creaky voice perception, this work aims to clarify how the social climate surrounding linguistic behavior and quantifiable acoustic cues jointly shape the way listeners process and interpret speech.</p>
</sec>
<sec>
<title>2. Background</title>
<sec>
<title>2.1 Creaky voice phonetics</title>
<p>Within phonetics, <italic>voice</italic> or <italic>phonation</italic> broadly refers to the acoustic signal produced by the articulatory system (i.e., the glottis and supralaryngeal tract), grounded in acoustic physics and physiology (<xref ref-type="bibr" rid="B46">Garellek, 2019</xref>; <xref ref-type="bibr" rid="B87">Kreiman &amp; Sidtis, 2011</xref>). In contrast, <italic>voice quality</italic> or <italic>phonatory quality</italic> denotes the holistic percept of this acoustic signal (or of a person&#8217;s voice), grounded in cognition (<xref ref-type="bibr" rid="B74">Johnson &amp; Babel, 2023</xref>; <xref ref-type="bibr" rid="B87">Kreiman &amp; Sidtis, 2011</xref>). <italic>Voice qualities</italic> or <italic>phonation types</italic> represent specific patterns of vibration that can be situated along a uni-dimensional continuum of vocal fold aperture (<xref ref-type="bibr" rid="B91">Ladefoged, 1971</xref>), more traditionally, or within a multi-dimensional space collectively defined by various acoustic cues (<xref ref-type="bibr" rid="B79">Keating et al., 2023a</xref>; <xref ref-type="bibr" rid="B87">Kreiman &amp; Sidtis, 2011</xref>), stipulated in more contemporary work on voice.</p>
<p>Creaky voice is a non-modal voice quality (or phonation type) typically characterized acoustically by a low fundamental frequency (f0), irregular vocal pulses, and decreased transglottal airflow (e.g., <xref ref-type="bibr" rid="B55">Gordon &amp; Ladefoged, 2001</xref>; <xref ref-type="bibr" rid="B77">Keating et al., 2015</xref>; <xref ref-type="bibr" rid="B153">Wright et al., 2019</xref>). Articulatorily, creaky voice is produced by increasing adductive tension and decreasing longitudinal tension of the vocal folds, allowing the vocal folds to be compressed and thick (<xref ref-type="bibr" rid="B77">Keating et al., 2015</xref>; <xref ref-type="bibr" rid="B153">Wright et al., 2019</xref>). This configuration permits little airflow through the glottis, resulting in some vibration, albeit slow and aperiodic. Perceptually, creaky voice is described as having a low, rough, croaking or crackling sound to the ear, like &#8220;a rapid series of taps&#8221; (<xref ref-type="bibr" rid="B21">Catford, 1964, p. 34</xref>) or popping corn (<xref ref-type="bibr" rid="B63">Henton &amp; Bladon, 1988</xref>). Keating et al. (<xref ref-type="bibr" rid="B77">2015</xref>) describe multiple sub-types of creaky voice, each characterized by different acoustic cues. At present, there is ongoing debate about whether these types are perceptually distinct or form one broad perceptual category (see <xref ref-type="bibr" rid="B26">Davidson, 2019b</xref> vs. <xref ref-type="bibr" rid="B45">Garellek, 2015</xref>; Gerratt &amp; Kreiman, 2001) as well as how individual acoustic cues contribute to creaky voice perception (<xref ref-type="bibr" rid="B81">Khan et al., 2015</xref>). Considering the limited empirical perception data, we employ the broader term &#8220;creaky voice&#8221; to refer to the holistic percept of creakiness, rather than any specific acoustic sub-type.</p>
</sec>
<sec>
<title>2.2 The production-perception mismatch</title>
<p>Phonation types can be contrastive in some of the world&#8217;s languages (see <xref ref-type="bibr" rid="B79">Keating et al., 2023a</xref> for a review), similar to f0 contrasts in tonal languages. Creaky voice can also serve a prosodic or segmental function in a variety of languages (see <xref ref-type="bibr" rid="B28">Davidson, 2021</xref>), marking intonational phrase boundaries (e.g., <xref ref-type="bibr" rid="B23">Crowhurst, 2018</xref>; <xref ref-type="bibr" rid="B30">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B124">Redi &amp; Shattuck-Hufnagel, 2001</xref>) or enhancing segmental contrasts (e.g., <xref ref-type="bibr" rid="B47">Garellek, 2022</xref>; <xref ref-type="bibr" rid="B116">Pierrehumbert, 1995</xref>). This paper focuses instead on non-contrastive creaky voice, especially discussing its sociolinguistic uses.</p>
<p>Voice quality is known to have many sociolinguistic functions in a variety of languages, conveying pragmatic information and to construct individual personae (<xref ref-type="bibr" rid="B18">Callier, 2013</xref>; <xref ref-type="bibr" rid="B93">Laver, 1968</xref>; <xref ref-type="bibr" rid="B117">Pillot-Loiseau et al., 2019</xref>; <xref ref-type="bibr" rid="B120">Podesva, 2013</xref>; <xref ref-type="bibr" rid="B127">Sicoli, 2010</xref>; <xref ref-type="bibr" rid="B156">Yuasa, 2010</xref>, among others). Socially-constrained variation in creaky voice use has been extensively studied in English varieties (see <xref ref-type="bibr" rid="B24">Dallaston &amp; Docherty, 2020</xref>; <xref ref-type="bibr" rid="B28">Davidson, 2021</xref>, for reviews), with increasing work on other languages published only recently (e.g., <xref ref-type="bibr" rid="B15">Burin, 2022</xref>; <xref ref-type="bibr" rid="B37">Duarte-Borquez et al., 2024</xref>; <xref ref-type="bibr" rid="B74">Johnson &amp; Babel, 2023</xref>; <xref ref-type="bibr" rid="B125">Sebregts et al. 2023</xref>; <xref ref-type="bibr" rid="B143">Uusitalo et al., 2024</xref>). Previous acoustic studies of various English varieties&#8212;including both seminal and contemporary work on creaky voice&#8212;have shown that <italic>men are creakier than women</italic> across a variety of speech contexts ranging from spontaneous speech in naturalistic settings to read wordlist recordings in a laboratory (e.g., <xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>; <xref ref-type="bibr" rid="B51">Gittelson et al., 2021</xref>; <xref ref-type="bibr" rid="B57">Hanson &amp; Chuang, 1999</xref>; <xref ref-type="bibr" rid="B67">Irons &amp; Alexander, 2016</xref>; <xref ref-type="bibr" rid="B68">Iseli et al., 2007</xref>; <xref ref-type="bibr" rid="B83">Klatt &amp; Klatt, 1990</xref>; <xref ref-type="bibr" rid="B105">Loakes &amp; Gregory, 2022</xref>; <xref ref-type="bibr" rid="B137">Syrdal 1996</xref>). Older impressionistic analyses, i.e., those relying on audio-visual coding of creaky voice, have reached similar conclusions with respect to gender (again in varied speech contexts), suggesting that creak can be interpreted as a sign of masculinity, authority, and socio-economic status in British varieties especially (<xref ref-type="bibr" rid="B2">Abercrombie, 1967</xref>; <xref ref-type="bibr" rid="B40">Esling, 1978</xref>; <xref ref-type="bibr" rid="B63">Henton &amp; Bladon, 1988</xref>; <xref ref-type="bibr" rid="B93">Laver, 1968</xref>; <xref ref-type="bibr" rid="B136">Stuart-Smith, 1999</xref>).</p>
<p>There is current impressionistic work&#8212;which still relies on audio-visual creaky voice identification methods&#8212;also sourced from various speech contexts, but examines American English varieties almost exclusively. Contrary to the aforementioned acoustic and older impressionistic work on creaky voice, these studies show that <italic>women are creakier than men</italic> (e.g., <xref ref-type="bibr" rid="B1">Abdelli-Beruh et al., 2014</xref>; <xref ref-type="bibr" rid="B110">Melvin &amp; Clopper, 2015</xref>; <xref ref-type="bibr" rid="B120">Podesva, 2013</xref>; <xref ref-type="bibr" rid="B151">Wolk et al., 2012</xref>; <xref ref-type="bibr" rid="B156">Yuasa, 2010</xref>), coinciding with the emergence of that same claim in mainstream American media in the late 2000s (<xref ref-type="bibr" rid="B42">Fessenden, 2011</xref>; <xref ref-type="bibr" rid="B56">Grim, 2015</xref>; <xref ref-type="bibr" rid="B70">Jaslow, 2011</xref>; <xref ref-type="bibr" rid="B132">Steinmetz, 2011</xref>; <xref ref-type="bibr" rid="B123">Quenqua, 2012</xref>, for example). Creaky voice is often referred to as vocal fry in popular discourse and media, and a sudden surge in use of this term around 2010 is observable in Google Ngrams (<xref ref-type="bibr" rid="B54">Google Books, 2025</xref>). Yuasa (<xref ref-type="bibr" rid="B156">2010</xref>) suggests that the increased use of creaky voice by young, upwardly-mobile American women is triggering a shift in popular perception towards a more educated, urban-oriented, nonaggressive and informal interpretation. Others explicitly asked listeners to provide qualitative judgements of modal and creaky voices, then compared listener ratings across voice qualities. Most studies exposed listeners to the same speakers producing both modal and creaky voice (in lab speech in <xref ref-type="bibr" rid="B4">Anderson et al., 2014</xref> and <xref ref-type="bibr" rid="B95">Lee, 2016</xref>; in spontaneous speech from online sources in <xref ref-type="bibr" rid="B133">Stewart et al., 2024</xref>), while others only compared modal speakers and creaky speakers (recorded in-lab in <xref ref-type="bibr" rid="B44">Gallena &amp; Pinto, 2021</xref>). Conversely, Ligon et al. (<xref ref-type="bibr" rid="B100">2019</xref>) did not make use of any audio stimuli, opting to train their listeners to identify and differentiate various voice qualities (modal and creaky voice included) and then ask them to associate the voice qualities with affective/emotive traits. Contrasting with Yuasa&#8217;s (<xref ref-type="bibr" rid="B156">2010</xref>) interpretation of creaky voice use, these speaker perception studies reach the consensus that creaky voice leads to overt negative sentiments: Women exhibiting creaky voice were often perceived as less competent, attractive and hirable (note that <xref ref-type="bibr" rid="B95">Lee, 2016</xref>, finds comparable negative judgement of both men&#8217;s and women&#8217;s creaky voice). Social perception work on creak has placed less focus on attitudes in other English varieties; the studies that do exist find similar but less consistent negative perceptions (in New Zealand English, <xref ref-type="bibr" rid="B17">Calhoun &amp; White, 2025</xref> and <xref ref-type="bibr" rid="B118">Pittam, 1987</xref>; mixed findings in Irish English, <xref ref-type="bibr" rid="B52">Gobl &amp; N&#237; Chasaide, 2003</xref>; in British English, <xref ref-type="bibr" rid="B104">Liu &amp; Xu, 2011</xref>; and in Canadian English, <xref ref-type="bibr" rid="B53">Goodine &amp; Johns, 2014</xref>). When exposed to two young women&#8217;s voices with creak and two others without creak, Canadian English listeners judged women exhibiting creaky voice as more ditzy and lazier, as well as less assertive, responsible and hardworking than modally-voiced women (<xref ref-type="bibr" rid="B53">Goodine &amp; Johns, 2014</xref>).</p>
<p>These methodological differences highlighted by the studies described above appear to impact conclusions drawn from investigations of creaky voice across gender: Acoustic analyses (almost exclusively) find more creak for men, and impressionistic analyses describe more creak for men prior to the 2000s, but more creak for women thereafter. While there is a remarkable lack of empirical evidence indicating a change in voice quality over time (see <xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>, for a review), there is compelling evidence for change in the social view of creaky voice. The production-perception mismatch displayed in creaky voice research demonstrates that perception does not always reflect acoustics or articulation. While acoustic studies rely on objective measures, the overarching goal of this work is to model the <italic>perceptual</italic> instantiation of creaky voice. Creaky voice as a perceptual phenomenon has attracted both public and academic interest in recent years, and speakers and listeners show the ability to reliably attend to it as well as develop strong intuitions about it. Acoustic correlates alone therefore function as indirect measures of creaky voice.</p>
<p>Some work has made efforts to bridge the gap between articulation, acoustics, and perception, notably Kreiman et al.&#8217;s (<xref ref-type="bibr" rid="B85">2014</xref>) psychoacoustic model of voice (and <xref ref-type="bibr" rid="B86">Kreiman et al., 2021</xref>, thereafter) which identifies a set of articulatorily-grounded acoustic measures that all contribute individually to the perception of voice quality. While this model may provide a quantitative measure of voice quality that more accurately reflects its perception compared to other acoustic studies of voice, the discord between older impressionistic studies of creaky voice and more contemporary ones remains unresolved. If a change in voice quality production (articulation and acoustics) has not occurred over time, then there is no motivation for any changes to the psychoacoustic model of voice over time. The psychoacoustic model of voice in its current state does not allow for the integration of social factors in voice perception, failing to account for any variation or change (at the individual or group level), and is not designed to assess <italic>creaky</italic> voice perception specifically. One plausible explanation for the divergence between creaky voice production and varying perceptions relies on the assumption that listener expectations can heavily influence perception. Accordingly, Section 2.3 discusses the well-attested effect of perceived speaker characteristics on speech perception, and Section 2.4 explores how perceptual judgements of voice can be shaped by speaker pitch and gender.</p>
</sec>
<sec>
<title>2.3 Sociolinguistic perception: Speech perception as a function of speaker perception</title>
<p>Sociolinguistic perception is a vast topic that integrates perspectives from diverse fields of research, including social psychology, cognitive science, sociolinguistics and phonetics. It is now well-known that variation in the speech signal can affect the social evaluations of the speaker (as exemplified in the previous section with respect to creaky voice usage). The <italic>matched-guise</italic> technique was first introduced by Wallace Lambert and his colleagues, highly influential in that it operationalized a method to systematically analyze listeners social evaluations in relation to linguistic features of speech. In <xref ref-type="bibr" rid="B92">Lambert et al.&#8217;s 1960</xref> study, the same speaker&#8217;s voice was recorded in French and in English, effectively producing two different guises which were presented to listeners without revealing that they originated from the same person. Listeners were then tasked with assigning subjective ratings of 14 character traits to the guises (<xref ref-type="bibr" rid="B92">Lambert et al., 1960</xref>). Since then, the paradigm has been applied to various linguistic features and personal characteristics, including, but not limited to, gender, race/ethnicity, region of origin, sexual orientation (see <xref ref-type="bibr" rid="B32">D&#8217;Onofrio, 2016</xref>; <xref ref-type="bibr" rid="B34">Drager 2010</xref>; <xref ref-type="bibr" rid="B43">Foulkes &amp; Hay, 2015</xref>, for overviews). This is generally thought to occur because listeners hold stereotypes towards social groups and attribute those stereotypes to individuals whom they perceive as belonging to that group on the basis of linguistic features alone (<xref ref-type="bibr" rid="B72">Johnson, 2000</xref>; <xref ref-type="bibr" rid="B103">Lippi-Green, 1997</xref>). The vast and varied body of work collectively shows the wide-reaching effect that linguistic variation has on social evaluations of speaker attributes.</p>
<p>Furthermore, the relationship between phonetic and social information in speech perception is bidirectional. Social information about the speaker, whether actual, expected, or perceived, can bias speech perception (distinctly from social perception). As a variation of the matched-guise design and following long-standing work arguing for the perceptual integration of visual and auditory information (e.g., <xref ref-type="bibr" rid="B19">Campanella &amp; Belin, 2007</xref>; <xref ref-type="bibr" rid="B109">McGurk &amp; MacDonald, 1976</xref>), Strand and Johnson (<xref ref-type="bibr" rid="B135">1996</xref>) spearheaded a method to examine the effect of perceived speaker attributes (social perception of the speaker) on speech perception/processing. In search of evidence for a visually-driven (or socially-driven) speaker normalization effect on sibilant perception, they paired ambiguously-gendered voices with prototypical male and female faces to isolate a <italic>face gender effect</italic>. They found that when Central Ohio English listeners (15 women and 9 men) were tasked with identifying whether they heard /s/ or /&#643;/ (2AFC) from a continuum of synthesized sibilants, they identified more /s/ when audio was presented with the male face than the female face, indicating a lower frequency threshold for /s/ perception (following general gender patterns of sibilant production). These results provide evidence for integration of the visual perception of gender with the acoustics in the speech signal in gradient sibilant categorization. Thus, priming listeners with social information about the speaker, encouraging a specific speaker perception and creating a certain expectation for that speaker&#8217;s voice and language use, can affect their perception of various linguistic variables (generally believed to be linked to those relevant social groupings) in otherwise acoustically identical speech. In addition to perceived speaker gender (<xref ref-type="bibr" rid="B3">Alderton, 2020</xref>; <xref ref-type="bibr" rid="B11">Bouavichith et al. 2019</xref>; <xref ref-type="bibr" rid="B71">Jessee &amp; Calder, 2025</xref>; <xref ref-type="bibr" rid="B75">Johnson et al., 1999</xref>; <xref ref-type="bibr" rid="B102">Lindvall-&#214;stling et al. 2020</xref>; <xref ref-type="bibr" rid="B135">Strand &amp; Johnson, 1996</xref>; <xref ref-type="bibr" rid="B134">Strand 1999</xref>; <xref ref-type="bibr" rid="B154">Yu, 2022</xref>), perceived speaker race/ethnicity (<xref ref-type="bibr" rid="B7">Babel &amp; Russell, 2015</xref>; <xref ref-type="bibr" rid="B89">Kutlu et al., 2022</xref>; <xref ref-type="bibr" rid="B131">Staum Casasanto, 2010</xref>), age (<xref ref-type="bibr" rid="B35">Drager, 2011</xref>; <xref ref-type="bibr" rid="B60">Hay et al., 2006b</xref>, <xref ref-type="bibr" rid="B84">Koops et al., 2008</xref>; <xref ref-type="bibr" rid="B146">Walker &amp; Hay, 2011</xref>), dialect (<xref ref-type="bibr" rid="B58">Hay &amp; Drager, 2010</xref>; <xref ref-type="bibr" rid="B59">Hay et al., 2006a</xref>; <xref ref-type="bibr" rid="B114">Niedzielski, 1999</xref>), socio-economic class (<xref ref-type="bibr" rid="B60">Hay et al., 2006b</xref>), sexual orientation (<xref ref-type="bibr" rid="B107">Mack &amp; Munson, 2012</xref>), as well as micro-sociological categories like personae (<xref ref-type="bibr" rid="B31">D&#8217;Onofrio, 2015</xref>; see <xref ref-type="bibr" rid="B33">D&#8217;Onofrio, 2020</xref>, for a review), have been shown to influence speech perception in diverse ways (see <xref ref-type="bibr" rid="B32">D&#8217;Onofrio, 2016</xref>; <xref ref-type="bibr" rid="B34">Drager 2010</xref>; <xref ref-type="bibr" rid="B43">Foulkes &amp; Hay, 2015</xref>; <xref ref-type="bibr" rid="B148">Weatherholtz &amp; Jaeger, 2016</xref>, for overviews).</p>
<p>Expectations and stereotypes about speakers have even been shown to trigger shifts in production in some cases, such as <italic>expectation-driven convergence</italic> (e.g., <xref ref-type="bibr" rid="B144">Vaughn &amp; Kendall, 2019</xref>; <xref ref-type="bibr" rid="B145">Wade et al., 2023</xref>; and see <xref ref-type="bibr" rid="B6">Auer &amp; Hinskens, 2005</xref>, for a review). In Wade et al. (<xref ref-type="bibr" rid="B145">2023</xref>), non-Southern out-group speakers converged to more monophthongal /a&#618;/ (a stereotypically Southern American English pronunciation) when listening to speech from a <italic>labeled</italic> Southern American English speaker, despite the speaker acoustically producing a Midland American English accent. On the other hand, some work does not yield clear social priming effects in the expected directions (e.g., <xref ref-type="bibr" rid="B130">Squires, 2013</xref>; <xref ref-type="bibr" rid="B94">Lawrence 2015</xref>, <xref ref-type="bibr" rid="B76">Juskan, 2016</xref>; <xref ref-type="bibr" rid="B147">Walker et al., 2019</xref>). Juskan (<xref ref-type="bibr" rid="B76">2016</xref>) situates their inconsistent priming effects within the psychological literature and suggests a number of conditions that need to be met in order for priming effects to emerge as expected: 1) The stimuli must be based on a highly salient linguistic variable/feature that listeners are explicitly aware of (i.e., a stereotype); 2) the linguistic variable must vary on a continuum rather than discretely; and 3) the prime and the acoustic stimuli must not be so mismatched that listeners will not accept that the prime and stimuli are combined. While these conditions may explain some of these failures to replicate priming effects on perception, the expectation that priming effects should be invariable or even similar across studies is controversial (see <xref ref-type="bibr" rid="B76">Juskan, 2016</xref>, for a discussion).</p>
<p>Given that this paper&#8217;s main topic, the production-perception mismatch in creaky voice, relates directly to varying gender expectations, only studies examining perceived gender effects on speech perception will be discussed further. Strand and Johnson&#8217;s (<xref ref-type="bibr" rid="B135">1996</xref>) study has inspired other (quasi-)replications in recent years (e.g., <xref ref-type="bibr" rid="B11">Bouavichith et al. 2019</xref>; <xref ref-type="bibr" rid="B71">Jessee &amp; Calder, 2025</xref>; <xref ref-type="bibr" rid="B113">Munson et al., 2017</xref>), which attests to the robustness of the face gender effect in sibilant perception. Munson et al. (<xref ref-type="bibr" rid="B113">2017</xref>) used both explicit and implicit priming, resulting again in lower frequency /s/ perception for male faces and male-suggestive cues. Likewise, Jessee and Calder (<xref ref-type="bibr" rid="B71">2025</xref>) show that when listeners were told that speakers were transgender, they perceived a higher frequency /s/ for the feminine voice and a lower frequency /s/ for the masculine voice than listeners who were not given that gender information. The face gender effect is also not limited to sibilant perception. Johnson et al. (<xref ref-type="bibr" rid="B75">1999</xref>) apply their 1996 paradigm to the perception of the /&#650;/-/&#652;/ contrast and find that the expectation for men&#8217;s formants to be lower frequency than women&#8217;s formants is substantiated by a face gender effect: Participants tend to perceive the vowel category boundary at lower F1 frequencies when stimuli is paired with a male face and at higher F1 frequencies when paired with a female face. Alderton (<xref ref-type="bibr" rid="B3">2020</xref>) examines the perception of /u/-fronting (a.k.a., GOOSE-fronting) in Standard Southern British English, a sound change led by women but not yet carrying social salience. They find a significant interaction between listener gender and face gender, men identifying fronter /u/ vowels when primed with a woman&#8217;s face, despite failing to find a significant effect of face gender alone (<xref ref-type="bibr" rid="B3">Alderton, 2020</xref>). In Yu (<xref ref-type="bibr" rid="B154">2022</xref>), listeners were presented with a gendered face and (ambiguously gendered) audio stimuli along a voicing continuum (crossing VOT and f0) and instructed to identify whether they heard a /b/ or /p/ (2AFC). Listeners who were exposed to a male face showed less reliance on VOT compared to those exposed to a female face or those who were given no visual information, consistent with typically-male acoustic behavior (VOT differences are less distinct for men than women) (<xref ref-type="bibr" rid="B154">Yu, 2022</xref>). In summary, these studies illustrate the influence of socio-indexical and paralinguistic information on speech perception in that expectations about a speaker&#8217;s gender or other personality traits can prime listeners to interpret the speech signal in ways that conform to the social expectation.</p>
</sec>
<sec>
<title>2.4 Systematic analyses of creaky voice perception</title>
<p>The apparent mismatch between acoustic and impressionistic methods of identifying creaky voice has been noted in previous perceptual work (namely, <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref> and <xref ref-type="bibr" rid="B150">White et al., 2024</xref>). The tendency to impressionistically identify more creakiness in women&#8217;s speech was first formally hypothesized by Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) to be due to two possible factors: acoustic pitch of the speech and perceived speaker gender. The <italic>Acoustic hypothesis</italic> (referred to as the pitch contrast scenario in <xref ref-type="bibr" rid="B150">White et al., 2024</xref>) stipulates that creaky voice is more perceptible in higher pitched modal voices because of a larger pitch differential between modal and creaky voice, compared to lower pitched modal voices which have a smaller pitch differential (<xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>). There is uncontroversial evidence for an upper bound of approximately 80 Hz on the f0 range of creaky voice (e.g., <xref ref-type="bibr" rid="B9">Blomgren et al., 1998</xref>; <xref ref-type="bibr" rid="B65">Hollien &amp; Michel, 1968</xref>; <xref ref-type="bibr" rid="B97">Leung et al., 2022</xref>). Some studies report similar f0 ranges of creaky voice for both men and women (respectively, 49 and 48 Hz on average in <xref ref-type="bibr" rid="B9">Blomgren et al., 1998</xref>; respectively reaching 46 Hz and 59 Hz minimums in <xref ref-type="bibr" rid="B80">Keating &amp; Kuo, 2012</xref>) while others report somewhat lower f0 ranges for men&#8217;s creak than for women&#8217;s creak (respectively, 55 Hz and 74 Hz on average in <xref ref-type="bibr" rid="B14">Brubaker et al., 2016</xref>; and 70 Hz and 88 Hz on average in <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>). Despite these differences, the hypothesis generally holds in assuming that creaky voice is always produced at a low f0, and therefore when habitually high-pitched voices (generally women&#8217;s voices) lower dramatically to reach the f0 threshold for creakiness, it is perceptually more salient than when habitually low-pitched voices (generally men&#8217;s voices) lower moderately to a similar threshold. The <italic>Bias hypothesis</italic> (referred to as the gender bias scenario in <xref ref-type="bibr" rid="B150">White et al., 2024</xref>) presents the alternative that listeners are simply biased to assume that creak is more common in women&#8217;s speech compared to men&#8217;s, given social stereotypes (<xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>). In reality, these two hypotheses are difficult to tease apart as they often co-occur: Men typically have lower modal pitch whereas women have higher modal pitch. White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) describe a third hypothesis or scenario that combines the former two, suggesting that the speaker habitual pitch will have the strongest impact on the perception of creaky voice when voices have distinct pitches, and only when pitch is no longer informative will speaker gender affect perception. The predictions of these hypotheses are shown in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap id="T1">
<caption>
<p><bold>Table 1:</bold> Predictions of each hypothesis for listener accuracy in the identification of creaky voice (adapted from <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>, and <xref ref-type="bibr" rid="B150">White et al., 2024</xref>).</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"><bold>Creaky voice condition</bold></td>
<td align="left" valign="top"><bold>Modal voice condition</bold></td>
</tr>
<tr>
<td align="left" valign="top">Acoustic hypothesis (pitch contrast scenario)</td>
<td align="left" valign="top">hi-f &gt; mid-f = mid-m &gt; lo-m</td>
<td align="left" valign="top">hi-f &gt; mid-f = mid-m &gt; lo-m</td>
</tr>
<tr>
<td align="left" valign="top">Bias hypothesis (gender bias scenario)</td>
<td align="left" valign="top">hi-f = mid-f &gt; mid-m = lo-m</td>
<td align="left" valign="top">lo-m = mid-m &gt; mid-f = hi-f</td>
</tr>
<tr>
<td align="left" valign="top">Acoustic + Bias hypothesis (pitch contrast + gender bias scenario)</td>
<td align="left" valign="top">hi-f &gt; mid-f &gt; mid-m &gt; lo-m</td>
<td align="left" valign="top">hi-f &gt; mid-m &gt; mid-f &gt; lo-m</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Assuming such an acoustic bias, the most relevant cue to the identification of creaky voice is a larger pitch differential between modal and creaky voice. That is, voices with a larger creaky-to-modal voice pitch differential should facilitate both the identification when creak is present and the non-identification when it is absent. Therefore, in both the creaky voice condition (i.e., identifying creak when it is present) and the modal voice condition (i.e., not identifying creak when it is absent), this hypothesis predicts more accurate creak identification in the highest-pitched voice (hi-f), comparably lower accuracy in the mid-pitched voices (mid-f and mid-m), and then lowest accuracy in the lowest-pitched voice (lo-m). On the other hand, if we assume a gender bias, then the most relevant cue to creaky voice identification is gender, specifically women&#8217;s voices eliciting more creaky voice responses in all contexts. Following this idea, creaky voice is predicted to be more accurately detected in women&#8217;s voices (hi-f and mid-f, regardless of pitch) when it is present, and also more inaccurately detected (over-detected) in women&#8217;s voices when it is absent. In the case of the combined acoustic and gender bias scenario, pressures from the creaky-to-modal voice pitch differential and gender interact: A larger pitch differential and women&#8217;s voices are predicted to lead to more creaky voice identification. White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) posit that pitch will play a larger role than gender in the extreme pitch conditions, with a high-pitched voice (hi-f) facilitating accurate creak identification in the creaky condition and non-identification in the modal condition, whereas responses to a low-pitched voice (lo-m) will exhibit less accuracy in both the identification and non-identification of creaky voice. They predict that gender will only play a role in creak decisions in conditions where the voices are similar in pitch. In the creaky condition, they expect the women&#8217;s voice (mid-f) to elicit more accurate identification of creak than in the men&#8217;s voice (mid-m), while in the modal condition, they expect the women&#8217;s voice to elicit more inaccurate identification of creak than in the men&#8217;s.</p>
<p>To test these hypotheses, both Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) and White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) conducted experiments in an attempt to disentangle the effects of pitch and gender on creaky voice perception. Crucially, stimuli for these experiments fit four speaker profiles: a high-pitched female, a mid-pitched female, a (quasi-)equally mid-pitched male, and a low-pitched male. Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) made use of natural speech stimuli from high quality podcast recordings, extracting modal and creaky speech stimuli from eight voices (with average f0s of 226 Hz and 194 Hz for the hi-f voice, 164 Hz and 152 Hz for the mid-f voice, 132 Hz and 145 Hz for the mid-m voice, and 111 Hz and 90 Hz for the lo-m voice). Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) also included other conditions such as location/extent of creak (none, partial, or whole) and type of utterance (fragment or full sentence), but these will not be discussed extensively here. Conversely, White et al. synthesized their stimuli from male (28 y.o.) and female (22 y.o.) modal voices recorded in a lab, manipulating mean f0 (190 Hz for the hi-f voice, 135 Hz for the mid-f and mid-m voices, and 97 Hz for the lo-m voice) and voice quality (inserting cycle-to-cycle f0 irregularity) while leaving gendered formant ratios untouched. White et al. limited their stimuli to bigrams (adjective noun pairs), synthesizing creak into the rhyme of the second word in the creaky voice condition. Both studies ran a creak identification task, requiring binary decisions from participants. Despite recruiting speakers of different varieties of English, 54 Americans (for each of two versions of the experiment) in Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) and 258 Australians in White et al., the results of both studies shared some similarities.</p>
<p>In Davidson&#8217;s (<xref ref-type="bibr" rid="B25">2019a</xref>) first version of the experiment, a weak tendency to accurately identify more creak in female voices (both hi-f and mid-f) than in males&#8217; was found. However, in the second version of the experiment (when the mid-pitched male and female voices were more closely matched), the gender difference in both creaky conditions disappears. In the modal condition, participants were more accurate at detecting the absence of creak in the hi-f voice, and less accurate for the lo-m voice, falsely detecting creak when there was none. Due to inconsistencies in the results of both versions of the experiment, Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) concludes that there is no robust evidence for either an acoustic bias or a gender bias. Aside from gender and pitch, across both of Davidson&#8217;s (<xref ref-type="bibr" rid="B25">2019a</xref>) experiments, participants were worse at identifying creak in the partially creaky condition than the wholly creaky condition, suggesting that creak is less salient in a prosodic position where it is expected (utterance-finally).</p>
<p>White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) find similarly mixed evidence for a gender or acoustic bias. In the creaky voice condition, creak identification is slightly less accurate for the lo-m voice (but the mid-m voice patterns with the female voices), either partially corroborating Davidson&#8217;s female voice skewed gender bias effect in her first study (<xref ref-type="bibr" rid="B25">2019a</xref>) or supporting the acoustic bias towards more creak identification for women&#8217;s higher pitched voices. Moreover, the strong false-alarms of creak in the lo-m voice in the modal condition from Davidson&#8217;s second study (<xref ref-type="bibr" rid="B26">2019b</xref>) are also replicated, indicating an acoustic bias in the opposite direction as predicted (i.e., men&#8217;s voices inducing more creaky voice percepts). These two findings show that for men&#8217;s low-pitched voices, listeners are less likely to identify creak when it is present (creaky voice condition), while also being more likely to identify creak when it is not present (modal voice condition). Lower overall creaky voice identification accuracy for the lo-m voice suggests that listeners struggle more to distinguish creaky voice from low modal pitch in men&#8217;s voices compared to other voices (hi-f, mid-f, and mid-m). Interestingly, White et al. find additional evidence for the predicted gender effect in the modal voice condition, more creak inaccurately identified (false-alarmed creak) in the mid-f voice than the mid-m voice in this case. In view of the two false-alarmed creak findings, White et al. conclude that the combined pitch contrast and gender bias scenarios have explanatory power, but suggest an alternative underlying mechanism: a <italic>pitch given gender bias</italic>. When listeners hear modal voices that are low given the listeners&#8217; expectations for gender (i.e., lo-m and mid-f), they are more likely to identify creak when there is none.</p>
<p>Li et al. (<xref ref-type="bibr" rid="B99">2023</xref>) conduct a comparable study in Mandarin, a language which, importantly, has not been reported to carry any social associations between gender and creaky voice. Stimuli originated from declarative sentences produced with modal and creaky voice by a high-pitched female speaker recorded in a lab. F0 and formant manipulations were performed to create the low-pitched male stimuli. Conditions included pitch range/gender (hi-f vs. lo-m), creak extent (mono-syllabic creak vs. multi-syllabic creak), and prosodic position (final vs. non-final). Forty native Mandarin listeners were tasked to identify characters pronounced with creaky voice. Li et al. find that Mandarin listeners consistently identify more creak in the low-pitched male voice: low pitch facilitating creak identification but also increasing false-alarmed creak identification in modal speech as in Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) and White et al. (<xref ref-type="bibr" rid="B150">2024</xref>). Furthermore, they find that sentence-final position inhibited the identification of creak, confirming the same effect of prosodic position in Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>).</p>
<p>Altogether, these results do lend some support to the combined pitch contrast and gender bias scenario (summarized in <xref ref-type="table" rid="T2">Table 2</xref>). Confirming at least a small effect of the predicted gender bias, Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) finds slightly more creak identified in women&#8217;s voices than in men&#8217;s voices, though only in one of two versions of her experiment, and White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) find slightly more creak identified in women&#8217;s modal voices compared to men&#8217;s, when pitch is matched and comparing women&#8217;s creaky voices (high and mid-pitched) to low-pitched men&#8217;s creaky voices. Confirming a stronger acoustic effect related to pitch but in the unexpected direction, Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>), White et al. and Li et al.&#8217;s (<xref ref-type="bibr" rid="B99">2023</xref>) studies all find increased creak identified in low-pitched men&#8217;s (modal) voices. Considering the broader motivations for these perception studies, these results alone do not fully explain the overwhelming tendency to identify more creak in women&#8217;s voices in impressionistic studies across the sociolinguistic body of literature. If there is a dominant acoustically-grounded inclination to find more creak in low-pitched men&#8217;s voices, evidenced to largely eclipse effects of social stereotypes which lead to women&#8217;s voices being perceived as creakier, then how do so many listeners (trained phoneticians and speech-language pathologists alike) consistently continue to identify more creak for women?</p>
<table-wrap id="T2">
<caption>
<p><bold>Table 2:</bold> Actual results on listener accuracy in the identification of creaky voice from Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>), White et al. (<xref ref-type="bibr" rid="B150">2024</xref>), and Li et al. (<xref ref-type="bibr" rid="B99">2023</xref>) alongside the hypotheses (see <xref ref-type="table" rid="T1">Table 1</xref>) supported.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"></td>
<td align="left" valign="top"><bold>Creaky voice condition</bold></td>
<td align="left" valign="top"><bold>Bias</bold></td>
<td align="left" valign="top"><bold>Modal voice condition</bold></td>
<td align="left" valign="top"><bold>Bias</bold></td>
</tr>
<tr>
<td align="left" valign="top">Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>): exp. ver. 1</td>
<td align="left" valign="top">hi-f = mid-f &gt; mid-m = lo-m</td>
<td align="left" valign="top">Gender</td>
<td align="left" valign="top">hi-f = mid-f = mid-m = lo-m</td>
<td align="left" valign="top">None</td>
</tr>
<tr>
<td align="left" valign="top">Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>): exp. ver. 2</td>
<td align="left" valign="top">hi-f = mid-f = mid-m = lo-m</td>
<td align="left" valign="top">None</td>
<td align="left" valign="top">hi-f &gt; mid-f = mid-m &gt; lo-m</td>
<td align="left" valign="top">Acoustic</td>
</tr>
<tr>
<td align="left" valign="top">White et al. (<xref ref-type="bibr" rid="B150">2024</xref>)</td>
<td align="left" valign="top">hi-f = mid-f = mid-m &gt; lo-m</td>
<td align="left" valign="top">None</td>
<td align="left" valign="top">hi-f &gt; mid-m &gt; mid-f &gt; lo-m</td>
<td align="left" valign="top">Acoustic + Gender</td>
</tr>
<tr>
<td align="left" valign="top">Li et al. (<xref ref-type="bibr" rid="B99">2023</xref>)</td>
<td align="left" valign="top">lo-m &gt; hi-f</td>
<td align="left" valign="top">None</td>
<td align="left" valign="top">hi-f &gt; lo-m</td>
<td align="left" valign="top">Acoustic or Acoustic + Gender</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>2.5 Research questions</title>
<p>Public discourse from roughly the last decade often reports on extreme creaky voice usage by women, a pattern attested in recent sociolinguistic and sociophonetic research (e.g., <xref ref-type="bibr" rid="B120">Podesva, 2013</xref>; <xref ref-type="bibr" rid="B156">Yuasa, 2010</xref>), despite acoustic evidence consistently showing that men tend to produce more creaky voice (<xref ref-type="bibr" rid="B51">Gittelson et al., 2021</xref>; <xref ref-type="bibr" rid="B83">Klatt &amp; Klatt, 1990</xref>; among many others). The goal of this project is to reconcile differences between the production and perception of creaky voice by uncovering the perceptual pathway by which this discrepancy arises. We test two hypotheses (originally from <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>): an acoustic pitch contrast bias in which creak is more perceptible in higher-pitched voices due to greater contrast with modal pitch, and a gender bias in which listeners expect women to be creakier. The primary research question addressed here investigates whether voice f0 and visual face gender both independently show quantifiable effects on creaky voice perception in Canadian English. Using a matched-guise paradigm, we isolate the social effect from the acoustic effect to determine whether either or both contribute to a perceptual asymmetry.</p>
<p>The precise experimental design implemented to test these questions draws from the methods from Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>), White et al. (<xref ref-type="bibr" rid="B150">2024</xref>), and Strand &amp; Johnson (<xref ref-type="bibr" rid="B135">1996</xref>). To disambiguate effects of an acoustic bias from a gender bias on the perception of creaky voice, f0 and gender need to be treated independently. F0 values are therefore restricted to a f0 range (and formant structure) ambiguous for gender, and clearly gendered faces are then paired with the ambiguously-gendered voices in a matched-guise paradigm. If a gender bias exists in creaky voice perception, then a priming effect of face gender should be observable even if voice quality and pitch are held constant. Specifically, if the listeners&#8217; gender bias leads them to expect more creak in women&#8217;s speech, then a woman&#8217;s face should prime increased creakiness percepts, but if their gender bias presumes more creak in men&#8217;s speech, then a man&#8217;s face should prime increased creakiness percepts. Alternatively, if an acoustic bias exists, then f0 values, independent of gender perception, should influence creaky voice perception. Predictions of an acoustic bias can also differ depending on the direction of the f0 effect: If a larger pitch differential between modal and creaky voice is crucial in the perception of creak, then a higher f0 value should result in more creaky percepts; otherwise, if low pitch is the most relevant cue to creaky voice, then a lower f0 value should induce more creaky percepts. A combined acoustic and gender bias (like the pitch given gender bias in <xref ref-type="bibr" rid="B150">White et al., 2024</xref>) is also possible and would be substantiated by a face gender effect that differs as a function of f0.</p>
<p>This line of inquiry opens up new avenues for understanding of how voice perception is not merely a function of physical (acoustic) input, but also of socially structured expectation. In the broader landscape of sociophonetic research, this study also contributes to filling gaps in the existing literature. While there has been increasing interest in creaky voice, few studies have directly tested how social perception shapes creaky voice perception, especially in non-American varieties of English. Although prior work has shown that speech perception is influenced by social expectations, especially around gender, this research has largely focused on segmental features like sibilants and vowels. Thus far, there has been comparably less attention given to more complex and multi-dimensional features of speech, notably non-modal voice qualities like creak.</p>
</sec>
</sec>
<sec>
<title>3. Method</title>
<p>First, a norming study (Section 3.1) was conducted to determine what voice settings would be appropriate for our ambiguously gendered stimuli. Clearly gendered faces were then paired randomly with the gender-ambiguous audio stimuli in a matched-guise paradigm (Section 3.2) to assess the effect of face gender priming on creak perception. Listener ratings of creaky and modal voices were elicited along a continuous scale from not creaky at all to extremely creaky. All stimuli from the norming study and creaky voice perception experiment (training, practice, and trials), original audio recordings, scripts, datasets, and saved models are available on the paper&#8217;s OSF page (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://osf.io/f45yh/">https://osf.io/f45yh/</ext-link>).</p>
<sec>
<title>3.1 Norming perceived gender</title>
<p>To create a gender ambiguous voice, the Praat (<xref ref-type="bibr" rid="B10">Boersma &amp; Weenink, 2025</xref>) &#8220;Change gender&#8221; function was used, which varies both formant ratio and f0. A norming study was conducted to choose stimuli that were rated as ambiguously gendered.</p>
<sec>
<title>3.1.1 Participants</title>
<p>Eighteen English-dominant speakers, born and raised in (and currently living in) Canada, without any hearing difficulties nor cochlear implants (9 women, 9 men, 18 to 71 years of age, median age: 38) were recruited online through Prolific (<xref ref-type="bibr" rid="B115">Palan &amp; Schitter, 2018</xref>).</p>
</sec>
<sec>
<title>3.1.2 Stimuli</title>
<p>A 27-year-old female native Canadian English speaker&#8217;s voice was recorded using a MixPre-3 audio recorder and a Shure SM10A headset microphone. Sampling rate was 44.1 kHz and intensity was set to 70 dB for all recordings. Ten phonetically balanced utterances, Harvard Sentences (<xref ref-type="bibr" rid="B141">IEEE, 1969</xref>; list 1), were pronounced in a neutral modal voice and intonation was kept consistent across sentences, with a falling contour. The f0 was manipulated to create three new median f0 values: 115 Hz, 135 Hz, and 155 Hz, roughly matching the ambiguous gendered pitch values used in Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) and White et al. (<xref ref-type="bibr" rid="B150">2024</xref>). The formant shift ratio was manipulated to create five values ranging from most prototypically male-like (longer vocal tract) formants to most prototypically female-like (shorter vocal tract) formants: 0.8, 0.85, 0.9, 0.95, 1. Both variables were fully crossed for all 10 modal utterances, creating 150 trial stimuli. All experimental trials were randomized, and each participant was only shown 75 of the 150 total stimuli in order to keep the task short. Creaky voice utterances were not used in the norming study because the goal was to elicit gender judgements based solely on f0 and formant values, and modal voice is best for maintaining a clear f0 throughout the utterances.</p>
</sec>
<sec>
<title>3.1.3 Experimental design</title>
<p>The experiment was conducted online using Gorilla Experiment Builder (<xref ref-type="bibr" rid="B5">Anwyl-Irvine et al., 2020</xref>). The task was estimated to take roughly 15 minutes and participants were paid at a rate of &#163;9.00/hour, amounting to &#163;2.25 per participant. Participants began a headphone screener (<xref ref-type="bibr" rid="B152">Woods et al., 2017</xref>) and were given a maximum of two attempts to pass it.</p>
<p>Instructions for the gender rating task were presented on the first screen, followed by three practice trials. The practice trials included audio of one utterance at the mid-point and both ends of the manipulated continua (lowest f0 and most male-like formants, highest f0 and most female-like formants, and mid f0 and ambiguous formants). In both the practice and experimental trials, participants were presented with a fixation cross alongside audio from the continua of manipulated utterances. As soon as the audio finished playing, the next screen prompted participants to rate the voice along a 5-point scale for gender prototypicality, with 5 indicating a very feminine voice and 1 indicating a very masculine voice. Participants then clicked to continue to the next trial.</p>
</sec>
<sec>
<title>3.1.4 Results</title>
<p>The results in <xref ref-type="fig" rid="F1">Figure 1</xref> (right) show a clear trend towards more masculine-sounding ratings as median f0 and formant shift ratio decreases, and more feminine-sounding ratings as median f0 and formant shift ratio increases. It was determined (through visual inspection of <xref ref-type="fig" rid="F1">Figure 1</xref> left) that a formant shift ratio of 0.9 provided the most ambiguously gendered responses, and all three f0 values were retained to examine the effect of pitch (within a generally ambiguous range) on creakiness ratings.</p>
<fig id="F1">
<caption>
<p><bold>Figure 1:</bold> Gender prototypicality rating screen that was presented to participants (left) and aggregated gender prototypicality ratings (1 = most male-sounding, 5 = most female-sounding) plotted by formant shift ratio and median f0 (right).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g1.png"/>
</fig>
<p>While there was minor variation by utterance, no particular utterances elicited qualitatively different responses than the others, observable in Figure A1 (Appendix A) and from a Bayesian ordinal cumulative regression model (utterance: <inline-formula>
<alternatives>
<mml:math id="Eq001-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M1">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.36, <inline-formula>
<alternatives>
<mml:math id="Eq002-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M2">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.21, CI = [&#8211;0.79, 0.07]; Table A2 in Appendix A). Individual variation in gender prototypicality ratings can be observed in Figure A3 (of Appendix A). While some participants skewed towards more feminine-sounding ratings across most of the manipulated continuum (see participant 44382292 and 44382167), no participants skewed towards more masculine-sounding ratings. This can be explained by the source of the original recordings, which was a female speaker, causing certain listeners to perceive a more feminine voice regardless of the manipulated f0 and formants.</p>
</sec>
</sec>
<sec>
<title>3.2 Creaky voice perception experiment</title>
<p>The main perception experiment addressed the study&#8217;s primary research question targeting how creaky voice perception might be influenced by an acoustic bias or a gender bias, i.e., whether manipulated f0 and presented face gender independently affect listeners&#8217; creakiness ratings.</p>
<sec>
<title>3.2.1 Participants</title>
<p>Recruitment for the creaky voice perception experiment followed the same procedure and recruitment criteria as in the norming study. We had a target sample size of 40 and we excluded and replaced any participants who failed to complete the study (<italic>n</italic> = 6). We had planned to also exclude participants with evidence of poor understanding or attention to the task as evidenced by a pattern of random responses however no participants met this exclusion criteria. Forty new English dominant speakers, (14 women and 26 men, aged 18 to 71 years old, median age: 32) were included. None were involved in the norming study.</p>
</sec>
<sec>
<title>3.2.2 Stimuli</title>
<p>The same 27-year-old female native Canadian English speaker&#8217;s voice was recorded using a MixPre3 audio recorder and a Shure SM10A headset microphone. Sixty new phonetically balanced sentences (i.e., not those used in the norming study), lists 3&#8211;8 of the Harvard Sentences (<xref ref-type="bibr" rid="B141">IEEE, 1969</xref>), were pronounced by the speaker. We chose to elicit somewhat natural productions of modal and creaky voice (like <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>) to avoid complications in simulating creaky voice acoustically. Thirty sentences were produced in a neutral modal voice, and 30 sentences were produced with roughly the first half of the sentence in modal voice and the second half in creaky voice. F0 tracks are often unreliable in creaky voice production due to their characteristic f0 irregularity, therefore f0 manipulations are often ineffective on creaky voices. As such, partially creaky sentences were preferred over fully creaky sentences so that the f0 manipulations could be perceived across all stimuli, obvious in the modal voice portions. Because the creaky utterances were produced naturally, the exact timing of creaky voice was variable. To provide clearer characterization of the original creaky utterances, we examined the range of variation in the proportion of creaky voice. Durations of creak were estimated using an audio-visual coding method to identify the onset of creak within the original creaky utterances. Approximate proportions of creak were then calculated (by dividing the duration of creak by the entire utterance duration) and are plotted in Figure A4 (Appendix A). The range of proportions of creak was 0.40 to 0.77 with a mean of 0.58 for all 30 creaky utterances. Original modal utterances did not contain any creak. Speech rate varied to some extent, ranging from 2.16 to 3.94 vowels per second with a mean of 3.15 vowels/second (see Figure A5 in Appendix A for speech rate plots by voice quality and median f0). Intonation was again kept consistent across sentences by the speaker, with a falling contour. The mean f0 across vowels within all the utterances in the original recordings was 187 Hz. Instead of using multiple speakers with varying mean modal pitches (as did <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>), we followed White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) in manipulating f0 values to achieve more control. As described above, we used three levels of f0: 115 Hz, 135 Hz, and 155 Hz. The 30 unique modal and 30 unique creaky utterances were each split into three groups of 10, and each group was f0-shifted to one of the three values, creating 60 audio stimuli. Following results of the norming study, the formant shift ratio was set to 0.9 for all stimuli.</p>
<p>Impressionistically, the &#8220;change gender&#8221; function introduced some distortions that may lead to a percept of creakiness in some of the modal stimuli. To check whether the creaky stimuli were still objectively creakier and, importantly, that the different levels of f0 were equally creaky, we implemented an acoustic analysis on all formant and f0-shifted stimuli. The acoustic analyses of the 60 audio stimuli were conducted using PraatSauce (<xref ref-type="bibr" rid="B82">Kirby, 2018</xref>), a Praat script for spectral measures, and another Praat script for f0-related measures (<xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>). A total of 533 vowels were analyzed, measured at 3 equidistant points, at 25, 50 and 75% of the vowel duration, from which vowel means were calculated. The acoustic correlates of creak examined in this analysis include the proportion of unreliable f0 tracks (i.e., the proportion of vowels for which Praat could not track the f0 consistently), in addition to more common measures within the creaky voice literature, specifically H1*&#8211;H2*, CPP, and HNR &lt; 500 Hz (see method in <xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>, for a more in-depth discussion of these measures and how they relate to creaky voice). H1*&#8211;H2* values that depended on unreliable (0 or NA) f0 values were removed (<italic>n</italic> points = 495, 30.96% of points over all H1*&#8211;H2* tracks and <italic>n</italic> vowels = 128) and extreme HNR &lt; 500 Hz values were excluded (<italic>n</italic> vowels = 14). <xref ref-type="fig" rid="F2">Figure 2</xref> illustrates empirical differences between vowels in modal utterances and those in creaky utterances across median f0 values. Observational trends show higher proportions of unreliable f0 tracks and lower H1*&#8211;H2*, CPP and HNR &lt; 500 Hz values for creaky utterances compared to modal utterances, providing acoustic evidence for increased creakiness in the creaky stimuli. This voice quality difference in acoustic measures is mostly consistent across f0 values (with the exception of H1*&#8211;H2*), indicating that the f0 manipulation did not seem to directly affect creakiness in the stimuli.</p>
<fig id="F2">
<caption>
<p><bold>Figure 2:</bold> Selected acoustic correlates of creak (mean proportion of reliable/unreliable f0 tracks, mean H1*&#8211;H2* in dB, mean CPP in dB, and mean HNR &lt; 500 Hz in dB) plotted by voice quality and median f0.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g2.png"/>
</fig>
<p>The 60 audio stimuli were paired with 60 faces from the London Set of Faces from the Faces Research Lab (<xref ref-type="bibr" rid="B29">Debruine &amp; Jones, 2017</xref>); 30 were presented with a unique male face and 30 with a unique female face. The faces selected from this database came from people under 30 years old, as these faces would match the original speakers&#8217; age more closely. The sampled faces also matched the general distribution of ethnicities across the larger database, including 20 white, 4 black, 3 East Asian and 3 West Asian faces (per gender). All participants were exposed to all f0 levels, but in order to achieve randomization of the unique faces within gender and across utterances, participants branched off into two alternating groups. Odd-numbered participants (in terms of order of recruitment) were shown utterances ending in numbers 1, 2, 3, 4, or 5 with female faces and utterances ending in numbers 6, 7, 8, 9, or 0 with male faces, while even-numbered participants were shown the same utterances with flipped face gender assignments. As such, face gender variation within utterances is between-subjects. <xref ref-type="table" rid="T3">Table 3</xref> shows the distribution of stimuli across conditions. All stimuli (audio utterances paired with randomized faces within gender) were then randomized in the experimental trials.</p>
<table-wrap id="T3">
<caption>
<p><bold>Table 3:</bold> Structure of stimuli across all conditions: median f0 (115 Hz, 135 Hz, 155 Hz) and formant shift ratio (FSR, 0.9), voice quality (modal, creaky), face gender (F, M), and branching of participants. Utterance numbering has been modified from that in the larger list of Harvard Sentences to better convey that utterances are unique (correspondences to the original list and utterance numbering from the Harvard Sentences are provided in Table A6 of Appendix A).</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top" rowspan="2"><bold>Median f0 / FSR</bold></td>
<td align="left" valign="top" rowspan="2"><bold>Face gender</bold></td>
<td align="left" valign="top" colspan="2"><bold>Odd participants (n = 20)</bold></td>
<td align="left" valign="top" colspan="2"><bold>Even participants (n = 20)</bold></td>
</tr>
<tr>
<td align="left" valign="top"><bold>Modal</bold></td>
<td align="left" valign="top"><bold>Creaky</bold></td>
<td align="left" valign="top"><bold>Modal</bold></td>
<td align="left" valign="top"><bold>Creaky</bold></td>
</tr>
<tr>
<td align="left" valign="top" rowspan="2">115 Hz / 0.9 FSR</td>
<td align="left" valign="top">F</td>
<td align="left" valign="top">Utt. 1&#8211;5</td>
<td align="left" valign="top">Utt. 31&#8211;35</td>
<td align="left" valign="top">Utt. 6&#8211;10</td>
<td align="left" valign="top">Utt. 36&#8211;40</td>
</tr>
<tr>
<td align="left" valign="top">M</td>
<td align="left" valign="top">Utt. 6&#8211;10</td>
<td align="left" valign="top">Utt. 36&#8211;40</td>
<td align="left" valign="top">Utt. 1&#8211;5</td>
<td align="left" valign="top">Utt. 31&#8211;35</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="2">135 Hz / 0.9 FSR</td>
<td align="left" valign="top">F</td>
<td align="left" valign="top">Utt. 11&#8211;15</td>
<td align="left" valign="top">Utt. 41&#8211;45</td>
<td align="left" valign="top">Utt. 16&#8211;20</td>
<td align="left" valign="top">Utt. 46&#8211;50</td>
</tr>
<tr>
<td align="left" valign="top">M</td>
<td align="left" valign="top">Utt. 16&#8211;20</td>
<td align="left" valign="top">Utt. 46&#8211;50</td>
<td align="left" valign="top">Utt. 11&#8211;15</td>
<td align="left" valign="top">Utt. 41&#8211;45</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="2">155 Hz / 0.9 FSR</td>
<td align="left" valign="top">F</td>
<td align="left" valign="top">Utt. 21&#8211;25</td>
<td align="left" valign="top">Utt. 51&#8211;55</td>
<td align="left" valign="top">Utt. 26&#8211;30</td>
<td align="left" valign="top">Utt. 56&#8211;60</td>
</tr>
<tr>
<td align="left" valign="top">M</td>
<td align="left" valign="top">Utt. 26&#8211;30</td>
<td align="left" valign="top">Utt. 56&#8211;60</td>
<td align="left" valign="top">Utt. 21&#8211;25</td>
<td align="left" valign="top">Utt. 51&#8211;55</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>3.2.3 Experimental design</title>
<p>The experiment followed the same protocol as the norming study, including the headphone screener, but instead of the gender prototypicality task, they performed the creakiness rating task. Instructions were presented on the first screen, followed by a training session, four practice trials, and 60 experimental trials.</p>
<p>The training phase aimed to orient participants to the acoustic and social cues to creaky voice, ultimately intending to reduce variability in participants&#8217; ratings that might arise from unfamiliarity. The training first described creaky voice to the participants in an accessible and simplified way, noting its rough, croaking/crackly sound to the ear, acknowledging its common name in public discourse, <italic>vocal fry</italic>, and contrasting it with modal voice. We avoided descriptions that made any mention of low pitch so as not to prime participants into associating creaky voice with low pitch a priori. Participants were presented with four examples of creaky voice. Next, participants were shown two faces that matched the broad popular perceptions of speakers associated with creaky voice use: a young female American celebrity (Kim Kardashian) and an old British actor (George Sanders, a.k.a. Shere Khan, from <italic>The Jungle Book</italic>). Text explained that these speakers are known for their very creaky voices. An image of their face, an audio excerpt of a fully creaky utterance, and the corresponding transcription for each was included. On the following screen, participants were exposed to two, more local, speakers (both public figures in Canada but less well-known than the previous American and British celebrities), a young man (Lenni-Kim Lalande) and a young woman (Francesca Farago), roughly the same age as the speaker who recorded the stimuli. On-screen text explained that speakers can manipulate their voice quality and make use of both modal and creaky voice within an utterance. An image of their faces, an audio excerpt of a partially creaky utterance, and its transcription (with the creak in bold font), were presented. Unlike other studies that only expose participants to training stimuli similar to the experimental materials, our training included (American and British) stereotypical examples of creaky voice. This approach provides participants with familiar anchors, allowing them to recognize and contextualize creak as a socially and acoustically meaningful feature.</p>
<p>The practice trials included stimuli from both ends of the manipulated continua (lowest f0 and ambiguous formants with male face, highest f0 and ambiguous formants with female face) in modal and creaky voice. In both the practice and experimental trials, participants were presented with the image of a person&#8217;s face (either male or female) alongside audio from the continua of manipulated utterances (see <xref ref-type="fig" rid="F3">Figure 3</xref> left). As soon as the audio finished playing, the next screen prompted participants to rate the voice along a visual analog scale (VAS) for creakiness (see <xref ref-type="fig" rid="F3">Figure 3</xref>, right), with 100 indicating an extremely creaky voice and 0 indicating modal voice (no creak at all). We preferred to use a gradient measure of creaky voice perception (motivated in other acoustic perception work, <xref ref-type="bibr" rid="B112">Munson et al., 2010</xref>; <xref ref-type="bibr" rid="B142">Urberg-Carlson et al., 2008</xref>), requiring listeners to rate the level of creakiness rather than provide binary judgements (as in <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref> and <xref ref-type="bibr" rid="B150">White et al., 2024</xref>). Participants then clicked to proceed to the next trial. Reaction time was measured from the moment the screen with the creakiness rating scale appeared to the moment the &#8220;Next&#8221; button was clicked. Participants were allowed to change their selection freely prior to clicking &#8220;Next.&#8221;</p>
<fig id="F3">
<caption>
<p><bold>Figure 3:</bold> Screen with face and audio stimuli (left) and creakiness rating screen (right) presented to participants.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g3.png"/>
</fig>
<p>Following the experimental trials, a short debrief questionnaire was administered. Participants were asked three open-ended questions: whether they found the task difficult, what cues (both auditory and visual) they thought influenced their ratings, and if there was any additional information they would like to share.</p>
</sec>
<sec>
<title>3.2.4 Statistical analysis</title>
<p>Using the brms package (<xref ref-type="bibr" rid="B16">B&#252;rkner, 2017</xref>) in R, a Bayesian zero-one-inflated beta regression model was fitted to the response data (creakiness ratings). A beta regression was chosen because the dependent variable is bounded (by the VAS) (<xref ref-type="bibr" rid="B129">Sonderegger &amp; S&#243;skuthy, 2025</xref>) and the zero-one-inflated variant was implemented because the endpoints (0 and 100 on the original VAS or 0 and 1 in proportions) are included in the possible responses (<xref ref-type="bibr" rid="B62">Heiss, 2021</xref>; see also <xref ref-type="bibr" rid="B157">Zellou et al., 2024</xref>). The exact model formula can be found in Table A7 of Appendix A. Model structure was chosen based on theoretical importance of the predictors to the research questions. Fixed effects included face gender, voice quality and median f0, henceforth f0. The effect of face gender on creakiness ratings is of primary interest for the research questions in this paper, crucial to the investigation into how (perceived) gender might affect the perception of creaky voice. The individual effect of voice quality on creakiness ratings is included to confirm that participants are capable of distinguishing between modal and creaky voice. The effect of f0 is included to test the two hypotheses: i) that a higher modal-to-creaky f0 difference (higher f0 values) leads to increased creakiness percepts (acoustic bias); and ii) that low pitch is an important cue to creaky voice, leading to more creakiness ratings for lower f0 values. All two-way and three-way interactions between these predictors were also included to assess whether a face gender effect on creakiness ratings differs by voice quality or by f0 value, but also to confirm that voice quality contrasts (measured perceptually by creakiness ratings) are maintained across different ambiguous pitch ranges. Both two-level predictors, face gender and voice quality, were standardized (centered and divided by 2 standard deviations) and the only multi-level predictor, f0, was Helmert contrast coded (centered and orthogonal) so that each contrast corresponded to the difference between that level and the mean of the previous levels (<xref ref-type="bibr" rid="B128">Sonderegger, 2023</xref>). Maximal varying-effect<xref ref-type="fn" rid="n1">1</xref> structure was implemented into the model. By-participant, by-utterance, and by-face varying intercepts were included, allowing participants, utterances and faces to vary in their baseline creakiness ratings. Varying correlated i) by-participant slopes for face gender, voice quality, f0, and their two-way and three-way interactions; ii) by-utterance slopes for face gender; and iii) by-face slopes for voice quality, f0, and their interaction, were also included, allowing the effects of the predictors on creakiness ratings to vary in direction and size by participant, utterance, and face. In addition, to account for more participant variability in the use of the 0&#8211;100 slider scale, we included by-participant varying intercepts on the &#981;, &#945; and &#947; parameters. This allowed participants to differ (within the model) in the precision of responses (i.e., variance/dispersion; &#981;) and in endpoint usage (&#945; and &#947;). The model was fitted with flat priors on all parameters and a weakly informative <italic>LKJ</italic>(1.5) prior on the correlation terms.</p>
<p>A shifted log-normal regression model was also fitted to the reaction time data, using the same model structure as for the creakiness rating data (see Table A12 in Appendix A). A log-normal regression was chosen because the dependent variable is bounded and can only be positive (<xref ref-type="bibr" rid="B129">Sonderegger &amp; S&#243;skuthy, 2025</xref>), and the shifted variant was chosen because it has been shown to be particularly well-suited to reaction time data, which has a shifted lower bound from 0 to at least 200 ms (<xref ref-type="bibr" rid="B101">Lindel&#248;v, 2019</xref>). Because the reaction time data is not directly related to the main research questions, it will only be discussed briefly in the results that follow.</p>
</sec>
</sec>
</sec>
<sec>
<title>4. Results</title>
<sec>
<title>4.1 Creakiness ratings</title>
<p>A total of 2,400 responses (40 participants x 60 trials) were collected. The distribution of creakiness ratings is plotted below in <xref ref-type="fig" rid="F4">Figure 4</xref>. The distribution appears to be bimodal, suggesting that many tokens are rated as very creaky (50&#8211;100), and fewer are rated less creaky (0&#8211;50), with a limited number of tokens rated as modal (~0&#8211;10). When split by voice quality condition, it is observed that few creaky utterances are rated as somewhat modal, but that comparably more modal utterances are rated as somewhat creaky.</p>
<fig id="F4">
<caption>
<p><bold>Figure 4:</bold> Distribution of creakiness ratings, colored by voice quality condition.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g4.png"/>
</fig>
<p>Group-level results are plotted in <xref ref-type="fig" rid="F5">Figure 5</xref>, and the zero-and-one-inflated Bayesian regression model table for fixed effects is shown in <xref ref-type="table" rid="T4">Table 4</xref> (the full model summary is available as Table A7 of Appendix A). As expected, the main effect of voice quality on creakiness ratings is significant and clearly observed at all f0 levels and face gender levels (compare x-axes in <xref ref-type="fig" rid="F5">Figure 5</xref>). In the creaky voice condition (when the second half of the utterance contains creak), the voice is rated to be creakier than in the modal voice condition (when the entire utterance is modal) (creak&#8211;modal: <inline-formula>
<alternatives>
<mml:math id="Eq003-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M3">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 1.04, <inline-formula>
<alternatives>
<mml:math id="Eq004-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M4">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.12, CI = [0.80, 1.30]; <xref ref-type="table" rid="T4">Table 4</xref>). Corroborating the distribution by voice quality in <xref ref-type="fig" rid="F5">Figure 5</xref>, modal stimuli creakiness ratings center around approximately 0.5, meaning that participants judged most modal stimuli as still containing some level of creaky voice. On the other hand, few of the creaky stimuli were rated as modal (visible from the empirical creakiness ratings in <xref ref-type="fig" rid="F5">Figure 5</xref>). While more subtle, the main effect of manipulated median f0 is also significant. When f0 values increase, creakiness ratings decrease (135Hz&#8211;115Hz: <inline-formula>
<alternatives>
<mml:math id="Eq005-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M5">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.12, <inline-formula>
<alternatives>
<mml:math id="Eq006-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M6">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.05, CI = [&#8211;0.22, &#8211;0.02]; 155Hz&#8211;(135Hz+115Hz): <inline-formula>
<alternatives>
<mml:math id="Eq007-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M7">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.10, <inline-formula>
<alternatives>
<mml:math id="Eq008-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M8">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.03, CI = [&#8211;0.16, &#8211;0.04]; <xref ref-type="table" rid="T4">Table 4</xref>). This is consistent with more creakiness perceived at lower f0 values, which is borne out in <xref ref-type="fig" rid="F5">Figure 5</xref> (compare facets). The face gender effect alone does not reach significance (F&#8211;M: <inline-formula>
<alternatives>
<mml:math id="Eq009-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M9">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 0.03, <inline-formula>
<alternatives>
<mml:math id="Eq010-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M10">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.03, CI = [&#8211;0.04, 0.10]; <xref ref-type="table" rid="T4">Table 4</xref>), only significant in interaction with f0. When f0 values increase to 155 Hz, the face gender effect reverses relative to the two lower f0 values, 115 Hz and 135 Hz (facegender*(155Hz&#8211;(135Hz+115Hz)): <inline-formula>
<alternatives>
<mml:math id="Eq011-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M11">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.05, <inline-formula>
<alternatives>
<mml:math id="Eq012-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M12">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.02, CI = [&#8211;0.10, &#8211;0.01]; <xref ref-type="table" rid="T4">Table 4</xref>). However, when comparing estimated marginal means using the emmeans package (<xref ref-type="bibr" rid="B96">Lenth, 2023</xref>) in R, that is, calculating the predicted effect of face gender at different f0 levels, there is insufficient evidence for any significant effect at any given level. Averaged over voice quality levels, when the median f0 is 115 Hz, 135 Hz, or 155 Hz, the face gender effect fails to reach significance, but trends are compatible with the aforementioned face gender effect reversal (respectively, F&#8211;M: <inline-formula>
<alternatives>
<mml:math id="Eq013-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M13">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 0.08, CI = [&#8211;0.04, 0.20]; <inline-formula>
<alternatives>
<mml:math id="Eq014-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M14">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 0.08, CI = [&#8211;0.04, 0.20]; <inline-formula>
<alternatives>
<mml:math id="Eq015-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M15">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.08, CI = [&#8211;0.19, 0.03]). From visual observation of <xref ref-type="fig" rid="F5">Figure 5</xref>, we can see that at both 115 Hz and at 135 Hz, female faces prime creakier ratings (the blue triangular points are slightly higher than the orange circular points) but that this trend reverses at 155 Hz, male faces priming creakier ratings instead (orange circular points are higher than blue triangular points), though these observations remain speculative in the absence of statistical confirmation. In any case, the data from this experiment shows that a face gender effect is limited at best.</p>
<fig id="F5">
<caption>
<p><bold>Figure 5:</bold> Predicted creakiness ratings (foreground points) with 95% credibility intervals and empirical creakiness ratings (background points) by face gender (color and shape), voice quality (x-axis) and f0 (facets).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g5.png"/>
</fig>
<table-wrap id="T4">
<caption>
<p><bold>Table 4:</bold> Zero-and-one-inflated Bayesian regression model table for fixed effects (FG = face gender, VQ = voice quality, f0 = new median f0) on creakiness ratings. Probabilities of direction (<italic>p<sub>d</sub></italic>) of credible effects are bolded.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"><bold>Fixed effects</bold></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
<td align="left" valign="top"></td>
</tr>
<tr>
<td align="left" valign="top"><bold>Coefficient</bold></td>
<td align="left" valign="top"><inline-formula>
<alternatives>
<mml:math id="Eq016-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M16">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula></td>
<td align="left" valign="top"><bold><italic>SE</italic></bold> (<inline-formula>
<alternatives>
<mml:math id="Eq017-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M17">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula>)</td>
<td align="left" valign="top"><bold>95% <italic>CI</italic></bold></td>
<td align="left" valign="top"><bold><italic>p<sub>d</sub></italic></bold></td>
</tr>
<tr>
<td align="left" valign="top">Intercept</td>
<td align="left" valign="top">0.48</td>
<td align="left" valign="top">0.11</td>
<td align="left" valign="top">[0.27, 0.69]</td>
<td align="left" valign="top"><bold>100</bold></td>
</tr>
<tr>
<td align="left" valign="top">&#981; (phi) Intercept</td>
<td align="left" valign="top">2.10</td>
<td align="left" valign="top">0.10</td>
<td align="left" valign="top">[1.91, 2.30]</td>
<td align="left" valign="top"><bold>100</bold></td>
</tr>
<tr>
<td align="left" valign="top">&#945; (zoi) Intercept</td>
<td align="left" valign="top">&#8211;4.06</td>
<td align="left" valign="top">0.36</td>
<td align="left" valign="top">[&#8211;4.82, &#8211;3.42]</td>
<td align="left" valign="top"><bold>100</bold></td>
</tr>
<tr>
<td align="left" valign="top">&#947; (coi) Intercept</td>
<td align="left" valign="top">0.75</td>
<td align="left" valign="top">0.53</td>
<td align="left" valign="top">[&#8211;0.19, 1.88]</td>
<td align="left" valign="top">93.97</td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M)</td>
<td align="left" valign="top">0.03</td>
<td align="left" valign="top">0.03</td>
<td align="left" valign="top">[&#8211;0.04, 0.10]</td>
<td align="left" valign="top">79.40</td>
</tr>
<tr>
<td align="left" valign="top">VQ(cr&#8211;mo)</td>
<td align="left" valign="top">1.04</td>
<td align="left" valign="top">0.12</td>
<td align="left" valign="top">[0.80, 1.30]</td>
<td align="left" valign="top"><bold>100</bold></td>
</tr>
<tr>
<td align="left" valign="top">f0_1(135&#8211;115)</td>
<td align="left" valign="top">&#8211;0.12</td>
<td align="left" valign="top">0.05</td>
<td align="left" valign="top">[&#8211;0.22, &#8211;0.02]</td>
<td align="left" valign="top"><bold>98.92</bold></td>
</tr>
<tr>
<td align="left" valign="top">f0_2(155&#8211;(135+115))</td>
<td align="left" valign="top">&#8211;0.10</td>
<td align="left" valign="top">0.03</td>
<td align="left" valign="top">[&#8211;0.16, &#8211;0.04]</td>
<td align="left" valign="top"><bold>99.98</bold></td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M):VQ(cr&#8211;mo)</td>
<td align="left" valign="top">&#8211;0.04</td>
<td align="left" valign="top">0.06</td>
<td align="left" valign="top">[&#8211;0.16, 0.08]</td>
<td align="left" valign="top">73.00</td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M):f0_1(135&#8211;115)</td>
<td align="left" valign="top">0.00</td>
<td align="left" valign="top">0.05</td>
<td align="left" valign="top">[&#8211;0.09, 0.09]</td>
<td align="left" valign="top">52.18</td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M):f0_2(155&#8211;(135+115))</td>
<td align="left" valign="top">&#8211;0.05</td>
<td align="left" valign="top">0.02</td>
<td align="left" valign="top">[&#8211;0.10, &#8211;0.01]</td>
<td align="left" valign="top"><bold>98.98</bold></td>
</tr>
<tr>
<td align="left" valign="top">VQ(cr&#8211;mo):f0_1(135&#8211;115)</td>
<td align="left" valign="top">0.10</td>
<td align="left" valign="top">0.10</td>
<td align="left" valign="top">[&#8211;0.09, 0.29]</td>
<td align="left" valign="top">84.00</td>
</tr>
<tr>
<td align="left" valign="top">VQ(cr&#8211;mo):f0_2(155&#8211;(135+115))</td>
<td align="left" valign="top">&#8211;0.01</td>
<td align="left" valign="top">0.06</td>
<td align="left" valign="top">[&#8211;0.12, 0.10]</td>
<td align="left" valign="top">59.95</td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M):VQ(cr&#8211;mo):f0_1(135&#8211;115)</td>
<td align="left" valign="top">&#8211;0.10</td>
<td align="left" valign="top">0.08</td>
<td align="left" valign="top">[&#8211;0.26, 0.06]</td>
<td align="left" valign="top">88.60</td>
</tr>
<tr>
<td align="left" valign="top">FG(F&#8211;M):VQ(cr&#8211;mo):f0_2(155&#8211;(135+115))</td>
<td align="left" valign="top">0.03</td>
<td align="left" valign="top">0.04</td>
<td align="left" valign="top">[&#8211;0.06, 0.11]</td>
<td align="left" valign="top">74.40</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Empirical individual variation plots for creakiness ratings are available in Appendix A (Figure A8, A9 and A10) but will not be extensively discussed here. Figure A8 shows that participants do not differ in the direction of the effect of voice quality; utterances containing creaky voice are consistently rated as creakier than the modal utterances across all participants. In comparison to the effect of voice quality, there is much more individual variation in the effect of f0 (see Figure A9), which suggests that this effect is less robust. Likewise, there is individual variation in the direction of the effect of face gender, at least for the participants who do show observable differences between gendered faces (see Figure A10). Overall, participants do vary in the range of ratings they attribute to the predictors (voice quality, f0 and face gender). Some participants make use of the full VAS scale while others display more moderate responses, and some participants exhibit large rating differences between experimental conditions whereas others only show slight differences. These differences in quantity of between-participant variation are confirmed by the by-participant varying intercept estimates in the model (sd &#981; intercept: <inline-formula>
<alternatives>
<mml:math id="Eq018-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M18">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 0.61; sd &#945; intercept: <inline-formula>
<alternatives>
<mml:math id="Eq019-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M19">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 1.73; sd &#947; intercept: <inline-formula>
<alternatives>
<mml:math id="Eq020-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M20">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = 1.81; see Table A7 in Appendix A).</p>
</sec>
<sec>
<title>4.2 Reaction time</title>
<p>For the reaction time data, long reaction times were excluded based on visual inspection of the distribution plot (see Figure A11 left panel in Appendix A), eliminating responses with reaction times over 15 seconds from the data (n = 153). The full summary of the shifted log-normal Bayesian regression model results is available in Table A12 of Appendix A. The main effect of voice quality on reaction time (in milliseconds) is significant. Overall, the modal voice condition leads to longer reaction times than the creaky condition (creak&#8211;modal: <inline-formula>
<alternatives>
<mml:math id="Eq021-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M21">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.06, <inline-formula>
<alternatives>
<mml:math id="Eq022-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M22">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.03, CI = [&#8211;0.11, &#8211;0.01]; Table A12), shown in <xref ref-type="fig" rid="F6">Figure 6</xref> (negative slopes of lines). The main effects of f0 and of face gender were not found to be significant individually (see Table A12), but did reach significance thresholds in interaction, suggesting an opposite face gender effect at 135 Hz compared to 115 Hz (facegender*(135Hz&#8211;115Hz)): <inline-formula>
<alternatives>
<mml:math id="Eq023-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03B2;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M23">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\beta
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e1.gif"/>
</alternatives>
</inline-formula> = &#8211;0.06, <inline-formula>
<alternatives>
<mml:math id="Eq024-mml">
<mml:mrow><mml:mover accent='true'><mml:mi>&#x03C3;</mml:mi><mml:mo stretchy='true'>&#x005E;</mml:mo></mml:mover></mml:mrow>
</mml:math>
<tex-math id="M24">
\documentclass[10pt]{article}
\usepackage{wasysym}
\usepackage[substack]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage[mathscr]{eucal}
\usepackage{mathrsfs}
\usepackage{pmc}
\usepackage[Euler]{upgreek}
\pagestyle{empty}
\oddsidemargin -1.0in
\begin{document}
\[
\widehat\sigma
\]
\end{document}
</tex-math>
<graphic xlink:href="labphon-17-24285-e2.gif"/>
</alternatives>
</inline-formula> = 0.03, CI = [&#8211;0.12, &#8211;0.00]; Table A12). While a comparison between estimated marginal means does not reach significance for any of the f0 levels, it appears from <xref ref-type="fig" rid="F6">Figure 6</xref> that male faces lead to longer reaction times than female faces at 135 Hz (in both modal and creaky voices) and at 155 Hz (for creaky voices), but at 115 Hz, female faces lead to longer reaction times (specifically for modal voices). Because these reaction times are relatively long and may reflect more than just immediate processing, these results should be treated with caution. However, they do provide some evidence that incongruence between the face gender and the expected f0 for that gender lead to slower responses. This suggests that the faces and voices were integrated by participants.</p>
<fig id="F6">
<caption>
<p><bold>Figure 6:</bold> Predicted reaction times (in ms) with 95% credibility intervals by voice quality (x-axis), f0 (facets), and face gender (color and shape).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="labphon-17-24285-g6.png"/>
</fig>
<p>Empirical individual variation plots for reaction time are available in Appendix A (Figure A13, A14 and A15) but will not be discussed here.</p>
</sec>
</sec>
<sec>
<title>5. Discussion</title>
<sec>
<title>5.1 General discussion of results</title>
<p>The acoustic analysis of the stimuli showed more acoustic creakiness (higher proportions of unreliable f0 tracks, lower H1*&#8211;H2*, lower CPP and lower HNR &lt; 500 Hz) for the (partially) creaky utterances compared to the modal utterances. In the creakiness perception experiment, participants rated utterances containing creaky voice as creakier than those containing little to no creak (in modal voice). These results confirm that listeners can consistently recognize and distinguish creaky voice from modal voice in our experiment, validating that our f0 and formant-altered modal and creaky voice stimuli induced perceptually distinct creakiness judgments.</p>
<p>Despite some individual variation across listeners, voices with lower median f0 were rated as creakier than voices with higher median f0 overall. This provides evidence that low pitch is closely related to perceptual creakiness, even within a relatively restricted range, aligning with previous perceptual studies (<xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>; <xref ref-type="bibr" rid="B99">Li et al., 2023</xref>; <xref ref-type="bibr" rid="B150">White et al., 2024</xref>). Notably, our acoustic analyses of the stimuli did not clearly show increased creakiness for lower f0 utterances. The proportions of unreliable f0 tracks, CPP values and HNR &lt; 500 Hz values were roughly similar for all f0 levels, while only H1*&#8211;H2* values decreased (indicative of more acoustic creak) alongside f0. This suggests that low f0 is a perceptually useful cue to creaky voice, but that low f0 does not necessarily lead to stronger acoustic cues to creaky voice. However, acoustic work has found covariation between creaky voice and low f0, suggesting that the relationship may be bidirectional (e.g., in American English in <xref ref-type="bibr" rid="B27">Davidson, 2020</xref>; in White Hmong in <xref ref-type="bibr" rid="B49">Garellek et al., 2013</xref>; in Mandarin in <xref ref-type="bibr" rid="B88">Kuang, 2017</xref>; and in Cantonese in <xref ref-type="bibr" rid="B159">Zhang &amp; Kirby, 2020</xref>). Additionally, a recent study applying mediation analysis to creaky voice claims that a number of its acoustic correlates are at least partially mediated by f0 (<xref ref-type="bibr" rid="B12">Brown, 2025</xref>). While the results of this study do show trends towards raised creakiness ratings for lower-pitched modal voices specifically (around 115 Hz), they are not statistically supported. Nevertheless, this trend is consistent with previous work showing increased false-alarms rates in lower low-pitched (roughly 100 Hz) modal voices (<xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>; <xref ref-type="bibr" rid="B99">Li et al., 2023</xref>; <xref ref-type="bibr" rid="B150">White et al., 2024</xref>).</p>
<p>Face gender priming weakly influenced creakiness ratings in relation to f0 values, despite not affecting creakiness ratings in isolation. Observational trends suggest that for ambiguously-gendered voices, a female face might prime increased creakiness perceptions for lower-pitched voices, but a male face might prime increased creakiness perceptions for higher-pitched voices. While not identical, these results share distinct similarities to those in White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) in a call back to their speculation about a pitch given gender bias scenario. At matched mean f0 values of 135 Hz, they found more creak identified in the woman&#8217;s voice than the man&#8217;s voice (and also more creak for a low-pitched man&#8217;s modal voice), thus proposing that the underlying cause of this gender effect is low pitch, given the expected pitch range for gender. This hypothesis can account for our study&#8217;s increased creakiness ratings for low-pitched voices (115 Hz and 135 Hz) primed by a female face, but fails to explain why a male face might prime listeners to perceive more creaky voice at a higher pitch (155 Hz). It is important to note, however, that this pattern of reversal is only observed in the empirical and model prediction plots and is not clearly statistically significant. If (and only if) we assume that this trend is reliable, we could argue that the pitch given gender bias may be less restricted than White et al. initially proposed, affecting not only <italic>low</italic> pitch given gender but also <italic>high</italic> pitch given gender in a more general expectation-reliant hypothesis. Some motivations and implications of such a hypothesis are explored further in following sections (5.2 and 5.3).</p>
<p>As for reaction time results, slower reaction time was observed for modal voice utterances than for creaky voice utterances, possibly indicating that listeners experienced more difficulty in determining whether there was absence of creak. Visual trends (Figure 8) point to longer reaction times for unexpected pitch-gender combinations, that is for lower-pitched utterances (115 Hz) primed by a woman&#8217;s face and higher-pitched utterances (135 Hz and creaky 155 Hz) primed by a man&#8217;s face. This is consistent with the idea that linguistic perception is influenced by social expectations. Walker &amp; Hay (<xref ref-type="bibr" rid="B146">2011</xref>) show that when word age (determined by the typical age of speakers who often use that word) and voice age are congruent, lexical access is facilitated, resulting in faster reaction times and higher accuracy than when they conflict. They take this as evidence that incongruency between linguistic and social information increases processing costs. Following the same reasoning, when speaker f0 and face gender create conflicting gender expectations, this could lead to increased processing demands and uncertainty, even though they are not being asked to make a gender judgement. These reaction time effects therefore give tentative support that our face gender manipulation created gendered social expectations for the stimuli. That said, surprisingly, this proposed effect reverses for modal voices at 155 Hz, with women&#8217;s faces showing slightly higher reaction times. However, given that our reaction times are not based on button-press responses and thus our data is very noisy, conclusions drawn from only reaction time data remain speculative.</p>
<p>Overall, this study shows that (Canadian English) listeners&#8217; perception of creaky voice is affected by speaker f0 even when controlling for gender biasing formant ratios. However, creakiness is less convincingly impacted by independently providing a gender cue with a face. Situating these results with respect to the hypotheses or scenarios described by Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) and White et al. (<xref ref-type="bibr" rid="B150">2024</xref>), they provide additional evidence for an acoustic hypothesis (pitch contrast bias), but again in the opposite direction as predicted: <italic>lower</italic> pitch (and/or a smaller modal-to-creaky-voice pitch differential) increasing perceived creakiness, as well as weak evidence for a bias hypothesis (gender bias) in which the exact direction of said bias is inconclusive&#8212;but seems to differ according to pitch (see <xref ref-type="table" rid="T5">Table 5</xref>).</p>
<table-wrap id="T5">
<caption>
<p><bold>Table 5:</bold> Summary of the listener creakiness ratings across median f0 values and face genders (for both voice qualities), with respect to the hypotheses and predictions tested.</p>
</caption>
<table>
<tbody>
<tr>
<td align="left" valign="top"><bold>Creakiness ratings</bold></td>
<td align="left" valign="top"><bold>Hypothesis tested</bold></td>
<td align="left" valign="top"><bold>Hypothesis predictions</bold></td>
</tr>
<tr>
<td align="left" valign="top">155 Hz &lt; 135 Hz &lt; 115 Hz</td>
<td align="left" valign="top">Acoustic</td>
<td align="left" valign="top">low-f0 &lt; high-f0</td>
</tr>
<tr>
<td align="left" valign="top">M &lt; F M &lt; F F &lt; M</td>
<td align="left" valign="top">Gender</td>
<td align="left" valign="top">M &lt; F</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As perception work on creaky voice has expanded, the results have revealed reliable trends. First, lower-pitched voices are perceived as creakier, which coincides with the acoustic literature that finds stronger creaky voice correlates for men (whose voices are typically lower-pitched) than for women (whose voices are typically higher-pitched). Second, briefly setting Li et al. (<xref ref-type="bibr" rid="B99">2023</xref>) aside because it does not vary f0 within gender, there is limited evidence in support of stronger creak judgements for women. Collectively, the tendency to find more creak in actual/perceived women&#8217;s voices compared to men&#8217;s is inconsistent in Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>), which only occurs in the first study, but not the subsequent replication; and in White et al. (<xref ref-type="bibr" rid="B150">2024</xref>), creak is unlikely to occur in the real world, given that it is only observable if male and female voices are matched at ambiguous pitch values. If anything, creak is weak in the current study and largely dependent on pitch as well. Thus, perception studies are <italic>not</italic> at odds with production studies of creaky voice. Rather, the perception of creaky voice in recent sociolinguistic and sociophonetic studies relying on impressionistic coding is at odds with i) (almost) all previous acoustic studies of creaky voice (e.g., <xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>; <xref ref-type="bibr" rid="B51">Gittelson, 2021</xref>; <xref ref-type="bibr" rid="B83">Klatt &amp; Klatt, 1990</xref>); ii) older sociolinguistic studies of creaky voice (in the U.K.), also relying on impressionistic coding (e.g., <xref ref-type="bibr" rid="B63">Henton &amp; Bladon, 1988</xref>; <xref ref-type="bibr" rid="B136">Stuart-Smith, 1999</xref>); and iii) to some degree, current systematic perceptual sociophonetic studies of creaky voice (e.g., <xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>; <xref ref-type="bibr" rid="B150">White et al., 2024</xref>). Given this, the central question now shifts from explaining a production-perception mismatch in creaky voice to explaining how so many recent sociolinguistic studies converge on increased creak in women&#8217;s voices through impressionistic coding. What makes the perception of creaky voice in recent impressionistic studies so different from both its acoustic realization and its perception in older impressionistic studies and controlled perception studies? This paper argues that neither a social gender bias nor an acoustic pitch-contrast bias fully accounts for the inflated perception of creak in women&#8217;s voices. Identifying the source of conflict between diametrically opposed findings on gendered creaky voice use has proven to be complex and multifaceted, requiring further investigation. The rest of this discussion is dedicated to providing directions for future sociophonetic research on creaky voice.</p>
</sec>
<sec>
<title>5.2 Questioning the validity of measures</title>
<p>If incongruent pitch-gender pairings subvert listener expectations, then how does this lead to increased perception of creak? As mentioned above (Section 5.1), White et al. (<xref ref-type="bibr" rid="B150">2024</xref>) suggest that this mechanism is related to low pitch with respect to the expected pitch of that gender. Assuming that low pitch is an important cue to creaky voice, this proposal follows. What is not so clear is why, in the current study&#8217;s results, a higher-pitched voice relative to expected pitch of a given face gender might lead to increased perceived creakiness. One possibility is that while our experiment (and potentially other creak perception experiments) was designed to evaluate perceived creakiness, listeners may not clearly be basing their ratings on creakiness alone. Listeners could be integrating other perceptual judgements of gender non-conformity, unnaturalness, or salience, for example, into their ratings. As a result, increased &#8220;creakiness&#8221; could in theory reflect increased &#8220;weirdness&#8221; to listeners instead.</p>
<p>This paper highlights how the perceptual identification of creaky voice is subject to social, cognitive or acoustic biases. It is worth noting, however, that existing acoustic studies also suffer from methodologically heterogeneity. Altogether, they implement a varied range of acoustic correlates to quantify/identify creaky voice: Some rely on one or two measures, usually low f0 or H1*&#8211;H2* (P&#233;piot, 2014; Loakes &amp; Gregory, 2020; <xref ref-type="bibr" rid="B137">Syrdal 1996</xref>; <xref ref-type="bibr" rid="B138">Szakay, 2012</xref>; <xref ref-type="bibr" rid="B139">Szakay &amp; Torgersen, 2015</xref>); some argue for a minimum of two measures, H1*&#8211;H2* and CPP (<xref ref-type="bibr" rid="B48">Garellek &amp; Esposito, 2023</xref>; <xref ref-type="bibr" rid="B126">Seyfarth &amp; Garellek, 2018</xref>); whereas others employ numerous measures (<xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>; <xref ref-type="bibr" rid="B57">Hanson &amp; Chuang, 1999</xref>; <xref ref-type="bibr" rid="B68">Iseli et al., 2007</xref>; <xref ref-type="bibr" rid="B88">Kuang, 2017</xref>; <xref ref-type="bibr" rid="B106">Lortie et al., 2015</xref>), and even apply dimensionality-reduction methods (<xref ref-type="bibr" rid="B74">Johnson &amp; Babel, 2023</xref>; <xref ref-type="bibr" rid="B79">Keating et al., 2023a</xref>); and some others use automatic creak detection tools which can be variably based on a single acoustic cue (<xref ref-type="bibr" rid="B125">Sebregts et al., 2023</xref>; <xref ref-type="bibr" rid="B140">Szakay &amp; Torgersen, 2019</xref>) or multiple acoustic cues (<xref ref-type="bibr" rid="B67">Irons &amp; Alexander, 2016</xref>; <xref ref-type="bibr" rid="B88">Kuang, 2017</xref>). This is a testament to the lack of consensus on the precise set of acoustic correlates to creak. The psychoacoustic model of voice (<xref ref-type="bibr" rid="B85">Kreiman et al., 2014</xref>; <xref ref-type="bibr" rid="B86">Kreiman et al., 2021</xref>) (introduced in Section 2.2) describes various acoustic cues as perceptually validated and individually necessary to the analysis of voice, and Keating et al.&#8217;s (<xref ref-type="bibr" rid="B79">2023a</xref>) cross-linguistic multi-dimensional analysis of voice argues for even more cues to be implemented. However, few perceptual studies thereafter (and even fewer perceptual studies of <italic>creaky</italic> voice) have made use of all of these reportedly key parameters, usually selecting a subset of these. As such, another possible explanation for the production-perception mismatch could postulate that previous studies do not use acoustic measures that reflect perceptual creakiness ratings. Perhaps dimensionality reduction methods could better integrate the various acoustic measures, including those from the psychoacoustic model of voice, and better represent the multi-dimensionality of the holistic percept of creaky voice. At the same time, some acoustic cues may be less perceptually relevant to creak than presumed in previous acoustic studies (e.g., <xref ref-type="bibr" rid="B66">Huang, 2019</xref>; <xref ref-type="bibr" rid="B78">Keating et al., 2023b</xref>; <xref ref-type="bibr" rid="B81">Khan et al., 2015</xref>). Without more in-depth empirical examination of this proposal, it cannot be certainly refuted. That said, given the abundance, breadth and varied use of cues, as well as consistency in results of previous acoustic literature of creak, it would be surprising if a slight modification to the method of analyzing acoustic measures would suddenly result in opposite findings (i.e., those that would support more creak acoustically for women).</p>
<p>Another alternative is that the use of creaky voice varies by gender in a way that is not captured by rates or averages used in previous studies and is more perceptually salient in women than in men. Investigations into creaky voice thus far have largely been restricted to descriptions of prevalence or average quantities of creak in the voice, often ignoring nuances in creaky voice usage. Burin (<xref ref-type="bibr" rid="B15">2022</xref>), for instance, notes more variability in women&#8217;s production of creaky voice than in men&#8217;s in American English, and increased pitch variability for women is reported to be widespread (e.g., <xref ref-type="bibr" rid="B50">Gisladottir et al., 2023</xref>). It is therefore conceivable that there exist gender differences in creaky voice usage as a function of creak location, intensity, type, or variance, for example. Whether these differences in use map to more perceptual salience remains empirically understudied, however.</p>
</sec>
<sec>
<title>5.3 Effects of expectation, awareness, and salience</title>
<p>In traditional impressionistic studies of creaky voice (e.g., <xref ref-type="bibr" rid="B63">Henton &amp; Bladon, 1988</xref>, <xref ref-type="bibr" rid="B136">Stuart-Smith, 1999</xref>), the social expectation was for men, older upper-class men in particular, to have creakier voices. This expectation is in alignment with phonetic work corroborating this pattern in voice acoustics (e.g., <xref ref-type="bibr" rid="B83">Klatt &amp; Klatt, 1990</xref>) and voice articulation, indicating physiological differences in vocal fold length and thickness as a function of gender (e.g., <xref ref-type="bibr" rid="B64">Hollien, 1974</xref>). Prior to the early 21st century, social expectations and acoustic expectations of voice were compatible. As of roughly 2010, it has become apparent that social expectations of voice have shifted to women exhibiting creakier voices (at least among most North American, English-speaking communities), despite a clear lack of evidence for any change in articulation (e.g., <xref ref-type="bibr" rid="B158">Zhang, 2021</xref>) or acoustics (e.g., <xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>) over time. At present, it appears that social expectations of voice conflict with phonetic expectations of voice.</p>
<p>Labov (<xref ref-type="bibr" rid="B90">1972</xref>) proposes a classification of sociolinguistic variables (i.e., socially-stratified linguistic behavior) as a function of speaker/listener awareness. The concept of awareness is then often attributed to <italic>salience</italic> of the linguistic variable (but see <xref ref-type="bibr" rid="B69">Jaeger &amp; Weatherholtz, 2016</xref>, for a more information theoretic model of salience as a function of surprisal, frequency, and perceived social informativeness). Labov describes three levels of awareness: 1) <italic>indicators</italic> are below the threshold of conscious awareness, never noticed by speakers/listeners; 2) <italic>markers</italic> are typically perceived at least subconsciously as listeners can manipulate the variable usage depending on context, suggesting implicit awareness of them; and 3) <italic>stereotypes</italic> are socially marked (often stigmatized) features characterized by explicit awareness, usually subject to meta-linguistic commentary. Considering this classification, creaky voice does not clearly fall under one single level of awareness. In American popular discourse, creaky voice (better known as vocal fry) is likely considered to be a stereotype associated with young women. Speaker and listeners are highly conscious of it and are often advised to avoid using it due to its controversial social evaluation. These strong opinions in the mainstream media have potential to influence general perception of creaky voice but are less likely to be an entirely faithful representation of widespread views of creak (see Coupland, 2014; Trudgill, 2014, for discussions of media and language). More broadly, it seems that creaky voice could be considered a marker, used by different social groups but often in a subconscious way, i.e., speakers are not fully aware that they are using it. As such, it is possible that creaky voice variation by gender is twofold, acting as a stereotype when women use it, but as a mere marker when men use it. In the current state of affairs, these pressures are both at odds, pulling creaky voice perception in two different directions and obscuring direct gender effects (independent of f0) in this study. In fact, explicit and implicit social evaluations might induce different perceptual behaviors and involve separate cognitive processing (see <xref ref-type="bibr" rid="B41">Evans, 2008</xref>, for a review). From another perspective, Juskan&#8217;s first necessary condition for effective priming is sufficient social salience of the variable (<xref ref-type="bibr" rid="B76">2016</xref>), and conflicting social expectations might prevent the satisfaction of this condition.</p>
<p>Campbell-Kibler (<xref ref-type="bibr" rid="B20">2009</xref>) introduces a similar concept to this, sociolinguistic &#8220;bullet-proofing,&#8221; showing that the social meaning of the variable (ING) (i.e., the use of &#8211;ing vs. -in) depends on listeners&#8217; perceptions of the speaker. While -ing is generally associated with higher intelligence and -in with lower intelligence, this pattern only emerged for working-class non-Southerners. Southerners were consistently rated as less intelligent regardless of their (ING) use, and middle-to-upper class (non-Southern) speakers were effectively bullet-proof to any effect of (ING) variation on perceived intelligence. This demonstrates that the social evaluation of linguistic variation can differ greatly as a function of listeners&#8217; expectations for/stereotypes about distinct social groups. It is possible that men&#8217;s voices are bullet-proof to creaky voice use in the sense that creaky voice use is not perceived as socially significant for them and is therefore somehow less salient or perceptible to the listener.</p>
<p>Another possible explanation for the increase in reported creakiness in women&#8217;s speech found in recent sociolinguistic studies is creaky voice in men may undergo perceptual normalization due to its frequency, an automatic process that is below the level of consciousness. Because creaky voice is acoustically common in men&#8217;s voices and listeners are frequently exposed to instances of creak in men&#8217;s voices, its presence may be filtered out or perceived as less salient (akin to exemplar-based categorization, see <xref ref-type="bibr" rid="B36">Drager &amp; Kirtley, 2016</xref>, or <xref ref-type="bibr" rid="B73">Johnson, 1997</xref>). This process would be comparable to reduced salience of creak in prosodically expected positions, identified consistently less utterance-finally (<xref ref-type="bibr" rid="B25">Davidson, 2019a</xref>; <xref ref-type="bibr" rid="B99">Li et al., 2023</xref>). This could make creaky voice in men less noticeable and thus under-identified in impressionistic coding, artificially inflating the identification of creak in women&#8217;s speech. However, this account does not explain why controlled perceptual experiments such as Davidson (<xref ref-type="bibr" rid="B25">2019a</xref>) do not similarly show increased creak judgments for women. If creaky voice in men was genuinely less salient, we might expect listeners in these perception studies to rate male voices as less creaky overall. Given that the opposite occurs, normalization effects alone are not sufficient to account for the gender asymmetries found in impressionistic sociolinguistic studies and controlled perceptual experiments.</p>
</sec>
<sec>
<title>5.4 Limitations of the current study</title>
<p>Participant demographic characteristics were not considered in this analysis, following Davidson&#8217;s (<xref ref-type="bibr" rid="B25">2019a</xref>) finding that listener gender did not affect creaky voice identification and White et al.&#8217;s (<xref ref-type="bibr" rid="B150">2024</xref>) concurring decision not to account for listener effects. There is some evidence, however, that listener characteristics can influence perception (e.g., <xref ref-type="bibr" rid="B155">Yu &amp; Zellou, 2019</xref>, among others). Preliminary empirical plots by listener gender and age are provided in figures A16, A17 and A18 (Appendix A) and suggest a possible gender effect whereby female listeners are more likely to rate voices as creakier than male listeners, but no consistent effects of age. As such, the inclusion of listener demographic information in formal modelling of creaky voice ratings may allow for a more thorough understanding of creaky voice perception.</p>
<p>In addition, closer examination of the stimuli and questionnaire data from the creak experiment provided some useful information about how participants perceived the stimuli and felt about the experiment. Generally, participants did not find the task to be very difficult. A few participants noted that the voices did not match the faces or that they ignored the faces entirely, suggesting that they may have been less influenced by the face gender primes. This could explain the weak face gender effect observed in this study. Other participants remarked that the voices/audio sounded a bit unnatural, distorted or noisy, which could be concern for the ecological validity of the stimuli. The Praat &#8220;Change gender&#8221; function did seem to introduce some distortion into the voice signal (both the original recordings and manipulated stimuli can be accessed on the OSF page), possibly leading some listeners to identify creakiness even in the modally-voiced utterances as seen in <xref ref-type="fig" rid="F4">Figure 4</xref> above. The modal stimuli also appeared to be produced at a slightly faster speech rate than the creaky stimuli (see Figure A5 in Appendix A), which could facilitate creak (<xref ref-type="bibr" rid="B13">Brown &amp; Sonderegger, 2025</xref>), also potentially contributing to the somewhat creaky-rated modal stimuli. As a result, differences in creakiness ratings between modal and creaky stimuli may be less pronounced.</p>
<p>Moreover, in the questionnaire data, a few participants shared that they were familiar with creaky voice; one participant explicitly stated their dislike for it. However, a systematic assessment of listeners&#8217; attitudes towards creak, or their stereotypes about it prior to participation in the experiment, was not included in this study. At present, it has not yet been empirically demonstrated that Canadian English listeners hold the same social biases as American English listeners, potentially explaining weaker face gender priming in the current study. There is some evidence (also from a matched-guise face priming design) that Canadian English listeners differ in their perception of speech and stereotypes compared to American English listeners (<xref ref-type="bibr" rid="B89">Kutlu et al., 2022</xref>). Nevertheless, it has been shown that Canadian listeners do hold similar negative judgements of women who use creaky voice (<xref ref-type="bibr" rid="B53">Goodine &amp; Johns, 2014</xref>) as American listeners (e.g., <xref ref-type="bibr" rid="B4">Anderson et al., 2014</xref>). Narratives promoting criticism of creaky voice are likely to reach Canadian audiences due to the pervasive influence of American media complemented by coverage in mainstream Canadian outlets and internet blogs/op-eds (e.g., <xref ref-type="bibr" rid="B22">Chattopadhyay, 2015</xref>; <xref ref-type="bibr" rid="B149">Weber, 2017</xref>). Furthermore, the training phase itself may have affected participants&#8217; responses. The presentation of two stereotyped and comparably extreme examples of creak could have inadvertently raised the perceptual threshold for what counts as creaky, possibly decreasing sensitivity to local and/or more subtle instances of creak. On the other hand, although our description of creaky voice deliberately avoided reference to low f0/pitch, participants still tended to associate lower f0s with greater creakiness, which supports the reliability of this effect.</p>
<p>It is also possible that some of the ambiguous voices and their pairings with some of the faces could have led to transgender or non-binary gender voice percepts. Gender expectations and stereotypes related to transgender and non-binary creaky voice use remain severely understudied (<xref ref-type="bibr" rid="B39">Eckert &amp; Podesva, 2021</xref>), and the few studies that do exist present opaque results. As a consequence, it is unclear how and to what extent these might impact listener creakiness ratings. Becker et al. (<xref ref-type="bibr" rid="B8">2022</xref>) find that creaky voice variation is only predicted by gender in interaction with speech style and hypothesize that larger style-shifts amongst non-binary AFAB individuals on testosterone and trans men (regardless of hormonal status) may be attributable to an avoidance of features ideologically associated with cis women&#8217;s speech. This study&#8217;s findings are indicative of the complex relationship between creaky voice use and projections of personal identity (with respect to gender in this case).</p>
<p>By the same token, it is possible that the (often negative) stereotypes surrounding young women&#8217;s creaky voice are not specific to creaky voice, but rather to a particular persona. The combination of individual meaningful elements (e.g., linguistic features) to construct a more broad and complex meaningful entity (e.g., a persona) is referred to as <italic>bricolage</italic> within the domain of third-wave sociolinguistics (<xref ref-type="bibr" rid="B38">Eckert, 2008</xref>; originating from fields of sociology and anthropology, <xref ref-type="bibr" rid="B61">Hebdige, 1984</xref>; <xref ref-type="bibr" rid="B98">L&#233;vi-Strauss, 1962</xref>). In Podesva&#8217;s (<xref ref-type="bibr" rid="B119">2007</xref>) study of a young gay doctor&#8217;s speech, he determined that this speaker made use of extensive falsetto voice quality (greater duration, f0 range and maximum), exaggerated stop releases (longer and more intense bursts), and lexical choices (e.g., &#8220;dear&#8221;) to form a so-called diva persona. Likewise, in some cases, creaky voice may be used concurrently with other sociolinguistic variables to construct specific personae. Generalization towards exact personae is not clear at present: Creaky voice has been proposed to be linked to the indexation of character or affective traits like aloofness, disengagement, and negativity (e.g., <xref ref-type="bibr" rid="B121">Podesva, 2018</xref>; <xref ref-type="bibr" rid="B122">Pratt, 2018</xref>), but also to upward-mobility, authority, and toughness (e.g., <xref ref-type="bibr" rid="B30">Dilley et al., 1996</xref>; <xref ref-type="bibr" rid="B111">Mendoza-Denton, 2011</xref>; <xref ref-type="bibr" rid="B156">Yuasa, 2010</xref>). For instance, creaky voice in combination with <italic>uptalk</italic> and the use of &#8220;like&#8221; as a discourse marker could convey a persona of a ditzy Millennial or Gen Z woman, whereas in combination with low modal pitch and slow speech rate it could convey the entirely different persona of a confident businesswoman. The distinct personae, instead of creaky voice use specifically, could also prompt different social and affective evaluations by listeners, influencing their perception of creaky voice in a variety of ways.</p>
</sec>
</sec>
<sec>
<title>6. Conclusion</title>
<p>This study found evidence that speaker f0 affects the perception of creaky voice, with low f0s increasing creakiness ratings even within a gender-neutral pitch range with gender-neutral formant structure. While face gender primes did not independently influence creakiness ratings, subtle interactions between perceived face gender and f0 suggest that socio-physiological expectations may shape listener judgements to some extent, especially when pitch and gender cues are incongruent. These effects were weak at best, however, failing to provide convincing evidence for a robust social gender bias in creaky voice perception. Crucially, our findings do not support the notion that increased creak perception in women&#8217;s voices can be explained by a widespread gender bias or an acoustic bias alone. Instead, they point to a more complex interplay between acoustic cues and social expectations, one that may be contingent on methodological decisions, listener awareness and/or the salience of gendered voice norms. Alongside previous systematic production and perception studies of creaky voice, the present results demonstrating a marked lack of empirical evidence for greater perception of creakiness in women&#8217;s voices cast further doubt on the pervasive narrative&#8212;perpetuated in popular discourse and recent influential sociolinguistic studies&#8212;that women are often creakier. From a methodological standpoint, these findings raise important questions about impressionistic coding practices in analyses of voice, especially when divorced from acoustic or perceptual validation.</p>
<p>Broadly, this paper explores how acoustic and social cues to speaker identity can influence listener voice perception. It highlights the limitations of phonetically-grounded but socially agnostic models of voice perception, advocating for more nuanced approaches that integrate acoustic and social information into cognition. By bringing together acoustic, perceptual, and social dimensions of speech, we can better understand how creaky voice manifests in production, molds listener judgements, and reveals broader patterns of language use and social meaning.</p>
</sec>
<sec>
<title>Data accessibility statement</title>
<p>All stimuli, code and data are provided on the paper&#8217;s OSF page at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://osf.io/f45yh/">https://osf.io/f45yh/</ext-link>, doi: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://doi.org/10.17605/osf.io/f45yh">10.17605/osf.io/f45yh</ext-link>.</p>
</sec>
<sec>
<title>Additional file</title>
<p>The additional file for this article can be found as follows:</p>
<list list-type="bullet">
<list-item><p><bold>Appendices.</bold> DOI: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://doi.org/10.16995/labphon.24285.s1">https://doi.org/10.16995/labphon.24285.s1</ext-link></p></list-item>
</list>
</sec>
</body>
<back>
<fn-group>
<fn id="n1"><p><italic>Varying</italic> effects are the Bayesian equivalent to <italic>random</italic> effects in frequentist models (<xref ref-type="bibr" rid="B108">McElreath, 2018</xref>).</p></fn>
</fn-group>
<sec>
<title>Ethics and consent</title>
<p>This research was reviewed and approved by the McGill Research Ethics Board 2 under REB# 419&#8211;0319. All participants provided informed consent to participate in the study, in accordance with ethical research guidelines.</p>
</sec>
<sec>
<title>Acknowledgements</title>
<p>We are grateful to Morgan Sonderegger for his expert insight, discussion, and comments on various versions of this work, to Eleanor Chodroff for her guidance on methodological decisions during the project&#8217;s conception, and to Abby Walker and Charlotte Vaughn for the thoughtful conversations that helped sharpen our thinking. We also appreciate the contributions of colleagues at McGill in MCQLL and P* and audiences at [mot<sup>h</sup>]2025 for their feedback and discussions. We thank two anonymous reviewers and associate editor, Yao Yao, whose engaged feedback and constructive suggestions substantially strengthened this manuscript.</p>
</sec>
<sec>
<title>Funding information</title>
<p>This work was supported by the Social Sciences and Humanities Research Council of Canada [435-2024-0996] grant to Meghan Clayards, as well as the Fonds de recherche du Qu&#233;bec &#8211; Soci&#233;t&#233; et culture [373124] grant, the Arts Graduate Research Enhancement and Travel Awards program at McGill, and the CRBLM Travel Award to Jeanne Brown.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>Meghan Clayards is a member of the editorial board for <italic>Laboratory Phonology</italic>, which is on a voluntary basis. All other authors declare that they have no competing interests.</p>
</sec>
<sec>
<title>Author contributions</title>
<p>Jeanne Brown and Meghan Clayards jointly conceptualized and designed the study and interpreted the results. Jeanne Brown created the stimuli, conducted the experiment, led the data analysis, and drafted the manuscript. Meghan Clayards supervised the project and provided funding. Both authors revised the manuscript, read, and approved the submitted version.</p>
</sec>
<ref-list>
<ref id="B1"><mixed-citation publication-type="journal"><string-name><surname>Abdelli-Beruh</surname>, <given-names>N. B.</given-names></string-name>, <string-name><surname>Wolk</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Slavin</surname>, <given-names>D.</given-names></string-name> (<year>2014</year>). <article-title>Prevalence of vocal fry in young adult male American English speakers</article-title>. <source>Journal of Voice</source>, <volume>28</volume>(<issue>2</issue>), <fpage>185</fpage>&#8211;<lpage>190</lpage>. <pub-id pub-id-type="doi">10.1016/j.jvoice.2013.08.011</pub-id></mixed-citation></ref>
<ref id="B2"><mixed-citation publication-type="book"><string-name><surname>Abercrombie</surname>, <given-names>D.</given-names></string-name> (<year>1967</year>). <source>Elements of general phonetics</source>. <publisher-name>Edinburgh University Press</publisher-name>.</mixed-citation></ref>
<ref id="B3"><mixed-citation publication-type="journal"><string-name><surname>Alderton</surname>, <given-names>R.</given-names></string-name> (<year>2020</year>). <article-title>Speaker gender and salience in sociolinguistic speech perception: Goose-fronting in Standard Southern British English</article-title>. <source>Journal of English Linguistics</source>, <volume>48</volume>(<issue>1</issue>), <fpage>72</fpage>&#8211;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1177/0075424219896400</pub-id></mixed-citation></ref>
<ref id="B4"><mixed-citation publication-type="journal"><string-name><surname>Anderson</surname>, <given-names>R. C.</given-names></string-name>, <string-name><surname>Klofstad</surname>, <given-names>C. A.</given-names></string-name>, <string-name><surname>Mayew</surname>, <given-names>W. J.</given-names></string-name>, &amp; <string-name><surname>Venkatachalam</surname>, <given-names>M.</given-names></string-name> (<year>2014</year>). <article-title>Vocal fry may undermine the success of young women in the labor market</article-title>. <source>PLoS ONE</source>, <volume>9</volume>(<issue>5</issue>), <elocation-id>e97506</elocation-id>. <pub-id pub-id-type="doi">10.1371/journal.pone.0097506</pub-id></mixed-citation></ref>
<ref id="B5"><mixed-citation publication-type="journal"><string-name><surname>Anwyl-Irvine</surname>, <given-names>A. L.</given-names></string-name>, <string-name><surname>Massonni&#233;</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Flitton</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Kirkham</surname>, <given-names>N.</given-names></string-name>, &amp; <string-name><surname>Evershed</surname>, <given-names>J. K.</given-names></string-name> (<year>2020</year>). <article-title>Gorilla in our midst: An online behavioral experiment builder</article-title>. <source>Behaviour Research Methods</source>, <volume>52</volume>(<issue>1</issue>), <fpage>388</fpage>&#8211;<lpage>407</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-019-01237-x</pub-id></mixed-citation></ref>
<ref id="B6"><mixed-citation publication-type="book"><string-name><surname>Auer</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Hinskens</surname>, <given-names>F.</given-names></string-name> (<year>2005</year>). <chapter-title>The role of interpersonal accommodation in a theory of language change</chapter-title>. In <string-name><given-names>P.</given-names> <surname>Auer</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Hinskens</surname></string-name>, &amp; <string-name><given-names>P.</given-names> <surname>Kerswill</surname></string-name> (Eds.), <source>Dialect change: Convergence and divergence in European languages</source> (pp. <fpage>335</fpage>&#8211;<lpage>357</lpage>), <publisher-name>Cambridge University Press</publisher-name>. <pub-id pub-id-type="doi">10.1017/CBO9780511486623</pub-id></mixed-citation></ref>
<ref id="B7"><mixed-citation publication-type="journal"><string-name><surname>Babel</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Russell</surname>, <given-names>J.</given-names></string-name> (<year>2015</year>). <article-title>Expectations and speech intelligibility</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>137</volume>(<issue>5</issue>), <fpage>2823</fpage>&#8211;<lpage>2833</lpage>. <pub-id pub-id-type="doi">10.1121/1.4919317</pub-id></mixed-citation></ref>
<ref id="B8"><mixed-citation publication-type="journal"><string-name><surname>Becker</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Khan</surname>, <given-names>S. u. D.</given-names></string-name>, &amp; <string-name><surname>Zimman</surname>, <given-names>L.</given-names></string-name> (<year>2022</year>). <article-title>Beyond binary gender: Creaky voice, gender, and the variationist enterprise</article-title>. <source>Language Variation and Change</source>, <volume>34</volume>(<issue>2</issue>), <fpage>215</fpage>&#8211;<lpage>238</lpage>. <pub-id pub-id-type="doi">10.1017/S0954394522000138</pub-id></mixed-citation></ref>
<ref id="B9"><mixed-citation publication-type="journal"><string-name><surname>Blomgren</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Ng</surname>, <given-names>M. L.</given-names></string-name>, &amp; <string-name><surname>Gilbert</surname>, <given-names>H. R.</given-names></string-name> (<year>1998</year>). <article-title>Acoustic, aerodynamic, physiologic, and perceptual properties of modal and vocal fry registers</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>103</volume>(<issue>5</issue>), Pt 1, <fpage>2649</fpage>&#8211;<lpage>2658</lpage>. <pub-id pub-id-type="doi">10.1121/1.422785</pub-id></mixed-citation></ref>
<ref id="B10"><mixed-citation publication-type="webpage"><string-name><surname>Boersma</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Weenink</surname>, <given-names>D.</given-names></string-name> (<year>1992&#8211;2025</year>): <article-title>Praat: Doing phonetics by computer [Computer program]</article-title>. Version 6.1.16, retrieved 12 August 2020. <uri>https://www.praat.org</uri></mixed-citation></ref>
<ref id="B11"><mixed-citation publication-type="book"><string-name><surname>Bouavichith</surname>, <given-names>D. A.</given-names></string-name>, <string-name><surname>Calloway</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Craft</surname>, <given-names>J. T.</given-names></string-name>, <string-name><surname>Hildebrandt</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Tobin</surname>, <given-names>S. J.</given-names></string-name>, &amp; <string-name><surname>Beddor</surname>, <given-names>P. S.</given-names></string-name> (<year>2019</year>). <chapter-title>Perceptual influences of social and linguistic priming are bidirectional</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Calhoun</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Escudero</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Tabain</surname></string-name> &amp; <string-name><given-names>P.</given-names> <surname>Warren</surname></string-name> (Eds.), <source>Proceedings of the 19th International Congress of Phonetic Sciences</source>, <fpage>1039</fpage>&#8211;<lpage>1043</lpage>. <publisher-name>Australasian Speech Science and Technology Association and International Phonetic Association</publisher-name>.</mixed-citation></ref>
<ref id="B12"><mixed-citation publication-type="webpage"><string-name><surname>Brown</surname>, <given-names>J.</given-names></string-name> (<year>2025</year>). <chapter-title>Acoustic correlates in the production of creaky voice: Mediation by f0 [Oral presentation]</chapter-title>. <source>72nd Annual CLA Conference</source>. <publisher-loc>Montreal, Canada</publisher-loc>. <uri>https://cla-acl.ca/programmes/congres-de-2025-meeting.html</uri></mixed-citation></ref>
<ref id="B13"><mixed-citation publication-type="journal"><string-name><surname>Brown</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Sonderegger</surname>, <given-names>M.</given-names></string-name> (<year>2025</year>). <article-title>A sociophonetic study of creaky voice across language, gender and age in Canadian English-French bilinguals</article-title>. <source>Journal of Phonetics</source>, <volume>112</volume>, <elocation-id>101431</elocation-id>. <pub-id pub-id-type="doi">10.1016/j.wocn.2025.101431</pub-id></mixed-citation></ref>
<ref id="B14"><mixed-citation publication-type="webpage"><string-name><surname>Brubaker</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Whitfield</surname>, <given-names>J. A.</given-names></string-name> &amp; <string-name><surname>Schoonmaker Rodgers</surname>, <given-names>J.</given-names></string-name> (<year>2016</year>). <chapter-title>Fundamental frequency characteristics of modal and vocal fry registers</chapter-title>. <source>Honors Projects, 267</source>. <publisher-name>Bowling Green State University</publisher-name>. <uri>https://scholarworks.bgsu.edu/honorsprojects/267</uri></mixed-citation></ref>
<ref id="B15"><mixed-citation publication-type="thesis"><string-name><surname>Burin</surname>, <given-names>L.</given-names></string-name> (<year>2022</year>). <source>Perception and accommodation among French learners of English: An acoustic and electroglottographic study of creaky voice</source>. [Doctoral dissertation, <publisher-name>Universit&#233; Paris Cit&#233;</publisher-name>].</mixed-citation></ref>
<ref id="B16"><mixed-citation publication-type="journal"><string-name><surname>B&#252;rkner</surname>, <given-names>P.</given-names></string-name> (<year>2017</year>). <article-title>brms: An R package for Bayesian multilevel models using Stan</article-title>. <source>Journal of Statistical Software</source>, <volume>80</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v080.i01</pub-id>.</mixed-citation></ref>
<ref id="B17"><mixed-citation publication-type="journal"><string-name><surname>Calhoun</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>White</surname>, <given-names>H.</given-names></string-name> (<year>2025</year>). <article-title>What makes iconic pitch associations &#8220;natural&#8221;: The effect of age on affective meanings of uptalk and creak</article-title>. <source>Language and Speech</source>, <volume>238309251314863</volume>. <pub-id pub-id-type="doi">10.1177/00238309251314863</pub-id></mixed-citation></ref>
<ref id="B18"><mixed-citation publication-type="thesis"><string-name><surname>Callier</surname>, <given-names>P.</given-names></string-name> (<year>2013</year>) <source>Linguistic context and the social meaning of voice quality variation</source>. [Doctoral dissertation, <publisher-name>Georgetown University</publisher-name>].</mixed-citation></ref>
<ref id="B19"><mixed-citation publication-type="journal"><string-name><surname>Campanella</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Belin</surname>, <given-names>P.</given-names></string-name> (<year>2007</year>) <article-title>Integrating face and voice in person perception</article-title>. <source>Trends in Cognitive Sciences</source>, <volume>11</volume>(<issue>12</issue>), <fpage>535</fpage>&#8211;<lpage>543</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2007.10.001</pub-id></mixed-citation></ref>
<ref id="B20"><mixed-citation publication-type="journal"><string-name><surname>Campbell-Kibler</surname>, <given-names>K.</given-names></string-name> (<year>2009</year>). <article-title>The nature of sociolinguistic perception</article-title>. <source>Language Variation and Change</source>, <volume>21</volume>(<issue>1</issue>), <fpage>135</fpage>&#8211;<lpage>156</lpage>. <pub-id pub-id-type="doi">10.1017/S0954394509000052</pub-id></mixed-citation></ref>
<ref id="B21"><mixed-citation publication-type="book"><string-name><surname>Catford</surname>, <given-names>J. C.</given-names></string-name> (<year>1964</year>). <chapter-title>Phonation types: The classification of some laryngeal components of speech production</chapter-title>. In <string-name><given-names>D.</given-names> <surname>Abercrombie</surname></string-name>, <string-name><given-names>D. B.</given-names> <surname>Fry</surname></string-name>, <string-name><given-names>P. A. D.</given-names> <surname>MacCarthy</surname></string-name>, <string-name><given-names>N. C.</given-names> <surname>Scott</surname></string-name>, <string-name><given-names>J. L. M.</given-names> <surname>Trim</surname></string-name> (Eds.), <source>In honour of Daniel Jones: Papers contributed on the occasion of his eightieth birthday</source>, <day>12</day> <month>September</month> 1961 (pp. <fpage>26</fpage>&#8211;<lpage>37</lpage>). <publisher-name>Longmans</publisher-name>.</mixed-citation></ref>
<ref id="B22"><mixed-citation publication-type="webpage"><string-name><surname>Chattopadhyay</surname>, <given-names>P.</given-names></string-name> (<year>2015</year>, <month>July</month> <day>28</day>). <chapter-title>&#8216;Vocal Fry&#8217; undermines empowered young women, says Naomi Wolf</chapter-title>. <source>The Current</source> [Radio broadcast]. <publisher-name>CBC Radio</publisher-name>. <uri>https://www.cbc.ca/radio/thecurrent/the-current-for-july-28-2015-1.3170502/vocal-fry-undermines-empowered-young-women-says-naomi-wolf-1.3170511</uri></mixed-citation></ref>
<ref id="B23"><mixed-citation publication-type="journal"><string-name><surname>Crowhurst</surname>, <given-names>M. J.</given-names></string-name> (<year>2018</year>). <article-title>The influence of varying vowel phonation and duration on rhythmic grouping biases among Spanish and English speakers</article-title>. <source>Journal of Phonetics</source>, <volume>66</volume>, <fpage>82</fpage>&#8211;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2017.09.001</pub-id></mixed-citation></ref>
<ref id="B24"><mixed-citation publication-type="journal"><string-name><surname>Dallaston</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Docherty</surname>, <given-names>G.</given-names></string-name> (<year>2020</year>). <article-title>The quantitative prevalence of creaky voice (vocal fry) in varieties of English: A systematic review of the literature</article-title>. <source>PLoS ONE</source>, <volume>15</volume>(<issue>3</issue>). <elocation-id>e0229960</elocation-id>. <pub-id pub-id-type="doi">10.1371/journal.pone.0229960</pub-id></mixed-citation></ref>
<ref id="B25"><mixed-citation publication-type="journal"><string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name> (<year>2019a</year>). <article-title>The effects of pitch, gender, and prosodic context on the identification of creaky voice</article-title>. <source>Phonetica</source>, <volume>76</volume>(<issue>4</issue>), <fpage>235</fpage>&#8211;<lpage>262</lpage>. <pub-id pub-id-type="doi">10.1159/000490948</pub-id></mixed-citation></ref>
<ref id="B26"><mixed-citation publication-type="book"><string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name> (<year>2019b</year>). <chapter-title>Perceptual coherence of creaky voice qualities</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Calhoun</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Escudero</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Tabain</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Warren</surname></string-name> (Eds.), <source>Proceedings of the 19th International Congress of Phonetic Sciences</source>, <fpage>147</fpage>&#8211;<lpage>151</lpage>, <publisher-name>Australasian Speech Science and Technology Association and International Phonetic Association</publisher-name>.</mixed-citation></ref>
<ref id="B27"><mixed-citation publication-type="journal"><string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name> (<year>2020</year>). <article-title>Contributions of modal and creaky voice to the perception of habitual pitch</article-title>. <source>Language</source>, <volume>96</volume>(<issue>1</issue>), <fpage>e22</fpage>&#8211;<lpage>e37</lpage>. <pub-id pub-id-type="doi">10.1353/lan.2020.0013</pub-id></mixed-citation></ref>
<ref id="B28"><mixed-citation publication-type="journal"><string-name><surname>Davidson</surname>, <given-names>L.</given-names></string-name> (<year>2021</year>). <article-title>The versatility of creaky phonation: Segmental, prosodic, and sociolinguistic uses in the world&#8217;s languages</article-title>. <source>Wiley Interdisciplinary Reviews Cognitive Science</source>, <volume>12</volume>(<issue>3</issue>), <elocation-id>1547</elocation-id>. <pub-id pub-id-type="doi">10.1002/wcs.1547</pub-id></mixed-citation></ref>
<ref id="B29"><mixed-citation publication-type="journal"><string-name><surname>DeBruine</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Jones</surname>, <given-names>B.</given-names></string-name> (<year>2017</year>). <article-title>Face Research Lab London Set [Dataset]</article-title>. <pub-id pub-id-type="doi">10.6084/m9.figshare.5047666.v5</pub-id></mixed-citation></ref>
<ref id="B30"><mixed-citation publication-type="journal"><string-name><surname>Dilley</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Shattuck-Hufnagel</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Ostendorf</surname>, <given-names>M.</given-names></string-name> (<year>1996</year>). <article-title>Glottalization of word-initial vowels as a function of prosodic structure</article-title>. <source>Journal of Phonetics</source>, <volume>24</volume>, <fpage>423</fpage>&#8211;<lpage>444</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.1996.0023</pub-id></mixed-citation></ref>
<ref id="B31"><mixed-citation publication-type="journal"><string-name><surname>D&#8217;Onofrio</surname>, <given-names>A.</given-names></string-name> (<year>2015</year>). <article-title>Persona-based information shapes linguistic perception: Valley Girls and California vowels</article-title>. <source>Journal of Sociolinguistics</source>, <volume>19</volume>(<issue>2</issue>), <fpage>241</fpage>&#8211;<lpage>256</lpage>. <pub-id pub-id-type="doi">10.1111/josl.12115</pub-id></mixed-citation></ref>
<ref id="B32"><mixed-citation publication-type="thesis"><string-name><surname>D&#8217;Onofrio</surname>, <given-names>A.</given-names></string-name> (<year>2016</year>). <source>Social meaning in linguistic perception</source>. [Doctoral dissertation, <publisher-name>Stanford University</publisher-name>].</mixed-citation></ref>
<ref id="B33"><mixed-citation publication-type="journal"><string-name><surname>D&#8217;Onofrio</surname>, <given-names>A.</given-names></string-name> (<year>2020</year>). <article-title>Personae in sociolinguistic variation</article-title>. <source>WIREs Cognition Science</source>, <volume>11</volume>(<issue>6</issue>), <elocation-id>e1543</elocation-id>. <pub-id pub-id-type="doi">10.1002/wcs.1543</pub-id></mixed-citation></ref>
<ref id="B34"><mixed-citation publication-type="journal"><string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name> (<year>2010</year>). <article-title>Sociophonetic variation in speech perception</article-title>. <source>Language and Linguistics Compass</source>, <volume>4</volume>(<issue>7</issue>), <fpage>473</fpage>&#8211;<lpage>480</lpage>. <pub-id pub-id-type="doi">10.1111/j.1749-818X.2010.00210.x</pub-id></mixed-citation></ref>
<ref id="B35"><mixed-citation publication-type="journal"><string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name> (<year>2011</year>). <article-title>Speaker age and vowel perception</article-title>. <source>Language and Speech</source>, <volume>54</volume>(<issue>1</issue>), <fpage>99</fpage>&#8211;<lpage>121</lpage>. <pub-id pub-id-type="doi">10.1177/0023830910388017</pub-id></mixed-citation></ref>
<ref id="B36"><mixed-citation publication-type="book"><string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Kirtley</surname>, <given-names>J.</given-names></string-name> (<year>2016</year>). <chapter-title>Awareness, salience, and stereotypes in exemplar-based models of speech production and perception</chapter-title>. In <string-name><given-names>A.</given-names> <surname>Babel</surname></string-name> (Ed.), <source>Awareness and control in sociolinguistic research</source> (pp. <fpage>1</fpage>&#8211;<lpage>24</lpage>). <publisher-name>Cambridge University Press</publisher-name>. <pub-id pub-id-type="doi">10.1017/CBO9781139680448.003</pub-id></mixed-citation></ref>
<ref id="B37"><mixed-citation publication-type="journal"><string-name><surname>Duarte-Borquez</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Van Doren</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2024</year>). <article-title>Utterance-final voice quality in American English and Mexican Spanish bilinguals</article-title>. <source>Languages</source>, <volume>9</volume>, <elocation-id>70</elocation-id>. <pub-id pub-id-type="doi">10.3390/languages9030070</pub-id></mixed-citation></ref>
<ref id="B38"><mixed-citation publication-type="journal"><string-name><surname>Eckert</surname>, <given-names>P.</given-names></string-name> (<year>2008</year>). <article-title>Variation and the indexical field</article-title>. <source>Journal of Sociolinguistics</source>, <volume>12</volume>(<issue>4</issue>), <fpage>453</fpage>&#8211;<lpage>476</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9841.2008.00374.x</pub-id></mixed-citation></ref>
<ref id="B39"><mixed-citation publication-type="book"><string-name><surname>Eckert</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Podesva</surname>, <given-names>R.</given-names></string-name> (<year>2021</year>). <chapter-title>Non-binary approaches to gender and sexuality</chapter-title>. In <string-name><given-names>J.</given-names> <surname>Angouri</surname></string-name> &amp; <string-name><given-names>J.</given-names> <surname>Baxter</surname></string-name> (Eds.), <source>The Routledge handbook of language, gender, and sexuality</source> (pp. <fpage>25</fpage>&#8211;<lpage>36</lpage>). <publisher-name>Routledge</publisher-name>. <pub-id pub-id-type="doi">10.4324/9781315514857</pub-id></mixed-citation></ref>
<ref id="B40"><mixed-citation publication-type="journal"><string-name><surname>Esling</surname>, <given-names>J.</given-names></string-name> (<year>1978</year>). <article-title>The identification of features of voice quality in social groups</article-title>. <source>Journal of the International Phonetic Association</source>, <volume>8</volume>(<issue>1&#8211;2</issue>), <fpage>18</fpage>&#8211;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1017/S0025100300001699</pub-id></mixed-citation></ref>
<ref id="B41"><mixed-citation publication-type="journal"><string-name><surname>Evans</surname>, <given-names>J. S.</given-names></string-name> (<year>2008</year>). <article-title>Dual-processing accounts of reasoning, judgment, and social cognition</article-title>. <source>Annual Review of Psychology</source>, <volume>59</volume>, <fpage>255</fpage>&#8211;<lpage>278</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.psych.59.103006.093629</pub-id></mixed-citation></ref>
<ref id="B42"><mixed-citation publication-type="webpage"><string-name><surname>Fessenden</surname>, <given-names>M.</given-names></string-name> (<year>2011</year>, <month>December</month> <day>9</day>). <article-title>&#8216;Vocal fry&#8217; creeping into U.S. speech</article-title>. <source>Science</source>. <uri>https://www.science.org/content/article/vocal-fry-creeping-us-speech#:~:text=Since%20the%201960s%2C%20vocal%20fry,or%20females%20varies%20among%20languages</uri></mixed-citation></ref>
<ref id="B43"><mixed-citation publication-type="book"><string-name><surname>Foulkes</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Hay</surname>, <given-names>J. B.</given-names></string-name> (<year>2015</year>). <chapter-title>The emergence of sociophonetic structure</chapter-title>. In <string-name><given-names>B.</given-names> <surname>MacWinney</surname></string-name> &amp; <string-name><given-names>W.</given-names> <surname>O&#8217;Grady</surname></string-name> (Eds.), <source>The handbook of language emergence</source> (pp. <fpage>292</fpage>&#8211;<lpage>313</lpage>). <publisher-name>John Wiley &amp; Sons</publisher-name>. <pub-id pub-id-type="doi">10.1002/9781118346136.ch13</pub-id></mixed-citation></ref>
<ref id="B44"><mixed-citation publication-type="journal"><string-name><surname>Gallena</surname>, <given-names>S. K.</given-names></string-name>, &amp; <string-name><surname>Pinto</surname>, <given-names>J. A.</given-names></string-name> (<year>2021</year>). <article-title>How graduate students with vocal fry are perceived by speech-language pathologists</article-title>. <source>Perspectives of the ASHA Special Interest Groups</source>, <volume>6</volume>(<issue>6</issue>), <fpage>1554</fpage>&#8211;<lpage>1565</lpage>. <pub-id pub-id-type="doi">10.1044/2021_PERSP-21-00083</pub-id></mixed-citation></ref>
<ref id="B45"><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2015</year>). <article-title>Perception of glottalization and phrase-final creak</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>137</volume>(<issue>2</issue>), <fpage>822</fpage>&#8211;<lpage>831</lpage>. <pub-id pub-id-type="doi">10.1121/1.4906155</pub-id></mixed-citation></ref>
<ref id="B46"><mixed-citation publication-type="book"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2019</year>). <chapter-title>The phonetics of voice</chapter-title>. In <string-name><given-names>W. F.</given-names> <surname>Katz</surname></string-name> &amp; <string-name><given-names>P. F.</given-names> <surname>Assmann</surname></string-name> (Eds.), <source>The Routledge handbook of phonetics</source> (pp. <fpage>75</fpage>&#8211;<lpage>106</lpage>). <publisher-name>Routledge</publisher-name>.</mixed-citation></ref>
<ref id="B47"><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2022</year>). <article-title>Theoretical achievements of phonetics in the 21st century: Phonetics of voice quality</article-title>. <source>Journal of Phonetics</source>, <volume>94</volume>, <fpage>1</fpage>&#8211;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2022.101155</pub-id></mixed-citation></ref>
<ref id="B48"><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Esposito</surname>, <given-names>C. M.</given-names></string-name> (<year>2023</year>). <article-title>Phonetics of White Hmong vowel and tonal contrasts</article-title>. <source>Journal of the International Phonetic Association</source>, <volume>53</volume>(<issue>1</issue>), <fpage>213</fpage>&#8211;<lpage>232</lpage>. <pub-id pub-id-type="doi">10.1017/S0025100321000104</pub-id></mixed-citation></ref>
<ref id="B49"><mixed-citation publication-type="journal"><string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Esposito</surname>, <given-names>C. M.</given-names></string-name>, &amp; <string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name> (<year>2013</year>). <article-title>Voice quality and tone identification in White Hmong</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>133</volume>(<issue>2</issue>), <fpage>1078</fpage>&#8211;<lpage>1089</lpage>. <pub-id pub-id-type="doi">10.1121/1.4773259</pub-id></mixed-citation></ref>
<ref id="B50"><mixed-citation publication-type="journal"><string-name><surname>Gisladottir</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Helgason</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Halldorsson</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Helgason</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Borsky</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Chien</surname>, <given-names>Y.</given-names></string-name>, &amp; <string-name><surname>Stefansson</surname>, <given-names>K.</given-names></string-name> (<year>2023</year>). <article-title>Sequence variants affecting voice pitch in humans</article-title>. <source>Science Advances</source>, <volume>9</volume>(<issue>23</issue>). <pub-id pub-id-type="doi">10.1126/sciadv.abq2969eabq2969</pub-id></mixed-citation></ref>
<ref id="B51"><mixed-citation publication-type="journal"><string-name><surname>Gittelson</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Leemann</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Tomaschek</surname>, <given-names>F.</given-names></string-name> (<year>2021</year>). <article-title>Using crowd-sourced speech data to study socially constrained variation in nonmodal phonation</article-title>. <source>Frontiers in Artificial Intelligence</source>, <volume>3</volume>, <elocation-id>565682</elocation-id>. <pub-id pub-id-type="doi">10.3389/frai.2020.565682</pub-id></mixed-citation></ref>
<ref id="B52"><mixed-citation publication-type="journal"><string-name><surname>Gobl</surname>, <given-names>C.</given-names></string-name> &amp; <string-name><surname>N&#237; Chasaide</surname>, <given-names>A.</given-names></string-name> (<year>2003</year>). <article-title>The role of voice quality in communicating emotion, mood and attitude</article-title>. <source>Speech Communication</source>, <volume>40</volume>(<issue>1&#8211;2</issue>), <fpage>189</fpage>&#8211;<lpage>212</lpage>. <pub-id pub-id-type="doi">10.1016/S0167-6393(02)00082-1</pub-id></mixed-citation></ref>
<ref id="B53"><mixed-citation publication-type="journal"><string-name><surname>Goodine</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Johns</surname>, <given-names>A.</given-names></string-name> (<year>2014</year>). <article-title>&#8220;Would you like fries with thaaaat?&#8221; Investigating vocal fry in young female Canadian English speakers</article-title>. <source>Strathy Student Working Papers on Canadian English</source> <volume>2014</volume>, <fpage>1</fpage>&#8211;<lpage>15</lpage>.</mixed-citation></ref>
<ref id="B54"><mixed-citation publication-type="webpage"><collab>Google Books</collab>. (<year>2025</year>, <month>April</month> <day>10</day>). <source>Vocal fry</source> [Infographic]. <publisher-name>Google Books Ngram Viewer</publisher-name>. <uri>https://books.google.com/ngrams/graph?content=vocal+fry&amp;year_start=1800&amp;year_end=2022&amp;corpus=en&amp;smoothing=3</uri></mixed-citation></ref>
<ref id="B55"><mixed-citation publication-type="journal"><string-name><surname>Gordon</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Ladefoged</surname>, <given-names>P.</given-names></string-name> (<year>2001</year>). <article-title>Phonation types: A cross-linguistic overview</article-title>. <source>Journal of Phonetics</source>. <volume>29</volume>(<issue>4</issue>), <fpage>383</fpage>&#8211;<lpage>406</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.2001.0147</pub-id></mixed-citation></ref>
<ref id="B56"><mixed-citation publication-type="webpage"><string-name><surname>Grim</surname>, <given-names>R.</given-names></string-name> (<year>2015</year>, <month>March</month> <day>31</day>). <article-title>My girlfriend went to a speech therapist to cure her vocal fry</article-title>. <source>Vice</source>. <uri>https://www.vice.com/en/article/i-took-my-girlfriend-to-a-speech-therapist-to-cure-her-annoying-vocal-fry-988/</uri></mixed-citation></ref>
<ref id="B57"><mixed-citation publication-type="journal"><string-name><surname>Hanson</surname>, <given-names>H. M.</given-names></string-name>, &amp; <string-name><surname>Chuang</surname>, <given-names>E. S.</given-names></string-name> (<year>1999</year>). <article-title>Glottal characteristics of male speakers: Acoustic correlates and comparison with female data</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>106</volume>(<issue>2</issue>), <fpage>1064</fpage>&#8211;<lpage>1077</lpage>. <pub-id pub-id-type="doi">10.1121/1.427116</pub-id></mixed-citation></ref>
<ref id="B58"><mixed-citation publication-type="journal"><string-name><surname>Hay</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name> (<year>2010</year>). <article-title>Stuffed toys and speech perception</article-title>. <source>Linguistics</source>, <volume>48</volume>(<issue>4</issue>), <fpage>865</fpage>&#8211;<lpage>892</lpage>. <pub-id pub-id-type="doi">10.1515/LING.2010.027</pub-id></mixed-citation></ref>
<ref id="B59"><mixed-citation publication-type="journal"><string-name><surname>Hay</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Nolan</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name> (<year>2006a</year>). <article-title>From fush to feesh: Exemplar priming in speech perception</article-title>. <source>Linguistic Review</source>, <volume>23</volume>(<issue>3</issue>), <fpage>351</fpage>&#8211;<lpage>379</lpage>. <pub-id pub-id-type="doi">10.1515/TLR.2006.014</pub-id></mixed-citation></ref>
<ref id="B60"><mixed-citation publication-type="journal"><string-name><surname>Hay</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Warren</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Drager</surname>, <given-names>K.</given-names></string-name> (<year>2006b</year>). <article-title>Factors influencing speech perception in the context of a merger-in-progress</article-title>. <source>Journal of Phonetics</source>, <volume>34</volume>, <fpage>458</fpage>&#8211;<lpage>484</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2005.10.001</pub-id></mixed-citation></ref>
<ref id="B61"><mixed-citation publication-type="book"><string-name><surname>Hebdige</surname>, <given-names>D.</given-names></string-name> (<year>1984</year>) <source>Subculture: The meaning of style</source>. <publisher-name>Methuen</publisher-name>.</mixed-citation></ref>
<ref id="B62"><mixed-citation publication-type="journal"><string-name><surname>Heiss</surname>, <given-names>A.</given-names></string-name> (<year>2021</year>). <article-title>A guide to modeling proportions with Bayesian beta and zero-inflated beta regression models</article-title>. <pub-id pub-id-type="doi">10.59350/7p1a4-0tw75</pub-id></mixed-citation></ref>
<ref id="B63"><mixed-citation publication-type="book"><string-name><surname>Henton</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Bladon</surname>, <given-names>A.</given-names></string-name> (<year>1988</year>). <chapter-title>Creak as a sociophonetic marker</chapter-title>. In <string-name><given-names>L.</given-names> <surname>Hyman</surname></string-name> &amp; <string-name><given-names>C.</given-names> <surname>Li</surname></string-name> (Eds.), <source>Language, speech and mind: Studies in honor of Victoria A. Fromkin</source> (pp. <fpage>3</fpage>&#8211;<lpage>29</lpage>). <publisher-name>Routledge</publisher-name>. <pub-id pub-id-type="doi">10.4324/9781003629610-2</pub-id></mixed-citation></ref>
<ref id="B64"><mixed-citation publication-type="journal"><string-name><surname>Hollien</surname>, <given-names>H.</given-names></string-name> (<year>1974</year>). <article-title>On vocal registers</article-title>. <source>Journal of Phonetics</source>, <volume>2</volume>(<issue>2</issue>), <fpage>125</fpage>&#8211;<lpage>143</lpage>. <pub-id pub-id-type="doi">10.1016/S0095-4470(19)31188-X</pub-id></mixed-citation></ref>
<ref id="B65"><mixed-citation publication-type="journal"><string-name><surname>Hollien</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>Michel</surname>, <given-names>J. F.</given-names></string-name> (<year>1968</year>). <article-title>Vocal fry as a phonational register</article-title>. <source>Journal of Speech and Hearing Research</source>, <volume>11</volume>(<issue>3</issue>), <fpage>600</fpage>&#8211;<lpage>604</lpage>. <pub-id pub-id-type="doi">10.1044/jshr.1103.600</pub-id></mixed-citation></ref>
<ref id="B66"><mixed-citation publication-type="book"><string-name><surname>Huang</surname>, <given-names>Y.</given-names></string-name> (<year>2019</year>). <chapter-title>The role of creaky voice attributes in Mandarin tonal perception</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Calhoun</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Escudero</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Tabain</surname></string-name> &amp; <string-name><given-names>P.</given-names> <surname>Warren</surname></string-name> (Eds.), <source>Proceedings of the 19th International Congress of Phonetic Sciences</source>, <fpage>1465</fpage>&#8211;<lpage>1469</lpage>. <publisher-name>Australasian Speech Science and Technology Association and International Phonetic Association</publisher-name>.</mixed-citation></ref>
<ref id="B67"><mixed-citation publication-type="journal"><string-name><surname>Irons</surname>, <given-names>S. T.</given-names></string-name>, &amp; <string-name><surname>Alexander</surname>, <given-names>J. E.</given-names></string-name> (<year>2016</year>). <article-title>Vocal fry in realistic speech: Acoustic characteristics and perceptions of vocal fry in spontaneously produced and read speech</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>140</volume>(<issue>4</issue>), <fpage>3397</fpage>&#8211;<lpage>3397</lpage>. <pub-id pub-id-type="doi">10.1121/1.4970891</pub-id></mixed-citation></ref>
<ref id="B68"><mixed-citation publication-type="journal"><string-name><surname>Iseli</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Shue</surname>, <given-names>Y.-L.</given-names></string-name>, &amp; <string-name><surname>Alwan</surname>, <given-names>A.</given-names></string-name> (<year>2007</year>). <article-title>Age, sex, and vowel dependencies of acoustic measures related to the voice source</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>121</volume>(<issue>4</issue>), <fpage>2283</fpage>&#8211;<lpage>2295</lpage>. <pub-id pub-id-type="doi">10.1121/1.2697522</pub-id></mixed-citation></ref>
<ref id="B69"><mixed-citation publication-type="journal"><string-name><surname>Jaeger</surname>, <given-names>T. F.</given-names></string-name>, &amp; <string-name><surname>Weatherholtz</surname>, <given-names>K.</given-names></string-name> (<year>2016</year>). <article-title>What the heck is salience? How predictive language processing contributes to sociolinguistic perception</article-title>. <source>Frontiers in Psychology</source>, <volume>7</volume>(<issue>1115</issue>), <fpage>1</fpage>&#8211;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2016.01115</pub-id></mixed-citation></ref>
<ref id="B70"><mixed-citation publication-type="webpage"><string-name><surname>Jaslow</surname>, <given-names>R.</given-names></string-name> (<year>2011</year>, <month>December</month> <day>16</day>). <article-title>Are &#8220;creaking&#8221; pop stars changing how young women speak?</article-title> <source>CBS News</source>. <uri>https://www.cbsnews.com/news/are-creaking-pop-stars-changing-how-young-women-speak/</uri></mixed-citation></ref>
<ref id="B71"><mixed-citation publication-type="journal"><string-name><surname>Jessee</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Calder</surname>, <given-names>J.</given-names></string-name> (<year>2025</year>). <article-title>The cisgender listening subject in sociolinguistic perception: Transgender identity affects sibilant categorization in American English</article-title>. <source>Journal of Sociolinguistics</source>, 0, <fpage>1</fpage>&#8211;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1111/josl.12702</pub-id></mixed-citation></ref>
<ref id="B72"><mixed-citation publication-type="book"><string-name><surname>Johnson</surname>, <given-names>F. L.</given-names></string-name> (<year>2000</year>). <source>Speaking culturally: Language diversity in the United States</source>. <publisher-name>Sage</publisher-name>. <pub-id pub-id-type="doi">10.4135/9781452220406</pub-id></mixed-citation></ref>
<ref id="B73"><mixed-citation publication-type="book"><string-name><surname>Johnson</surname>, <given-names>K.</given-names></string-name> (<year>1997</year>). <chapter-title>Speech perception without speaker normalization: An exemplar model</chapter-title>. In <string-name><given-names>K.</given-names> <surname>Johnson</surname></string-name> &amp; <string-name><given-names>J. W.</given-names> <surname>Mullennix</surname></string-name> (Eds.), <source>Talker Variability in Speech Processing</source> (pp. <fpage>145</fpage>&#8211;<lpage>165</lpage>). <publisher-name>Academic Press</publisher-name>.</mixed-citation></ref>
<ref id="B74"><mixed-citation publication-type="journal"><string-name><surname>Johnson</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Babel</surname>, <given-names>M.</given-names></string-name> (<year>2023</year>). <article-title>The structure of acoustic voice variation in bilingual speech</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>153</volume>(<issue>6</issue>), <fpage>3221</fpage>&#8211;<lpage>3238</lpage>. <pub-id pub-id-type="doi">10.1121/10.0019659</pub-id></mixed-citation></ref>
<ref id="B75"><mixed-citation publication-type="journal"><string-name><surname>Johnson</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Strand</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>D&#8217;Imperio</surname>, <given-names>M.</given-names></string-name> (<year>1999</year>). <article-title>Auditory-visual integration of talker gender in vowel perception</article-title>. <source>Journal of Phonetics</source>, <volume>27</volume>(<issue>4</issue>), <fpage>359</fpage>&#8211;<lpage>384</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.1999.0100</pub-id></mixed-citation></ref>
<ref id="B76"><mixed-citation publication-type="thesis"><string-name><surname>Juskan</surname>, <given-names>M.</given-names></string-name> (<year>2016</year>). <source>Production and perception of local variants in Liverpool English: Change, salience, exemplar priming</source>. [Doctoral dissertation, <publisher-name>Albert Ludwig&#8217;s University</publisher-name>].</mixed-citation></ref>
<ref id="B77"><mixed-citation publication-type="book"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name> (<year>2015</year>). <chapter-title>Acoustic properties of different kinds of creaky voice</chapter-title>. In <collab>The Scottish Consortium for ICPhS 2015</collab> (Ed.), <source>Proceedings of the 18th International Congress of Phonetic Sciences</source>, <fpage>1</fpage>&#8211;<lpage>5</lpage>. <publisher-name>University of Glasgow</publisher-name>.</mixed-citation></ref>
<ref id="B78"><mixed-citation publication-type="book"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Chai</surname>, <given-names>Y.</given-names></string-name> (<year>2023b</year>, <month>May</month> <day>8&#8211;12</day>). <chapter-title>Acoustic properties of subtypes of creaky voice [Poster presentation]</chapter-title>. <source>The 184th Meeting of the Acoustical Society of America</source>. <publisher-loc>Chicago, IL, USA</publisher-loc>. <pub-id pub-id-type="doi">10.1121/10.0018918</pub-id></mixed-citation></ref>
<ref id="B79"><mixed-citation publication-type="journal"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Kuang</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Esposito</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Khan</surname>, <given-names>S.</given-names></string-name> (<year>2023a</year>). <article-title>A cross-language acoustic space for vocalic phonation distinctions</article-title>, <source>Language</source>, <volume>99</volume>(<issue>2</issue>), <fpage>351</fpage>&#8211;<lpage>389</lpage>. <pub-id pub-id-type="doi">10.1353/lan.2023.a900607</pub-id></mixed-citation></ref>
<ref id="B80"><mixed-citation publication-type="journal"><string-name><surname>Keating</surname>, <given-names>P.</given-names></string-name>, &amp; <string-name><surname>Kuo</surname>, <given-names>G.</given-names></string-name> (<year>2012</year>). <article-title>Comparison of speaking fundamental frequency in English and Mandarin</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>132</volume>(<issue>2</issue>), <fpage>1050</fpage>&#8211;<lpage>1060</lpage>. <pub-id pub-id-type="doi">10.1121/1.4730893</pub-id></mixed-citation></ref>
<ref id="B81"><mixed-citation publication-type="journal"><string-name><surname>Khan</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Becker</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Zimman</surname>, <given-names>L.</given-names></string-name> (<year>2015</year>). <article-title>The acoustics of perceived creaky voice in American English</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>138</volume>(<issue>3</issue>), <fpage>1809</fpage>&#8211;<lpage>1809</lpage>. <pub-id pub-id-type="doi">10.1121/1.4933741</pub-id></mixed-citation></ref>
<ref id="B82"><mixed-citation publication-type="webpage"><string-name><surname>Kirby</surname>, <given-names>J.</given-names></string-name> (<year>2018</year>). <article-title>Praatsauce: Praat-based tools for spectral analysis</article-title>. <uri>https://github.com/kirbyj/praatsauce</uri></mixed-citation></ref>
<ref id="B83"><mixed-citation publication-type="journal"><string-name><surname>Klatt</surname>, <given-names>D. H.</given-names></string-name>, &amp; <string-name><surname>Klatt</surname>, <given-names>L. C.</given-names></string-name> (<year>1990</year>). <article-title>Analysis, synthesis, and perception of voice quality variations among female and male talkers</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>87</volume>(<issue>2</issue>), <fpage>820</fpage>&#8211;<lpage>857</lpage>. <pub-id pub-id-type="doi">10.1121/1.398894</pub-id></mixed-citation></ref>
<ref id="B84"><mixed-citation publication-type="journal"><string-name><surname>Koops</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Gentry</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Pantos</surname>, <given-names>A.</given-names></string-name> (<year>2008</year>). <article-title>The effect of perceived speaker age on the perception of PIN and PEN vowels in Houston, Texas</article-title>. <source>University of Pennsylvania Working Papers in Linguistics</source>, <volume>14</volume>(<issue>2</issue>), Article <elocation-id>12</elocation-id>.</mixed-citation></ref>
<ref id="B85"><mixed-citation publication-type="journal"><string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Gerratt</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Samlan</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Zhang</surname>, <given-names>Z.</given-names></string-name> (<year>2014</year>). <article-title>Toward a unified theory of voice production and perception</article-title>. <source>Loquens</source>, <volume>1</volume>(<issue>1</issue>), <elocation-id>e009</elocation-id>. <pub-id pub-id-type="doi">10.3989/loquens.2014.009</pub-id></mixed-citation></ref>
<ref id="B86"><mixed-citation publication-type="journal"><string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Lee</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Samlan</surname>, <given-names>R.</given-names></string-name>, &amp; <string-name><surname>Gerratt</surname>, <given-names>B.</given-names></string-name> (<year>2021</year>). <article-title>Validating a psychoacoustic model of voice quality</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>149</volume>, <fpage>457</fpage>&#8211;<lpage>465</lpage>. <pub-id pub-id-type="doi">10.1121/10.0003331</pub-id></mixed-citation></ref>
<ref id="B87"><mixed-citation publication-type="book"><string-name><surname>Kreiman</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>Sidtis</surname>, <given-names>D.</given-names></string-name> (<year>2011</year>). <source>Foundations of voice studies: An interdisciplinary approach to voice production and perception</source>. <publisher-name>Wiley-Blackwell</publisher-name>. <pub-id pub-id-type="doi">10.1002/9781444395068</pub-id></mixed-citation></ref>
<ref id="B88"><mixed-citation publication-type="journal"><string-name><surname>Kuang</surname>, <given-names>J.</given-names></string-name> (<year>2017</year>). <article-title>Covariation between voice quality and pitch: Revisiting the case of Mandarin creaky voice</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>142</volume>(<issue>3</issue>), <fpage>1693</fpage>&#8211;<lpage>1706</lpage>. <pub-id pub-id-type="doi">10.1121/1.5003649</pub-id></mixed-citation></ref>
<ref id="B89"><mixed-citation publication-type="journal"><string-name><surname>Kutlu</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Tiv</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Wulff</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Titone</surname>, <given-names>D.</given-names></string-name> (<year>2022</year>). <article-title>Does race impact speech perception? An account of accented speech in two different multilingual locales</article-title>. <source>Cognitive Research: Principles and Implications</source>, <volume>7</volume>, <elocation-id>7</elocation-id>. <pub-id pub-id-type="doi">10.1186/s41235-022-00354-0</pub-id></mixed-citation></ref>
<ref id="B90"><mixed-citation publication-type="book"><string-name><surname>Labov</surname>, <given-names>W.</given-names></string-name> (<year>1972</year>). <source>Sociolinguistic patterns</source>. <publisher-name>University of Pennsylvania Press</publisher-name>.</mixed-citation></ref>
<ref id="B91"><mixed-citation publication-type="book"><string-name><surname>Ladefoged</surname>, <given-names>P.</given-names></string-name> (<year>1971</year>). <source>Preliminaries to linguistic phonetics</source>. <publisher-name>University of Chicago Press</publisher-name>.</mixed-citation></ref>
<ref id="B92"><mixed-citation publication-type="journal"><string-name><surname>Lambert</surname>, <given-names>W. E.</given-names></string-name>, <string-name><surname>Hodgson</surname>, <given-names>R. C.</given-names></string-name>, <string-name><surname>Gardner</surname>, <given-names>R. C.</given-names></string-name>, &amp; <string-name><surname>Fillenbaum</surname>, <given-names>S.</given-names></string-name> (<year>1960</year>). <article-title>Evaluational reactions to spoken languages</article-title>. <source>Journal of Abnormal and Social Psychology</source>, <volume>60</volume>(<issue>1</issue>), <fpage>44</fpage>&#8211;<lpage>51</lpage>. <pub-id pub-id-type="doi">10.1037/h0044430</pub-id></mixed-citation></ref>
<ref id="B93"><mixed-citation publication-type="journal"><string-name><surname>Laver</surname>, <given-names>J. D. M.</given-names></string-name> (<year>1968</year>). <article-title>Voice quality and indexical information</article-title>. <source>International Journal of Language &amp; Communication Disorders</source>, <volume>3</volume>(<issue>1</issue>), <fpage>43</fpage>&#8211;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.3109/13682826809011440</pub-id></mixed-citation></ref>
<ref id="B94"><mixed-citation publication-type="book"><string-name><surname>Lawrence</surname>, <given-names>D.</given-names></string-name> (<year>2015</year>). <chapter-title>Limited evidence for social priming in the perception of the BATH and STRUT vowels</chapter-title>. In <collab>The Scottish Consortium for ICPhS 2015</collab> (Ed.), <source>Proceedings of the 18th International Congress of Phonetic Sciences</source>, <fpage>1</fpage>&#8211;<lpage>5</lpage>. <publisher-name>University of Glasgow</publisher-name>.</mixed-citation></ref>
<ref id="B95"><mixed-citation publication-type="thesis"><string-name><surname>Lee</surname>, <given-names>K. E.</given-names></string-name> (<year>2016</year>). <source>The perception of creaky voice: Does speaker gender affect our judgments?</source> [Master&#8217;s thesis, <publisher-name>University of Kentucky</publisher-name>]. <pub-id pub-id-type="doi">10.13023/ETD.2017.032</pub-id></mixed-citation></ref>
<ref id="B96"><mixed-citation publication-type="journal"><string-name><surname>Lenth</surname>, <given-names>R.</given-names></string-name> (<year>2023</year>). <article-title>emmeans: Estimated marginal means, aka least-squares means</article-title>. R package version 1.8.8. <pub-id pub-id-type="doi">10.32614/CRAN.package.emmeans</pub-id></mixed-citation></ref>
<ref id="B97"><mixed-citation publication-type="journal"><string-name><surname>Leung</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Oates</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Papp</surname>, <given-names>V.</given-names></string-name>, &amp; <string-name><surname>Chan</surname>, <given-names>S.-P.</given-names></string-name> (<year>2022</year>). <article-title>Speaking Fundamental frequencies of adult speakers of Australian English and effects of sex, age, and geographical location</article-title>. <source>Journal of Voice</source>, <volume>36</volume>(<issue>3</issue>), <fpage>434.e1</fpage>&#8211;<lpage>434.e15</lpage>. <pub-id pub-id-type="doi">10.1016/j.jvoice.2020.06.014</pub-id></mixed-citation></ref>
<ref id="B98"><mixed-citation publication-type="book"><string-name><surname>L&#233;vi-Strauss</surname>, <given-names>C.</given-names></string-name> (<year>1962</year>). <source>La pens&#233;e sauvage</source>. <publisher-name>Plon</publisher-name>.</mixed-citation></ref>
<ref id="B99"><mixed-citation publication-type="journal"><string-name><surname>Li</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Lai</surname>, <given-names>W.</given-names></string-name>, &amp; <string-name><surname>Kuang</surname>, <given-names>J.</given-names></string-name> (<year>2023</year>). <article-title>Creaky voice identification in Mandarin: The effects of prosodic position, tone, pitch range and creak locality</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>154</volume>(<issue>1</issue>), <fpage>126</fpage>&#8211;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1121/10.0019941</pub-id></mixed-citation></ref>
<ref id="B100"><mixed-citation publication-type="journal"><string-name><surname>Ligon</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Rountrey</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Rank</surname>, <given-names>N. V.</given-names></string-name>, <string-name><surname>Hull</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Khidr</surname>, <given-names>A.</given-names></string-name> (<year>2019</year>). <article-title>Perceived desirability of vocal fry among female speech communication disorders graduate students</article-title>. <source>Journal of Voice</source>, <volume>33</volume>(<issue>5</issue>), <fpage>805.e21</fpage>&#8211;<lpage>805.e35</lpage>. <pub-id pub-id-type="doi">10.1016/j.jvoice.2018.03.010</pub-id></mixed-citation></ref>
<ref id="B101"><mixed-citation publication-type="webpage"><string-name><surname>Lindel&#248;v</surname>, <given-names>J. K.</given-names></string-name> (<year>2019</year>). <article-title>Reaction time distributions: An interactive overview</article-title>. Retrieved <month>June</month> <day>7</day>, 2025 from <uri>https://lindeloev.github.io/shiny-rt/</uri></mixed-citation></ref>
<ref id="B102"><mixed-citation publication-type="journal"><string-name><surname>Lindvall-&#214;stling</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Deutschmann</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Steinvall</surname>, <given-names>A.</given-names></string-name> (<year>2020</year>). <article-title>An exploratory study on linguistic gender stereotypes and their effects on perception</article-title>. <source>Open Linguistics</source>, <volume>6</volume>(<issue>1</issue>), <fpage>567</fpage>&#8211;<lpage>583</lpage>. <pub-id pub-id-type="doi">10.1515/opli-2020-0033</pub-id></mixed-citation></ref>
<ref id="B103"><mixed-citation publication-type="book"><string-name><surname>Lippi-Green</surname>, <given-names>R.</given-names></string-name> (<year>1997</year>). <source>English with an accent: Language, ideology, and discrimination in the United States</source>. <publisher-name>Routledge</publisher-name>.</mixed-citation></ref>
<ref id="B104"><mixed-citation publication-type="book"><string-name><surname>Liu</surname>, <given-names>X.</given-names></string-name>, &amp; <string-name><surname>Xu</surname>, <given-names>Y.</given-names></string-name> (<year>2011</year>). <chapter-title>What makes a female voice attractive?</chapter-title> In <string-name><given-names>W.</given-names> <surname>Sum Lee</surname></string-name> &amp; <string-name><given-names>E.</given-names> <surname>Zee</surname></string-name> (Eds.), <source>Proceedings of the 17th International Congress of Phonetic Sciences</source>, <fpage>1274</fpage>&#8211;<lpage>1277</lpage>. <publisher-name>City University of Hong Kong</publisher-name>.</mixed-citation></ref>
<ref id="B105"><mixed-citation publication-type="journal"><string-name><surname>Loakes</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Gregory</surname>, <given-names>A.</given-names></string-name> (<year>2022</year>). <article-title>Voice quality in Australian English</article-title>. <source>JASA Express Letter</source>, <volume>2</volume>(<issue>8</issue>), <fpage>1</fpage>&#8211;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1121/10.0012994</pub-id></mixed-citation></ref>
<ref id="B106"><mixed-citation publication-type="journal"><string-name><surname>Lortie</surname>, <given-names>C. L.</given-names></string-name>, <string-name><surname>Thibeault</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Guitton</surname>, <given-names>M. J.</given-names></string-name>, &amp; <string-name><surname>Tremblay</surname>, <given-names>P.</given-names></string-name> (<year>2015</year>). <article-title>Effects of age on the amplitude, frequency and perceived quality of voice</article-title>. <source>AGE</source>, <volume>37</volume>(<issue>117</issue>). <pub-id pub-id-type="doi">10.1007/s11357-015-9854-1</pub-id></mixed-citation></ref>
<ref id="B107"><mixed-citation publication-type="journal"><string-name><surname>Mack</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Munson</surname>, <given-names>B.</given-names></string-name> (<year>2012</year>). <article-title>The influence of /s/ quality on ratings of men&#8217;s sexual orientation: Explicit and implicit measures of the &#8220;gay lisp&#8221; stereotype</article-title>. <source>Journal of Phonetics</source>, <volume>40</volume>(<issue>1</issue>), <fpage>198</fpage>&#8211;<lpage>212</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2011.10.002</pub-id></mixed-citation></ref>
<ref id="B108"><mixed-citation publication-type="book"><string-name><surname>McElreath</surname>, <given-names>R.</given-names></string-name> (<year>2018</year>). <source>Statistical rethinking: A Bayesian course with examples in R and Stan</source>. <publisher-name>Chapman and Hall/CRC</publisher-name>. <pub-id pub-id-type="doi">10.1201/9781315372495</pub-id></mixed-citation></ref>
<ref id="B109"><mixed-citation publication-type="journal"><string-name><surname>McGurk</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>MacDonald</surname>, <given-names>J.</given-names></string-name> (<year>1976</year>). <article-title>Hearing lips and seeing voices</article-title>. <source>Nature</source>, <volume>264</volume>(<issue>5588</issue>), <fpage>746</fpage>&#8211;<lpage>748</lpage>. <pub-id pub-id-type="doi">10.1038/264746a0</pub-id></mixed-citation></ref>
<ref id="B110"><mixed-citation publication-type="book"><string-name><surname>Melvin</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Clopper</surname>, <given-names>C. G.</given-names></string-name> (<year>2015</year>). <chapter-title>Gender variation in creaky voice and fundamental frequency</chapter-title>. In <collab>The Scottish Consortium for ICPhS 2015</collab> (Ed.), <source>Proceedings of the 18th International Congress of Phonetic Sciences</source>, <fpage>1</fpage>&#8211;<lpage>5</lpage>. <publisher-name>University of Glasgow</publisher-name>.</mixed-citation></ref>
<ref id="B111"><mixed-citation publication-type="journal"><string-name><surname>Mendoza-Denton</surname>, <given-names>N.</given-names></string-name> (<year>2011</year>). <article-title>The semiotic hitchhiker&#8217;s guide to creaky voice: Circulation and gendered hardcore in a Chicana/o gang persona</article-title>. <source>Journal of Linguistic Anthropology</source>, <volume>21</volume>(<issue>2</issue>), <fpage>261</fpage>&#8211;<lpage>280</lpage>. <pub-id pub-id-type="doi">10.1111/j.1548-1395.2011.01110.x</pub-id></mixed-citation></ref>
<ref id="B112"><mixed-citation publication-type="journal"><string-name><surname>Munson</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Edwards</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Schellinger</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Beckman</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Meyer</surname>, <given-names>M.</given-names></string-name> (<year>2010</year>). <article-title>Deconstructing phonetic transcription: Covert contrast, perceptual bias, and an extraterrestrial view of Vox Humana</article-title>. <source>Clinical Linguistics &amp; Phonetics</source>, <volume>24</volume>(<issue>4&#8211;5</issue>), <fpage>245</fpage>&#8211;<lpage>260</lpage>. <pub-id pub-id-type="doi">10.3109/02699200903532524</pub-id></mixed-citation></ref>
<ref id="B113"><mixed-citation publication-type="journal"><string-name><surname>Munson</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Ryherd</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Kemper</surname>, <given-names>S.</given-names></string-name> (<year>2017</year>). <article-title>Implicit and explicit gender priming in English lingual sibilant fricative perception</article-title>. <source>Linguistics</source>, <volume>55</volume>(<issue>5</issue>), <fpage>1073</fpage>&#8211;<lpage>1107</lpage>. <pub-id pub-id-type="doi">10.1515/ling-2017-0021</pub-id></mixed-citation></ref>
<ref id="B114"><mixed-citation publication-type="journal"><string-name><surname>Niedzielski</surname>, <given-names>N.</given-names></string-name> (<year>1999</year>). <article-title>The effect of social information on the perception of sociolinguistic variables</article-title>. <source>Journal of Language and Social Psychology</source>, <volume>18</volume>(<issue>1</issue>), <fpage>62</fpage>&#8211;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1177/0261927X99018001005</pub-id></mixed-citation></ref>
<ref id="B115"><mixed-citation publication-type="journal"><string-name><surname>Palan</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Schitter</surname>, <given-names>C.</given-names></string-name> (<year>2018</year>). <article-title>Prolific.ac&#8212;A subject pool for online experiments</article-title>. <source>Journal of Behavioral and Experimental Finance</source>, <volume>17</volume>, <fpage>22</fpage>&#8211;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbef.2017.12.004</pub-id></mixed-citation></ref>
<ref id="B116"><mixed-citation publication-type="book"><string-name><surname>Pierrehumbert</surname>, <given-names>J.</given-names></string-name> (<year>1995</year>) <chapter-title>Prosodic effects on glottal allophones</chapter-title>. In <string-name><given-names>O.</given-names> <surname>Fujimura</surname></string-name> &amp; <string-name><given-names>M.</given-names> <surname>Hirano</surname></string-name> (Eds.), <source>Vocal fold physiology 8: Voice quality control</source> (pp. <fpage>39</fpage>&#8211;<lpage>60</lpage>). <publisher-name>Singular Press</publisher-name>.</mixed-citation></ref>
<ref id="B117"><mixed-citation publication-type="journal"><string-name><surname>Pillot-Loiseau</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Horgues</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Scheuer</surname>, <given-names>S.</given-names></string-name>, &amp; <string-name><surname>Kamiyama</surname>, <given-names>T.</given-names></string-name> (<year>2019</year>). <article-title>The evolution of creaky voice use in read speech by native-French and native-English speakers in tandem: A pilot study</article-title>. <source>Anglophonia</source>, <volume>27</volume>. <pub-id pub-id-type="doi">10.4000/anglophonia.2005</pub-id></mixed-citation></ref>
<ref id="B118"><mixed-citation publication-type="journal"><string-name><surname>Pittam</surname>, <given-names>J.</given-names></string-name> (<year>1987</year>). <article-title>Listeners&#8217; evaluations of voice quality in Australian English speakers</article-title>. <source>Language and Speech</source>, <volume>30</volume>(<issue>2</issue>), <fpage>99</fpage>&#8211;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1177/002383098703000201</pub-id></mixed-citation></ref>
<ref id="B119"><mixed-citation publication-type="journal"><string-name><surname>Podesva</surname>, <given-names>R. J.</given-names></string-name> (<year>2007</year>). <article-title>Phonation type as a stylistic variable: The use of falsetto in constructing a persona</article-title>. <source>Journal of Sociolinguistics</source>, <volume>11</volume>(<issue>4</issue>), <fpage>478</fpage>&#8211;<lpage>504</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9841.2007.00334.x</pub-id></mixed-citation></ref>
<ref id="B120"><mixed-citation publication-type="book"><string-name><surname>Podesva</surname>, <given-names>R. J.</given-names></string-name> (<year>2013</year>). <chapter-title>Gender and the social meaning of non-modal phonation types</chapter-title>. In <string-name><given-names>C.</given-names> <surname>Cathcart</surname></string-name>, <string-name><given-names>I-H.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>G.</given-names> <surname>Finley</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Kang</surname></string-name>, <string-name><given-names>C. S.</given-names> <surname>Sandy</surname></string-name>, &amp; <string-name><given-names>E.</given-names> <surname>Stickles</surname></string-name> (Eds.), <source>Proceedings of the Annual Meeting of the Berkeley Linguistics Society</source>, <volume>37</volume>(<issue>1</issue>), <fpage>427</fpage>&#8211;<lpage>448</lpage>. <pub-id pub-id-type="doi">10.3765/bls.v37i1.832</pub-id></mixed-citation></ref>
<ref id="B121"><mixed-citation publication-type="webpage"><string-name><surname>Podesva</surname>, <given-names>R. J.</given-names></string-name> (<year>2018</year>) <chapter-title>The affective roots of gender patterns in the use of creaky voice [Invited presentation]</chapter-title>. <source>Experimental and Theoretical Approaches to Prosody 4</source>. <publisher-name>University of Massachusetts</publisher-name>, <publisher-loc>Amherst</publisher-loc>. <uri>https://www.youtube.com/watch?v=ZtqHpia7Iy8</uri></mixed-citation></ref>
<ref id="B122"><mixed-citation publication-type="thesis"><string-name><surname>Pratt</surname>, <given-names>T. C.</given-names></string-name> (<year>2018</year>). <source>Affective sociolinguistic style: An ethnography of embodied linguistic variation in an arts high school</source>. [Doctoral dissertation, <publisher-name>Stanford University</publisher-name>].</mixed-citation></ref>
<ref id="B123"><mixed-citation publication-type="webpage"><string-name><surname>Quenqua</surname>, <given-names>D.</given-names></string-name> (<year>2012</year>, <month>February</month> <day>27</day>). <article-title>They&#8217;re, like, way ahead of the linguistic currrrve</article-title>. <source>New York Times</source>. <uri>https://www.nytimes.com/2012/02/28/science/young-women-often-trendsetters-in-vocal-patterns.html?searchResultPosition=1</uri></mixed-citation></ref>
<ref id="B124"><mixed-citation publication-type="journal"><string-name><surname>Redi</surname>, <given-names>L.</given-names></string-name>, &amp; <string-name><surname>Shattuck-Hufnagel</surname>, <given-names>S.</given-names></string-name> (<year>2001</year>). <article-title>Variation in the realization of glottalization in normal speakers</article-title>. <source>Journal of Phonetics</source>, <volume>29</volume>(<issue>4</issue>), <fpage>407</fpage>&#8211;<lpage>429</lpage>. <pub-id pub-id-type="doi">10.1006/jpho.2001.0145</pub-id></mixed-citation></ref>
<ref id="B125"><mixed-citation publication-type="book"><string-name><surname>Sebregts</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Vriesendorp</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Quen&#233;</surname>, <given-names>H.</given-names></string-name>, &amp; <string-name><surname>White</surname>, <given-names>Y.</given-names></string-name> (<year>2023</year>). <chapter-title>Creaky voice in L2 English and L1 Dutch</chapter-title>. In <string-name><given-names>R.</given-names> <surname>Skarnitzl</surname></string-name> &amp; <string-name><given-names>J.</given-names> <surname>Vol&#237;n</surname></string-name> (Eds.), <source>Proceedings of the 20th Internation Congress of Phonetic Sciences</source>, <fpage>1841</fpage>&#8211;<lpage>1845</lpage>. <publisher-name>International Phonetic Association</publisher-name>.</mixed-citation></ref>
<ref id="B126"><mixed-citation publication-type="journal"><string-name><surname>Seyfarth</surname>, <given-names>S.</given-names></string-name> &amp; <string-name><surname>Garellek</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>Plosive voicing acoustics and voice quality in Yerevan Armenian</article-title>. <source>Journal of Phonetics</source>, <volume>71</volume>, <fpage>425</fpage>&#8211;<lpage>450</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2018.09.001</pub-id></mixed-citation></ref>
<ref id="B127"><mixed-citation publication-type="journal"><string-name><surname>Sicoli</surname>, <given-names>M.</given-names></string-name> (<year>2010</year>). <article-title>Shifting voices with participant roles: Voice qualities and speech registers in Mesoamerica</article-title>. <source>Language in Society</source>, <volume>39</volume>(<issue>4</issue>), <fpage>521</fpage>&#8211;<lpage>553</lpage>. <pub-id pub-id-type="doi">10.1017/S0047404510000436</pub-id></mixed-citation></ref>
<ref id="B128"><mixed-citation publication-type="book"><string-name><surname>Sonderegger</surname>, <given-names>M.</given-names></string-name> (<year>2023</year>). <source>Regression modeling for linguistic data</source>. <publisher-name>MIT Press</publisher-name>.</mixed-citation></ref>
<ref id="B129"><mixed-citation publication-type="journal"><string-name><surname>Sonderegger</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>S&#243;skuthy</surname>, <given-names>M.</given-names></string-name> (<year>2025</year>). <article-title>Advancements of phonetics in the 21st century: Quantitative data analysis</article-title>. <source>Journal of Phonetics</source>, <volume>111</volume>, <elocation-id>101415</elocation-id>. <pub-id pub-id-type="doi">10.1016/j.wocn.2025.101415</pub-id></mixed-citation></ref>
<ref id="B130"><mixed-citation publication-type="journal"><string-name><surname>Squires</surname>, <given-names>L.</given-names></string-name> (<year>2013</year>). <article-title>It don&#8217;t go both ways: Limited bidirectionality in sociolinguistic perception</article-title>. <source>Journal of Sociolinguistics</source>, <volume>17</volume>(<issue>2</issue>), <fpage>200</fpage>&#8211;<lpage>237</lpage>. <pub-id pub-id-type="doi">10.1111/josl.12025</pub-id></mixed-citation></ref>
<ref id="B131"><mixed-citation publication-type="journal"><string-name><surname>Staum Casasanto</surname>, <given-names>L.</given-names></string-name> (<year>2010</year>). <article-title>What do listeners know about sociolinguistic variation?</article-title> <source>University of Pennsylvania Working Papers in Linguistics</source>, <volume>15</volume>(<issue>2</issue>), Article <elocation-id>6</elocation-id>.</mixed-citation></ref>
<ref id="B132"><mixed-citation publication-type="webpage"><string-name><surname>Steinmetz</surname>, <given-names>K.</given-names></string-name> (<year>2011</year>, <month>December</month> <day>15</day>). <article-title>Get your creak on: Is &#8220;vocal fry&#8221; a female fad?</article-title> <source>Time</source>. <uri>https://healthland.time.com/2011/12/15/get-your-creak-on-is-vocal-fry-a-female-fad/</uri></mixed-citation></ref>
<ref id="B133"><mixed-citation publication-type="journal"><string-name><surname>Stewart</surname>, <given-names>C. F.</given-names></string-name>, <string-name><surname>Kling</surname>, <given-names>I.</given-names></string-name>, &amp; <string-name><surname>D&#8217;Agosto</surname>, <given-names>A.</given-names></string-name> (<year>2024</year>). <article-title>Modal register, vocal fry, and uptalk: Identification and perceptual judgments of inexperienced listeners</article-title>. <source>Journal of Voice</source>. Advance online publication. <pub-id pub-id-type="doi">10.1016/j.jvoice.2024.02.028</pub-id></mixed-citation></ref>
<ref id="B134"><mixed-citation publication-type="journal"><string-name><surname>Strand</surname>, <given-names>E.</given-names></string-name> (<year>1999</year>). <article-title>Uncovering the role of gender stereotypes in speech perception</article-title>. <source>Journal of Language and Social Psychology</source>, <volume>18</volume>(<issue>1</issue>), <fpage>86</fpage>&#8211;<lpage>100</lpage>. <pub-id pub-id-type="doi">10.1177/0261927X99018001006</pub-id></mixed-citation></ref>
<ref id="B135"><mixed-citation publication-type="journal"><string-name><surname>Strand</surname>, <given-names>E.</given-names></string-name>, &amp; <string-name><surname>Johnson</surname>, <given-names>K.</given-names></string-name> (<year>1996</year>). <article-title>Gradient and visual speaker normalization in the perception of fricatives</article-title>. <source>Natural Language Processing and Speech Technology: Results of the 3rd KONVENS Conference</source>, <fpage>14</fpage>&#8211;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1515/9783110821895-003</pub-id></mixed-citation></ref>
<ref id="B136"><mixed-citation publication-type="book"><string-name><surname>Stuart-Smith</surname>, <given-names>J.</given-names></string-name> (<year>1999</year>). <chapter-title>Voice quality in Glaswegian</chapter-title>. In <string-name><given-names>J. J.</given-names> <surname>Ohala</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Hasegawa</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Ohala</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Granville</surname></string-name> &amp; <string-name><given-names>A. C.</given-names> <surname>Bailey</surname></string-name> (Eds.), <source>Proceedings of the 14th International Congress of Phonetic Sciences</source>, <fpage>2553</fpage>&#8211;<lpage>2556</lpage>. <publisher-name>Edward Arnold</publisher-name>.</mixed-citation></ref>
<ref id="B137"><mixed-citation publication-type="journal"><string-name><surname>Syrdal</surname>, <given-names>A. K.</given-names></string-name> (<year>1996</year>). <article-title>Acoustic variability in spontaneous conversational speech of American English talkers</article-title>. <source>Proceedings of the 4th International Conference on Spoken Language Processing</source>, <fpage>438</fpage>&#8211;<lpage>441</lpage>. <pub-id pub-id-type="doi">10.1109/ICSLP.1996.607148</pub-id></mixed-citation></ref>
<ref id="B138"><mixed-citation publication-type="journal"><string-name><surname>Szakay</surname>, <given-names>A.</given-names></string-name> (<year>2012</year>). <article-title>Voice quality as a marker of ethnicity in New Zealand: From acoustics to perception</article-title>. <source>Journal of Sociolinguistics</source>, <volume>16</volume>(<issue>3</issue>), <fpage>382</fpage>&#8211;<lpage>397</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9841.2012.00537.x</pub-id></mixed-citation></ref>
<ref id="B139"><mixed-citation publication-type="book"><string-name><surname>Szakay</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Torgersen</surname>, <given-names>E.</given-names></string-name> (<year>2015</year>). <chapter-title>An acoustic analysis of voice quality in London English: The effect of gender, ethnicity and f0</chapter-title>. In <collab>The Scottish Consortium for ICPhS 2015</collab> (Ed.), <source>Proceedings of the 18th International Congress of Phonetic Sciences</source>, <fpage>1</fpage>&#8211;<lpage>4</lpage>. <publisher-name>University of Glasgow</publisher-name>.</mixed-citation></ref>
<ref id="B140"><mixed-citation publication-type="book"><string-name><surname>Szakay</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Torgersen</surname>, <given-names>E.</given-names></string-name> (<year>2019</year>). <chapter-title>A re-analysis of f0 in ethnic varieties of London English using REAPER</chapter-title>. In <string-name><given-names>S.</given-names> <surname>Calhoun</surname></string-name>, <string-name><given-names>P.</given-names> <surname>Escudero</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Tabain</surname></string-name>, &amp; <string-name><given-names>P.</given-names> <surname>Warren</surname></string-name> (Eds.), <source>Proceedings of the 19th International Congress of Phonetic Sciences</source>, <fpage>1675</fpage>&#8211;<lpage>1678</lpage>. <publisher-name>Australasian Speech Science and Technology Association and International Phonetic Association</publisher-name>.</mixed-citation></ref>
<ref id="B141"><mixed-citation publication-type="journal"><collab>The Institute of Electrical and Electronics Engineers</collab>. (<year>1969</year>). <article-title>IEEE recommended practice for speech quality measurements</article-title>. <source>IEEE Transactions on Audio and Electroacoustics</source>, <volume>17</volume>(<issue>3</issue>), <fpage>225</fpage>&#8211;<lpage>246</lpage>. <pub-id pub-id-type="doi">10.1109/IEEESTD.1969.7405210</pub-id></mixed-citation></ref>
<ref id="B142"><mixed-citation publication-type="journal"><string-name><surname>Urberg-Carlson</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Munson</surname>, <given-names>B.</given-names></string-name>, &amp; <string-name><surname>Kaiser</surname>, <given-names>E.</given-names></string-name> (<year>2008</year>). <article-title>Gradient measures of children&#8217;s speech production: Visual analogue scale and equal appearing interval scale measures of fricative goodness</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>125</volume>(<issue>4</issue>), <elocation-id>2529</elocation-id>. <pub-id pub-id-type="doi">10.1121/1.4783533</pub-id></mixed-citation></ref>
<ref id="B143"><mixed-citation publication-type="journal"><string-name><surname>Uusitalo</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Nyberg</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Laukkanen</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Waaramaa</surname>, <given-names>T.</given-names></string-name>, &amp; <string-name><surname>Rantala</surname>, <given-names>L.</given-names></string-name> (<year>2024</year>). <article-title>Has the prevalence of creaky voice increased among Finnish university students from the 1990&#8217;s to the 2010&#8217;s?</article-title> <source>Journal of Voice</source>, <volume>38</volume>(<issue>3</issue>), <fpage>697</fpage>&#8211;<lpage>702</lpage>. <pub-id pub-id-type="doi">10.1016/j.jvoice.2021.12.006</pub-id></mixed-citation></ref>
<ref id="B144"><mixed-citation publication-type="journal"><string-name><surname>Vaughn</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Kendall</surname>, <given-names>T.</given-names></string-name> (<year>2019</year>). <article-title>Stylistically coherent variants: Cognitive representation of social meaning</article-title>. <source>Revista de estudos da linguagem</source>, <volume>27</volume>(<issue>4</issue>), <fpage>1787</fpage>&#8211;<lpage>1830</lpage>. <pub-id pub-id-type="doi">10.17851/2237-2083.0.0.1787-1830</pub-id></mixed-citation></ref>
<ref id="B145"><mixed-citation publication-type="journal"><string-name><surname>Wade</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Embick</surname>, <given-names>D.</given-names></string-name>, &amp; <string-name><surname>Tamminga</surname>, <given-names>M.</given-names></string-name> (<year>2023</year>). <article-title>Dialect experience modulates cue reliance in sociolinguistic convergence</article-title>. <source>Glossa Psycholinguistics</source>, <volume>2</volume>(<issue>1</issue>), <fpage>1</fpage>&#8211;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.5070/G6011187</pub-id></mixed-citation></ref>
<ref id="B146"><mixed-citation publication-type="journal"><string-name><surname>Walker</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Hay</surname>, <given-names>J.</given-names></string-name> (<year>2011</year>). <article-title>Congruence between &#8220;word age&#8221; and &#8220;voice age&#8221; facilitates lexical access</article-title>. <source>Laboratory Phonology</source>, <volume>2</volume>(<issue>1</issue>), <fpage>219</fpage>&#8211;<lpage>237</lpage>. <pub-id pub-id-type="doi">10.1515/LABPHON.2011.007</pub-id></mixed-citation></ref>
<ref id="B147"><mixed-citation publication-type="journal"><string-name><surname>Walker</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Szakay</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Cox</surname>, <given-names>F.</given-names></string-name> (<year>2019</year>). <article-title>Can kiwis and koalas as cultural primes induce perceptual bias in Australian English-speaking listeners?</article-title> <source>Laboratory Phonology</source>, <volume>10</volume>(<issue>1</issue>). <fpage>1</fpage>&#8211;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.5334/labphon.90</pub-id></mixed-citation></ref>
<ref id="B148"><mixed-citation publication-type="book"><string-name><surname>Weatherholtz</surname>, <given-names>K.</given-names></string-name>, &amp; <string-name><surname>Jaeger</surname>, <given-names>T.</given-names></string-name> (<year>2016</year>). <chapter-title>Speech perception and generalization across talkers and accents</chapter-title>. <source>Oxford research encyclopedia of linguistics</source>. <publisher-name>Oxford University Press</publisher-name>. <pub-id pub-id-type="doi">10.1093/acrefore/9780199384655.013.95</pub-id></mixed-citation></ref>
<ref id="B149"><mixed-citation publication-type="webpage"><string-name><surname>Weber</surname>, <given-names>M. M.</given-names></string-name> (<year>2017</year>, <month>May</month> <day>10</day>). <article-title>Top five most annoying vocal habits</article-title>. <source>Voice Empowerment</source>. <uri>https://www.voiceempowerment.com/voice-empowerment-blog/2017/5/1/ten-most-annoying-vocal-habits-or-5</uri></mixed-citation></ref>
<ref id="B150"><mixed-citation publication-type="journal"><string-name><surname>White</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Penney</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Gibson</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Szakay</surname>, <given-names>A.</given-names></string-name>, &amp; <string-name><surname>Cox</surname>, <given-names>F.</given-names></string-name> (<year>2024</year>). <article-title>Influence of pitch and speaker gender on perception of creaky voice</article-title>. <source>Journal of Phonetics</source>, <volume>102</volume>, <fpage>1</fpage>&#8211;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1016/j.wocn.2023.101293</pub-id></mixed-citation></ref>
<ref id="B151"><mixed-citation publication-type="journal"><string-name><surname>Wolk</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Abdelli-Beruh</surname>, <given-names>N.</given-names></string-name>, &amp; <string-name><surname>Slavin</surname>, <given-names>D.</given-names></string-name> (<year>2012</year>). <article-title>Habitual use of vocal fry in young adult female speakers</article-title>. <source>Journal of Voice</source>, <volume>26</volume>(<issue>3</issue>), <fpage>e111</fpage>&#8211;<lpage>e116</lpage>. <pub-id pub-id-type="doi">10.1016/j.jvoice.2011.04.007</pub-id></mixed-citation></ref>
<ref id="B152"><mixed-citation publication-type="journal"><string-name><surname>Woods</surname>, <given-names>K. J. P.</given-names></string-name>, <string-name><surname>Siegel</surname>, <given-names>M. H.</given-names></string-name>, <string-name><surname>Traer</surname>, <given-names>J.</given-names></string-name>, &amp; <string-name><surname>McDermott</surname>, <given-names>J. H.</given-names></string-name> (<year>2017</year>). <article-title>Headphone screening to facilitate web-based auditory experiments</article-title>. <source>Attention, Perception &amp; Psychophysics</source>, <volume>79</volume>(<issue>7</issue>), <fpage>2064</fpage>&#8211;<lpage>2072</lpage>. <pub-id pub-id-type="doi">10.3758/s13414-017-1361-2</pub-id></mixed-citation></ref>
<ref id="B153"><mixed-citation publication-type="journal"><string-name><surname>Wright</surname>, <given-names>R.</given-names></string-name>, <string-name><surname>Mansfield</surname>, <given-names>C.</given-names></string-name>, &amp; <string-name><surname>Panfili</surname>, <given-names>L.</given-names></string-name> (<year>2019</year>). <article-title>Voice quality types and uses in North American English</article-title>. <source>Anglophonia</source>, <volume>27</volume>, <elocation-id>1952</elocation-id>. <pub-id pub-id-type="doi">10.4000/anglophonia.1952</pub-id></mixed-citation></ref>
<ref id="B154"><mixed-citation publication-type="journal"><string-name><surname>Yu</surname>, <given-names>A. C. L.</given-names></string-name> (<year>2022</year>). <article-title>Perceptual cue weighting is influenced by the listener&#8217;s gender and subjective evaluations of the speaker: The case of English stop voicing</article-title>. <source>Frontiers in Psychology</source>, <volume>13</volume>, <elocation-id>840291</elocation-id>. <pub-id pub-id-type="doi">10.3389/fpsyg.2022.840291</pub-id></mixed-citation></ref>
<ref id="B155"><mixed-citation publication-type="journal"><string-name><surname>Yu</surname>, <given-names>A. C. L.</given-names></string-name>, &amp; <string-name><surname>Zellou</surname>, <given-names>G.</given-names></string-name> (<year>2019</year>). <article-title>Individual differences in language processing: Phonology</article-title>. <source>Annual Review of Linguistics</source>, <volume>5</volume>(<issue>1</issue>), <fpage>131</fpage>&#8211;<lpage>150</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-linguistics-011516-033815</pub-id></mixed-citation></ref>
<ref id="B156"><mixed-citation publication-type="journal"><string-name><surname>Yuasa</surname>, <given-names>I.</given-names></string-name> (<year>2010</year>). <article-title>Creaky voice: A new feminine voice quality for young urban-oriented upwardly mobile American women?</article-title> <source>American Speech</source>, <volume>85</volume>(<issue>3</issue>), <fpage>315</fpage>&#8211;<lpage>337</lpage>. <pub-id pub-id-type="doi">10.1215/00031283-2010-018</pub-id></mixed-citation></ref>
<ref id="B157"><mixed-citation publication-type="journal"><string-name><surname>Zellou</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Barreda</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Lahrouchi</surname>, <given-names>M.</given-names></string-name>, &amp; <string-name><surname>Smiljani&#263;</surname>, <given-names>R.</given-names></string-name> (<year>2024</year>). <article-title>Learning a language with vowelless words</article-title>. <source>Cognition</source>, <volume>251</volume>, <elocation-id>105909</elocation-id>. <pub-id pub-id-type="doi">10.1016/j.cognition.2024.105909</pub-id></mixed-citation></ref>
<ref id="B158"><mixed-citation publication-type="journal"><string-name><surname>Zhang</surname>, <given-names>Z.</given-names></string-name> (<year>2021</year>). <article-title>Contribution of laryngeal size to differences between male and female voice production</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>150</volume>(<issue>6</issue>), <fpage>4511</fpage>&#8211;<lpage>4521</lpage>. <pub-id pub-id-type="doi">10.1121/10.0009033</pub-id></mixed-citation></ref>
<ref id="B159"><mixed-citation publication-type="journal"><string-name><surname>Zhang</surname>, <given-names>Z.</given-names></string-name>, &amp; <string-name><surname>Kirby</surname>, <given-names>J.</given-names></string-name> (<year>2020</year>). <article-title>The role of <italic>F<sub>0</sub></italic> and phonation cues in Cantonese low tone perception</article-title>. <source>Journal of the Acoustical Society of America</source>, <volume>148</volume>(<issue>1</issue>), <fpage>EL40</fpage>&#8211;<lpage>EL45</lpage>. <pub-id pub-id-type="doi">10.1121/10.0001523</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>