1. Introduction

1.1. General background: Sound symbolism

Most modern linguistic theories in the twentieth century had been generally assuming that the relationships between sounds and meanings are in principle arbitrary. This thesis of arbitrariness is famously attributed to Saussure (1916) as well as to Hockett (1959), the latter of whom proposed that arbitrariness is one of the design features that distinguishes human languages from other animals’ communication systems.1 On the other hand, the idea that there can be systematic and iconic mappings between sounds and meanings is also old, going back to Plato’s Cratylus. There are several instances of such systematic sound-meaning connections that seem to hold across languages. For example, the seminal experimental work by Sapir (1929) has demonstrated that English speakers feel nonce words containing low back vowels (e.g., mal) to be larger than those containing high front vowels (e.g., mil), a finding that has been replicated multiple times across many different languages (e.g., Berlin, 1994, 2006; Blasi, Wichman, Hammarström, Stadler, & Christianson, 2016; Johansson, 2017; Johansson, Anikin, Carling, & Holmer, 2020; Newman, 1933; Shinohara & Kawahara, 2016; Thompson & Estes, 2011; Ultan, 1978; C. Westbury, Hollis, Sidhu, & Pexman, 2018, though c.f. Diffloth, 1994). Another well-known example is that some sounds (e.g., voiceless obstruents) tend to be associated with angular shapes, whereas other sounds (e.g., sonorant consonants) tend to be associated with round shapes (e.g., Ahlner & Zlatev, 2010; Aveyard, 2012; Bremner et al., 2013; D’Onofrio, 2014; Kawahara & Shinohara, 2012; Köhler, 1929; Maurer, Pathman, & Mondloch, 2006; Nielsen & Rendall, 2013; Ramachandran & Hubbard, 2001; C. Westbury et al., 2018, though cf. Rogers & Ross, 1975 and Styles & Gawne, 2017). Such systematic associations between sounds and meanings are generally referred to as ‘sound symbolism’ (Hinton, Nichols, & Ohala, 1994).

In recent years, the field has witnessed a rapidly growing body of renewed interest in sound symbolism, which is actively studied from a wide range of perspectives. The rise of interest in sound symbolism is partly evidenced by the fact that there have been so many overview articles written recently on this phenomenon, each with different areas of focus and coverage (Akita, 2015; Akita & Dingemanse, 2019; Cuskley & Kirby, 2013; Dingemanse, 2018; Dingemanse, Blasi, Lupyan, Christiansen, & Monaghan, 2015; Kawahara, 2020b; Lockwood & Dingemanse, 2015; Nielsen & Dingemanse, 2020; Nuckolls, 1999; Perniss, Thompson, & Vigiliocco, 2010; Perniss & Vigiliocco, 2014; Schmidtke, Conrad, & Jacobs, 2014; Sidhu & Pexman, 2018; Spence, 2011; Svantesson, 2017; C. Westbury et al., 2018).2 There are several reasons why sound symbolism is currently attracting intensive attention from scholars in different branches of language science in particular, and from those in the cognitive science community more generally. For example, it has been argued that sound symbolism plays a non-negligible role in language acquisition, both in the context of first and second language acquisition (Asano et al., 2015; Imai & Kita, 2014; Maurer et al., 2006; Nielsen & Dingemanse, 2020; Nygaard, Cook, & Namy, 2009; Perniss et al., 2010; Perniss & Vigiliocco, 2014; Perry, Perlman, Winter, Massaro, & Lupyan, 2018) as well as in speech processing (Asano et al., 2015; Aveyard, 2012; Perniss & Vigiliocco, 2014; C. Westbury, 2005). Sound symbolism may have played a significant role in the origin of human languages and how they have evolved (Blasi, Christiansen, Wichman, & Hammarström, 2014; Cabrera, 2012; Cuskley & Kirby, 2013; Haiman, 2018; Perlman & Lupyan, 2018; Ramachandran & Hubbard, 2001). Studies on sound symbolism may also shed light on the issue of the place of linguistic knowledge within the larger cognitive system, since sound symbolism may be considered as an instance of a more general cross-modal synesthetic correspondence between multiple modalities (Bankieris & Simner, 2015; Cuskley & Kirby, 2013; Spence, 2011). The potential application of sound-symbolism for marketing research and sports science is also actively explored (Klink, 2000, 2009; Konstantina & Friedemann, 2020; Shinohara, Kawahara, & Tanaka, 2020; Shinohara, Yamauchi, Kawahara, & Tanaka, 2016; Yorkston & Menon, 2004).

1.2. Questions addressed in this study: Cumulativity

The research on sound symbolism has been flourishing, and these studies have been revealing various intriguing aspects of sound symbolic patterns in natural languages (see the overview papers cited in Section 1.1). However, one issue that has been markedly under-explored is whether sound symbolism shows cumulative patterns or not (Kawahara, 2020b), a gap that the current experiments attempt to address. We will elaborate on the notion of cumulativity in further detail below, but put most simply here for the sake of exposition, the question is, when there are two or more sounds with the same sound symbolic meaning, whether these effects yield a greater combined effect than each of their own effects when they appear in isolation.

The issue of cumulativity in sound symbolism is important to study since it bears upon the general question of how speakers make a linguistic decision when multiple sources of evidence are available. Broadly speaking, there are two general strategies actively discussed in the decision making literature (Gigerenzer & Gaissmaier, 2011). One is the well-known regression-based model, in which a decision is made based on all pieces of evidence that are available. Each piece of evidence is associated with some weight (or cogency), and the final decision is based on some form of weighted sum of these weights. The other strategy is known as a ‘fast-and-frugal’ decision making, in which a decision is made solely in terms of the most important evidence, and other less important evidence is disregarded. The regression-based model predicts cumulative patterns, whereas the fast-and-frugal model predicts non-cumulative patterns.

The difference between these two decision-making strategies—or whether cumulativity holds or not—is particularly actively explored in the theoretical phonology literature, since it bears on the question of whether phonological optimization algorithm should be based on rankings (as in Optimality Theory: Prince & Smolensky, 1993/2004) or on weights (as in Harmonic Grammar: Legendre, Miyata, & Smolensky, 1990). The latter framework predicts that cumulativity is the norm, whereas the former approach explicitly disallows it. In addition, whether linguistic patterns show cumulative patterns or not is a topic that is extensively discussed in several other areas of the language science, including in the laboratory phonology tradition: The empirical patterns discussed from this perspective include sociolinguistic variation (C.-J. N. Bailey, 1973; Guy & Boberg, 1994, 1997), phonotactic judgment patterns (T. Bailey & Hahn, 2001; Coleman & Pierrehumbert, 1997; Hay, Pierrehumbert, & Beckman, 2003), speech errors (Rose & King, 2007), phonological alternations (e.g., Blust, 2012; Smith & Pater, 2020; Zuraw & Hayes, 2017), as well as diachronic changes in syntax (Kroch, 1989; Zimmermann, 2017) (see Breiss, 2020 and Kawahara, 2020a for recent overviews of these studies). Inspired by this body of work, the current experiments address the general issue of cumulativity in the context of sound symbolism.

It has been conventional to distinguish two types of cumulativity (Jäger, 2007; Jäger & Rosenbach, 2006): counting cumulativity and ganging-up cumulativity. In the context of sound symbolism, counting cumulativity holds if two occurrences of the same sound evoke stronger sound symbolic image than one occurrence. Take the famous case of [a] generally being judged to be larger than [i] (Sapir, 1929 et seq.). Counting cumulativity holds if a form like [CaCa] is judged to be larger than [CaCi], which itself is judged to be larger than [CiCi]. In order for ganging-up cumulativity to hold, there must be two different sounds, [A] and [B], which evoke the same sound symbolic image. Ganging-up cumulativity holds when the simultaneous occurrence of [A] and [B] evokes a stronger image than a single occurrence of [A] or [B]. To take a more concrete example, both labial consonants and rounded vowels are known to be associated with round images (D’Onofrio, 2014). Ganging-up cumulativity holds if the combination of these sounds (e.g., [bu]) evokes a stronger image than a single occurrence of a labial consonant (e.g., [bi]) or a rounded vowel alone (e.g., [du]). This paper contributes two new experiments which explore whether and how these two types of cumulativity hold in sound symbolic patterns.

1.3. Previous studies and the current research questions

As stated above, the issue of cumulativity did not receive much attention in the research on sound symbolism until very recently. For counting cumulativity, there are two impressionistic reports which suggest that sound symbolism may function in a cumulative fashion. One example comes from the ideophone system of the Tsugaru dialect of Japanese, in which two voiced obstruents evoke stronger sound symbolic meanings than one voiced obstruent (Hamano, 2013). The other example comes from the ideophone system in Korean (Martin, 1962, cited by McCarthy, 1983), in which tense consonants signal intensification, and the degree of intensification is stronger when there are two segments than when there is only one segment. However, both of these patterns are based on impressionistic reports without robust quantitative support.

Recent experiments by Kawahara and Kumagai (in press) build upon these observations and show that two voiced obstruents evoke stronger sound symbolic images than one voiced obstruent in Japanese. Kawahara (2020d) studied how name lengths, measured in terms of mora counts, affect the judgment of Pokémons evolvedness (see Section 1.4 below for further details) and showed that each mora count increased the probability of nonce names being judged as evolved Pokémon characters. Kawahara, Suzuki, and Kumagai (2020) demonstrated a similar effect of mora counts on judged attack values in Pokémon move names. Finally, Kawahara, Katsuda, and Kumagai (2019) analyzed the set of Japanese Takarazuka actress names, and showed that the number of sonorants contained in the names positively correlates with the probability of those names being used for the female names.3

There are a handful of previous studies on ganging-up cumulativity in sound symbolism. Thompson and Estes (2011) experimentally investigated the well-known observation that some set of sounds (back vowels, sonorants, and voiced stops) are associated with large referents (e.g., Sapir, 1929). The results show that the larger the object, the more likely it is that the assigned names contain such sounds that symbolically signal largeness. D’Onofrio (2014) studied the bouba-kiki effect, in which some sounds tend to be associated with round figures, whereas other sounds tend to be associated with angular figures (Ramachandran & Hubbard, 2001). She found that consonant voicing, consonant place of articulation, and vowel backness cumulatively affect this round/angular judgment pattern. Ahlner and Zlatev (2010) likewise argue for the ganging-up cumulative pattern using the bouba-kiki paradigm with Swedish speakers, although their results are not very clear because of a ceiling effect (Kawahara, 2020a). Kawahara, Suzuki, and Kumagai (2020) and Kawahara (2020d) showed that in addition to the effects of name length, the presence of a voiced obstruent also plays a role in the judgment of Pokémons names, suggesting that counting cumulativity and ganging-up cumulativity can coexist within a single sound symbolic pattern. See Kawahara (2020a) for a few other potential cases of cumulative patterns in sound symbolism.

The studies reviewed in this subsection seem to suggest that sound symbolic patterns generally show cumulative patterns, supporting the idea that some regression-based mechanism, rather than a fast-and-frugal decision-making mechanism, governs the sound symbolic patterns in natural languages. However, the number and the scope of existing case studies, especially those based on quantitive evidence, are limited, and we believe that this issue of cumulativity in sound symbolism should be explored with more case studies. In addition, while this body of research seems to have established that cumulativity generally holds in sound symbolic patterns, it at the same time has opened up several new questions regarding precisely how cumulativity works in sound symbolism. These how-questions have recently started receiving attention from some theoretical phonologists, in order to further our understanding of how cumulativity works in linguistic patterns in general (Breiss, 2020; Breiss & Albright, 2020; Hayes, 2020; McPherson & Hayes, 2016; Zuraw & Hayes, 2017). Against the backdrop of these theoretical developments, the current experiments address the four specific questions summarized in 1.

1. Specific research questions addressed in the current study

  1. Can more than two factors interact cumulatively?

  2. Can counting cumulativity and ganging-up cumulativity coexist in a single system?

  3. In the case of ganging-up cumulativity, are the results linearly cumulative or sub/super-linearly cumulative?

  4. In the case of counting cumulativity, are the effects linearly cumulative or sub/super-linearly cumulative?

The first question is important to address, partly because of the lack of empirical studies that directly explored this question—to the best of our knowledge, D’Onofrio (2014) is the only study which clearly showed cumulative interactions of three different factors in sound symbolism.4 A regression-based model predicts that three factors can cumulatively affect one sound symbolic pattern, whereas a fast-and-frugal model predicts that the single factor with the most salient sound symbolic meaning will determine the sound symbolic image of a particular word, predicting non-cumulative patterns. The latter model predicts that the cumulative interaction of two factors is impossible, let alone that of three factors.

The second question is also under-explored in the context of sound symbolism in particular, and in other domains of linguistic patterns in general. The only studies which addressed this question in the context of sound symbolism are Kawahara, Suzuki, and Kumagai (2020) and Kawahara (2020d), whose experiments show that counting cumulativity (the effect of name length) and gang-up cumulativity (the additive effects of name length and voiced obstruents) hold in the same sound symbolic pattern of Pokémon names in Japanese. In the regression-based model, the natural prediction is that if we manipulate the relevant factors in a certain way, both counting cumulativity and ganging-up cumulativity will show up together within a single pattern, again contrasting with the fast-and-frugal model, which predicts neither type of cumulative patterns. We find this question to be particularly interesting, because some recent work on probabilistic phonology has started revealing patterns in which counting cumulativity and ganging-up cumulativity coexist (Breiss, 2020; Hayes, 2020; McPherson & Hayes, 2016; Zuraw & Hayes, 2017).

The last two questions in 1 delve into the nature of cumulativity in further detail; just because the effects of two factors cumulatively add up, it does not have to be the case that the result is the linear sum of the contributions of the two factors. Instead, cumulativity can manifest itself as a sub-linear or super-linear pattern. A schematic, simplified way to think about this question is that if 1 + 1 = 2 holds, that is a case of linear cumulativity. It is conceivable, however, that 1 + 1 could result in 2.5, which is a case of super-linear cumulativity, or that 1 + 1 could result in 1.7, which would be a case of sub-linear cumulativity. Recent studies which tested cumulativity effects in phonotactic learning experiments show that non-linear cumulative patterns are indeed possible (Breiss & Albright, 2020). In part inspired by this study, the last two questions in 1 are geared toward addressing precisely how cumulativity manifests itself in quantitative patterns of sound symbolism. To the best of our knowledge, this issue—the (non-)linearity of cumulative patterns—has never been rigorously addressed in the literature on sound symbolism, and we hope that the current study offers a first stepping stone toward more extensive research on this issue.

1.4. Pokémonastics

The main purpose of the current paper is to address the nature of cumulativity in sound symbolic patterns in natural languages; however, the current studies can also be understood as case studies of Pokémonastics, a research paradigm in which researchers explore the nature of sound symbolism in human languages using Pokémon names (Kawahara, Noto, & Kumagai, 2018; Shih et al., 2019). This subsection briefly summarizes what we take to be advantages of this research paradigm. We make it clear at this point, however, that while the current study is a case study of Pokémonastics, Pokémonastics itself arose building on the large body of research on sound symbolism (see Section 2.1). In this sense, the current studies should also be understood as case studies of general sound symbolism research.

One advantage of Pokémonastics is the fact that, as of 2020, there are more than 800 Pokémon characters, which allows for quantitive analyses of sound symbolism using real, albeit made-up, names. This N is comparatively larger than what is usually analyzed in the cross-linguistic studies of sound symbolism in real words; e.g., 40 basic vocabulary items (Blasi et al., 2016; Wichmann, Holman, & Brown, 2010), 245 vocabulary items (Johansson et al., 2020), 28 antonym pairs (Johansson & Zlatev, 2013), and 112 male names and 151 female names (Pitcher, Mesoudi, & McElligott, 2013). We hasten to add, however, that many of these studies cover a wide range of languages (e.g., thousands of languages were analyzed by Blasi et al., 2016). Thus, we do not at all intend to claim that studying Pokémon names is inherently superior (see also footnote 5).

Additionally, Pokémon characters are specified for various attributes, such as weight, height, strengths, evolution levels, and types. This nature of the Pokémon characters allows researchers to address the general question of which semantic concept can be symbolically expressed in natural languages (e.g., Lupyan & Winter, 2018; Sidhu, Deschamps, Bourdage, & Pexman, 2019; C. Westbury et al., 2018). For example, Kawahara and Kumagai (in press) show that in Japanese, voiced obstruents can symbolically express weight, height, evolution levels as well as strengths, demonstrating the multi-dimensionality of sound symbolism (Winter, Pérez-Sobrino, & Lucien, 2019). Other recent studies have revealed that concepts that are as complex as Pokémon types (such as flying, dark, and fairy) can be symbolically represented in several languages (Godoy, Gomes, Kumagai, & Kawahara, 2020; Kawahara, Godoy, & Kumagai, 2020; Kawahara, Kumagai, & Godoy, 2020; Uno et al., 2020).

Another advantage is the fact that the set of denotations that are assigned a name is fixed across all languages in the Pokémon universe. In this respect, the Pokémonastics project has important precedents, i.e., Berlin (1994, 2006), who compared the names of the same animals in different languages. In his studies too, the set of semantic denotations is fixed and constant across all the languages that are examined. This feature does not necessarily hold in other domains of natural languages, because languages can differ in terms of which real world object to name. For instance, Japanese needs to distinguish ‘older brother’ and ‘younger brother’ and does not have a word that corresponds to the English word ‘brother.’ Neither does it have a gender neutral term ‘siblings.’ This cross-linguistic difference can present an analytical complexity for a cross-linguistic comparison of real words in the studies of sound symbolism. The following quote from Johansson et al. (2020) illustrates the complication, as well as how they overcame it:

For example, when concepts were found to have multiple forms (e.g., gender inflections), only the unmarked form was selected to ensure comparability across languages, as long as relevant information about the meaning was provided through the lexical entries or grammatical descriptions, i.e., in the singular nominative for accusative systems, in the singular absolutive for ergative systems, and so forth. In many languages, the same concept can have a number of different roots or versions…which makes it difficult to know which form of a group of words is the unmarked one. Likewise, throughout languages, most concepts also have several synonyms. Therefore, all phonemes from all forms in these cases were combined into a single string rather than selecting only one of the forms to represent the concept in question. For example, the three English forms of the third person singular personal pronoun (he, she and it) were analyzed as a single word with six phonemes… (p. 268)

We do not wish to imply that this complication is insurmountable, as the quote above shows, but it does present an additional layer of analytical complexity, possibly increasing researcher degrees of freedom (Roettger, 2019). On the other hand, this sort of complication does not arise when studying Pokémon names, since the set of denotations is fixed across languages (Shih et al., 2019). The Pokémon universe therefore has a potential to provide a well-controlled test ground for cross-linguistic comparisons of sound symbolism.5

In short, the Pokémon universe makes it possible to conduct a quantitative study of sound symbolic patterns in an ecologically realistic setting. In this spirit, Shih et al. (2019) report a cross-linguistic study of Pokémon names in Cantonese, English, Japanese, Korean, Mandarin, and Russian. Even if Pokémon names are not available in a particular language, one can run an elicitation study to explore how Pokémon creatures would be named in that language. Godoy, de Souza Filho, Marques de Souza, Alves, and Kawahara (2020) report a study of this sort with native speakers of Brazilian Portuguese.

In Pokémonastics, it is also possible to conduct experiments to explore which Pokémon properties are symbolically expressed how in what languages. For example, how evolution levels are symbolically expressed have been explored in Japanese (Kawahara & Kumagai, 2019), Brazilian Portuguese (Godoy, de Souza Filho, et al., 2020) as well as in English (Kawahara & Moore, 2021). Evolved Pokémon characters are generally larger, heavier, and stronger (see Figure 1 for an example). These studies have revealed interesting cross-linguistic commonalities as well as differences. For instance, in all these three languages, nonce names with [a] tend to be judged to be more suitable for post-evolution Pokémon characters than nonce names with [i]. The effects of voiced obstruents are detectable in all these three languages—larger, post-evolution characters tend to be associated with names with voiced obstruents—but the effect size is evidently larger for Japanese than for English and Brazilian Portuguese.

Figure 1
Figure 1

The pair of pictures used to illustrate pre-evolution versus post-evolution Pokémon characters, drawn by a digital artist toto-mame. These pictures are generally judged to be authentic by Pokémon practitioners, and were used in the experiment with the permission from the artist. Her website where one can view other pictures of original Pokémon characters can be found at https://t0t0mo.jimdofree.com (last access, December 2020).

The majority of the experimental Pokémonastics studies, however, is still limited to those targeting Japanese speakers. In order for the Pokémonastic paradigm to provide a useful resource for cross-linguistic studies of sound symbolism in general, more studies targeting languages other than Japanese are hoped for. In addition to the issue of cumulativity in sound symbolism, this is another gap in the Pokémonastics literature that the current experiments are intended to address.

2. Experiment 1

2.1. Introduction

In order to address the theoretical and empirical issues outlined above, the experiment manipulated three linguistic factors: (i) vowel quality ([a] versus [i]), (ii) voicing of obstruents (voiced versus voiceless), and (iii) name length (short versus long). The main purpose of the experiment was to examine whether these three factors interact cumulatively or not. This design also allows us to address another question regarding the nature of cumulativity—whether the cumulative effects are linear or sub/super-linear (Breiss & Albright, 2020).

In addition to addressing the nature of cumulativity in sound symbolism, each of the sound symbolic associations that is tested in the experiment has a plausible phonetic or psycholinguistic basis (Kawahara, 2020b). The experiment thus serves an additional purpose of testing the robustness of sound symbolic effects that are grounded in our speech behavior, even if the sound symbolic effects at issue are not evidently observable in the Pokémon lexicon.

The first factor, the vowel quality difference ([a] versus [i]), instantiates a well-known sound symbolic effect, in which the vowel [a] is associated with large images, whereas the vowel [i] is associated with small images (e.g., Berlin, 2006; Jespersen, 1922; Newman, 1933; Sapir, 1929; Thompson & Estes, 2011; Ultan, 1978). One plausible phonetic basis of this sound symbolic principle is the difference in oral aperture size: The mouth is much wider open for [a] than for [i], and this difference may be projected onto the different size judgments (Jespersen, 1922; Sapir, 1929). Another plausible phonetic explanation is their difference in F2: [i] has very high F2, whereas [a] has low F2. Given that formant frequency is inversely proportional to the size of a resonating chamber, sounds with high frequency energy are generally associated with small images (Ohala, 1983b, 1994).

These sound symbolic associations ([a]=big, [i]=small) have been shown to hold for English speakers in previous experiments on sound symbolism (Newman, 1933; Sapir, 1929; Shinohara & Kawahara, 2016). Within Pokémonastics, a previous experiment has shown that English speakers indeed associate names containing [a] with post-evolution names and names containing [i] with pre-evolution names (Kawahara & Moore, 2021), even though these sound symbolic associations do not seem to be inferrable from the existing English Pokémon lexicon (Shih et al., 2019). A similar sound symbolic effect is observed in other Pokémonastics experiments targeting Japanese speakers (Kumagai & Kawahara, 2019) and Brazilian Portuguese speakers (Godoy, de Souza Filho, et al., 2020).

The second factor that is manipulated in the experiment is the effects of obstruent voicing. Newman (1933) found that English speakers tend to judge nonce words with voiced obstruents to be larger than those with voiceless obstruents (see also C. Westbury et al., 2018). Articulatorily speaking, the production of voiced obstruents requires expansion of supralaryngeal cavity (Ohala, 1983a; Proctor, Shadle, & Iskarous, 2010; J. R. Westbury, 1983)—this expansion occurs because it is necessary to keep the intraoral airpressure sufficiently low with respect to the subglottal airpressure level in order to sustain vocal fold vibrations (Ohala, 1983a). Acoustically speaking, voiced obstruents involve low frequency energy in three respects: (1) they are characterized by low f0 as well as low F1 in surrounding vowels (Kingston & Diehl, 1994, 1995), (2) burst energies are lower for voiced obstruents than for voiceless obstruents (Chodroff & Wilson, 2014), and (3) at least intervocalically, voiced obstruents are characterized by low frequency energy which reflect vocal fold vibration (a ‘voice bar’) (Stevens & Blumstein, 1981). These low frequency properties, which are demonstrably integrated into one perceptual property (Kingston & Diehl, 1995; Kingston, Diehl, Kirk, & Castleman, 2008), can be mapped onto large images, because of the general inverse relationship between the size of a resonator and its resonating frequency (Ohala, 1983b, 1994).

Shih et al. (2019) did not identify a correlation between evolution levels and the number of voiced obstruents contained in their names in the set of existing Pokémon names in English. The first Pokémonastics experiment by Kawahara and Kumagai (2019) showed that English speakers tend to associate nonce names containing voiced obstruents with post-evolution characters, whereas they tend to associate nonce names containing voiceless obstruents with pre-evolution characters, although the size of that difference found in the experiment was not very large. The primary target of that experiment, moreover, was Japanese speakers, and hence the stimuli were Japanese-sounding words, consisting solely of CV syllables, and hence could have sounded unnatural to English speakers. A follow-up experiment by Kawahara and Moore (2021) identified a similar trend for English speakers to associate names having voiced obstruents with larger post-evolution characters, although the effect of voicing was not statistically significant in that study. The current experiment therefore addresses, with a larger number of test items and participants, whether we can identify the effects of obstruent voicing on the judgment of evolution in Pokémon names.

The third factor, phonological length, was first identified as an active sound symbolic principle in the existing set of Japanese Pokémon names (Kawahara et al., 2018). They found that those Pokémon characters with longer names are generally larger, heavier, stronger, and more evolved. They attribute this observation to a previously known sound symbolic principle, ‘the iconicity of quantity,’ in which larger quantity is expressed via phonological length (Haiman, 1980, 1984).6 A follow-up cross-linguistic study of existing Pokémon names by Shih et al. (2019) targeting Cantonese, English, Japanese, Korean, Mandarin, and Russian shows that the iconicity of quantity is the sound symbolic principle that most robustly holds across these languages, including English. Two experimental studies confirmed the productivity of this principle using nonce names with English speakers (Kawahara & Kumagai, 2019; Kawahara & Moore, 2021).

To recap, building upon the previous studies on Pokémonastics, which themselves are inspired by the general studies of sound symbolism, the current experiment manipulated three phonological dimensions (vowel quality, obstruent voicing, and name length) to examine whether each of these factors impacts the judgment of evolvedness in Pokémon names. More importantly, to the extent that these factors impact the judgment of evolvedness, an ensuing question was whether they do so cumulatively, and if so, how.

2.2. Methods

2.2.1. Stimuli

Experiment 1 had three factors which were fully crossed; six items were included in each cell. The full list of the stimuli is given in Table 1. The stimuli either had two voiceless stops or two voiced stops in onset; word-initially, two items had labial stops, two items had coronal stops, and two items had dorsal stops. The short names were of the form CVC.CVC, where coda consonants were sonorants so that there was a sonority fall across the syllable boundaries (Vennemann, 1988). Long names are of the form CrVC.ClVC; the first complex onset was created using an additional [r], because this is the only consonant that can form a complex cluster with any preceding stop in English (Massaro & Cohen, 1983). The second complex onset was created using [l] in order to avoid two occurrences of [r] within the same word, which can lead to misidentification of the correct number of [r]s (Hall, 2012). The vowels were either [i] or [a].

Table 1

The list of stimuli for Experiment 1.

[i] short [i] long [a] short [a] long
pinkin prinklin pankan pranklan
pintil prinslim pantal pranslam
tinsin trinslin tansan translan
timpim trimplim tampam tramplam
kimpin krimplin kampan kramplan
kintil krinslin kantal kranslan
[i] short [i] long [a] short [a] long
bingin bringlin bangan branglan
bindil brinzlim bandal branslam
dinzin drinzlin danzan dranzlan
dimbim drimblim dambam dramblam
gimbin grimblin gamban gramblan
gindil grinzlin gandal granzlan

2.2.2. Participants

The experiment was distributed online via SurveyMonkey. The responses were collected using the ‘buy response’ function of SurveyMonkey. A total of 150 participants, who passed all the inclusion criteria (see Section 2.2.3), completed the experiment.

2.2.3. Procedure

The first page of the experiment was a consent form, which was approved by the first author’s institute. The second page presented the qualification questions, and only those who fulfilled all four of the following conditions were allowed to proceed: (1) they were a native speaker of English, (2) they were familiar with Pokémon, (3) they were not already familiar with sound symbolism, and (4) they had not participated in a Pokémonastics experiment before.

In the instructions, the participants were told that the experiment is about how they would name new Pokémon characters. They were also told that there are two aspects of Pokémon that are crucial: (1) Pokémon characters undergo evolution, and when they do, they are called by a different name, (2) when Pokémon characters undergo evolution, they generally become larger, heavier, and stronger. The participants were provided with an example pair that illustrates the difference between pre-evoluation character and post-evolution character using a pair of non-existing Pokémon characters, shown in Figure 1.

Within each trial, the participants were given one nonce name and asked to judge whether that name is better for a pre-evolution character or a post-evolution character, i.e., the task was to make a binary decision. The stimuli were presented in the English orthography, although they are asked to read each stimulus in their head before making their responses.7 They were asked to base their decision on their intuition, without thinking too much about ‘right’ or ‘wrong’ answers. The order of the stimuli was uniquely randomized for each participant.

2.2.4. Statistical analyses

The results for this experiment, as well as those for Experiment 2 below, were analyzed using hierarchical Bayesian mixed effects logistic regression using the brms R package (Bürkner, 2017), with evolvedness (0 = pre-evolution, 1 = post-evolution) as the dependent variable, and binary fixed effects of length, vowel, and consonant voicing, with all two- and three-way interactions, plus random intercepts for participant and item, with random slopes of all fixed effects and interactions by participants. We ran four chains of 2,000 iterations each, retaining for analysis samples from the second 1,000 from each chain. Weakly-informative, ‘default’ priors were used. All values were between 1 and 1.01, indicating that the chains mixed successfully.

Bayesian models yield a distribution of possible values for each parameter of interest, which can be interpreted by examining the middle 95% of these values, called the 95% Credible Interval (abbreviated as ‘95% CI,’ followed by bracketed upper and lower bounds). We can interpret these values directly as our degree of belief in the estimate of the role of the factor in explaining the data (see e.g., Franke & Roettger, 2019; Kruschke & Liddell, 2018; Vasishth, Nicenboim, Beckman, Li, & Jong Kong, 2018), with positive coefficients (β’s) indicating that the factor increases post-evolution responses.

Taking a Bayesian approach has two advantages in the current context. First, this method generally allows us to fit the complex model with multiple interaction terms justified by the experimental design without convergence issues. The second advantage is that the Bayesian approach allows us to directly access how meaningful the interaction terms are in the explanation of the data, rather than merely telling us whether we can reject a null hypothesis or not, as in frequentist (that is, non-Bayesian) analyses. These two advantages are important because linearity of cumulative interaction can be examined in light of how meaningful the interaction terms in question are.

2.3. Results

The results are graphically represented in Figure 2, in which each dot represents the ‘evolved response’ for each item averaged across all the participants. We observe that each phonological factor affected the judgment of evolvedness in the expected direction. Long names were more likely to be associated with evolved characters than short names (left panels versus right panels). Names with [a] were more likely to be associated with evolved characters than names with [i] (top panels versus bottom panels). The names with voiced obstruents were more likely to be associated with evolved characters than names with voiceless consonants (comparisons within each panel).

Figure 2
Figure 2

The results of Experiment 1. Each dot represents the ‘evolved response’ for each item, averaged over all the participants.

The results of the hierarchical Bayesian logistic mixed effects model show that name length (long versus short, β = 0.90, 95% CI [0.53, 1.26]), vowel quality ([a] versus [i], β = 1.85, 95% CI [1.41, 2.29]), and consonant voicing (voiced versus voiceless, β = 0.60, 95% CI [0.28, 0.92]) all meaningfully predicted participants’ responses in that they all increased ‘evolved’ responses, because their CIs do not include 0. This result indicates that each linguistic factor cumulatively affected the judgment of evolution status of Pokémon characters for English speakers. On the other hand, the Cls for all of the interaction terms were more or less centered around 0, suggesting that there is no strong evidence that the interaction terms are playing a substantial role (see the R markdown file in the supplementary material for complete model details, including their ROPE analyses: Kruschke & Liddell, 2018; Makowski, Ben-Shachar, & Lüdecke, 2019). For the case at hand, it seems plausible to infer that the cumulativity for the case at hand is linear, i.e., the effects of the three factors appear to be additive.

2.4. Discussion

The current results first of all show that three phonological factors can independently impact the judgment of evolvedness in naming new Pokémon characters. Further, the fact that each factor exerts its own effect regardless of the presence of other factors is evidence that cumulativity of three factors is possible in judgment concerning sound symbolism. In other words, the results instantiate a clear case of ganging-up cumulativity of three factors. We submit that this is an interesting, if not entirely novel, result—recall that D’Onofrio (2014) is the only study in the existing literature which clearly showed this three-way cumulative pattern in sound symbolism.

The current results show that there is no strong evidence that the interaction terms play a clearly substantial role in predicting the participants’ judgments for the case at hand. In other words, it appears that when two or three factors are relevant, the probabilities of the outcomes can plausibly be predicted based on the summed contribution (in log-odds) of each factor at play. For the case at hand, then, the cumulative effects appear to be linear (although see the R markdown file for complete details).

Finally, as discussed in section 2.1, the sound symbolic effects of vowels and voiced obstruents on evolution levels are not observed in the existing English Pokémon lexicon (Shih et al., 2019), while these sound symbolic effects are observed in other experimental settings (Kawahara & Moore, 2021; Kumagai & Kawahara, 2019). The current results lend further support to the thesis that English speakers are able to apply these sound symbolic associations to new Pokémon names, even when these associations are not evidently apparent in the existing Pokémon patterns.

3. Experiment 2

In order to further our understanding of the nature of cumulativity in sound symbolic patterns, Experiment 2 tested counting cumulativity by varying name lengths in three degrees (short versus medium versus long). To test whether counting cumulativity and ganging-up cumulativity can co-exist, this factor was crossed with the binary vowel quality difference tested in Experiment 1.

3.1. Methods

The experimental procedures were almost identical to those of Experiment 1, so we only highlight the important differences.

3.1.1. Stimuli

The list of the stimuli in Experiment 2 is shown in Table 2. Short forms are disyllabic CV.CV words, and the vowels are either [i] or [a]; the word-initial consonants were voiceless stops (three items for [p], [t], [k] each), and the second consonants were sonorants. Medium forms were of the form CVC.CVC, in which onset consonants were voiceless stops and coda consonants were sonorants, which guaranteed sonority fall across the syllable boundary (Vennemann, 1988). Long forms were of the form CCVC.CCVC. Each consonant cluster in onset is an attested sequence in the English phonotactics, and there was a sonority fall across the syllable boundaries.

Table 2

The list of stimuli for Experiment 2.

[i] short [i] medium [i] long
pini pinkin prinklin
pimi pimpil primplim
pili piltim prilslim
tini tinsin trinkrin
timi timpim trimplim
tili tilpil trilspil
kimi kimpin krimplin
kini kintil krinslin
kili kiltim kriltrim
[a] short [a] medium [a] long
pana pankan pranklan
pama pampal pramplam
pala paltam pralslam
tana tansan trankran
tama tampam tramplam
tala talpal tralspal
kama kampan kramplan
kana kantal kranslan
kala kaltam kraltram

3.1.2. Participants

A total of 147 native speakers of English participated in this experiment. They were recruited online from two universities in the United States (University of Toledo and UCLA), as well as from the ‘Psychological research on the net’ website.8 Nine participants were excluded from the subsequent analysis because they failed to satisfy one or more the inclusion criteria: (1) they are a native speaker of English, (2) they are familiar with Pokémon, (3) they are not familiar with sound symbolism, (4) and they have not participated in a Pokémonastics experiment before.

3.1.3. Statistical analyses

Taking the theoretical quantity of length as a continuous scale, we coded the length factor numerically as 1, 2, and 3. Other aspects of the analysis were identical to those of Experiment 1, although we report an additional analysis to examine the question of whether the counting cumulativity pattern is linear or sub/super-linear in Section 3.2.2.

3.2. Results

3.2.1. General results

Figure 3 illustrates the general results by presenting ‘evolved response’ for each item, averaged over all the participants. We observe that as the names get longer, they were more likely to be judged as names for post-evolution characters (the left panel versus middle panel versus right panel). Within each panel, we observe that names with [a] were judged to be more likely for post-evolution characters than those names with [i].

Figure 3
Figure 3

The results of Experiment 2, showing the cumulative effect of name length modulated by the vowel quality difference. Each dot represents average ‘evolved response’ for each item.

The mixed-effects Bayesian modeling analysis shows that the binary variable of vowel ([a] versus [i], β = 2.04, 95% CI [0.69, 3.50]) as well as the continuous variable of length (1-unit increase, β = 2.67, 95% CI [2.15, 3.17]) both meaningfully predicted participants’ judgements of evolvedness, while the interaction of the two factors did not (β = 0.15, 95% CI [-0.51, 0.77]). Again, see the R markdown file for complete details. We conclude that both counting (length 1 versus 2 versus 3) and ganging-up (vowel + length) cumulativity obtained, and that since there was no strong evidence that the interaction played a clearly meaningful role, the ganging-up cumulativity between vowel quality and name length may be considered to be linear, just as in Experiment 1.

3.2.2. Probing the linearity of counting cumulativity

To assess whether the counting cumulativity was linear or not, we re-fit the model above with length as a three-level unordered factor. We then used the posterior samples returned by the Bayesian model to calculate the distributions of probable values of the log-odds of being judged evolved for each combination of length and vowel quality. The results are plotted in Figure 4.

Figure 4
Figure 4

Estimates of the log-odds of being judged evolved for each level of length, divided by vowel quality.

Since we are interested in whether the change in log-odds when moving from short to medium is different from that of moving from medium to long, we subtracted the adjacent categories from each other, yielding distributions over differences in Figure 5.

Figure 5
Figure 5

Differences in effect of length by category, divided by vowel quality.

Finally, we can use these distributions to answer the question of whether counting cumulativity is linear, sub-linear, or super-linear. Linear cumulativity means that the log-odds of being judged evolved increases by the same amount for each adjacent pair of levels; if this were the case in Experiment 2, we expect the pink and blue distributions to be entirely overlapping; to the degree that they are not, the cumulativity is sub-linear (pink below blue) or super-linear (blue below pink).

To more directly visualize the linearity of counting cumulativity, we can simply examine whether the difference between these two distributions is positive, negative, or centered on zero. Further, we can average across the two vowel qualities, since our hypothesis in (1d) above does not concern whether the linearity of counting cumulativity itself differs by what it is ganged with; such a question is interesting, but beyond the scope of conclusions that can be reasonably drawn using this experiment.9 This yields difference in differences in log-odds of being judged evolved between the two levels of counting cumulativity, plotted in Figure 6. We find that the vast majority of credible values for this difference are above zero: β = 1.44, 95% CI [0.45, 2.45]. We therefore conclude that the counting cumulativity observed in Experiment 2 is sub-linear: The increase in likelihood of perceived evolvedness going from medium to long length is less than that associated with going from short to medium.

Figure 6
Figure 6

Differences in effect of length by category.

3.3. Discussion

Experiment 2 demonstrated that counting and ganging-up cumulativity simultaneously obtain in the domain of sound symbolism (Kawahara, 2020d; Kawahara, Suzuki, & Kumagai, 2020), paralleling recent findings from other areas of probabilistic phonological patterns (Breiss, 2020; Hayes, 2020; McPherson & Hayes, 2016; Zuraw & Hayes, 2017). The result also seems to suggest that the multiple levels of length intersect with vowel without much complication, i.e., can likely be modeled without positing a substantial interaction term.

Going beyond the question of whether cumulativity holds in sound symbolism or not, we found that ganging-up cumulativity (the interaction between the length factor and the vowel factor) seems to be linearly cumulative, while there is strong evidence that counting cumulativity (the gradual increase along the continuous length dimension) is sub-linear. Where this difference comes from is an interesting question, but it is beyond the scope of this paper to provide an answer. Nevertheless, the current results open an opportunity for future investigation on cumulativity in sound symbolism and other linguistic patterns to address how cumulativity manifests itself in which contexts (Breiss & Albright, 2020).

4. General discussion

4.1. Summary of the results

The two experiments reported in this paper have shown that sound symbolic effects operate cumulatively when English speakers are provided with new names for Pokémon characters and are asked to judge their evolution status. One may ask if the observed cumulative patterns are surprising at all; i.e., they could have been otherwise. Our answer is positive. Going back to the general issue of how speakers make linguistic decisions (Section 1.2), the participants could have resorted to a fast-and-frugal decision-making strategy (Gigerenzer & Gaissmaier, 2011); for example, they could have assigned all names with [i] to pre-evolution characters and those with [a] to post-evolution characters, especially given that these sound symbolic effects are so robust (Jespersen, 1922; Sapir, 1929 et seq). Or, given that the iconicity of quantity (Haiman, 1980, 1984) is such a robust principle in the Pokémon universe (Shih et al., 2019), they could have assigned all long names to post-evolution characters, and could have made their decision solely based on that criterion. However, the current results show that English speakers did not resort to such fast-and-frugal decision-making strategies: They instead probabilistically took all factors (vowel quality, consonant voicing, and different degrees of length) into consideration when they decided whether each name belonged to a pre-evolution character or a post-evolution character. As such, we do not take the current results to be trivial—they provide evidence that speakers take into account multiple sources of information when they make sound symbolic judgments, as predicted by a regression-based model.

As discussed in Section 1.2, the issue of how cumulativity works in sound symbolism has been relatively understudied (Kawahara, 2020b), especially in light of the recent dramatic rise of interest in the phenomena (Nielsen & Dingemanse, 2020). Partly to address this gap in the literature and partly inspired by the increasing body of work on cumulativity in other linguistic patterns reviewed in Section 1.2, we have found that (1) three factors can interact cumulatively (Experiment 1), (2) counting cumulativity and ganging-up cumulativity can co-exist within the same system (Experiment 2), (3) the ganging-up cumulativity patterns appear to be linear in general (Experiments 1 and 2), and (4) at least in the case of name length studied in the current experiments, there is fairly strong evidence that the counting cumulativity is sub-linear (Experiment 2). We do not at all pretend that the current experiments offer a final answer to the question of how cumulativity works in sound symbolism in general; neither do we intend to argue that these results should generalize to all other cases of sound symbolism, let alone other linguistic patterns. Nevertheless, we submit that only through case studies of this kind will we understand how cumulativity functions in sound symbolism in particular, and other linguistic patterns more generally.

4.2. Remaining issues

To expand on this last point, while our results demonstrated that the sound symbolic pattern in Pokémon names shows cumulative properties, the current results do not necessarily entail that all sound symbolic patterns have to operate this way. There are several issues that can and should be addressed building on what we already know—and what we have now learned—about how cumulativity works in sound symbolism. For example, the semantic notion that was studied in the current experiments is evolvedness, which is closely related to the gradable, scalar notion of size, weight, and strengths. The previous studies which addressed cumulativity in sound symbolism (reviewed in Section 1.2) also tend to target gradable and scalar notions such as size (Kawahara & Kumagai, in press; Thompson & Estes, 2011), intensification (Martin, 1962; McCarthy, 1983), and roundness/angularity (Ahlner & Zlatev, 2010; D’Onofrio, 2014). Thus, an interesting question which should be addressed in future research is whether non-scalar semantic notions (such as dead versus alive) show cumulative sound symbolic patterns.10

Since the issue of cumulativity in sound symbolism is generally understudied, we are unable to offer a full-fledged answer to this question in this paper. The only study that we know of which may bear on this question is Kawahara et al. (2019), who analyzed the names of Takarazuka actresses and found that the number of sonorants in their names positively correlates with the probability of those names being used for the female names. This pattern instantiates a case in which we observe cumulative effects for a semantic notion that is not gradable (i.e., Takarazuka gender). More case studies are needed to fully address the question of whether non-gradable semantic dimensions can show cumulative sound symbolic effects.

A related question is whether the difference between social versus referential meanings can matter with respect to whether, and how, cumulativity holds in sound symbolic patterns. The same question can be asked with respect to the difference between propositional meanings and attitudinal meanings. These questions too are interesting ones, although they can only be answered with different sets of quantitative studies. While the range of semantic dimensions that can be studied in Pokémonastics is fairly wide (size, weight, strengths, type, etc.), we obviously need to go beyond Pokémonastics to address all of these questions.

While cumulativity seems to be the norm in sound symbolism, as the studies reviewed in Section 1.2 as well as the current results show, there may also be cases in which one segment has such a strong sound symbolic meaning that one occurrence deterministically conveys that meaning, in which cases cumulativity is unexpected. Palatalization found in Japanese baby-talk, which symbolically expresses ‘childishness’ may instantiate such an example, where one instance of palatalization makes the whole utterances undoubtedly ‘child-like’ (Kawahara, 2020a; Sawada, 2013), although quantitative examination of this phenomenon is also yet to be conducted.

All in all, we hope that our paper stimulates more research on this question—how cumulativity manifests itself for what kinds of semantic meanings in what contexts—not only in sound symbolism but also in other domains of our speech behavior.

4.3. Contributions to the studies of sound symbolism

In addition to addressing the nature of cumulativity in sound symbolism, the current experiments have contributed toward expanding available data on Pokémonastics, a resource that can be used for cross-linguistic comparisons of sound symbolic patterns (Shih et al., 2019). As reviewed in Section 2.1, the effects of vowel quality were known to affect the judgment of evolution status for Japanese speakers (Kumagai & Kawahara, 2019) and Brazilian Portuguese speakers (Godoy, de Souza Filho, et al., 2020). While this effect was also shown to be productive among English speakers (Kawahara & Moore, 2021), the current replication of the effects lends further support for the robustness of this sound symbolic pattern across languages.

The fact that we found an effect of voiced obstruents in Experiment 1 is also encouraging, as in one of the previous studies, the effect was not significant (Kawahara & Moore, 2021). The current experiment shows that with a sufficient number of items and speakers, we can, with a reasonable amount of confidence, identify a sound symbolic effect of voiced obstruents even among English speakers. This sound symbolic effect too was previously identified as active among Japanese speakers (Kumagai & Kawahara, 2019) and Brazilian Portuguese speakers (Godoy, de Souza Filho, et al., 2020), a cross-linguistic parallel that we find intriguing and important. It may be the case that sound symbolic values of voiced obstruents are grounded in the articulatory/acoustic properties of these sounds (see Section 2.1), and hence may be available to speakers of different languages (Shinohara & Kawahara, 2016). Again, we do not pretend that studying three languages suffices, but it points to a hypothesis that phonetically-motivated sound symbolic patterns are universally available to speakers of different languages (see e.g., Bremner et al., 2013; Imai & Kita, 2014; Kawahara, 2020b; Ohala, 1994; Shinohara & Kawahara, 2016 for relevant discussion).

This hypothesis is further supported by our results which showed clear effects of name length, identified both in Experiments 1 and 2. The iconicity of quantity is a well-known sound symbolic principle (Dingemanse et al., 2015; Haiman, 1980, 1984), which has been shown to hold in the existing names of Pokémon characters in various languages (Shih et al., 2019). Its productivity has been confirmed with experimentation for Japanese speakers (Kumagai & Kawahara, 2019) and Brazilian Portuguese speakers (Godoy, de Souza Filho, et al., 2020). This robustness may have to do with the fact that this sound symbolic principle also has a clear cognitive basis, in which the quantity of a vector in one perceptual domain can be iconically mapped onto the quantity in another perceptual domain (Marks, 1978).

4.4. Formal phonology and research on sound symbolism

We would like to emphasize at this point that our investigation of the (non-)linearity of cumulativity in sound symbolism is inspired by several studies on this topic conducted in the formal phonology community (Breiss, 2020; Hayes, 2020; McPherson & Hayes, 2016; Zuraw & Hayes, 2017, especially Breiss & Albright, 2020). If not for this research program, we would not have addressed the same issue in sound symbolism. Neither might we have realized that cumulativity is an understudied topic in sound symbolism in the first place. In this sense, we maintain that studies in formal phonology can inform research on sound symbolism by potentially identifying aspects of sound symbolism that are understudied.

Recall that the exploration of cumulative nature of linguistic patterns is an actively debated topic in the theoretical phonology literature, because it bears on the question of choosing between Optimality Theory (Prince & Smolensky, 1993/2004) and other theories of grammar with weighted-constraints such as Harmonic Grammar (Legendre et al., 1990). Viewed more generally, exploring cumulativity allows us to address the question of what sort of decision-making strategy—regression-based versus fast-and-frugal heuristics (Gigerenzer & Gaissmaier, 2011)—is best suited to model our linguistic behavior. We hope to have shown by way of case studies that it is useful to study linguistic patterns, including sound symbolic patterns, from this general perspective of decision making strategies.

As briefly touched upon in Section 1.2, and more extensively reviewed in Breiss (2020) and Kawahara (2020a), it seems to be the case that the regression-based framework, which predicts cumulative patterns, is better suited to model our speech behavior, including phonological alternations and surface phonotactic judgment patterns. To the extent that cumulativity is a general property of these phonological patterns as well as that of sound symbolic patterns, it raises the possibility that they may share a non-trivial property. Sound symbolism has long been considered as residing outside the purview of ‘the core’ phonological knowledge (Alderete & Kochetov, 2017; Kawahara, 2020b). However, the current results point to a hypothesis that sound symbolism may not be as irrelevant to formal phonology as it used to be believed, a hypothesis that is recently being put forth by several researchers (Alderete & Kochetov, 2017; Jang, 2020; Kawahara, 2020b; Kawahara et al., 2019; Kumagai, 2019; Shih, 2020). Some of these studies even argue that phonological systems and sound symbolic principles are integrated so tightly that sound symbolic principles can trigger phonological alternations (Alderete & Kochetov, 2017; Jang, 2020; Kumagai, 2019).

The current results lend further support to the general hypothesis that phonological patterns and sound-symbolic patterns share non-trivial properties, and hence can and should be studied in tandem with each other. Recall that Experiment 2 revealed a sub-linear cumulativity pattern, and a recent study showed that such a non-linear pattern is also possible in phonotactic judgment patterns (Breiss & Albright, 2020). The fact that we are discovering such non-linear cumulative patterns both in sound symbolism and other linguistic patterns is intriguing. Recall also that Experiment 2 found the co-existence of counting cumulativity and ganging-up cumulativity, paralleling some recent findings in probabilistic phonological patterns (McPherson & Hayes, 2016; Zuraw & Hayes, 2017). These new findings give further credence to the possibility that a similar sort of mechanism may lie behind phonological knowledge and sound symbolic knowledge.11

To summarize, we believe that formal phonology can inform research on sound symbolism. We further hope that the relationship can be a mutually beneficial one. To the extent that there are non-trivial parallels between sound symbolic patterns and other phonological patterns, we may be able to study sound symbolic patterns to explore the general nature of linguistic patterns as well (Kawahara, 2020b). To conclude this paper, all in all, by way of case studies, we hope to have demonstrated a potential that formal phonology and studies on sound symbolism can inform one another.

Data Accessibility Statement

The experimental data from the current experiments as well as the R markdown files are available as supplementary materials at https://osf.io/7phjv/.


  1. Although these two writers are often cited as the influential figures who had established the arbitrariness thesis at the center of modern linguistic theories, there are many precedents who have made similar statements, including Locke (1689), Whitney (1867), as well as Hermogenes in Cratylus (Plato). [^]
  2. Nielsen and Dingemanse (2020) offer some quantitative evidence for this trend, based on the number of relevant publications found in Web of Science. [^]
  3. The sound symbolic connection between sonorants and femaleness holds in languages other than Japanese. See Sidhu and Pexman (2019) for a recent review. [^]
  4. Although Thompson and Estes (2011) examined three types of sounds (i.e., back vowels, sonorants, and voiced stops), they collapsed them together in one group, so it is not clear how the effects of the three types of sounds interacted in their results. [^]
  5. We hasten to add that studies of sound symbolism using real words, such as basic vocabularies, are no less important than the Pokémonastics studies, as they have been revealing important aspects of sound symbolism (Berlin, 1994, 2006; Blasi et al., 2016; Johansson, 2017; Johansson & Zlatev, 2013; Johansson et al., 2020; Pitcher et al., 2013; Ultan, 1978; Wichmann et al., 2010). These studies tend to target many different languages, and in that sense, the scope is much wider than the current scope of the Pokémonastics studies. Moreover, while we can study various semantic dimensions in Pokémonastics, it is not the case that we can study all semantic dimensions that are of interest. In short, these different studies have different advantages, and aim to address different, albeit related, questions, although ultimately, we are all interested in explicating the nature of sound symbolism in natural languages. In other words, Pokémonastic studies and those studies using real words should complement each other. Other potential benefits of the Pokémonastics approach, not discussed in further depth in this paper, include their potential application to teaching and public outreach (Kawahara, 2020b, 2020c; MacKenzie, 2018). Since Pokémon is a widely known game franchise, it can attract attention from students and non-experts. [^]
  6. This sound symbolic principle may arguably be grounded in the domain-general iconic mapping between the length of a vector in one modality to the size of a vector in another modality (Marks 1978). See also Dingemanse et al. (2015) for how this principle may manifest itself in other linguistic patterns. [^]
  7. Since Pokémon names are often communicated in written forms, and since the previous Pokémonastics experiments used orthographic stimuli, the current experiment followed that methodology (Kawahara & Kumagai, 2019; Kawahara & Moore, 2021). It is possible that orthography has some effects on sound symbolic patterns (Cuskley, Simner, & Kirby, 2017), but it has also been shown that sound symbolism holds beyond the influences of orthography (Sidhu, Pexman, & Saint-Aubin, 2016). [^]
  8. https://psych.hanover.edu/research/exponnet.html (last access, December 2020). [^]
  9. An interested reader can find all samples from the posterior distribution yielded by the Bayesian model, which underlie the data presented here and the assessment of linearity, in the supplementary materials. [^]
  10. An anonymous reviewer offered a very interesting example which can be tested to address this specific question. To quote: “[m]eanings like ‘big,’ ‘small,’ ‘evolved,’ etc. are arguably linear and open-ended in scale. But meanings like ‘androgynous,’ for example, might not be linear in the same way, such that, e.g., pitch raising up to a certain point of a male voice can reliably signal it, but beyond a certain threshold, not so much.” We are not in a position to offer an explicit answer to this specific question—it needs to await another quantitative study. We also note, however, that even if we find a pattern that is described by the reviewer, that would still be a case of counting cumulativity, but it might be a case of (strongly) sub-linear cumulative pattern. [^]
  11. Yu Tanaka (p.c.) pointed out a potential challenge to this thesis in the context of Experiment 2. In that experiment, long forms had complex onsets (i.e., more segments) while medium and short form did not (i.e., less segments), and the sound symbolic effect at issue was sensitive to that difference. However, it was long believed in the formal phonology literature that it is coda consonants, not onset consonants, that contribute to metrical weight in phonology (e.g., Hayes, 1989), possibly pointing to a divergence between phonology and sound symbolic patterns. However, it is possible that metrical weight is better defined in terms of V-to-V intervals (Ryan, 2016; Steriade, 2008). Since the long forms in our stimuli do have longer V-to-V intervals than the medium or short forms, to the extent that metrical weight is defined in terms of V-to-V interval, it would be a case of convergence rather than divergence. [^]

Ethics and Consent

Experiments 1 and 2 were conducted under the ethical approval granted by the first author’s institution. A subset of the participants for Experiment 2 was recruited from the UCLA experiment participant pool, which was approved by the second author’s institution. A consent form was provided to the participants before the experiments.


We are grateful to two anonymous reviewers for Laboratory Phonology and the associate editor who provided constructive comments on the previous version of this paper, as well as to Caroline Menezes who helped us with participant recruitment for Experiment 2. We would also like to thank the participants at Annual Meeting of Phonology 2020, the Phonological Society of Japan, as well as those at the NINJAL prosody meeting. All remaining errors are ours.

Funding Information

This project is supported by the the JSPS grants #17K13448 and #18H03579, the research money granted to the Keio Institute of Cultural and Linguistic Studies and NINJAL collaborative research project ‘Cross-linguistic Studies of Japanese Prosody and Grammar’ to Shigeto Kawahara. This work is also supported in part by NSF Graduate Research Fellowship DGE-1650604 to Canaan Breiss.

Competing Interests

The authors have no competing interests to declare.


Ahlner, F., & Zlatev, J. (2010). Cross-modal iconicity: A cognitive semiotic approach to sound symbolism. Sign Sytems Studies, 38(1/4), 298–348. DOI:  http://doi.org/10.12697/SSS.2010.38.1-4.11

Akita, K. (2015). Sound symbolism. In J.-O. Östman & J. Verschueren (Eds.), Handbook of pragmatics, installment 2015. Amsterdam and Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/hop.19.sou1

Akita, K., & Dingemanse, M. (2019). Ideophones (mimetics, expressives). In M. Aronoff (Ed.), Oxford research encyclopedia of linguistics. Oxford: Oxford University Press. DOI:  http://doi.org/10.1075/ill.16

Alderete, J., & Kochetov, A. (2017). Integrating sound symbolism with core grammar: The case of expressive palatalization. Language, 93, 731–766. DOI:  http://doi.org/10.1353/lan.2017.0056

Asano, M., Imai, M., Kita, S., Kitaji, K., Okada, H., & Thierry, G. (2015). Sound symbolism scaffolds language development in preverbal infants. Cortex, 63, 196–205. DOI:  http://doi.org/10.1016/j.cortex.2014.08.025

Aveyard, M. E. (2012). Some consonants sound curvey: Effects of sound symbolism on object recognition. Memory & Cognition, 40(1), 298–348. DOI:  http://doi.org/10.3758/s13421-011-0139-3

Bailey, C.-J. N. (1973). Variation and linguistic theory. Arlington: Center for Applied Linguistics.

Bailey, T., & Hahn, U. (2001). Determinants of wordlikeliness: Phonotactics or lexical neighborhoods? Journal of Memory and Language, 44, 568–591. DOI:  http://doi.org/10.1006/jmla.2000.2756

Bankieris, K., & Simner, J. (2015). What is the link between synaesthesia and sound symbolism? Cognition, 136, 186–195. DOI:  http://doi.org/10.1016/j.cognition.2014.11.013

Berlin, B. (1994). Evidence for pervasive synesthetic sound symbolism in ethnozoological nomenclature. In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 76–93). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511751806.006

Berlin, B. (2006). The first congress of ethonozoological nomenclature. Journal of Royal Anthropological Institution, 12, 23–44. DOI:  http://doi.org/10.1111/j.1467-9655.2006.00271.x

Blasi, D. E., Wichman, S., Hammarström, H., Stadler, P. F., & Christianson, M. H. (2016). Sound-meaning association biases evidenced across thousands of languages. Proceedings of National Academy of Sciences, 113(39), 10818–10823. DOI:  http://doi.org/10.1073/pnas.1605782113

Blasi, D. E., Christiansen, M. H., Wichman, S., & Hammarström, H. (2014). Sound symbolism and the origins of language. The Evolution of Language (pp. 391–392). DOI:  http://doi.org/10.1142/9789814603638_0059

Blust, R. (2012). One mark per word? Some patterns of dissimilation in Austronesian and Australian languages. Phonology, 29(3), 355–381. DOI:  http://doi.org/10.1017/S0952675712000206

Breiss, C. (2020). Constraint cumulativity in phonotactics: Evidence from artificial grammar learning studies. Phonology 37(4), 1–26. DOI:  http://doi.org/10.1017/S0952675720000275

Breiss, C., & Albright, A. (2020). Cumulative markedness effects and (non-)linearity in phonotactics. Unpublished manuscript. UCLA and MIT.

Bremner, A. J., Caparos, S., Davidoff, J., de Fockert, J., Linnell, K. J., & Spence, C. (2013). “Bouba” and “Kiki” in Namibia? A remote culture make similar shape-sound matches, but different shape-taste matches to Westerners. Cognition, 126, 165–172. DOI:  http://doi.org/10.1016/j.cognition.2012.09.007

Bürkner, P.-C. (2017). brms: An R Package for Bayesian Multilevel Models using Stan. R package. DOI:  http://doi.org/10.18637/jss.v080.i01

Cabrera, J. C. M. (2012). The role of sound symbolism in protolanguage: Some linguistic and archaeological speculations. Theoria et Historia Scientiarum, 9, 115–130. DOI:  http://doi.org/10.12775/v10235-011-0007-0

Chodroff, E., & Wilson, C. (2014). Burst spectrum as a cue for the stop voicing contrast in American Engish. Journal of the Acoustical Society of America, 136(5), 2762–2772. DOI:  http://doi.org/10.1121/1.4896470

Coleman, J., & Pierrehumbert, J. (1997). Stochastic phonological grammars and acceptability. In Computational phonology: Third meeting of the ACL special interest group in computational phonology (pp. 49–56). Somerset: Association for Computational Linguistics.

Cuskley, C., & Kirby, S. (2013). Synesthesia, cross-modality, and language evolution. In J. Simner & E. Hubbard (Eds.), Oxford handbook of synesthesia. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199603329.013.0043

Cuskley, C., Simner, J., & Kirby, S. (2017). Phonological and orthographic influences in the bouba-kiki effect. Psychological Research, 81(1), 119–130. DOI:  http://doi.org/10.1007/s00426-015-0709-2

Diffloth, G. (1994). i: big, a: small. In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 107–114). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511751806.008

Dingemanse, M. (2018). Redrawing the margins of language: Lessons from research on ideophones. Glossa, 3(1), 4. DOI:  http://doi.org/10.5334/gjgl.444

Dingemanse, M., Blasi, D. E., Lupyan, G., Christiansen, M. H., & Monaghan, P. (2015). Arbitrariness, iconicity and systematicity in language. Trends in Cognitive Sciences, 19(10), 603–615. DOI:  http://doi.org/10.1016/j.tics.2015.07.013

D’Onofrio, A. (2014). Phonetic detail and dimensionality in sound-shape correspondences: Refining the bouba-kiki paradigm. Language and Speech, 57(3), 367–393. DOI:  http://doi.org/10.1177/0023830913507694

Franke, M., & Roettger, T. B. (2019). Bayesian regression modeling (for factorial designs): A tutorial. Unpublished manuscript. DOI:  http://doi.org/10.31234/osf.io/cdxv3

Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual Review of Psychology, 62, 451–482. DOI:  http://doi.org/10.1146/annurev-psych-120709-145346

Godoy, M. C., de Souza Filho, N. S., Marques de Souza, J. G., Alves, H., & Kawahara, S. (2020). Gotta name’em all: An experimental study on the sound symbolism of Pokémon names in Brazilian Portuguese. Journal of Psycholinguistic Research, 49, 717–740. DOI:  http://doi.org/10.1007/s10936-019-09679-2

Godoy, M. C., Gomes, A. L. M., Kumagai, G., & Kawahara, S. (2020). Sound symbolism in Brazilian Portuguese Pokémon names: Evidence for cross-linguistic similarities and differences. Unpublished manuscript. Federal University of Rio Grande do Norte, Meikai University and Keio University.

Guy, G., & Boberg, C. (1994). The Obligatory Contour Principle and sociolinguistic variation. Toronto Working Papers in Linguistics.

Guy, G., & Boberg, C. (1997). Inherent variability and the obligatory contour principle. Language Variation and Change, 9, 149–164. DOI:  http://doi.org/10.1017/S095439450000185X

Haiman, J. (1980). The iconicity of grammar: Isomorphism and motivation. Language, 56(3), 515–540. DOI:  http://doi.org/10.2307/414448

Haiman, J. (1984). Natural syntax: Iconicity and erosion. Cambridge: Cambridge University Press.

Haiman, J. (2018). Ideophones and the evolution of language. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/9781107706897

Hall, N. (2012). Perceptual errors or deliberate avoidance? Types of English /r/-dissimilation. Proceedings of the Thirty-Fourth Annual Meeting of the Berkeley Linguistics Society, 133–144. DOI:  http://doi.org/10.3765/bls.v34i1.3563

Hamano, S. (2013). Hoogen-ni okeru giongo-gitaigo-no taikeiteki-kenkyuu-no igi. In K. Shinohara & R. Uno (Eds.), Chikazuku oto-to imi: Onomatope kenkyuu-no shatei (pp. 133–147). Tokyo: Hitsuzi Syobo.

Hay, J., Pierrehumbert, J., & Beckman, M. (2003). Speech perception, well-formedness, and the statistics of the lexicon. In J. Local, R. Ogden, & R. Temple (Eds.), Papers in laboratory phonology VI: Phonetic interpretation (pp. 58–74). Cambridge: Cambridge University Press.

Hayes, B. (1989). Compensatory lengthening in moraic phonology. Linguistic Inquiry, 20, 253–306.

Hayes, B. (2020). Deriving the wug-shaped curve: A criterion for assessing formal theories of linguistic variation. Unpublished manuscript. UCLA.

Hinton, L., Nichols, J., & Ohala, J. (1994). Sound symbolism. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511751806

Hockett, C. (1959). Animal “languages” and human language. Human Biology, 31, 32–39.

Imai, M., & Kita, S. (2014). The sound symbolism bootstrapping hypothesis for language acquisition and language evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651). DOI:  http://doi.org/10.1098/rstb.2013.0298

Jäger, G. (2007). Maximum entropy models and stochastic Optimality Theory. In J. Grimshaw, M. Joan, C. Manning, J. Simpson, & A. Zaenen (Eds.), Architectures, rules and preferences: A festschrift for Joan Bresnan (pp. 467–479). Stanford: CSLI.

Jäger, G., & Rosenbach, A. (2006). The winner takes it all—almost: Cumulativity in grammatical variation. Linguistics, 44(5), 937–971. DOI:  http://doi.org/10.1515/LING.2006.031

Jang, H. (2020). How cute do I sound to you?: Gender and age effects in the use and evaluation of Korean baby-talk register, Aegyo. Language Sciences. DOI:  http://doi.org/10.1016/j.langsci.2020.101289

Jespersen, O. (1922). Symbolic value of the vowel i. In Linguistica: Selected papers in English, French and German, 1, 283–303. Copenhagen: Levin and Munksgaard.

Johansson, N. E. (2017). Tracking linguistic primitives: The phonosemantic realization of fundamental oppositional pairs. In Dimensions of iconicity. Iconicity in language and literature 15 (pp. 39–62). John Benjamins. DOI:  http://doi.org/10.1075/ill.15.03joh

Johansson, N. E., & Zlatev, J. (2013). Motivations for sound symbolism in spatial deixis: A typological study of 101 languages. The Public Journal of Semiotics, 5(1), 1–20. DOI:  http://doi.org/10.37693/pjos.2013.5.9668

Johansson, N. E., Anikin, A., Carling, G., & Holmer, A. (2020). The typology of sound symbolism: Defining macro-concepts via their semantic and phonetic features. Linguistic Typology, 24(2), 253–310. DOI:  http://doi.org/10.1515/lingty-2020-2034

Kawahara, S. (2020a). Cumulative effects in sound symbolism. Unpublished manuscript. Keio University.

Kawahara, S. (2020b). Sound symbolism and theoretical phonology. Language and Linguistic Compass, 14(8), e12372. DOI:  http://doi.org/10.1111/lnc3.12376

Kawahara, S. (2020c). Teaching and learning guide for “Sound symbolism and theoretical linguistics”. Language and Linguistic Compass, 14(8), e12376. DOI:  http://doi.org/10.1111/lnc3.12376

Kawahara, S. (2020d). A wug-shaped curve in sound symbolism: The case of Japanese Pokémon names. Phonology, 37(3). DOI:  http://doi.org/10.1017/S0952675720000202

Kawahara, S., Godoy, M. C., & Kumagai, G. (2020). Do sibilants fly? Evidence from a sound symbolic pattern in Pokémon names. Open Linguistics, 6(1), 386–400. DOI:  http://doi.org/10.1515/opli-2020-0027

Kawahara, S., Katsuda, H., & Kumagai, G. (2019). Accounting for the stochastic nature of sound symbolism using Maximum Entropy model. Open Linguistics, 5, 109–120. DOI:  http://doi.org/10.1515/opli-2019-0007

Kawahara, S., & Kumagai, G. (2019). Expressing evolution in Pokémon names: Experimental explorations. Journal of Japanese Linguistics, 35(1), 3–38. DOI:  http://doi.org/10.1515/jjl-2019-2002

Kawahara, S., & Kumagai, G. (in press). What voiced obstruents symbolically represent in Japanese: Evidence from the Pokémon universe. Journal of Japanese Linguistics, 37(1).

Kawahara, S., Kumagai, G., & Godoy, M. C. (2020). English speakers can infer Pokémon types based on sound symbolism. Unpublished manuscript. Keio University, Meikai University and Federal University of Rio Grande do Norte.

Kawahara, S., & Moore, J. (2021). How to express evolution in English Pokémon names. Linguistics.

Kawahara, S., Noto, A., & Kumagai, G. (2018). Sound symbolic patterns in Pokémon names. Phonetica, 75(3), 219–244. DOI:  http://doi.org/10.1159/000484938

Kawahara, S., & Shinohara, K. (2012). A tripartite trans-modal relationship between sounds, shapes and emotions: A case of abrupt modulation. Proceedings of CogSci, 2012, 569–574.

Kawahara, S., Suzuki, M., & Kumagai, G. (2020). Sound symbolic patterns in Pokémon move names in Japanese. ICU Working Papers in Linguistics 10. Festschrift for Prof. Junko Hibiya in the occasion of her retirement from ICU, 17–30.

Kingston, J., & Diehl, R. (1994). Phonetic knowledge. Language, 70, 419–454. DOI:  http://doi.org/10.1353/lan.1994.0023

Kingston, J., & Diehl, R. (1995). Intermediate properties in the perception of distinctive feature values. In B. Connell & A. Arvaniti (Eds.), Papers in laboratory phonology IV: Phonology and phonetic evidence (p. 7–27). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511554315.002

Kingston, J., Diehl, R., Kirk, C., & Castleman, W. (2008). On the internal perceptual structure of distinctive features: The [voice] contrast. Journal of Phonetics, 36, 28–54. DOI:  http://doi.org/10.1016/j.wocn.2007.02.001

Klink, R. R. (2000). Creating brand names with meaning: The use of sound symbolism. Marketing Letters, 11(1), 5–20. DOI:  http://doi.org/10.1023/A:1008184423824

Klink, R. R. (2009). Gender differences in new brand name response. Marketing Letters, 20, 313–326. DOI:  http://doi.org/10.1007/s11002-008-9066-x

Köhler, W. (1929). Gestalt psychology. New York: Liveright.

Konstantina, M., & Friedemann, P. (2020). Action sound–shape congruencies explain sound symbolism. Scientific Reports, 10(1). DOI:  http://doi.org/10.1038/s41598-020-69528-4

Kroch, A. (1989). Reflexes of grammar in patterns of language change. Language Variation and Change, 1(2), 199–244. DOI:  http://doi.org/10.1017/S0954394500000168

Kruschke, J. K., & Liddell, T. M. (2018). The Bayesian new statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychological Bulletin and Review, 25, 178–206. DOI:  http://doi.org/10.3758/s13423-016-1221-4

Kumagai, G. (2019). A sound-symbolic alternation to express cuteness and the orthographic Lyman’s Law in Japanese. Journal of Japanese Linguistics, 35(1), 39–74. DOI:  http://doi.org/10.1515/jjl-2019-2004

Kumagai, G., & Kawahara, S. (2019). Effects of vowels and voiced obstruents on Pokémon names: Experimental and theoretical approaches [in Japanese]. Journal of the Linguistic Society of Japan, 155, 65–99.

Legendre, G., Miyata, Y., & Smolensky, P. (1990). Harmonic grammar – a formal multilevel connectionist theory of linguistic well-formedness: Theoretical foundations. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society (pp. 388–395). Mahwah, NJ: Lawrence Erlbaum Associates.

Locke, J. (1689). An essay concerning human understanding. London: MDCC. DOI:  http://doi.org/10.1093/oseo/instance.00018020

Lockwood, G., & Dingemanse, M. (2015). Iconicity in the lab: A review of behavioral, developmental, and neuroimaging research into sound-symbolism. Frontiers in Psychology. DOI:  http://doi.org/10.3389/fpsyg.2015.01246

Lupyan, G., & Winter, B. (2018). Language is more abstract than you think, or, why aren’t languages more iconic? Proceedings of Royal Society B., 373, 20170137. DOI:  http://doi.org/10.1098/rstb.2017.0137

MacKenzie, L. (2018). What’s in a name? Teaching linguistics using onomastic data. Language [Teaching Linguistics], 94, e1–e18. DOI:  http://doi.org/10.1353/lan.2018.0068

Makowski, D., Ben-Shachar, M. S., & Lüdecke, D. (2019). bayestestr: Describing effects and their uncertainty, existence and significance within the Bayesian framework. Journal of Open Source Software, 4(40), 1541. DOI:  http://doi.org/10.21105/joss.01541

Marks, L. (1978). The unity of the senses: Interrelations among the modalities. New York: Academic Press. DOI:  http://doi.org/10.1016/B978-0-12-472960-5.50011-1

Martin, S. (1962). Phonetic symbolism in Korean. In N. Poppe (Ed.), American studies in Uralic and Altaic linguistics. Indiana University Press.

Massaro, D., & Cohen, M. (1983). Phonological context in speech perception. Perception & Psychophysics, 34, 338–348. DOI:  http://doi.org/10.3758/BF03203046

Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound-shape correspondences in toddlers and adults. Developmental Science, 9, 316–322. DOI:  http://doi.org/10.1111/j.1467-7687.2006.00495.x

McCarthy, J. J. (1983). Phonological features and morphological structure. In J. Richardson, M. Marks, & A. Chukerman (Eds.), Proceedings from the parasesion on the interplay of phonology, morphology and syntax (pp. 135–161). Chicago: CLS.

McPherson, L., & Hayes, B. (2016). Relating application frequency to morphological structure: The case of Tommo So vowel harmony. Phonology, 33, 125–167. DOI:  http://doi.org/10.1017/S0952675716000051

Newman, S. (1933). Further experiments on phonetic symbolism. American Journal of Psychology, 45, 53–75. DOI:  http://doi.org/10.2307/1414186

Nielsen, A. K. S., & Dingemanse, M. (2020). Iconicity in word learning and beyond: A critical review. Language and Speech. DOI:  http://doi.org/10.1177/0023830920914339

Nielsen, A. K. S., & Rendall, D. (2013). Parsing the role of consonants versus vowels in the classic takete-maluma phenomenon. Canadian Journal of Experimental Psychology, 67(2), 153–63. DOI:  http://doi.org/10.1037/a0030553

Nuckolls, J. B. (1999). The case for sound symbolism. Annual Review of Anthropology, 28, 225–252. DOI:  http://doi.org/10.1146/annurev.anthro.28.1.225

Nygaard, L. C., Cook, A. E., & Namy, L. L. (2009). Sound to meaning correspondence facilitates word learning. Cognition, 112, 181–186. DOI:  http://doi.org/10.1016/j.cognition.2009.04.001

Ohala, J. (1983a). The origin of sound patterns in vocal tract constraints. In P. MacNeilage (Ed.), The production of speech (pp. 189–216). New York: Springer-Verlag. DOI:  http://doi.org/10.1007/978-1-4613-8202-7_9

Ohala, J. (1983b). The phonological end justifies any means. In S. Hattori & K. Inoue (Eds.), Proceedings of the 13th International Congress of Linguists (pp. 232–243). Tokyo: Sanseido.

Ohala, J. (1994). The frequency code underlies the sound symbolic use of voice pitch. In L. Hinton, J. Nichols, & J. Ohala (Eds.), Sound symbolism (pp. 325–347). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511751806.022

Perlman, M., & Lupyan, G. (2018). The potential for iconicity in vocalization. Scientific Reports, 8(1). DOI:  http://doi.org/10.1038/s41598-018-20961-6

Perniss, P., Thompson, R. L., & Vigiliocco, G. (2010). Iconicity as a general property of language: Evidence from spoken and signed languages. Frontiers in Psychology. DOI:  http://doi.org/10.3389/fpsyg.2010.00227

Perniss, P., & Vigiliocco, G. (2014). The bridge of iconicity: From a world of experience to the experiment of language. Philosophical Transactions of the Royal Society B, 369, 20130300. DOI:  http://doi.org/10.1098/rstb.2013.0300

Perry, L. K., Perlman, M., Winter, B., Massaro, D. W., & Lupyan, G. (2018). Iconicity in the speech of children and adults. Developmental Science, 21(3), e12572. DOI:  http://doi.org/10.1111/desc.12572

Pitcher, B. J., Mesoudi, A., & McElligott, A. G. (2013). Sex-based sound symbolism in English-language first names. PLOS ONE, 8(6), e64825. DOI:  http://doi.org/10.1371/journal.pone.0064825

Prince, A., & Smolensky, P. (1993/2004). Optimality Theory: Constraint interaction in generative grammar. Malden and Oxford: Blackwell.

Proctor, M. I., Shadle, C. H., & Iskarous, K. (2010). Pharyngeal articulation differences in voiced and voiceless fricatives. Journal of the Acoustical Society of America, 127(3), 1507–1518. DOI:  http://doi.org/10.1121/1.3299199

Ramachandran, V. S., & Hubbard, E. M. (2001). Synesthesia–a window into perception, thought, and language. Journal of Consciousness Studies, 8(12), 3–34.

Roettger, T. B. (2019). Researcher degree of freedom in phonetic research. Laboratory Phonology, 10. DOI:  http://doi.org/10.5334/labphon.147

Rogers, S. K., & Ross, A. S. (1975). A cross-cultural test of the maluma-takete phenomenon. Perception, 4, 105–106. DOI:  http://doi.org/10.1068/p040105

Rose, S., & King, L. (2007). Speech error elicitation and co-occurrence restrictions in two Ethiopian Semitic languages. Language and Speech, 50(4), 451–504. DOI:  http://doi.org/10.1177/00238309070500040101

Ryan, K. (2016). Phonological weight. Language and Linguistic Compass, 10(12), 720–733. DOI:  http://doi.org/10.1111/lnc3.12229

Sapir, E. (1929). A study in phonetic symbolism. Journal of Experimental Psychology, 12, 225–239. DOI:  http://doi.org/10.1037/h0070931

Saussure, F. (1916). Cours de linguistique générale. Paris: Payot.

Sawada, O. (2013). The meanings of diminutive shifts in Japanese. Proceedings of North East Linguistic Society, 42, 163–176.

Schmidtke, D. S., Conrad, M., & Jacobs, A. M. (2014). Phonological iconicity. Frontiers in Psychology, 5(80). DOI:  http://doi.org/10.3389/fpsyg.2014.00080

Shih, S. S. (2020). Gradient categories in lexically-conditioned phonology: An example from sound symbolism. Proceedings of the 2019 Annual Meeting on Phonology. DOI:  http://doi.org/10.3765/amp.v8i0.4689

Shih, S. S., Ackerman, J., Hermalin, N., Inkelas, S., Jang, H., Johnson, J., … Yu, A. (2019). Cross-linguistic and language-specific sound symbolism: Pokémonastics. Unpublished manuscript. University of Southern California, University of California, Merced, University of California, Berkeley, Keio University, National University of Singapore and University of Chicago.

Shinohara, K., & Kawahara, S. (2016). A cross-linguistic study of sound symbolism: The images of size. In Proceedings of the Thirty Sixth Annual Meeting of the Berkeley Linguistics Society (pp. 396–410). Berkeley: Berkeley Linguistics Society. DOI:  http://doi.org/10.3765/bls.v36i1.3926

Shinohara, K., Kawahara, S., & Tanaka, H. (2020). Visual and proprioceptive perceptions evoke motion-sound symbolism: Different acceleration profiles are associated with different types of consonants. Frontiers in Psychology. DOI:  http://doi.org/10.3389/fpsyg.2020.589797

Shinohara, K., Yamauchi, N., Kawahara, S., & Tanaka, H. (2016). Takete and maluma in action: A cross-modal relationship between gestures and sounds. PLOS ONE. DOI:  http://doi.org/10.1371/journal.pone.0163525

Sidhu, D. M., & Pexman, P. M. (2018). Five mechanisms of sound symbolic association. Psychonomic Bulletin & Review, 25(5), 1619–1643. DOI:  http://doi.org/10.3758/s13423-017-1361-1

Sidhu, D. M., & Pexman, P. M. (2019). The sound symbolism of names. Current Directions in Psychological Science, 28(4), 398–402. DOI:  http://doi.org/10.1177/0963721419850134

Sidhu, D. M., Deschamps, K., Bourdage, J. S., & Pexman, P. M. (2019). Does the name say it all? Investigating phoneme-personality sound symbolism in first names. Journal of Experimental Psychology: General, 148(9), 1595–1614. DOI:  http://doi.org/10.1037/xge0000662

Sidhu, D. M., Pexman, P. M., & Saint-Aubin, J. (2016). From the Bob-Kirk effect to the Benoit-Éric effect: Testing the mechanism of name sound symbolism in two languages. Acta Psychologica, 169, 88–99. DOI:  http://doi.org/10.1016/j.actpsy.2016.05.011

Smith, B. W., & Pater, J. (2020). French schwa and gradient cumulativity. Glossa, 5(1), 24. DOI:  http://doi.org/10.5334/gjgl.583

Spence, C. (2011). Crossmodal correspondences: A tutorial review. Attention, Perception & Psychophysics, 73(4), 971–995. DOI:  http://doi.org/10.3758/s13414-010-0073-7

Steriade, D. (2008). Resyllabification in the quantitative meters of Ancient Greek: Evidence for an Interval Theory of Weight. Unpublished manuscript. MIT.

Stevens, K., & Blumstein, S. (1981). The search for invariant acoustic correlates of phonetic features. In P. Eimas & J. D. Miller (Eds.), Perspectives on the study of speech (pp. 1–38). New Jersey: Earlbaum.

Styles, S. J., & Gawne, L. (2017). When does maluma/takete fail? Two key failures and a meta-analysis suggest that phonology and phonotactics matter. i-Perception, 8(4), 1–17. DOI:  http://doi.org/10.1177/2041669517724807

Svantesson, J.-O. (2017). Sound symbolism: The role of word sound in meaning. WIRE Cog Sci, 8, e01441. DOI:  http://doi.org/10.1002/wcs.1441

Thompson, P. D., & Estes, Z. (2011). Sound symbolic naming of novel objects is a graded function. Quarterly Journal of Experimental Psychology, 64(12), 2392–2404. DOI:  http://doi.org/10.1080/17470218.2011.605898

Ultan, R. (1978). Size-sound symbolism. In J. Greenberg (Ed.), Universals of human language II: Phonology (pp. 525–568). Stanford: Stanford University Press.

Uno, R., Shinohara, K., Hosokawa, Y., Ataumi, N., Kumagai, G., & Kawahara, S. (2020). What’s in a villain’s name? Sound symbolic values of voiced obstruents and bilabial consonants. Review of Cognitive Linguistics, 18(2), 428–457. DOI:  http://doi.org/10.1075/rcl.00066.uno

Vasishth, S., Nicenboim, B., Beckman, M., Li, F., & Jong Kong, E. (2018). Bayesian data analysis in the phonetic sciences: A tutorial introduction. Journal of Phonetics, 71, 147–161. DOI:  http://doi.org/10.1016/j.wocn.2018.07.008

Vennemann, T. (1988). Preference laws for syllable structure and the explanation of sound change: With special reference to German, Germanic, Italian, and Latin. Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110849608

Westbury, C. (2005). Implicit sound symbolism in lexical access: Evidence from an interference task. Brain and Language, 93, 10–19. DOI:  http://doi.org/10.1016/j.bandl.2004.07.006

Westbury, C., Hollis, G., Sidhu, D. M., & Pexman, P. M. (2018). Weighting up the evidence for sound symbolism: Distributional properties predict cue strength. Journal of Memory and Language, 99, 122–150. DOI:  http://doi.org/10.1016/j.jml.2017.09.006

Westbury, J. R. (1983). Enlargement of the supraglottal cavity and its relation to stop consonant voicing. Journal of the Acoustical Society of America, 73, 1322–1336. DOI:  http://doi.org/10.1121/1.389236

Whitney, W. D. (1867). Language, and the study of language. Twelve lectures on the principles of linguistic science. C. Scribner & Company.

Wichmann, S., Holman, E. W., & Brown, C. H. (2010). Sound symbolism in basic vocabulary. Entropy, 12(4), 844–858. DOI:  http://doi.org/10.3390/e12040844

Winter, B., Pérez-Sobrino, P., & Lucien, B. (2019). The sound of soft alcohol: Crossmodal associations between interjections and liquor. PLOS ONE. DOI:  http://doi.org/10.1371/journal.pone.0220449

Yorkston, E., & Menon, G. (2004). A sound idea: Phonetic effects of brand names on consumer judgments. Journal of Consumer Research, 31, 43–51. DOI:  http://doi.org/10.1086/383422

Zimmermann, R. (2017). Formal and quantitative approaches to the study of syntactic change: Three case studies from the history of English (Doctoral Dissertation). University of Geneva.

Zuraw, K., & Hayes, B. (2017). Intersecting constraint families: An argument for Harmonic Grammar. Language, 93, 497–548. DOI:  http://doi.org/10.1353/lan.2017.0035