1 Introduction

The Korean language has a rich inventory of ‘ideophones,’ which are vivid sensory words that depict sounds (e.g., tʰəmpəŋ1 ‘a plopping sound’), movements (hɨntɨl ‘a motion of swaying’), textures (c’ontɨk ‘sticky’), inner feelings (təlkʰək ‘mental state of being shocked’), and so on (Dingemanse, 2012; Dingemanse et al., 2015). Being inherently expressive, ideophones provide the best locus to study iconicity (i.e., a perceptual resemblance between form and meaning) in spoken language (Dingemanse et al., 2016; Perry et al., 2015). Recent empirical support for the iconicity of ideophones is found in Dingemanse et al.’s (2016) study, where native Dutch speakers matched the meanings of existing ideophones (from five ideophone-rich languages that they did not speak) above the chance level of 50%. Cross-linguistically, such (canonically) iconic words often differentiate themselves from ordinary vocabulary by means of skewed phonotactic distributions (Childs, 2014; Dingemanse, 2012; Dingemanse et al., 2015). For instance, ideophones in Hausa (Afro-Asiatic) use obstruents in word-final position (e.g., túkúf ‘very old,’ tsit ‘in complete silence’) while most native Hausa words are either vowel- or sonorant-final (Newman, 2001, p. 252). Ideophones in Kisi (Niger-Congo) feature vowel harmony, which does not occur in any other lexical class in the language (Childs, 1988). Korean ideophones exhibit stem-internal vowel harmony, which does not occur in the prosaic lexicon.

This paper focuses on Korean ideophonic vowel-harmony system, which contains so-called ‘dark’ and ‘light’ vowels (see Section 2 for details of this language-specific vowel distinction) and restricts the co-occurrence of those two vowel-harmony types within a morpheme (Cho, 1994; J.-S. Lee, 1992; H.-M. Sohn, 1999, among others). Within the ideophonic harmony system, legitimate violations of the harmony rule occur only with the presence of ‘neutral’ vowels—some dark vowels act as harmony-neutral in non-initial syllables and freely allow either dark or light vowels in the preceding syllable. Forms that contain dark and light vowels within a morpheme are considered an apparent violation of the system.

Acknowledging the possibly occurring harmony patterns in the ideophonic lexicon, this paper examines the connections between the following phenomena: A set of vowel patterns classified (phonologically) as harmonic, neutral, and disharmonic; a set of ideophones classified (semantically) as onomatopoeic vs. cross-modal;2 and a set of form-meaning mappings classified (semiotically) as higher vs. lower in iconicity. In detail, using a written corpus of Korean ideophonic stems, this paper quantitatively tests hypotheses that onomatopoeic ideophones would show diversity in harmony patterns. That is so because they are bound to actual sounds, and therefore they would take whatever phonological and phonotactic liberties they need—this fits cross-linguistic observations that, among ideophones, those with onomatopoeic meanings tend to show the most diversity in phonology and phonotactics (Akita et al., 2013; Akita, 2013; Childs, 1994). In contrast, cross-modal ideophones would conform to stricter vowel harmony, which within the ideophone inventory is considered unmarked, because they are not directly tied to sound.

Regarding the formulation of the hypotheses, it is not necessarily the case that all cross-modal ideophones are less iconic than all onomatopoeic ideophones. The present study resolves this issue by empirically establishing degrees of iconicity through native speakers’ rating judgments on a randomly selected subset of the ideophones (see Section 3 for further details).

The paper is organized as follows. Section 2 describes the vowel harmony system in Korean ideophones, and Section 3 provides a brief introduction to the semantic sub-categories of Korean ideophones and examines their associated iconicity levels on an empirical basis. Section 4 describes the corpus of Korean ideophonic stems used in the current paper, and Section 5 reports the relative proportions of onomatopoeic and cross-modal ideophones in their associations with neutral forms containing neutral /i, ɨ/ and partially neutral /u/ in non-initial syllables. The phonosemantic analysis expands to disharmonic forms containing non-neutral /a/ and harmonic forms. Section 6 discusses the results and Section 7 summarizes the paper.

2 Overview of ideophonic vowel harmony in Korean

In Middle Korean (15th–16th century), vowel harmony was active and regular throughout the entire vocabulary. The co-occurrence of the class of dark vowels (including /ɨ, u, ə/) with that of light vowels (including /o, a, ɔ/)3 was strictly prohibited both stem-internally and -externally. The regular harmonic system, however, underwent disruption due to both a number of borrowings from Chinese, which had no harmonic system, and a historic vowel shift (Kim-Renaud, 1976, p. 397; Larsen & Heinz, 2012). Since that change, strict vowel harmony has largely disappeared from Modern Korean, leaving its trace in only a few limited cases, namely, verbal suffix harmony and ideophonic harmony (Larsen & Heinz, 2012).4 Of those, the current paper focuses only on ideophonic harmony pertaining to monophthongs, which regularly make up the dark-light harmony classes (Larsen & Heinz, 2012); verbal suffix harmony is not discussed, since it does not predict any potential connection with iconicity.

The ideophonic harmony system in Modern Korean comprises light vowels, consisting of /ԑ, (ø),5 a, o/, and dark vowels, consisting of /i, e, (y), ɨ, ə, u/. Previously, several studies have tried to account for the harmonic groupings in Korean ideophones, using various features, such as [±low] (K.-O. Kim, 1977; McCarthy, 1983; H.-S. Sohn, 1986), and [±Advanced Tongue Root] or [±Retracted Tongue Root] (Cho, 1994; J.-K. Kim, 2000; J.-S. Lee, 1992; M. Lee, 2001; Y. Lee, 1993). However, none of the claimed distinguishing features are successfully supported by the formant frequencies of Korean monophthongs (see the detailed formant chart for F1 and F2 values of Standard Korean monophthongs spoken by female speakers in Larsen & Heinz, p. 437; see also Kwon, to appear). This indicates that the dark and light vowel sets in modern Korean are not a natural class that can be distinguished by any widely-accepted universal distinctive feature. Reflecting this fact, this paper adopts the traditional semantic terms dark and light to refer to the two harmony classes of vowels.

These semantic terms are in fact useful for describing the connotation that each vowel class carries in the ideophonic lexicon. Korean ideophones display systematic vowel alternations by associating the light vowels with a diminutive connotation (such as lightness, smallness, and fastness) and the dark vowels with an augmentative connotation (such as heaviness, largeness, and slowness) (Cho, 1994; Finley, 2006; J.-K. Kim, 2000; Y.S. Kim, 1984; Kim-Renaud, 1976; M. Lee, 2001; McCarthy, 1983; H.-M. Sohn, 1999). Alternations occur vertically, involving a change in the high/low feature, and also diagonally, involving a change in the frontness/backness feature (K.-O. Kim, 1977; Y. Lee, 1993). This results in seven possible alternating patterns, as in (1).6

    1. (1)
    1. Vowel alternating patterns

Tellingly, the vowel alternations, which create a series of semantic minimal pairs, occur not only in initial syllables but also in non-initial syllables, as exemplified in (2). This is because Korean ideophones are governed by a harmony rule—that the vowels within a stem should agree with the semantic feature (Cho, 1994; H.-M. Sohn, 1999; Larsen & Heinz, 2012).

    1. (2)
    1. Vowel alternations in Korean ideophones
    2. Dark forms Light forms  
      k’əŋ.cʰuŋ : k’aŋ.cʰoŋ ‘skipping with longer: shorter legs’
      pʰuŋ.təŋ : pʰoŋ.taŋ ‘plopping sound of a bigger and heavier: smaller and lighter object’
      cʰi.ləŋ : cʰa.laŋ ‘dropping of a longer: shorter object’

But there are also several cases where the harmony rule does not persist, i.e., when the dark vowels /i, ɨ/ occur in non-initial syllables, as exemplified in (3).

    1. (3)
    1. The neutral /i/ and /ɨ/
    2. Dark + Neutral Light + Neutral  
      tuŋ.kɨl : toŋ.kɨl ‘round involving a large: small circle’
      nɨl.s’in : nal.s’in ‘thinness of taller: tall thing’

The position-sensitive harmony-neutral status of /i, ɨ/ has diachronic and synchronic grounds. In diachronic terms, the neutrality of /i/ in non-initial syllables is attributed to the newly appeared light /ɛ/ in initial syllables (which formed a harmonic pair with the position-insensitive neutral /i/) in the late 18th century. The neutrality of /ɨ/ is attributed to a historical merger between the light /ɔ/ and its dark counterpart /ɨ/ in non-initial syllables around the middle of the 15th century (K.-M. Lee, 1961, 1972). Synchronic evidence is found in Larsen and Heinz’s (2012) corpus-based study showing that the neutral vowels /i, ɨ/ have a more or less equal distribution of dark and light vowels in the preceding syllables.

According to Larsen and Heinz’s study again, dark /u/ in non-initial syllables also frequently follows light vowels, at a ratio of around 2:1 (464/266). Although /u/ occurs with light vowels proportionately less than the traditional neutral vowels /i, ɨ/, Larsen and Heinz claimed that it patterns closely with the neutral vowels /i/ and /ɨ/ (in terms of transparency in vowel harmony) and that it is therefore at least partially neutral. In a diachronic sense, the partial neutrality of /u/ can be traced back to the raising of /o/ ~ /u/ in the late 19th century (Cho, 1994; Ko, 2012; J.-S. Lee, 1992). Perhaps the varying behaviors of /u/ as a neutral, dark or optional neutral vowel in non-initial syllables in (4) may have been influenced by the ongoing mid-vowel raising process (e.g., hoto > hotu ‘walnut’; cato > catu ‘plum,’ Larsen & Heinz, 2012, p. 454).

    1. (4)
    1. Partial neutrality of /u/ in non-initial syllables
    2. Neutral /u/ sil.c’uk: sɛl.c’uk ‘a more: less sulky face’
        p’i.cuk: p’ɛ.cuk ‘a more: less jagged shape’
      Dark /u/ kəŋ.tuŋ: kaŋ.toŋ ‘hopping with longer: shorter legs’
        cil.luk: cal.lok ‘shape which is more: less tightly narrow at some point’
      Neutral and dark /u/ məl.t’uŋ: mal.t’uŋ: mal.t’oŋ ‘widely opening eyes’
        pə.tuŋ: pa.tuŋ: pa.toŋ ‘winding’

Somewhat similar to the partially neutral /u/, Larsen and Heinz’s corpus study further found that the light vowel /a/ exhibits inconsistent neutrality in non-initial syllables, as shown in (5).

    1. (5)
    1. Somewhat neutral behavior of /a/
    2. Neutral /a/ t’uk.t’ak: t’ok.t’ak ‘sound of light: lighter hammering’
        p’i.t’ak: p’ɛ.t’ak ‘a more: less tilted shape’
      Light /a/ kɨlk.cək: kalk.cak ‘rougher: smoother motion of scrawling text’
        k’ul.t’ək: k’ol.t’ak ‘swallowing of a larger: smaller amount of food’

However, /a/ is harmonic to a greater extent (approximately 89% of the time) than the traditional neutral vowels, /i, ɨ/ and the partially neutral /u/ (Larsen & Heinz, 2012). Also, its neutrality does not have a robust historic basis. Therefore, this paper restricts neutral vowels to /i/, /ɨ/, and /u/, that is, to those which enter relatively freely into disharmonic patterns in an ideophonic word. Vowel harmony as a pattern of alternation will not be discussed further, because the focus of this paper lies on harmony patterns within individual ideophonic forms only.7

3 Degrees of iconicity: An iconicity rating experiment

Traditionally, Korean ideophones have been classified into two semantic sub-categories: ɨjsəŋə ‘onomatopoeic ideophones’ (words imitative of auditory experiences, e.g., tʰəm.pəŋ ‘a plopping sound’) and ɨjtɛə ‘cross-modal ideophones’ (words imitative of non-auditory experiences, such as visual/tactile sensations or mental states, e.g., te.k’ək ‘a state of working easily and unfalteringly’) (J.-S. Lee, 1992, p. 98; H.-M. Sohn, 1999).8 However, the practicality of this traditional semantic distinction is in question, because a number of Korean ideophones express multi-sensory experiences whose meanings straddle the semantic boundaries of the two categories. For example, tekul-tekul conveys both onomatopoeic (‘sound of an object rolling’) and cross-modal (‘the manner of an object rolling’) meanings (Garrigues, 1995, p. 362).

Leaving aside the difficulty of creating a binary semantic classification of ideophones (which will be discussed in empirical terms in Section 3.2), intuitively speaking, ɨjsəŋə ‘depiction of sound’ seems to be more (transparently) iconic than ɨjtɛə ‘depiction of visual/tactile information or of mental states,’ because its form-meaning associations occur within the same modality. Empirical support for such intuitive claims is found in Dingemanse et al.’s (2016) behavioral experiment, in which Dutch listeners showed better rates of correct guessing of meaning for onomatopoeic ideophones than for cross-modal ideophones, when given words from five languages (including Korean) that they did not speak.

For further assurance of the hierarchy of iconicity linked to onomatopoeic/cross-modal distinction, the current research directly measured their iconicity levels in Korean ideophones. Specifically, it used ratings from native Korean speakers on a paper-based questionnaire. The rating task partly replicates work by Perry et al. (2015), who tested the iconicity of English and Spanish words in different lexical categories, such as onomatopoeia, nouns, and adjectives (see also Vinson et al., 2008 for iconicity rating in sign language).

3.1 Participants

Thirty native Korean speakers were recruited in Seoul through the author’s personal contacts. Their participation was on a voluntary basis without pay.

3.2 Materials and procedure

Participants subjectively rated the iconicity of 170 randomly selected ideophones (onomatopoeic only: 17; cross-modal only: 120; both onomatopoeic and cross-modal: 33) that form 10% of the main data for analysis (onomatopoeic: 178; cross-modal: 1,262; both: 335). The dataset, from which a representative 10% is taken, contains North Korean (29.97%; 532/1,775) as well as South Korean dialects (70.03%, 1,243/1,775). Since all of the participants are speakers of the South Korean dialect, a random selection of ideophones was made from the South Korean dialect only for the questionnaire.

Three randomized versions of the questionnaire, in the form of a Microsoft Excel spreadsheet, were randomly distributed to the participants via email. In the rating task, participants were asked to look at the words in written form and to say them aloud before making their rating (on a scale from 1 to 7, where 1 indicates that a word is not at all iconic and 7 indicates that a word is highly iconic). The instructions to the participants included a careful definition of iconicity (see Appendix A for the instructions). The estimated time for completing the questionnaire was 30 minutes or less.

3.3 Results

For the analysis of the rating results, a linear mixed-effects model was run, with the lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2016) packages in R (R Core Team, 2017). The dependent variable was the iconicity rating. The independent variable, implemented as a fixed effect, was the Semantic Type (three levels: Onomatopoeic, both onomatopoeic and cross-modal, and cross-modal). Participant and item served as random effects; the by-participant effects included the random intercept and the random slope for the semantic type. The by-item effects included the random intercept. The estimated means for the three semantic types of ideophones (with examples of stimulus words), which can generalize over participants and items, are shown in Table 1 below.

Table 1

Estimated means on a scale of 1 (‘not iconic at all’) to 7 (‘highly iconic’) for three types of ideophones, with example words.

Word Meaning Semantic type Estimate SE

p’iak-p’iak ‘peep’ Onomatopoeic 5.57 0.31
pʰətək-pʰətək ‘flapping of the wings or the sound thereof’ Both 4.77 0.27
muŋge-muŋge ‘in thick clouds’ Cross-modal 4.31 0.26

After the model fitting, a pair-wise comparison of different semantic types of ideophones was performed. The results indicate that the participants rated onomatopoeic ideophones as more iconic than ideophones of both onomatopoeic and cross-modal meanings (t(169.1) = 2.67, p = .008**), and cross-modal ideophones (t(90.1) = 3.49, p < .001***), at a statistically significant level. There was no significant difference in the iconicity ratings between ideophones of both onomatopoeic and cross-modal meaning, and cross-modal ideophones (t(59.5) = 1.36, p = .179, n.s.).

These results lend some support to the intuitive claim that onomatopoeic ideophones have stronger iconicity than cross-modal ideophones, by suggesting the following iconicity rank order: Onomatopoeic > both onomatopoeic and cross-modal = cross-modal. Given the inferred iconicity rank order, I combined those ideophones of both onomatopoeic and cross-modal meanings, and cross-modal ideophones into a single cross-modal type, against onomatopoeic ideophones. Then, I tested the following hypotheses:

  1. That vowel harmony governs the ideophonic lexicon. Therefore, both onomatopoeic and cross-modal ideophones would be mostly harmonic or neutral.
  2. Yet, since onomatopoeic ideophones are bound to actual sounds, they would be skewed toward a larger proportion of disharmonic forms.
  3. Conversely, cross-modal ideophones would conform to stricter vowel harmony patterns and would disfavor disharmonic forms.

4 Methodology

4.1 The written corpus containing a list of Korean ideophonic stems

This study used the same corpus as Larsen and Heinz’s (2012) study. Although a detailed description of the corpus is found in Larsen and Heinz’s paper (pp. 441–443), I address the main characteristics of the corpus that are relevant to the current study.

The corpus, which is reported to contain 29,015 Korean ideophones, was developed during the compilation of pʰjocunkukətɛsacən ‘the Great Dictionary of Standard Korean’ (The National Institute of the Korean Language, 2001).9 A few concerns can be raised about the accuracy of the corpus, for the following reasons (as the distributor has admitted): (a) Some words in the corpus may not possess sound-symbolic meanings (i.e., no perceptual sensory meanings), although they exhibit a pattern of sound alternations as do ideophones; and (b) some may have no entries in the Great Dictionary of Standard Korean. Despite these concerns, I do not question the accuracy of the corpus, since I found that words falling under (a) and (b) occupy only 1.55% (32/2,062) of the underlying data for the current study (955 harmonic forms, 749 neutral forms containing /i/ or /ɨ/, 260 partially neutral forms containing /u/, and 98 disharmonic forms containing /a/; the details of the list are found in Section 5).10

Another issue arises when considering the fact that many Korean ideophonic stems are combined with a verb, hata ‘do, be,’ or verbal suffix, kəlita or tɛta ‘keep doing’ (e.g., t’ak’ɨm-hata ‘be painful’ and t’ak’ɨm-kəlita ‘keep being painful’) (H.-M. Sohn, 1999, p. 101). Relating to this, the corpus contains multiple variants of a single underlying ideophonic stem. For example, it lists four variants—katuŋ-katuŋ, katuŋ-katuŋ-hata, katuŋ-kəlita, katuŋ-tɛta—built on a single stem, katuŋ ‘swaying one’s hips.’ Among the variants, I extracted only the reduplicated forms to minimize confusion about whether or not the selected items were ideophonic.

Reduplication, which is used to express different degrees of iteration, distribution, or intensity of the events depicted, is cross-linguistically common in ideophones (Dingemanse, 2015; Kwon, 2017). In fact, most Korean ideophones appear in fully reduplicated forms, such as pʰuŋtəŋ-pʰuŋtəŋ ‘with splash after splash,’ piŋ-piŋ ‘round and round (repeated spinning, whirling, turning),’ and pʰəllək-pʰəllək ‘fluttering continuously.’11 I therefore considered reduplicated forms as representative variants of each ideophonic stem.12 The commonly occurring reduplicated types in the Korean ideophonic lexicon are exemplified in (6) below (most of the examples in [6] are adapted from Larsen & Heinz, 2012, p. 438).

    1. (6)
    1. Reduplication in Korean ideophones
    2.   Stem syllable length Ideophone Meaning
      a. 1 sal-sal ‘gently, softly, slowly’
      b. 2 culəŋ-culəŋ ‘in clusters’ (e.g., grapes)
      c. als’oŋ-tals’oŋ ‘jumbled, obscure’
      d. p’it’ul-p’ɛt’ul ‘staggeringly’
      e. 3 ucik’ɨn-ucik’ɨn ‘with a snap, crackling’
      f. allak’uŋ-tallak’uŋ ‘messy’
      g. 4 cʰikcʰikpʰokpʰok ‘chugga chugga’ (e.g., train)

Among those, I extracted reduplicated forms, including forms of echo reduplication, based on two- and three-syllable ideophonic stems (3,041 di-syllabic and 983 tri-syllabic types), since they are the two most frequent syllable lengths for ideophonic stems.13 One-syllable-based reduplicated forms, such as (6a), were not extracted from the corpus because they provide no information about vowel harmony (299 types). Four-syllable-based reduplicated forms, such as (6g), were also excluded (10 types), because they appear to be a compound consisting of two reduplicated forms of two one-syllable-based stems (e.g., cʰik- and pʰok-). In addition, two-syllable- and three-syllable-based forms of echo reduplication that show differing vowel patterns between base and reduplicant were excluded (110 out of 316 types). Examples of such include kalpʰaŋ-cilpʰaŋ ‘cluelessly’ (for a syllable change) and nɨnsil-nansil ‘behaving lasciviously’ (for a vowel change). As for homophonous reduplicated forms that were listed twice in the corpus (e.g., katak-katak (01) ‘into strips’ and katak-katak (02) ‘dry and stiff state of an object that was once watery’),14 I kept only one token of each homophonous form (but attended to all of their meanings for semantic coding in the next sub-section).

As a result, the total number of reduplicated ideophonic forms extracted from the corpus amounted to 4,024. This included 2,875 harmonic and 1,149 neutral/disharmonic stems. The number deviates from that in Larsen and Heinz’s (2012) study, which extracted reduplicatives of di- and tri-syllabic ideophonic stems that displayed harmonic (e.g., katak-katak) and neutral/disharmonic (e.g., kaku-kakul ‘winding’) sequences with monophthongs (3,972 forms in total). This difference may have resulted from the exclusions of /y/ and /ø/ (cf. note 5) and the inclusions of some forms of echo reduplication in the present study. However, given that the number of stems containing /y/ and /ø/ is 40 in Larsen and Heinz’s study while the number of stems of echo reduplication is 206 in the present study, there is still a slight difference in the number of stems between them.15 Reasons for this remaining difference are unknown, and one can only speculate that it is due to counting errors in either study.

For the limitations of the corpus, Larsen and Heinz (p. 442) listed the following points. First, some reduplicatives found in the corpus are not familiar to Korean speakers. To minimize this issue, I did not consider those reduplicatives that did not appear in the dictionary. As well, their use is not found in the Kaist Concordance Program (http://semanticweb.kaist.ac.kr/research/kcp/), a web-based program for searching expressions containing a target word in the Kaist Raw Corpus (1997), which contains 70 million Korean phrases.16 Second, not all of the ideophonic reduplicatives in Korean are found in the corpus. I did not make any additions to the corpus for a future replication study following Larsen and Heinz. Third, the reduplicatives in the corpus contain North Korean as well as South Korean dialects. In fact, out of 4,024 reduplicatives, 1,062 forms were labeled as North Korean (26.39%) in the dictionary.17 However, vowel harmony patterns in North Korean ideophones are not different from those in South Korean ideophones (T.-I. Sohn, 2012), so I did not differentiate between them in the main analysis. Still, the point that needs to be addressed here is that there is a synchronic merger between /ə/ and /o/, caused by the rounding of /ə/ in standard North Korean (Kwak, 2003). On the other hand, a merging of /e/ with /ɛ/ has occurred in all areas of South Korea, due to the raising of /ɛ/ (Ingram & Park, 1997; Tsukada et al., 2005; Yang, 1996, among others). These different synchronic mergers in the two dialects create potential confounds in the data analysis, as they appear to transform the dark /ə/ into the light /o/ in North Korean and the light /ɛ/ into the dark /e/ in South Korean. In order to not make any changes in the corpus, I retained those reduplicatives that contained the potential confounds, but considered the effects of the synchronic mergers in each dialect separately in the analysis in Section 5.

4.2 Semantic coding

According to the compilation guidelines18 for the Great Standard Korean Dictionary (http://stdweb2.korean.go.kr/main.jsp), the glosses of onomatopoeic ideophones contain the phrase ‘the sound of …’ or ‘the sound made when conducting the action of …’ in word-for-word translations. On the other hand, the glosses of cross-modal ideophones contain the phrase ‘the shape/way of …’, ‘the state of …’ or ‘the feeling of …’ (the National Institute of the Korean Language, 2000).

Following the guidelines, I checked the meanings of the reduplicatives using the dictionary and, based on those meanings, I assigned the semantic codes ‘O’ for onomatopoeic and ‘C’ for cross-modal meanings. When a reduplicative had both onomatopoeic and cross-modal meanings (e.g., tekul ‘the sound or action of an object rolling’), it was assigned C, as it was revealed in Section 3.2 that its iconicity level is not significantly different from the iconicity of cross-modal ideophones. When there appeared to be multiple identical semantic codes for one reduplicative, they were merged into one. For example, t’ɨk’ɨm has three related cross-modal meanings: (1) Burning sensation when one suddenly touches a fire (C); (2) enthusiasm when one is under the inspiration of someone/something (C); and (3) pain when one is being beaten or pricked (C). The three Cs were merged into one C.

A limitation of semantic coding is that it could not be applied to all of the reduplicatives, for the following reasons. First, there were some reduplicatives whose definitions did not appear in the dictionary. In this case, their actual use was searched in the Kaist Concordance Program, and only those that were found in naturally occurring data received a coding (based on their meanings in the exemplified expressions). Second, there were some reduplicatives whose meanings were not sound-symbolic, such as mitʰa-mitʰa ‘suspicious’—they were excluded in the main analysis. Third, some reduplicatives appeared to be mistake forms of other reduplicatives in the corpus. For example, the neutral k’asil-k’asil ‘rough skin or hard-grained character’ was defined as a mistake form of the neutral k’asɨl-k’asɨl. Similarly, the neutral patɨŋ-patɨŋ ‘struggle in agony or wriggle’ was defined as a mistake of the harmonic patoŋ-patoŋ. In this case, only the meanings of the correct reduplicatives were counted in the analysis. The correct forms were not newly added to the corpus, though, to avoid unnecessary duplication (the correct forms were already in the corpus). The specific number of each semantically non-classifiable case is reported in Appendices C–F.

5 Results

5.1 The distribution of non-initial /i, ɨ/ and /u/ in the data

Before a discussion of the distribution of neutral forms (i.e., forms that contain neutral /i, ɨ/ or partially neutral /u/) in onomatopoeic vs. cross-modal ideophones, I examine their neutrality by considering the proportions of the harmonic and neutral forms they produce in the current data (which include a total of 4,024 ideophonic reduplicatives in Korean). If their neutrality is strong, they should produce harmonic (i.e., the forms where the neutral vowels are preceded by dark vowels) and neutral forms (i.e., the forms where the neutral vowels are preceded by light vowels) at approximately the same ratio. Since vowel patterns in the base and reduplicant are identical in all of the forms in the dataset, I chose to consider vowel patterns in base stems only.

The result is largely in line with Larsen and Heinz’s (2012) study: The counts when the non-initial /i/ and /ɨ/ occur with dark vowels (296 harmonic stems for /i/ and 565 harmonic stems for /ɨ/) were not much different from the counts when they occur with light vowels (244 neutral stems for /i/ and 505 neutral stems for /ɨ/).19 The vowel /u/ is seen to occur more frequently with dark vowels (461 harmonic stems) than with light vowels (260 neutral stems). However, it is still distinct from non-initial dark vowels at a statistically significant level (p < .001***), in terms of the frequencies of light vowels in the preceding syllables (Larsen & Heinz, p. 449)—non-initial dark /e/ and /ə/ preferred to follow dark vowels as against light vowels at a ratio of approximately 35:1.

For disharmonic forms that did not contain /i/, /ɨ/, or /u/ in non-initial syllables, there were 133 forms occupying 3.30% of the entire data (133/4,024) and a large number of these involved a non-initial light /a/ (98 out of 133 forms). This raises a question about whether /a/ should also be classified as (partially) neutral. However, the non-initial /a/ preferred to be preceded by light vowels (598 harmonic stems) than dark vowels (98 disharmonic stems) to a greater degree (i.e., approximately at a ratio of 6:1) than other neutral vowels.

To statistically test whether /i/, /ɨ/, /u/, and /a/ were different from each other in their co-occurrences with dark and light vowels, I conducted Fisher’s exact tests on pairs of vowels (Figure 1), using the R statistical software package (R Core Team, 2017). The results with Odds Ratios (OR) are shown in Table 2.

Figure 1 

Stem frequencies where the co-occurrence of a non-initial /i/, /ɨ/, /u/, and /a/ with dark and light vowels can be seen.

Table 2

Fisher’s exact test on six possible pairs of vowels in Figure 1.

…a …u …ɨ

…i OR = 5.023, p < .001*** OR = 1.461, p = .006** OR = 0.922, p = 1.00 (n.s.)
…ɨ OR = 5.449, p < .001*** OR = 1.610, p < .001***
…u OR = 87.341, p < .001***

Note: ***p < .001, **p < .01, *p < .05. Cells are shaded where no significant difference was found (p > 0.05). P-values were adjusted with Bonferroni’s method.

The results of the test in Table 2 show that: (a) Non-initial /i/ (45.19%; 244/540) is not statistically different from /ɨ/ (47.20%; 505/1,070) in its neutrality strength, while /u/ (36.06%; 260/721) is different from both of them at a statistically significant level; and (b) /a/ (14.08%; 98/696) is significantly different from all of the target neutral vowels, /i/, /ɨ/, and /u/. Based on this synchronic statistical analysis, I grouped /i/ and /ɨ/ together as neutral, separated from the partially neutral /u/ and the non-neutral /a/.

Given this, the following sub-sections report connections between two semantic types of ideophones associated with different iconicity levels and stems containing non-initial /i, ɨ/, /u/, and /a/ in order. The iconicity correlated with apparently harmonic stems that do not contain any of the vowels of interest (i.e., /i, ɨ/, /u/, /a/) in non-initial syllables is measured next. Sub-sections 5.2 to 5.5 contain only the major results of a comparison. For readers who wish to see substantial detail related to justification for the selection of the relevant datasets, refer to Appendices C–F.

5.2 Iconicity correlated with stems containing the neutral /i, ɨ/

From 749 neutral forms containing non-initial /i/ or /ɨ/ (i.e., forms where /i/ or /ɨ/ follows light vowels), 131 forms were excluded for semantic coding (see their details in Appendix C). In brief, 14 forms were eliminated as they were listed as mistakes of forms already found in the dataset; 85 forms were eliminated as they could have been affected by the /ɛ/~/e/ merger in South Korean dialects; seven forms were eliminated as their meanings were not found in the dictionary; and 25 forms were eliminated because they instantiated partial reduplication.

Overall, the elimination process left a total of 618 forms for semantic classification into cross-modal (C) and onomatopoeic meanings (O). Out of those 618 forms, 578 (or 93.53%) were classified as having cross-modal meanings (e.g., kacʰil-kacʰil ‘roughness of skin’; salɨk-salɨk ‘a soft sound or manner of an object being swept’) while 40 (or 6.47%) were classified as having onomatopoeic meanings only (e.g., posilak-posilak (NK)20 ‘a rustling sound’).

5.3 Iconicity correlated with neutral stems containing the partially neutral /u/

There were 260 neutral forms containing a non-initial /u/. Of those, a total of 50 forms (in Appendix D) were eliminated before the examination of iconicity correlated with the neutral /u/ in Korean ideophones. Consequently, 210 forms remained for the semantic analysis, and of those, 206 forms (or 98.10%) represented cross-modal meanings, while four forms (or 1.90%) represented onomatopoeic meanings only (e.g., t’alk’uk-t’alk’uk ‘hiccup’).

To sum up Sections 5.2–5.3, the proportional distributions of neutral forms with /i, ɨ/ of a strong neutrality and /u/ of a weak neutrality in onomatopoeic and cross-modal types of ideophones differ in the same direction. Specifically, the baseline proportions of the neutral and the partially neutral forms in the observed subset of ideophones are 34.82% (618/1,775) and 11.83% (210/1,775), respectively. In cross-modal ideophones, the corresponding proportions are 36.19% (578/1,597) and 12.90% (206/1,597), so they remained similar. However, in onomatopoeic ideophones, the distributions of neutral (22.47%; 40/178) and partially neutral forms (2.25%; 4/178) are significantly lower than in cross-modal ideophones (p < .001***, two-tailed proportion test). This indicates that onomatopoeic ideophones would show a corresponding increase in either or both the remaining harmony patterns (i.e., harmonic and disharmonic). The measurement of proportions of the disharmonic stems containing /a/ (Section 5.4) and the harmonic stems (Section 5.5) in onomatopoeic vs. cross-modal ideophones follows.

5.4 Lexical iconicity correlated with disharmonic forms containing /a/

There were 98 disharmonic forms containing non-initial /a/. Of those, a total of 35 forms (in Appendix E) were eliminated, but 11 forms were newly added from the list of disharmonic forms containing /i, ɨ/. These 11 forms, exemplified in (7), were moved here because they appeared to be disharmonic forms containing /a/ rather than /i, ɨ/, when the /ɛ/~/e/ merger is taken into account.

    1. (7)
    1. k’ɛcilak-k’ɛcilak ‘pecking at something’
      pɛcʰicak-pɛcʰicak ‘staggering’
      tɛkɨlak-tɛkɨlak ‘clattering’
      t’ɛkɨlak-t’ɛkɨlak ‘clattering’
      tɛŋkɨlaŋ-tɛŋkɨlaŋ ‘cling-cling’
      t’ɛŋkɨlaŋ-t’ɛŋkɨlaŋ ‘cling-cling’
      mɛk’ɨtaŋ-mɛk’ɨtaŋ ‘slippery’
      pɛtʰɨcak-pɛtʰɨcak ‘staggering’
      p’ɛtʰɨcak-p’ɛtʰɨcak ‘staggering’
      cɛŋkɨlaŋ-cɛŋkɨlaŋ ‘chink’
      c’ɛŋkɨlaŋ-c’ɛŋkɨlaŋ ‘chink’

Consequently, 74 forms remained, and of those, 53 forms (or 71.62%) were classified as having cross-modal meanings, while 21 forms (or 28.38%) were classified as having onomatopoeic meanings only (e.g., p’iak-p’iak ‘peep’).

5.5 Iconicity correlated with harmonic forms that do not contain non-initial /i, ɨ/, /u/, or /a/

Harmonic forms that do not contain any of the aforementioned vowels /i, ɨ/, /u/, or /a/ in non-initial syllables amounted to 955. Of those, 82 forms (in Appendix F) were eliminated. After this elimination, 873 forms remained, and of those, 760 forms (or 87.06%) were classified as having cross-modal meanings (e.g., nətəl-nətəl ‘in tatters’; k’ɛlk’ɛk-k’ɛlk’ɛk ‘a sound or state of chocking’) while 113 forms (or 12.94%) were classified as having onomatopoeic meanings only (tekək-tekək ‘a rattling sound’).

Figure 2 shows the number of onomatopoeic or cross-modal stems by reference to the four categories—harmonic forms, neutral forms containing /i, ɨ/, partially neutral forms containing /u/, and disharmonic forms containing /a/.

Figure 2 

The distribution of vowel harmony patterns in onomatopoeic vs. cross-modal ideophones.

The baseline proportions of the harmony patterns in the observed subset of ideophones are 49.18% (873/1,775) for harmonic, 46.65% (828/1,775) for neutral (in which the neutral and the partially neutral patterns in Figure 2 are lumped together), and 4.17% (74/1,775) for disharmonic. The proportions remain similar in cross-modal ideophones: 47.59% (760/1,597) for harmonic, 49.09% (784/1,597) for neutral, and 3.32% (53/1,597) for disharmonic. In contrast, in onomatopoeic ideophones, the proportions are 63.48% (113/178) for harmonic, 24.72% (44/178) for neutral, and 11.80% (21/178) for disharmonic. This reveals that, compared to cross-modal ideophones, onomatopoeic ideophones show a significant increase in both disharmonic (p < .001***, two-tailed proportion test) and harmonic forms (p < .001***), and a significant decrease in neutral forms (Section 5.3).

In sum, in the subset of ideophones analyzed, cross-modal (96.68%; 1,544/1,597) and onomatopoeic ideophones (88%; 157/178) are mostly linked to harmonic or neutral forms. This indicates that they generally conform to the conventional phonotactic of the ideophonic lexicon. However, as a remarkable difference between them, onomatopoeic ideophones are skewed toward a larger proportion of disharmonic forms.

6 Discussion

The current corpus-based study reveals that the vowel-harmony system in Korean ideophones is associated with iconicity in a complicated manner. In general, ideophones obeyed vowel harmony. However, a closer look at the distribution of harmony patterns (i.e., harmonic, neutral, and disharmonic) in onomatopoeic vs. cross-modal ideophones reveals that the former shows greater diversity in harmony patterns than the latter. This supports the hypothesis that highly iconic ideophones would take the phonotactic liberties they needed, so that there would be some skewedness in the distribution of vowel harmony patterns (i.e., the conventional phonotactic of Korean ideophones) in onomatopoeic vs. cross-modal ideophones. In fact, onomatopoeic ideophones were relatively frequently associated with disharmonic forms containing /a/, which does not possess a legitimate neutral status. In contrast, cross-modal ideophones that represent abstract iconic mappings conformed to stricter vowel harmony patterns, which are phonologically motivated regularities.

The findings that highly iconic ideophones are relatively free from the conventional phonotactic of the ideophonic system are reached from quantitative data. Perhaps, to find further empirical evidence, it would be useful to conduct production experiments with native Korean speakers in a future study. For example, one could ask Korean speakers to produce novel ideophones for ideophonic and cross-modal meanings, for which real ideophones do not exist in the language. One could then examine the distribution of harmonic, neutral, and disharmonic forms in the novel ideophones. If the skewed distribution of harmony patterns in onomatopoeic vs. cross-modal ideophones was also observed in the production data, the current investigation would gain strong psycholinguistic validity.

7 Summary

The primary aim of this study was to examine whether there is any correlation between the levels of iconicity and the degree to which ideophones conform to harmony constraints in Korean. Using a written corpus of Korean ideophonic stems, the study examined the meanings of 873 harmonic forms that do not contain any of the (potential) neutral vowels (/i/, /ɨ/, /u/, and /a/), in non-initial syllables, and of 828 neutral and 74 disharmonic forms (1,775 ideophonic stems in total). The results showed that both onomatopoeic and cross-modal ideophones were mostly harmonic or neutral. But onomatopoeic ideophones (i.e., a highly iconic type of ideophone) were skewed toward a larger proportion of disharmonic forms. This quantitatively confirms the hypothesis that high iconicity is correlated with phonotactic diversity.

Additional Files

The additional files for this article can be found as follows:

Appendix A

Participant instructions for the iconicity rating task in Section 3 (in English translation). DOI: https://doi.org/10.5334/labphon.53.s1

Appendix B

Number of ideophonic stems containing /i/, /ɨ/, or /u/ in non-initial syllables. DOI: https://doi.org/10.5334/labphon.53.s2

Appendix C

Description of the excluded neutral forms containing non-initial /i/ or /ɨ/ in Section 5.2. DOI: https://doi.org/10.5334/labphon.53.s3

Appendix D

Description of the excluded neutral forms containing non-initial /u/ in Section 5.3. DOI: https://doi.org/10.5334/labphon.53.s4

Appendix E

Description of the excluded disharmonic forms containing non-initial /a/ in Section 5.4. DOI: https://doi.org/10.5334/labphon.53.s5

Appendix F

Description of the excluded forms in the harmonic set in Section 5.5. DOI: https://doi.org/10.5334/labphon.53.s6