The finding that production and perception systems rapidly adapt to implicit contextual cues is important because it speaks directly to the crucial role played by both variation and experience in the cognitive representations involved. A number of studies have shown that phonetic boundaries between two neighboring phoneme categories can shift depending on perceived social characteristics of the speaker, such as gender (Johnson, Strand, & D’Imperio, 1999), social class (Hay, Warren, & Drager, 2006), or age (Drager, 2011). Similar studies have found that the perceived phonetic value of a given vowel token can shift depending on contextual cues related to region (Niedzielski, 1999; Hay, Nolan, & Drager, 2006; Hay & Drager, 2010). Crucially, in Hay and Drager (2010) these cues were implicit, in that they were not interpretable as characteristics of the speaker who produced the tokens, suggesting that the cues were acting on the perceptual system in a way that is both low-level and indirect, perhaps through the differential activation of specific linguistic experiences associated with each of the two social identities (regions) involved. Closely related implicit effects have also been observed in production (Mendoza-Denton, Hay, & Jannedy, 2003; Sanchez, Hay, & Nilson, 2015). These studies together paint a clear picture of a production and perception system that is highly dynamic and contextually sensitive, at least at the level of (phonemic) segments and the mostly continuous phonetic space that they occupy. Such effects are readily accounted for by theories that assume that perception and production processes depend on aggregates of stored experiences of speech (exemplars), which are enriched with details about the social context in which they occurred (Goldinger, 1998; Johnson, 1997a, 2006; Pierrehumbert, 2001, 2002).
Related accounts suggest that learning and behavior for other levels of representation depend on linguistic experience in a similar way, including the syntactic parsing of word strings (Bod, 2006, 2009), alignments between abstract phonological categories in production (German, Carlson, & Pierrehumbert, 2013), morphophonological productivity (Pierrehumbert, 2002), morphological productivity (Rácz et al., 2017), and the semantic interpretation of wordforms (Bybee, 2006). Warren (2017) provides the first evidence that the interpretation of intonational forms depends on social cues. In that study, the relevant social cues (related to speaker age) concerned the specific quality of vowels within the same utterance as the intonational feature being investigated and were therefore attributable not only to the speaker’s social identity, but also to the likelihood that the speaker used an early rising pattern as a question versus a late rising pattern as an uptalk statement.
Like Warren (2017), the present study also seeks evidence for the influence of socio-indexical cues on the interpretation of intonational forms. Specifically, it uses a perception study to test whether a particular sentence-final intonation contour in French is interpreted differently depending on which dialect region (Corsica or Continental France) is activated during processing. The present study differs from Warren (2017), however, in that the social cue manipulation is implicit. As with Hay and Drager (2010), in other words, the relevant cues are not attributable by the listeners to the social identity of the speaker. Furthermore, a large number of studies on socially driven adaptation in perception concern either (i) changes in how a continuous phonetic space is categorized into phonemes (e.g., Johnson et al., 1999; Drager, 2011), or (ii) changes in the perceived phonetic quality of a given vowel (e.g., Niedzielski, 1999; Hay & Drager, 2010). By contrast, the present study concerns differences in how a single intonational form is categorically mapped onto one abstract meaning versus another (either ‘question’ or ‘statement’). In that sense, it is more closely analogous to homophone disambiguation, whose relationship to socially-driven adaptation remains understudied. Finally, to our knowledge, virtually all existing studies that relate socio-indexical cues to perceptual adaptation involve English, with two notable exceptions. One is Chang (2015), which explores the perception of the contrast between alveolar and retroflex consonants of Beijing versus Taiwan varieties of Mandarin. The second exception we are aware of is Dufour et al. (2014), which tests the influence of the explicit social status of the speaker on the perception of the French /s/-/ʃ/ contrast in Mauritian Creole-French bilinguals. One important aspect of the present study, therefore, is that it involves two varieties of French.
The influence of socio-indexical cues on perception is now well documented, at least for different varieties of English. One type of evidence concerns listeners’ beliefs about the speaker’s identity. Strand and Johnson (1996), for example, found that listeners adjusted their category boundary for the /s/-/ʃ/ continuum according to whether the speaker appeared to be male versus female in a video that was synchronized to the speech signal. Johnson et al. (1999) replicated this effect for a vowel continuum, and further found that the effect could be induced simply by asking listeners to imagine the speaker’s gender. Related studies have found that photographs suggesting the age or social class of the speaker can influence either listeners’ categorization of vowels (Drager, 2011), or their ability to distinguish between word pairs whose vowels are involved in an ongoing merger within the community (Hay, Warren, & Drager, 2006). Finally, Drager (2005) found that the perceived age of the speaker’s voice had an influence on the category boundary between neighboring vowels in New Zealand English.
A second type of evidence concerns socio-indexical cues which are clearly not related to the listeners’ beliefs about the speaker. Hay and Drager (2010), for example, found that New Zealand English listeners perceived vowels as phonetically more New Zealand-like or Australian-like depending on whether a stuffed toy kiwi or a stuffed toy kangaroo (or koala), respectively, was visually present during the experiment. Hay, Drager, and Warren (2010) report at least three types of non-speaker-related social cues that influence how individuals produce, perceive, or otherwise make a distinction between matched word pairs involved in a merger. These include the dialect region of the experimenter (US versus NZ), the dialect used for a set of prerecorded spoken instructions (RP versus NZ), and the regional concept reinforced by a pre-task vocabulary questionnaire (US versus NZ).
Finally, two studies involve social cues which may or may not be linked to the speaker’s identity by listeners. Both Niedzielski (1999) and Hay et al. (2006a) found that textual labels for different dialect regions (e.g., “Michigan” versus “Canadian”) appearing on the experimental materials influenced the perceived phonetic quality of a given phoneme in the direction expected by the regional dialect differences (or, in the case of Niedzielski, the stereotyped differences).
At least two studies have failed to find an influence of regional cues on perception. In the first case, Lawrence (2015) replicated the design of Niedzielski (1999) and Hay et al. (2006a, 2006b) in a British context, where the listeners were speakers of Standard Southern British English and the phonetic variation concerned vowel differences between that variety and Northern British English. In the second case, Walker et al. (2019) replicated the specific design of Hay and Drager (2010), but in an Australian context with Australian listeners, and found no effect of the regionally indexed stuffed toys. As Walker et al. (2019) point out, such evidence suggests that the extent to which regional cues influence perception depends on the specific details of the sociolinguistic context involved. Crucially, for both replication studies, there was a potential asymmetry in terms of the linguistic and cultural dominance of the regions involved, and therefore also in terms of the degree of each group’s exposure to the ‘other’ regional variety. In the case of Walker et al.’s (2019) study, for example, it is likely that New Zealand individuals are exposed to Australian English to a greater degree than Australians are to New Zealand English as a result of, for example, greater output and distribution of Australian media in New Zealand as compared to the other way around. Walker et al. (2019) in fact measured listeners’ degree of exposure to and familiarity with New Zealand English, though this factor did not significantly interact with the regional cue manipulation. The authors offer various interpretations of the lack of both a main effect and an effect of exposure, including the possibility that the Australian listeners’ average exposure to New Zealand English is simply too low to result in a detectable effect on behavior. In other words, mere familiarity with another variety may not be sufficient for regional cues to influence perception, and instead such effects may depend on the quantity of input from another variety.
Findings like those of Hay and Drager (2010) run contrary to the claim that only conscious factors impact a hearer’s social judgements (Labov, 1972; McGowan, 2016). They therefore speak directly to the need for a model of linguistic (phonological) representation that can capture rapid and automatic adaptation to socio-contextual variation. Exemplar-based models (Johnson, 1997a; Goldinger, 1998; Pierrehumbert, 2001, 2002) have received considerable attention in connection with the types of socially adaptive effects reviewed above, and they appear well-suited to capturing those effects. The starting hypothesis of this class of models is that each experience of speech is stored as a memory trace (or exemplar), which is indexed, first of all, to the label of an abstract linguistic category (e.g., a wordform or phoneme), and secondly, to an array of linguistic (e.g., syntactic, lexical, or prosodic context) and non-linguistic (e.g., gender, region, socioeconomic status of the speaker) features associated with the specific context in which the speech was encountered. Crucially, in those models both production and perception processes rely on the activation of specific sets of such exemplars. In perception, a phonetic input activates existing exemplars to varying degrees based on their overall phonetic similarity to it. Activated exemplars feed activation of the abstract category to which they are linked, and lexical selection proceeds through an activation-competition mechanism over abstract category labels.
Rapid adaptation can be driven by activation of various kinds of contextual features, including social ones. Activation of one of these features leads to increased activation of the exemplars which index it (Hay et al., 2006a; Johnson, 2007) as compared to those which do not. The result is a perceptual bias, in the sense that the pattern of behavior more closely reflects a system consisting only of exemplars indexed to the social category. The specific nature of this activation bias, and how it relates to salience and cue strength, remains somewhat unexplored, though some studies suggest that exemplars differ in terms of the strength of their association with social indices (Hay et al., 2006a; Drager, 2011; Sumner, Kim, King, & McGowan, 2014).
Early exemplar-based models were primarily concerned with the relationship between gradient phonetic variation and abstract representations at the level of vowels (Miller, 1989; Johnson, 1997a) or words (Johnson, 1997b). Most studies on social adaptation in perception similarly emphasize the relationship between gradient phonetic detail and phonemes. However, exemplar-based models are also useful for characterizing relationships between levels of representation that involve only discrete units. Bod (2009) and Walsh, Möbius, Wade, and Schütze (2010), for example, have proposed exemplar-based models of syntax, which associate word strings with constituent structures. Based on evidence from dialect learning, German et al. (2013) propose that the mapping between phonemes and allophones involves exemplar-based representations, while Rácz et al. (2017) found that morphophonological productivity in artificial language learning is sensitive to social indexing, suggesting that selection from among discrete variants also involves an exemplar-based mechanism. In short, since exemplar-based mechanisms potentially underlie a wide range of associations between layers of representation, socially driven adaptation is predicted to be possible for any association which is learned through experience and includes variation that is socially differentiated.
The extension of exemplar-based models to capture alignments between categories is straightforward, since it involves a low-dimensional generalization of phonetically oriented models. In other words, the lower, or ‘concrete,’ level of representation involves tokens distributed over discrete unit types (categories) as opposed to tokens distributed in a continuous phonetic space. In Section 5, we discuss how such a model applies to the present study, which involves only a single lower level category (an intonation contour) and its relationship to two higher level categories (question and statement meaning).
Most studies on social priming in perception concern either (i) how gradient phonetic variation relates to two neighboring phonemes (e.g., Johnson et al., 1999; Drager, 2011), or (ii) the perceived phonetic quality of a given phoneme (e.g., Niedzielski, 1999; Hay & Drager, 2010). Warren (2017), however, recently extended the program to the relationship between intonation and illocutionary meaning. In that study, New Zealand English listeners were presented with two types of final rising contour and were then asked to choose between a ‘question’ or a ‘statement’ meaning for the utterance. In a between-subjects manipulation, the target rise was preceded by one of two variants of the SQUARE diphthong ([iɘ] versus [eɘ]) that suggested either a younger or an older age of the speaker respectively. Younger speakers in New Zealand are both more likely to use final rises to express statements, and more likely to distinguish phonetically between question and statement uses of final rises through an earlier alignment of the rise onset for questions than for statements. The study found no evidence that the phonetic cue to speaker age influenced how the listeners interpreted the two contours, though it did influence the processing of the contours. For example, participants had longer reaction times in giving ‘statement’ responses following the [eɘ] vowel, suggesting that this cue to older speakers contributes a slight bias against a statement interpretation. Additionally, participants’ mouse movements when responding with ‘question’ for an earlier rise onset were more direct following [iɘ] than [eɘ], suggesting that a cue to a younger speaker biased listeners to use a coding system that involves a clear phonetic contrast between rising statements and questions.
German (2017) also explores the influence of social cues on the pragmatic use of intonational patterns by Singapore English (SgE) listeners. SgE differs from mainstream varieties in that pronoun reference is not influenced by pitch accent placement. The study found, however, that SgE listeners made greater use of pitch accent placement for deciding pronoun reference in the presence of an ‘American’ visual prime versus a ‘Singaporean’ prime. In contrast to Warren’s study, though parallel to Hay and Drager’s (2010), these primes were non-linguistic, external to the task, and not attributable to the speaker’s social identity.
The hypothesis underlying the present study is motived by the observation that there is a form-level similarity between intonation contours in two varieties of French, namely Corsican French and Continental French. Specifically, both varieties include in their inventory a final rising-falling contour whose peak is aligned to the penultimate syllable of the utterance. The two varieties differ, however, in how they associate meaning to these tunes: The Corsican French penultimate rise-fall is primarily used to convey polar questions while the corresponding Continental French tune is primarily used to convey statements. In what follows, we present our observations along with a review of accounts of French intonation that treat these contours. Together these facts support the view that the contours are not only phonetically similar, but that they can be interpreted differently depending on which linguistic coding system (i.e., regional variety) is used by the listener.
Few studies have investigated Corsican French and still fewer its intonation. Boula de Mareüil et al. (2012a, 2012b), however, report specifically on the contour which is of primary interest to the present study. Through a production study, they compared the realization of declarative polar questions in Corsican (see Section 1.3.3 for details), Corsican French, and Parisian French. The results showed that for Corsican and Corsican French, polar questions are realized with a final rising-falling contour with the f0 peak aligned to the penultimate syllable, while for Parisian French they are realized with a final rising contour. The pattern for Corsican and Corsican French is also associated with an additional rise-fall on the first accentual phrase which surfaces only when the syntactic subject is a full NP. The Corsican French pattern including both the early and final rise-fall is illustrated in Figure 1.
A pattern similar to the above Corsican French tune has been described for Continental French. While the literature on Continental French intonation is more comprehensive than for Corsican French, authors differ widely in the specific tonal inventory and labeling system used to characterize this tune. In an autosegmental-metrical account, Post (2000) describes an “IP-final falling movement to low from a penultimate unaccented peak” (Post, 2000, p. 137) which contrasts with a fall from a final peak. The author accounts for this phonologically through the introduction of a new bitonal pitch accent, H+H*, where the leading H tone aligns with the penultimate syllable while the nuclear H* tone is scaled lower by virtue of a phonological downstepping rule between adjacent H tones. While Post does not specifically use the term ‘rise’ in this description, three of the four tokens presented (Post, 2000, p. 137, Figure 23; p. 138, Figure 24) show a clear rise to the penultimate peak. This apparent discrepancy can be explained by the fact that phrase-final accents in French are usually preceded by an L tone. In most accounts of French, including Post’s, this L tone is not a leading tone linked to the accent, but rather a feature of the local phrasing unit (either the Accentual Phrase, the Phonological Phrase, or the Intonational Phrase, depending on the specific model). Thus, regardless of the pitch accent category that appears phrase-finally, L tends to precede it except in certain cases of tonal crowding where a phrase-initial rise precedes the accent within the same word (see Jun & Fougeron, 2000, 2002, i.a.). Thus, in the full phonological model proposed by Post, the inventory of final tunes specifically includes L H+H* L%, which is described as having a penultimate peak with a fall to low (Post, 2000, p. 160).
In more recent accounts, which build on Post’s treatment of H+H*, this contour has been labeled for model-internal reasons as L H+!H* !H% by Delais-Roussarie et al. (2015) (who unlike Post, 2000, do not assume an automatic downstep rule), or as L H+L* L% by Portes and Beyssade (2015) and by Sichel-Bazin (2015). Note that Delais-Roussarie et al. provide no explicit motivation for labeling the boundary tone as !H% rather than L%, which descriptively corresponds to a final low f0 (see, e.g., Delais-Roussarie et al., 2015, Figure 3.12). Finally, Martin (2009) describes a rising-falling contour for Continental French en forme de cloche “with a bell-shaped curve” which he characterizes with the feature [+cloche]. While this description bears some resemblance to those above, not enough detail is provided by Martin to establish that he is discussing the same contour as Post and others.
In spite of differences in phonological labeling, the tune described in these various accounts corresponds closely to the one produced by the Continental French speaker in the present study (see Section 2.2 below). Here we adopt the L H+!H* L% coding of this penultimate rise-fall (henceforth PRF) on the grounds that it is phonetically the most transparent. Figure 2 provides an example of the Continental French PRF as it occurs in both spontaneous speech (Figure 2a) and as produced by the speaker in this study (Figure 2b).
For Continental French, the PRF is most commonly associated with statement meanings, which typically include various additional attitudinal connotations. Post’s analysis of the L H+H* L% contour, for example, corroborates the view expressed by Leach (1988) and Mertens (1992) that “[this] movement is used to convey that the speaker thinks that what he says is evident, or that he does not want to commit himself” (Post, 2000, p. 137). For Delais-Roussarie et al. (2015), the L H+!H* !H% contour is related to contradiction statements, while in Portes and Beyssade (2015) and Sichel-Bazin (2015), the L H+L* L% contour is associated with adamant statements. According to Sichel-Bazin, “the speaker urges the addressee to add the content of the proposition to the common ground and forget about any alternative that the addressee could have in mind” (Sichel-Bazin, 2015, p. 209). Finally, for Martin (2009), the [+cloche] contour is said to convey a statement of the obvious.
At least one study reports that the PRF may sometimes be associated with question-like meanings in specific contexts. A production study by Michelas, Portes, and Champagne-Lavau (2016) elicited negatively biased declarative questions (i.e., where the speaker believes the proposition to be false) from Continental French speakers. While most of the elicited productions (66%) were realized with a high boundary tone (L H+!H* H%), approximately one third (29%) were realized with the PRF (L H+!H* L%). These results show that even though the PRF is sometimes used by Continental French speakers to convey negatively biased declarative questions, final rising contours are nevertheless generally preferred for those meanings.
The meaning of the PRF in Corsican French was not discussed at length in Boula de Mareüil et al.’s (2012a, 2012b) production study, since the authors assumed that they elicited declarative polar questions. More recently, however, Boula de Mareüil, Rilliard, and Maynard (2016) explored the perception of the Corsican French PRF by both Corsican French and Continental French (Parisian) listeners. Specifically, participants in that study listened to Corsican French productions of both polar questions (involving the PRF) and statements (involving a falling contour) and were asked to judge whether they perceived a question or a statement. While statement utterances were consistently perceived as statements by both Corsican French listeners (100%) and Continental French listeners (98%), polar question utterances (with a penultimate peak) were perceived much more often as questions by Corsican French listeners (83%) than by Continental French listeners (35%). Against the backdrop of the above accounts for Continental French, these results are readily explained: The Corsican French PRF is phonetically similar to the Continental French PRF, and as such, the Continental French listeners tend to assimilate it to their native phonological category, which they generally associate with a statement meaning. This provides clear evidence, in other words, that the PRF of a given variety may be interpreted differently depending on the coding system used by the listener. The present study specifically explores the extent to which contextual factors influence which coding system is used by Corsican French listeners in their interpretation of the Continental French PRF.
The variety of French spoken by Corsicans has been shown to have characteristic phonetic and phonological traits which distinguish it from other French varieties, including Continental French. Many of these features likely arise from contact with Corsican, which is a romance language belonging to the Italo-Roman group and closely related to Toscan and other varieties from Central Italy, and which was formerly the dominant language in Corsica. Filippi (1992) reports a number of segmental characteristics of Corsican French which likely have their origins in Corsican. For example, there is a tendency for certain phonemically voiceless consonants to be realized as voiced allophones intervocalically (e.g., [ãdwan] instead of [ãtwan] for the name Antoine). Vowel differences include a tendency for a more posterior articulation of low vowels, such as [sã] for sang ‘blood’ ([sã] in Continental French) or [ʒɑk] for the name Jacques ([ʒak] in Continental French), as well as the widespread maintenance of a contrast between /e/ and /ɛ/, for which there are fewer minimal pairs in many continental varieties. At the level of word prosody, there is realization of syllabic /ə/ in contexts where it is elided in Continental French (e.g., [mɛ̃.tə.nã] for maintenant ‘now’ as opposed to [mɛ̃t.nã]), as well as hiatus in words like fouet ‘whip’ [fu.we], which is produced as a single syllable [fwe] in Continental French. Together, these features of Corsican French render it readily identifiable against the backdrop of other French varieties to such an extent that it is often stereotyped (Filippi, 1992).
While Corsican French is the dominant variety in Corsica, Corsicans are constantly exposed to Continental French, firstly through the national French media (television, radio, internet, etc.), and secondly through the regular presence of continental French tourists throughout the year. Hence, regardless of how Corsican individuals speak, they have substantial experience with Continental French in perception and should therefore be able to switch unconsciously from one variety to another depending on the context and on the regional identity projected by their interlocutors.
Compared to previous studies on the effects of regional cues on perception, the sociolinguistic situation of the listeners in the present study is closely parallel to that of New Zealand listeners in studies such as Hay and Drager (2010). Corsican French listeners are not only familiar with Continental French but are in a context where that variety in fact dominates a number of important sources of linguistic input. Unlike the Australian listeners in Walker et al. (2019) then, the listeners in the present study are unlikely to fall below any minimum threshold for exposure that would prevent them from being behaviorally sensitive to regional cues or linguistic differences. Conversely, while Continental French listeners may be familiar with Corsican French, they generally have limited exposure to it and are therefore predicted to be less likely to show an effect of regional cue, though we do not test that specific hypothesis in the present study.
A large number of studies have documented that the perceptual system rapidly adapts to socio-contextual information. As already discussed, however, most existing studies concern the relationship between gradient phonetic detail and phonemes. By comparison, social effects related either to suprasegmental representations or to categorical alignments between layers of abstract representation remain underexplored. Existing studies differ widely in terms of the nature of the social cues that are employed. For most studies, these cues are directly relatable to the social identity of the speaker (e.g., Johnson, et al., 1999; Warren, 2017), while only a few studies (Hay & Drager, 2010; Hay et al., 2010; German, 2017) have explored the role of social cues that are completely unrelated to the speaker or the task. Finally, all such studies that we are aware of, with the exceptions of Chang (2015) and Dufour, Kriegel, Alleesaib, and Nguyen (2014), involve varieties of English.
In this study, we explore whether social cues influence how Corsican French listeners interpret one type of intonational contour. We are specifically interested in cues that are fully implicit in the sense of being external to the speech signal, non-linguistic, separate from the task, and not related to the identity of the speaker. For the primary manipulation, we therefore made use of visual cues in the form of regional newspapers which are present during the task but not related to it (following Hay & Drager, 2010). Unlike most previous studies, we are not concerned with how gradient differences in phonetic detail relate to phonological categories. Instead, we are interested in how socially relevant contextual cues affect the alignment between a single lower level form (an intonation contour) and two categories of illocutionary meaning (either statement or question). There was, therefore, only one type of target utterance in the experimental materials.
To address these issues, Corsican participants were asked to interpret syntactically declarative sentences bearing the Continental French PRF by means of a forced-choice selection of a continuation of a dialogue (following Portes, Beyssade, Michelas, Marandin, & Champagne-Lavau, 2014). These continuations were consistent with an interpretation of the target utterance as either a question or a statement and allowed us to assess listeners’ interpretations without directing attention to the linguistic phenomenon being studied. In a between-subjects manipulation, we varied whether the visual cue (or prime) was evocative of ‘Corsica’ or ‘Continental France.’ Our main hypothesis is that participants in the Corsican prime condition should be more likely to interpret targets as questions as compared to participants in the Continental prime condition.
In order to verify that participants’ responses were related to their interpretation of the target utterance, we also varied the context in a within-subjects manipulation, such that the wording of the context either (i) was neutral with respect to the illocutionary meaning of the target, or (ii) reinforced the interpretation of the target as a question (see the QUESTION context and the NEUTRAL context in examples  and  below). A secondary hypothesis, therefore, was that participants would be more likely to interpret the targets as questions when the context reinforced that interpretation. Such a result would serve to validate that the response choices in our materials correspond to the relevant differences in the interpretation of the contour.
Since the experiment took place in Corsica, we assume that the Corsican French variety was highly activated for our participants. However, the target utterances were all produced by a speaker of Continental French, and as we outline in Section 1.3.3, the segmental and word-level prosodic differences between these two varieties are substantial and salient. Listeners could therefore readily determine that the speaker was from the continent, or that if he was from Corsica, he was actively using Continental French in the context of the experiment. Moreover, the experimenters were known by participants to be visiting from Continental France, were themselves using Continental French, thereby contributing an additional bias towards the assumption that the speaker was using Continental French. Overall then, there is good reason to believe that, to the extent that the participants were able to adapt to Continental French, they were strongly biased towards doing so when listening to the target utterances. Any effect of the experimental manipulation was therefore over and above what the participants believed about the variety of French being used.
Forty-eight individuals participated in the study as unpaid volunteers. Participants consisted of undergraduate students, graduate students, administrators, and instructors at the Université de Corse Pascal Paoli in Corte, France. All participants were native speakers of French and long-time residents of Corsica. However, two participants were excluded from the study because French was not their dominant language, and three were excluded because they had moved to Corsica after early childhood (age 5). The remaining 43 participants (15 males, 28 females) had a mean age of 27.2 years.
The experimental targets consisted of 32 spoken utterances involving the L H+!H* L% contour (i.e., the Continental French version of the PRF) described in Section 1.3. All target sentences involved a declarative syntax. In French, if the utterance-final word is less than three syllables, then the penultimate rise-fall is potentially interpretable as a word-initial or emphatic accent. In the materials, therefore, the last word in each target sentence consisted of either three or four syllables. The utterances were produced by a male speaker of Continental French who is a native of the Île-de-France region. To elicit these utterances, one of the authors produced each sentence with the appropriate contour, and the speaker was asked to imitate it with the same intonation pattern. The recorded productions were later verified for the intended contour through auditory and visual inspection.
In a within-subjects manipulation, each of the 32 target utterances was paired with two different types of short textual contexts which preceded it in presentation. These were designed either to introduce a bias towards a question interpretation of the target utterance, or to be neutral in that regard. This manipulation, and the inclusion of the question-biased contexts specifically, allowed us to establish within the context of the experiment whether participants’ responses were reflective of their interpretation of the target utterances as either questions or statements. Each context describes a daily scenario in the life of a single character, ‘Antoine,’ and sets the stage for the target utterance, of which Antoine is understood to be the speaker. The presence or absence of a question-bias was achieved by varying the verb of reported speech (either demander à ‘to ask’ or s’adresser à ‘to speak to’) appearing in the last sentence of the context, which semantically constrains the type of speech act that is possible for the target utterance that follows. Crucially, these contexts were designed to be incompatible with the negatively biased question interpretation for L H+!H* L% reported by Michelas et al. (2016). Example contexts are given in (1) and (2) for the target sentence in (3).
|Antoine cuisine avec sa mère. Il lui demande…|
|Antoine is cooking with his mother. He asks her…|
|Antoine cuisine avec sa mère. Il s’adresse à elle…|
|Antoine is cooking with his mother. He speaks to her…|
|Il faut mélanger|
|Should I stir? (question) or You have to stir (statement)|
Each of the 32 target utterances was associated with two response options, which represented possible continuations of the dialogue between Antoine and his interlocutor. One of these was consistent with a statement interpretation of the target, and one with a question interpretation. The response pair for the target sentence in (3) is given in (4).
|Je crois que oui.||I think so.|
Sixty filler items were included in the study as distractors. Like the experimental items, these involved declarative syntax. However, half of the fillers were produced with a final rising contour, which in this type of context is consistent only with a polar question interpretation, while the other half were produced with a fall on the utterance-final accentual phrase, which is generally consistent with a statement interpretation regardless of variety. Each filler utterance was paired with only one type of context. These contexts included a reported speech verb that corresponded to the interpretation of the associated intonation pattern. Given that the target utterances are expected to be consistent with both question and statement interpretations (though one or the other may be preferred), it was important that filler items not be saliently different from target items by including responses that were strictly either correct or incorrect. For this reason, both response options for the filler items were consistent with whichever interpretation was suggested by the intonation and context. Examples of question and statement fillers are given in (5) and (6), respectively.
|Context:||Alice et Antoine attendent le bus. Antoine s’interroge…|
|Alice and Antoine are waiting for the bus. Antoine asks…|
|Target:||Il y a du retard?|
|There’s a delay?|
|Response 1:||Oui, c’est possible.|
|Yes, it’s possible.|
|Response 2:||C’est ce qu’on m’a dit.|
|That’s what I was told.|
|Context:||Antoine et ses amis jouent au volley sur la plage. Antoine déclare…|
|Antoine and his friends are playing volleyball at the beach. Antoine declares…|
|Target:||Il va pleuvoir.|
|It’s going to rain.|
|Response 1:||Tu te trompes.|
|Response 2:||Je ne crois pas.|
|I don’t believe it.|
Two visual primes were used in a between-subjects manipulation. The CORSICAN prime consisted of a paper copy of the daily newspaper Corse-Matin, while the CONTINENTAL prime consisted of a copy of the daily newspaper Le Parisien. As Figure 3 shows, the two newspapers are similar in size and layout. Each newspaper was chosen for the content of its cover material, specifically such that it included a maximum of headlines and images reflecting events and issues specific to each region.
Four lists were created based on the 32 target utterances, the two context types, and the 60 filler items. For this study, we were primarily interested in the influence of PRIME on the interpretation of the target utterances. Since the NEUTRAL context items introduce no particular bias, we expected these to be more sensitive to this influence (if present), while we hypothesized that the bias introduced by the QUESTION contexts could lead to ceiling effects that might obscure any effects of PRIME. Nevertheless, we were also interested in verifying that the response options reflected real differences in interpretation in the expected way. For these reasons, three quarters of the target utterances (24) in each list were paired with NEUTRAL contexts, and one quarter (8) was paired with QUESTION contexts. The four lists were counterbalanced in this respect, such that each list differed in terms of which target utterances were paired with QUESTION versus NEUTRAL contexts. Specifically, each target utterance occurred in a NEUTRAL context in three out of the four lists, and in a QUESTION context in one out of the four lists. Table 1 illustrates the breakdown of item types for the four lists. The items were also grouped into eight blocks, each of which included one QUESTION context target item, three NEUTRAL context target items, and either seven or eight filler items. Four blocks contained four QUESTION and four STATEMENT fillers, two blocks contained three QUESTION and four STATEMENT fillers, and two blocks contained four QUESTION and three STATEMENT fillers. Additionally, each list occurred in two versions based on the left-right orientation with which the response options were presented.
|Utterance type||Context type|
|32 Target utterances (PRF)||24 NEUTRAL|
|30 Fillers (rising)||30 QUESTION|
|30 Fillers (falling)||30 STATEMENT|
The experiment took place in a quiet, empty classroom at the Universita di Corsica Pasquale Paoli in Corte, France. Participants were seated at a desk in front of a laptop computer and wore sound-insulating headphones. Before each participant entered the room, one of the two primes (CORSICAN or CONTINENTAL) was placed on the desk next to the laptop. Once the participant was seated, the experimenter drew attention to the prime by asking the participant if it belonged to them. Since the participants all responded, “No,” the experimenter subsequently expressed that the newspaper must have belonged to one of the previous participants. The participants were then directed to read the instructions on the screen (see Appendix A), and the experimenter stepped out of the room for several minutes. This step was taken to give ample time for the participants to observe the prime before beginning the task. Approximately half of the participants were assigned to the CORSICA prime condition (23) and half to the CONTINENTAL prime condition (20).
After reading the instructions, participants completed a set of ten training items resembling the filler items. Text and audio presentation of all items was self-paced and controlled using the E-Prime 126.96.36.1993 software (Psychology Software Tools, Pittsburgh, PA). At the beginning of each item, the text of the context appeared on the screen. After reading the context, the participant pressed the spacebar to begin audio playback of the target utterance. After an obligatory second playback, the two responses appeared on the screen below the context. The response options were color-coded: A small, colored square next to each response option matched one of two colored keys on the keyboard. Participants were instructed to choose the response that was the most appropriate reaction to the speaker (Antoine) by his interlocutor (see Appendix A). No time limit on responses was enforced. Block order and the order of items within blocks were randomized automatically by E-Prime.
A total of 1,376 responses for experimental trials was collected. Response times (RTs), as measured from the point at which response options appeared, ranged from 90 ms to over 41 seconds, with a mean of 3.76 seconds. Since very short or very long RTs likely reflect a lack of engagement by participants, trials for which log RT fell outside of two standard deviations of the mean were excluded (4% of the total), resulting in 1,321 trials used for analysis.
The outcome variable of interest was the proportion of trials in which participants appeared to interpret target sentences as questions. The overall mean proportion of question-consistent responses was 0.482. Figure 4 (left) shows the by-participant means of question-consistent responses plotted by rank. All but five participants had means between 0.25 and 0.75. No participant chose the same response type for all items. Figure 4 (right) shows the by-item means plotted by rank. All but three items had means between 0.25 and 0.75. Together, these results support the hypothesis that Corsican French listeners have substantial flexibility in their interpretation of the PRF, and further show that this flexibility is not strongly dependent on the lexical content of the sentence.
Figure 5 shows the proportion of question-consistent responses by CONTEXT and PRIME. As expected, there were more question-consistent responses in the QUESTION context than in the NEUTRAL context. Consistent with our hypothesis regarding the influence of the social cues, there were more question-consistent responses for the CORSICAN prime than for the CONTINENTAL prime in both types of context.
To assess these differences statistically, a generalized logistic mixed model was fit to the data using the glmer function (lme4 package, Bates et al., 2014) in R (R Core Team, 2017) with a logit link, and response type (question-consistent, statement-consistent) as the binary dependent variable. PRIME, CONTEXT, and their interaction were included as contrast-coded fixed factors. The maximal random structure that converged included random intercepts for items and participants and random slopes for items by CONTEXT. Model comparisons by likelihood ratio tests were then used to assess which factors contributed significantly to the fit of the model.
The results of these comparisons reveal a significant main effect of PRIME (β = 0.30, SE β = 0.067, χ2 (1) = 5.89, p < 0.05), in that participants were more likely to give a question-consistent response in the presence of the CORSICAN prime as compared to the CONTINENTAL prime. The effect of CONTEXT was also significant (β = 0.49, SE β = 0.069, χ2 (1) = 12.92, p < 0.001), in that participants were more likely to give a question-consistent response in QUESTION contexts than in NEUTRAL contexts. The interaction of PRIME and CONTEXT, however, was not significant (χ2 = 0.059, p = 0.808).
The main result of the present study is that Corsican French listeners interpreted the Continental French PRF differently depending on the value of the social prime. The direction of this effect was consistent with expectation, in that participants exposed to the CORSICAN prime were more likely to interpret targets as a question as compared to those exposed to the CONTINENTAL prime. The results therefore support the hypothesis that listeners adjust their patterns of alignment between intonational form and meaning in response to socio-indexical cues that are non-linguistic, unrelated to the task, and not attributable to the speaker’s social identity.
The fact that participants chose both interpretations for the PRF contour at similar overall frequencies confirms our initial assumption that Corsican French listeners have access to at least two systems for interpreting this contour. While we cannot be certain that the availability of the two systems is due specifically to contact with Continental French, the fact that the choice of interpretation was sensitive to regional cues associated with different dialects suggests that contact is at least partly explanatory.
The Corsican listeners in Boula de Mareüil et al.’s (2016) study perceived the Corsican French PRF as a question at a rate of 83%. The listeners in the present study, by comparison, perceived the Continental French PRF as a question at much lower rate (Neutral: 45.3%, Question: 56.7%). This difference can be explained by the fact that the listeners in the present study were able to identify the continental identity of the speaker based on the segmental and word-level prosodic characteristics of his speech, and that they therefore understood his utterances as belonging to the Continental French variety. As suggested in Section 1.4, the identity of the experimenters may have also played a role. Since these biases applied equally to all participants in the study, the effect of the prime manipulation influenced the interpretation patterns of the listeners over and above what they believed about the variety being used by the speaker. In that sense, our finding is highly analogous to those of Hay and Drager (2010) and Hay et al. (2010), in which the contextual prime had no plausible relationship to the identity of the speaker or the linguistic variety being used.
The main finding of the present study corroborates recent findings (Warren, 2017; German, 2017) that the perception and interpretation of intonational contours, and not just phonemic/segmental categories, are sensitive to socio-contextual factors. In Warren (2017), the social cues were provided by the spectral characteristics of a diphthong within the same utterance as the relevant contour. Even if not explicitly so, these cues were directly relatable to the social identity of the speaker. By contrast, in the present study the prime was visual and not related to the speaker or task in any way. The present findings therefore corroborate those of Hay and Drager (2010), which established that the mere activation of social concepts can interact with linguistic representations in a way that leads to subconscious changes in perceptual behavior.
A second difference between the present study and Warren (2017) concerns the nature of the instructions and how they relate to the task. In Warren (2017), participants were explicitly asked to decide whether the target utterance was a question or a statement, and their attention was directed to the role of the utterance-final intonation contour in connection with this difference. In the present study, participants were not asked to consider the illocutionary meaning of the target utterance or to pay attention to its intonational characteristics. Instead they were required only to consider which type of follow-on response was more natural given the context and the global meaning of the target utterance. The results are therefore more clearly interpretable in terms of adaptive processes that are implicit and unconscious. This distinction is not trivial given the view among many variationists that social adaptation is driven by the conscious and deliberate desire to achieve social and communicative goals (Labov, 1972 cited in Drager & Kirtley, 2016; McGowan, 2016). Drager and Kirtley (2016), however, argue that exemplar-based models of representation and processing both allow for and predict such unconscious influences, and the present findings lend further support for that view.
The materials in the present study involved a single pattern from a single variety, and all participants were presented with identical target utterances. Therefore, the findings do not concern changes in how continuously varying phonetic detail is assigned to neighboring phonological categories within the same system (c.f., Johnson et al., 1999), or changes in the perceived phonetic quality of tokens from a single category (c.f., Hay & Drager, 2010). Instead, they reflect differences in how a single intonational form relates to two categories of illocutionary meaning. In that sense, a more appropriate analog would involve a case of word-level homophony, where the tendency to map a given phonological parse onto one lexical entry versus another is conditioned by socio-indexical cues. However, we are not aware of any study that explores such cases. The finding of implicit social adaptation at this level shows that it is not limited to the processing of phonetic detail, but also extends to categorical alignments between layers of abstract representation, i.e., the phonological representation of the PRF and the two different meanings associated to it. In that sense, they echo findings in production (German et al., 2013; Rácz et al., 2017) suggesting that (hybrid) exemplar-based models are necessary for capturing both learning and contextual adaptation across various levels of the grammar.
For English, it is recognized that most dialects differ relatively little in their inventory of intonational categories, with most differences consisting of phonetic differences for a given category or differences in how the categories relate to meaning (Ladd, 2008; Fletcher, Grabe, & Warren, 2004; Grabe, 2004; Grice, German, & Warren, to appear). If this is also the case for French, then it is likely that the Continental French PRF in the present study was consistently parsed as a single phonological category across conditions. In that case, the study reflects adaptation in terms of how a single phonological category is associated to meaning. It is possible, however, that the two dialectal varieties have distinct phonological inventories, such that a given phonetic contour can be parsed as a different phonological representation depending on which dialectal system is being used to process it. In that case, different interpretations of the contour do not reflect differences in meaning association per se, but instead follow deterministically from the phonological parse. Crucially, in each case, the influence of activated social indices enters into the grammar in different ways. Figure 6 illustrates these two possibilities schematically.
Existing descriptions and observation suggest that the PRFs in the two varieties are phonetically very similar. Additionally, Boula de Mareüil et al. (2016) demonstrate that the contours are sufficiently similar that one variety’s (Corsican French) contour may be assimilated to that of the other (Continental French). Nevertheless, there may be subtle differences in how the contour in each variety is distributed phonetically. If that is the case, then based on results like those of Hay and Drager (2010), the social prime may influence the subjectively perceived phonetic quality of the contour, such that a given token is perceived as lying closer to or further from the center of the distribution for one variety than it actually does (i.e., acoustically). In other words, the Continental French PRF may have ‘sounded’ more like a typical Corsican French PRF for participants in the CORSICA condition, and vice versa. This has consequences for the scenario outlined in Figure 6b, since it would mean that the prime potentially influences phonological encoding in two different ways. At a higher level, activation of a social concept would raise the activation of the phonological category of the corresponding variety, thus raising the likelihood that any token will be parsed as that variety’s category. This is the principal effect we assume under the scenario in Figure 6b. At a lower level, however, changes in the perceived phonetic quality of a given token would render it (perceptually) closer to the center of the distribution for that variety’s category. It is possible that this could in turn bias whether the token is encoded as one variety’s category versus another. This effect is expected to be minimal, however, if the degree of overlap in the two phonetic distributions is high. To the extent that there is significant overlap between Continental and Corsican French PRFs, it is likely that the higher level effect would dominate.
The design of the present study does not allow us to distinguish between these different possibilities, and further research is needed to address precisely how activation of the social concept influences the relationship between the PRF and its interpretation. Regardless of which situation more accurately characterizes Corsican French listeners, however, the present results support the relevance of exemplar-based models for explaining adaptation in the relationship between a single form and different possible meanings. In the following section, we set aside the issue of how many phonological categories are involved and consider how an exemplar model can capture the present findings on the assumption that there is a single form-level (i.e., phonological) representation for the PRF, as in Figure 6a above.
As discussed in Section 1.1, hybrid exemplar models can be readily extended to cases of categorical alignments between layers of abstract representations. In models addressing gradient phonetic detail, an exemplar can be represented as a pairing between (i) a value (or set of values) in the continuous phonetic parameter space, and (ii) an index to an abstract linguistic category (e.g., a phoneme or wordform), and (iii) any associated social or contextual labels. In the consonant voicing domain, for example, an exemplar might be represented as: ⟨VOT: 75 ms, phoneme: /t/, gender: ‘female’, etc.⟩. In discrete models, the only difference is the number of possible values for the first member of the tuple, since these are drawn from a closed set of categorical values as opposed to the set of real numbers. Thus, an exemplar in a discrete model is a tuple involving an index to a ‘lower-level’ (i.e., more concrete) category, an index to a ‘higher level’ (i.e., more abstract) category, as well as indices to any relevant social or contextual features (e.g., ⟨L1, H1, ‘female’, etc.⟩).
In gradient phonetic models, an input activates exemplars based on phonetic distance, or similarity. In discrete models, an input either matches the concrete label of an exemplar (e.g., ‘L2’) or it does not—thus similarity is reduced to identity. A given input therefore activates all exemplars whose concrete labels match it (i.e., all exemplars of the form ⟨L2, x, y, …⟩), and these in turn activate abstract labels to which they are indexed. Lexical selection is then a function of the activation of exemplars feeding each abstract category.
In the present study, the input to perception was a single type of intonational contour, which was associated with one of two meanings. Two different socio-indexical cues led to different biases in interpretation. Thus, the exemplars relevant to listeners’ behavior can be represented as triples consisting of an index to the PRF, an index to one of two illocutionary meanings, and an index to one of two socio-indexical labels. The following is the set of all possible combinations: ⟨PRF, QUESTION, CORSICAN⟩, ⟨PRF, QUESTION, CONTINENTAL⟩, ⟨PRF, STATEMENT, CORSICAN⟩, and ⟨PRF, STATEMENT, CONTINENTAL⟩. Since all four exemplar types share the same first element, this will be dropped for the remainder of the analysis.
In what follows, we describe an exemplar-based model of a single hypothetical Corsican French listener in order to explore how well such a model can capture the behavior of listeners in the present study. To do this, it is important to first consider the linguistic experiences that a hypothetical listener has encountered. We begin by assuming as previously that a typical Corsican French individual has significant experience with Continental French. As the literature review in Section 1.3 suggests, the PRF is only rarely used to express questions in Continental French, and perhaps not ever for the discourse contexts involved in the present study. To be conservative, let us assume that question uses occur in 10% of PRFs produced by Continental speakers, with statement uses comprising the remaining 90% of PRFs. Observationally, Corsican French speakers use the PRF primarily to express questions, though given the high degree of contact with Continental French, it is reasonable to assume that they also use it to express statements with some regularity. For the sake of illustration, let us suppose that in our listener’s experience, 70% of PRFs produced by Corsican French speakers are questions, and 30% are statements.
Since lexical selection in an exemplar model depends on the number of exemplars of each type, we also need to consider how often our hypothetical listener encounters PRFs produced by Continental speakers as compared to Corsican French speakers. This is perhaps the most variable aspect of experience among Corsican French individuals. As such, we will first build a model based on one set of fixed values, and then later explore systematically how different degrees of contact with Continental French are predicted to influence listener behavior. For the first step, let us suppose that 80% of PRFs encountered by our hypothetical listener were produced by other Corsicans, and 20% were produced by Continental French speakers. To obtain the overall relative frequency estimates for all PRFs in our listener’s experience set, it suffices to multiply the regional dialect proportions by the usage (i.e., meanings) proportions. The resulting relative frequencies based on the above assumption are shown in Table 2.
To assess behavior in perception, we assume that any instance of the PRF as input activates the exemplars in each of the four cases. Since there is only one type of input, there is no need to posit differences in how strongly each exemplar is activated by that input (e.g., based on phonetic distance). Thus, we assume that each exemplar receives the same activation from the input. The selection of a meaning is then a matter of comparing the summed activation for all exemplars indexed to STATEMENT versus those indexed to QUESTION. In the absence of specific social cues, those values would be 0.42 for STATEMENT and 0.58 for QUESTION, indicating a preference for a question interpretation by this hypothetical listener.
Crucially, the listeners in the present study did not behave categorically, in the sense of always choosing a particular interpretation for all targets, and most listeners were somewhat balanced between the two interpretations (i.e., between 0.25 and 0.75). This suggests that meaning selection was probabilistic, and that the effect of the social prime manipulation resulted from differences in the probability distribution over the two meanings. Probabilistic behavior in category selection is well-known in production and perception. While probabilistic behavior is a cornerstone of much variationist research (e.g., Labov, 1970; Sankoff & Rousseau, 1989), the issue remains relatively unexplored in connection with implicit social adaptation. At least some exemplar-based approaches, however, explicitly assume that relative differences in exemplar activation determine the probability distribution over outcomes (Pierrehumbert, 2003; Wedel, 2004, 2006; German et al., 2013), and we make use of this feature to capture the within-listener variation found in the present study. On the assumption that the relation between activation and outcome probabilities is linear, then in the absence of social cues, our hypothetical individual is predicted to interpret any given instance of the PRF as a question with a probability of 0.58.
In most explicit exemplar models, salient social cues influence perception because activation of the social category label raises the resting activation level of exemplars which are indexed to it (Johnson, 2007; Drager, 2005; i.a.). This can be operationalized in the present model by adding to the total activation of each socially ‘primed’ exemplar on top of the activation contributed by the input. So, for example, when the CORSICAN label is activated, the total activation of each CORSICAN-indexed exemplar might be 1.2 as compared to 1.0 for each CONTINENTAL-indexed exemplar. The behavior of our hypothetical listener can then be estimated by multiplying the relative frequencies in the second row of Table 2 by 1.2, and then summing the activation of each column. Since the resulting activation scores do not sum to 1, it is necessary to divide each score by the sum of the scores to obtain probabilities. This gives 0.40 for STATEMENT and 0.60 for QUESTION. In the presence of the CORSICAN prime, in other words, our listener is predicted to interpret the PRF as a question with a probability of 0.60—2% higher than in the absence of a social prime. If the CONTINENTAL prime provides the same activation advantage as the CORSICAN prime, then in the presence of a CONTINENTAL prime, our listener would interpret the PRF as a question or statement with probabilities of 0.44 and 0.56, respectively. Our hypothetical model therefore predicts that different social cues should lead to small but measurable differences in the probability with which a CORSICAN listener interprets the PRF as either a statement or a question.
To be sure, the model we present here is little more than an exercise. While certain numerical choices are at least realistic (e.g., the usage rates by dialect group), others, such as the size of the activation advantage provided by the social primes, were chosen rather arbitrarily. It should be noted, however, that the qualitative predictions of our model do not depend on these choices. In fact, as long as the proportion of question uses of the PRF is higher for CORSICAN-indexed exemplars than for CONTINENTAL-indexed ones, then it is a general fact that a CORSICAN (or CONTINENTAL) prime is predicted to increase (or decrease) the probability of a question interpretation. This holds even if CORSICAN speakers generally use the PRF more often for statements than for questions. This property of the model is important, since it allows us to explore how behavior is predicted to differ as a function of factors that are certain to vary across individuals. These include, for example, an individual’s experience with Continental French, as well as the proportion of question versus statement uses by Corsican French speakers in the listener’s experience.
Figure 7 illustrates how the effect size of the prime manipulation (specifically the difference in the proportion of ‘question’ responses in the presence of a CORSICAN versus a CONTINENTAL prime) varies with these two factors. The x-axis represents the proportion of an individual’s total experience that is associated with Continental French speakers—a value of 0 corresponds to an individual who has never encountered Continental French, while 1 corresponds to an individual who has only encountered Continental French. The y-axis represents the proportion of PRF uses by Corsican French speakers that were questions. A value of 1 corresponds to an individual for whom Corsican French speakers in their experience always use the PRF to express questions, while 0 corresponds to an individual for whom Corsican French speakers in their experience always use the PRF to express statements. Darker shading represents individuals who have a stronger bias towards ‘question’ responses in the presence of a CORSICAN prime as compared to a CONTINENTAL prime. Note that the scale ranges from positive to slightly negative values, thus the very lightly shaded cells indicated by the dotted line correspond to negative values, i.e., a slight bias towards ‘statement’ responses in the presence of the Corsican prime as compared to the Continental prime, though such values only arise in the (highly unrealistic) case where the PRF is used to express questions less often in Corsican French than in Continental French (i.e., since our model assumes that the proportion of question uses of the PRF for Continental French is fixed at 0.2, thus 0 on the x-axis is less than 0.2).
This illustration shows that all else being equal, individuals’ responses are predicted to vary more as a function of the social primes when (i) those individuals have balanced rather than uneven experience with the two varieties, and (ii) Corsican French speakers in their experience set encounter the PRF more often as questions than as statements. Since the x-axis ranges from no experience with Continental French to only experience with Continental French, the model also makes predictions for individuals from Continental France who have varying degrees of exposure to Corsican French. Specifically, a typical Continental French speaker, whose input is strongly dominated by Continental French with perhaps a high familiarity but limited exposure to Corsican French, is expected to show an effect that is weak and therefore difficult to detect. This closely echoes the findings by Lawrence (2015) and Walker et al. (2019) that speakers of more dominant varieties did not show an effect of regional cue even though speakers from regions with more balanced input previously had.
This study showed that implicitly evoking different socio-indexical concepts through a visual cue that was unrelated to the speaker’s identity was sufficient to influence how Corsican French listeners associate illocutionary meaning with a single type of intonational contour. It thus provides evidence for the impact of implicit social factors on categorical alignments between levels of representation, specifically at the interface between intonational form and meaning. A theoretical analysis of these results shows that this type of adaptation can be readily captured by exemplar-based models. These results not only add to the body of sociophonetic evidence demonstrating the need for a dynamic modeling of linguistic representations, they also show that these dynamic approaches should be relevant for the treatment of polysemy and homonymy, since the penultimate rise-fall can be considered as semantically ambiguous for Corsican French listeners.
The additional files for this article can be found as follows:Appendix A
Task instructions. This file contains the instructions both in the original French and the English translation. DOI: https://doi.org/10.5334/labphon.162.s1Appendix B
Experimental material. This file contains the text of the materials along with audio file names. ‘Test items’ includes all 32 target utterances and provides an example of how these were assigned to contexts (24 NEUTRAL, 8 QUESTION) for one of the four lists. ‘Filler items’ shows the 60 filler items that were included in all four lists. DOI: https://doi.org/10.5334/labphon.162.s2Appendix C
Sound files of the auditory stimuli. DOI: https://doi.org/10.5334/labphon.162.s3
This study was made possible through support by the A*MIDEX project (nº ANR-11-IDEX-0001-02) funded by the Investissements d’Avenir French Government program, managed by the French National Research Agency (ANR) and by the Institute for Language Communication and the Brain (ANR-16-CONV-0002). We are grateful to Susanne Gahl, Mirjam Ernestus, and Kip Wilson for their help in improving this paper, and to two anonymous reviewers for their careful and helpful comments. Thanks to Amandine Michelas and Oriana Reed-Collins for their help in carrying out the experiment, and to Leonardo Lancia for his advice on statistical design. A special thanks to the participants at the Université de Corse for their willingness to participate, and to Jean-Michel Géa and Stella Medori for their hospitality and logistical assistance in Corte.
The authors have no competing interests to declare.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823. DOI: https://doi.org/10.18637/jss.v067.i01
Bod, R. (2006). Exemplar-based syntax: How to get productivity from examples. The linguistic review, 23(3), 291–320. DOI: https://doi.org/10.1515/TLR.2006.012
Bod, R. (2009). From exemplar to grammar: A probabilistic analogy-based model of language learning. Cognitive Science, 33(5), 752–793. DOI: https://doi.org/10.1111/j.1551-6709.2009.01031.x
Boula de Mareüil, P. B., Rilliard, A., Mairano, P., & Lai, J. P. (2012a). Questions corses: Peut-on mettre en évidence un transfert prosodique du corse vers le français? (Corsican questions: is there a prosodic transfer from Corsican to French?) [in French]. Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 1 (pp. 609–616).
Boula de Mareüil, P. B., Rilliard, A., Mairano, P., & Lai, J. P. (2012b). Corsican French questions: Is there a prosodic transfer from Corsican to French and how to highlight it? Proceedings of Speech Prosody 2012 (pp. 418–421). https://www.isca-speech.org/archive/sp2012/papers/sp12_418.pdf.
Boula de Mareüil, P., Rilliard, A., & Maynard, H. (2016). Questions à intonation descendante: Un exemple d’isolat corse? In S. Retali-Medori (Ed.), Lingue delle isole, isole linguistiche (pp. 3–19). Alessandria: Edizioni dell’Orso.
Chang, Y. S. (2015). Use of social information in the perception of Mandarin alveolar-retroflex contrast. Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow. https://www.internationalphoneticassociation.org/icphsproceedings/ICPhS2015/Papers/ICPHS0589.pdf.
Delais-Roussarie, E., Post, B., Avanzi, M., Buthke, C., Di Cristo, A., Feldhausen, I., & Sichel-Bazin, R. (2015). Intonational phonology of French: Developing a ToBI system for French. In S. Frota & P. Prieto (Eds.), Intonation in Romance (pp. 63–100). Oxford University Press. DOI: https://doi.org/10.1093/acprof:oso/9780199685332.003.0003
Drager, K. (2011). Speaker age and vowel perception. Language and Speech, 54(1), 99–121. DOI: https://doi.org/10.1177/0023830910388017
Drager, K., & Kirtley, J. (2016). Awareness, salience, and stereotypes in exemplar-based models of speech production and perception. In A. Babel (Ed.), Awareness and control in sociolinguistic research (pp. 1–24) Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139680448.003
Dufour, S., Kriegel, S., Alleesaib, M., & Nguyen, N. (2014). The perception of the French /s/-/S/ contrast in early Creole-French bilinguals. Frontiers in Psychology, Frontiers, 5 (1200). DOI: https://doi.org/10.3389/fpsyg.2014.01200
Fletcher, J., Grabe, E., & Warren, P. (2004). Intonational variation in four dialects of English: The high rising tune. In S.-A. Jun (Ed.), Prosodic typology. The Phonology of Intonation and Phrasing (pp. 390–409). Oxford: OUP. DOI: https://doi.org/10.1093/acprof:oso/9780199249633.003.0014
German, J. S., Carlson, K., & Pierrehumbert, J. B. (2013). Reassignment of consonant allophones in rapid dialect acquisition. Journal of Phonetics, 41(3–4), 228–248. DOI: https://doi.org/10.1016/j.wocn.2013.03.001
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological review, 105(2), 251. DOI: https://doi.org/10.1037/0033-295X.105.2.251
Grabe, E. (2004). Intonational variation in urban dialects of English spoken in the British Isles. In P. Gilles & J. Peters (Eds.), Regional Variation in Intonation (pp. 9–31). Linguistische Arbeiten, Tübingen, Niemeyer.
Grice, M., German, J. S., & Warren, P. (to appear). Intonation systems across varieties of English. In C. Gussenhoven & A. Chen (Eds.), The Oxford Handbook of Language Prosody. Oxford University Press.
Hay, J., & Drager, K. (2010). Stuffed toys and speech perception. Linguistics, 48(4), 865–892. DOI: https://doi.org/10.1111/j.1749-818X.2010.00210.x
Hay, J., Drager, K., & Warren, P. (2010). Short-term exposure to one dialect affects processing of another. Language and speech, 53(4), 447–471. DOI: https://doi.org/10.1177/0023830910372489
Hay, J., Nolan, A., & Drager, K. (2006a). From fush to feesh: Exemplar priming in speech perception. The Linguistic Review, 23(3), 351–379. DOI: https://doi.org/10.1515/TLR.2006.014
Hay, J., Warren, P., & Drager, K. (2006b). Factors influencing speech perception in the context of a merger-in-progress. Journal of Phonetics, 34(4), 458–484. DOI: https://doi.org/10.1016/j.wocn.2005.10.001
Johnson, K. (2006). Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics, 34, 485–499. DOI: https://doi.org/10.1016/j.wocn.2005.08.004
Johnson, K., Strand, E. A., & D’Imperio, M. (1999). Auditory–visual integration of talker gender in vowel perception. Journal of phonetics, 27(4), 359–384. DOI: https://doi.org/10.1006/jpho.1999.0100
Jun, S.-A., & Fougeron, C. (2000). A phonological model of French Intonation. In A. Botinis (Ed.), Intonation. Analysis, Modelling and Technology (pp. 209–242). Dordrecht: Kluwer Academic Publishers. DOI: https://doi.org/10.1007/978-94-011-4317-2_10
Jun, S. A., & Fougeron, C. (2002). Realizations of accentual phrase in French intonation. Probus, 14(1), 147–172. DOI: https://doi.org/10.1515/prbs.2002.002
Labov, W. (1970). The logic of nonstandard English. In Language and poverty (pp. 153–189). Academic Press. DOI: https://doi.org/10.1016/B978-0-12-754850-0.50014-3
Ladd, D. R. (2008). Intonational phonology. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511808814
Lawrence, D. (2015). Limited evidence for social priming in the perception of the bath and strut vowels. Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow. UK: The University of Glasgow.
Leach, P. (1988). French intonation: Tone or tune? Journal of the International Phonetic Association, 18(2), 125–139. DOI: https://doi.org/10.1017/S002510030000373X
McGowan, K. B. (2016). Sounding Chinese and listening Chinese: Awareness and knowledge in the laboratory. In A. Babel (Ed.), Awareness and Control in Sociolinguistic Research (pp. 25–61). Cambridge: Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139680448.004
Mertens, P. (1992). L’accentuation de syllabes contiguës. I.T.L., 95/96, 145–163. DOI: https://doi.org/10.1075/itl.95-96.07mer
Michelas, A., Portes, C., & Champagne-Lavau, M. (2016). When pitch accents encode speaker commitment: Evidence from French intonation. Language and speech, 59(2), 266–293. DOI: https://doi.org/10.1177/0023830915587337
Miller, J. D. (1989). Auditory-perceptual interpretation of the vowel. Journal of the Acoustical Society of America, 85(5), 2114–2134. DOI: https://doi.org/10.1121/1.397862
Niedzielski, N. (1999). The effect of social information on the perception of sociolinguistic variables. Journal of language and social psychology, 18(1), 62–85. DOI: https://doi.org/10.1177/0261927X99018001005
Pierrehumbert, J. (2001). Exemplar dynamics: Word frequency, lenition and contrast. Frequency and the emergence of linguistic structure. Typological studies in language, 45, 137–158. DOI: https://doi.org/10.1075/tsl.45.08pie
Pierrehumbert, J. (2002). Word-specific phonetics. Laboratory phonology, 7. DOI: https://doi.org/10.1515/9783110197105.101
Portes, C. (2004). Prosodie et économie du discours: Spécificité phonétique, écologie discursive et portée pragmatique de l’intonation d’implication [unpublished doctoral thesis]. Aix-Marseille Université.
Portes, C., Beyssade, C., Michelas, A., Marandin, J. M., & Champagne-Lavau, M. (2014). The dialogical dimension of intonational meaning: Evidence from French. Journal of Pragmatics, 74, 15–29. DOI: https://doi.org/10.1016/j.pragma.2014.08.013
R Core Team (2017): R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/.
Rácz, P., Hay, J. B., & Pierrehumbert, J. B. (2017). Social salience discriminates learnability of contextual cues in an artificial language. Frontiers in psychology, 8. DOI: https://doi.org/10.3389/fpsyg.2017.00051
Sanchez, K., Hay, J., & Nilson, E. (2015). Contextual activation of Australia can affect New Zealanders’ vowel productions. Journal of Phonetics, 48, 76–95. DOI: https://doi.org/10.1016/j.wocn.2014.10.004
Sankoff, D., & Rousseau, P. (1989). Statistical evidence for rule ordering. Language Variation and Change, 1(1), 1–18. DOI: https://doi.org/10.1017/S0954394500000090
Strand, E. A., & Johnson, K. (1996). Gradient and visual speaker normalization in the perception of fricatives. In KONVENS (pp. 14–26). DOI: https://doi.org/10.1515/9783110821895-003
Walker, M., Szakay, A., & Cox, F. (2019). Can kiwis and koalas as cultural primes induce perceptual bias in Australian English speaking listeners?. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 10(1). DOI: https://doi.org/10.5334/labphon.90
Walsh, M., Möbius, B., Wade, T., & Schütze, H. (2010). Multilevel exemplar theory. Cognitive science, 34(4), 537–582. DOI: https://doi.org/10.1111/j.1551-6709.2010.01099.x
Warren, P. (2017). The interpretation of prosodic variability in the context of accompanying sociophonetic cues. Laboratory Phonology: Journal of the Association for Laboratory Phonology, 8(1). DOI: https://doi.org/10.5334/labphon.92
Wedel, A. B. (2006). Exemplar models, evolution and language change. The Linguistic Review, 23(3), 247–274. DOI: https://doi.org/10.1515/TLR.2006.010