The present study investigates the intonational marking of question words (qwords) in Tashlhiyt Berber. The first part of the study identifies a number of possible prosodic patterns on qwords as employed in conversational contexts. When they occur in a direct interrogative, qwords are marked with a rise in pitch towards a H(igh) target and a subsequent fall. By contrast, when the qword is embedded, no tonal targets occur on it. The second part consists of a detailed investigation of the alignment and scaling of qwords in utterance-initial position and in narrow focus. While the H target is consistently present somewhere on the qword, neither a local F0 maximum nor a high plateau region is characterized by stable alignment with any specific position in the segmental string. Scaling of the starting point (%L) and endpoint (H) of the rise characteristic of the qword exhibited a dependency on alignment: The rise is somewhat truncated if the peak is aligned early in the word. This study’s results shed more light on the intonation system of Tashlhiyt and support earlier findings suggesting that tonal placement in this language is prone to a typologically unusual degree of variability.
In most models of intonation, a distinction is made between tones found at the edges of prosodic constituents, and tones that are linked to specific strong elements within prosodic constituents. For instance, Trager and Smith (
The co-occurrence of a phonological tone with a certain segmental or prosodic landmark can be expressed on the one hand in terms of phonetic alignment and on the other hand in terms of phonological association. Henceforth, alignment is taken to be continuous, referring to the exact position of a tone in relation to a landmark in the segmental string, and association is taken to be discrete, referring to the (phonological) linking of a tone with a phonological constituent.
The last two decades of research investigating phonetic alignment have yielded the insight that certain tonal targets align in a highly systematic way with respect to single or multiple reference points in the segmental string (the latter referred to as ‘segmental anchoring’ by
The assumption that association can be defined phonetically in terms of temporal alignment is made explicit by for example Pierrehumbert and Beckman (
Given that pitch accents are usually understood to be phonologically linked to metrically stressed syllables, this category label does by definition not apply to languages that lack stress. It is therefore particularly interesting to investigate tonal behaviour in such languages. Yet, to the best of our knowledge, tonal and segmental alignment have not been systematically investigated in languages that do not exhibit word-level metrical structure, with the notable exceptions of stress-lacking Ambonese Malay (
It merits further investigation whether Tashlhiyt, as another example of a language lacking lexical stress, exhibits systematic phonetic alignment at all, and if so, what the relevant landmarks in the segmental string would be. This, in turn, raises the questions as to how tonal alignment in Tashlhiyt should be phonologically represented in the absence of word-level metrical structure.
Tashlhiyt is one of three Berber languages spoken in Morocco and has recently received much interest regarding aspects of prosodic structure and tonal placement, with instrumental studies on metrical structure (
Recent work on the intonation of Tashlhiyt has unveiled a great deal of variability in tonal placement in phrase-final position. Grice et al. (
Whether and how wh-words, or question words (henceforth, qwords) are marked prosodically varies crosslinguistically. Ladd (
While the idea of qwords carrying inherent focus used to be problematic for early accounts of qword prosody (see
A final issue relevant to any investigation of question intonation has to do with question length. As Ladd (
In this paper we limit ourselves to a case study on qwords that are in narrow focus in default initial position, as detailed in the next section.
Although previous work on Tashlhiyt has looked at polar questions, the present paper is the first investigation into intonational characteristics of questions with a qword. Most morphologically simple qwords consist of one or two syllables starting with ma, e.g.,
The rest of this paper is structured as follows. Section 2 is exploratory in nature: First, section 2.1 provides the motivation for the pilot experiment reported on in section 2.2, in which we investigate qualitatively whether and in what contexts qwords in Tashlhiyt are marked prosodically. Following this, section 3 discusses the main experiment which concentrates on 11 different qwords (five simple and six complex) in initial position and in narrow focus. It addresses in detail the alignment and scaling characteristics of the F0 contour associated with these particular words. In the discussion in section 4, we take up the question as to how to interpret tonal placement in Tashlhiyt phonologically.
Very little is known about the intonation of questions with qwords in Berber, including Tashlhiyt, and this section serves to lay out the ground for the more detailed experiment in section 3. The aforementioned qword questions produced in semi-spontaneous speech (map tasks and gameplay) by 10 speakers recorded in Agadir in 2013 and 2015 (
It has been proposed that qwords are attributed focal status by virtue of contributing interrogative meaning to the phrase they occur in (
This section investigates precisely this issue in Tashlhiyt. It compares the same question word (
In order to elicit natural sounding instances of direct questions as well as embedding contexts, we scripted a telephone dialogue between two speakers (see Appendix). The dialogue was presented to single participants a few lines at a time on a laptop screen. Participants were familiarized with the content of the conversation beforehand and were then instructed to read and act out both sides. Intonation patterns discussed here reflect data from eight participants, who are the same as those who participated in the main experiment (see section 3.1.3 for details), plus one who only completed this pilot but not the main experiment.
As mentioned in the introduction, the default position for qwords in Tashlhiyt is phrase-initial. A typical intonation contour with initial qword is given in Figure
Representative F0 contours and waveforms for interrogative qwords (a: initial and c: final) and embedded qwords (b: medial and d: final); target word
In contrast to qwords in interrogatives, non-interrogative, embedded qwords are not marked by the same pitch event, as can be seen for both medial position (1b) and final position (1d).
On the other hand, qwords that are in medial (peninitial) position, preceded by a discourse marker, are characterized by the same rising-falling pitch movement as initial and final qwords. Figure
Two F0 contours and waveforms for questions with peninitial qword, with target word
The previous section has shown that qwords are consistently marked intonationally in interrogatives, but not in embedded contexts. Moreover, there is only a minimal effect of phrasal position on the choice of tune that marks the interrogative qword: Irrespective of the position in which it occurs (initial, peninitial, and final position), the qword is invariably marked by (part of) a rise towards a peak or plateau and a subsequent falling movement, with the peak being reached on the qword.
Based on this inventory we might hypothesize that the localized rise-fall consists of a sequence of turning points described in terms of a sequence of a low, a high, and another low target, or LHL. Taking this as our point of departure, we could explore these turning points as potential phonological targets and investigate how they are phonetically aligned with respect to the segmental level. The resulting phonological representation should reflect what remains an invariable part of the qword contour, irrespective of changes in phrasal position (which was discussed in this section), and irrespective of segmental structure (which will be addressed in the main experiment in the next section).
The working hypothesis we propose, however, does not take the initial L turning point into account as an integral part of the qword contour. Two pieces of evidence support this position. Firstly, in peninitial position, most speakers’ rising movement starts prior to the start of the qword (as exemplified in Figure
A better working hypothesis would thus be that the sequence of a high and a low tonal target, HL, form the essence of the Tashlhiyt qword tune. Of these two targets, the H target seems to be overall more stably linked to the qword than the subsequent L turning point, as observation of qwords in semi-spontaneous data revealed that the location of a low turning point following the peak showed considerable variability there. Peaks or regions of high pitch, on the other hand, tend to occur on qwords much the same in semi-spontaneous data as in the present elicited data.
As mentioned in section 1.1, the notion of tonal alignment with specific segmental landmarks has guided much recent work on the description of phonological association patterns. Crucially, anchor points are more often than not defined with reference to stressed or accented syllables (or specific segments within or surrounding these), either in absolute or relative terms. As Tashlhiyt lacks specifications for metrical strength that would allow us to attribute a central role to a stressed syllable, including using it as a point of reference, the question arises whether alignment of postlexical tones is stable, and if so with respect to what segmental or prosodic anchors. Of particular relevance to this discussion is the general question as to where syllable boundaries are. Syllabification in Tashlhiyt has been the subject of much work (and debate) in the past few decades (
In addition to the alignment of F0 targets, scaling of these targets is relevant to a characterization of intonation contours. While pitch alignment and pitch scaling are theoretically independent (each can be separately manipulated in production), they often interact in predictable ways at least with reference to perception (cf.
In the experiment presented here we investigate the alignment and scaling of the H that we argued forms an essential part of the Tashlhiyt qword tune, and consider its scaling with respect to the phrase-initial %L. The main questions of interest in this experiment are:
It is important to note that recent work carried out on Tashlhiyt, including the present study, are based on field recordings. Thus, the present experiment, although controlled in terms of content, did not take place in a laboratory setting. Recordings were made in a university room in March 2015, in Agadir, Morocco. As Tashlhiyt is still strictly a spoken language (national reforms in 2011 have slowly started to change the situation) we were reliant on a specific target group, namely students in the Amazigh (Berber) department, who are competent readers of the language. Our participants had differing local origins within the Tashlhiyt-speaking area of southern Morocco (see section 3.1.3 for details), but forms a homogeneous group in terms of age and socio-economic background.
The qwords used in this experiment are given in Table
Target qwords and their syllabification.
Simple | Complex | ||
---|---|---|---|
‘what’ | ‘which well’ | ||
‘what’ | ‘which pineapple’ | ||
‘where’ | ‘which sheep’ | ||
‘who’ | ‘what time’ | ||
‘when’ | ‘which ewe’ | ||
‘which shepherd’ |
Participants were first given an explanation in Tashlhiyt as to what the task entailed, and then seated in front of a laptop screen where they reread the instructions (also in Tashlhiyt). They were told to act out the role of a primary school teacher doing a picture-question exercise with their students. In the experiment, they were presented on each subsequent slide with a description of a picture scene, with the relevant picture shown immediately below, and the target question underneath the picture. The instructions were to read the picture description out loud and then produce the question underneath as if they were asking it to their students.
While we have independent reasons for assuming that the qword acts as the default focus of the question (see section 1.3), it was necessary to clarify that speakers consistently treated the qword as constituting the question’s single focus. Thus, lest speakers interpreted other elements in the question as (additional) foci, we created stimuli in which the lexical items in the rest of the question were both textually and visually given. In addition, the context of the task (in which subsequent questions had different phrase-initial qwords) resulted in implicit focus on the qword.
An example gloss is given below for the target item
boy | 3 |
one | sheep | in | road | 2 |
students | ||
‘The boy there sees a sheep on the road.’ | ‘You ask your students:’ |
where | in | 3 |
sheep |
‘Where does he see the sheep?’ |
Thus, the following factors ensured that stimuli were treated as constituting a single ‘focus domain’ (cf.
Stimuli were presented in blocks with each target qword occurring once. As the recording session involved a number of other tasks, speakers completed a set of two blocks with the stimuli in each block having a different semi-randomized order, followed by another task, and finally the same set of two blocks again. Each set was preceded by five practice items. We did not include fillers to minimize the total duration of the experiments. Of the 7 speakers, 2 completed only one block, so that their number of repetitions per stimulus is two instead of four, as shown in Table
Speaker details.
Subject | Age | Born/raised | Repetitions |
---|---|---|---|
1f | 21 | *Tata | 4 |
2m | 23 | Essaouira | 2 |
3f | 22 | Ait Baamran | 4 |
4f | 22 | *Ida-ou-Tanane | 4 |
6f | 21 | *Essaouira | 4 |
8f | 26 | Taroudant | 2 |
9f | 24 | Sidi Ifni | 4 |
The original recordings for the task involved 9 speakers. Data from two speakers were excluded, which in one case was due to reading difficulties and in the other case due to the speaker being unable to finish recording. Table
Location of speakers’ place of birth.
We based our scaling and alignment measurements on the contour provided by the standard pitch tracking algorithm provided in Praat (
We used two measurements for quantifying the properties of the H target: A single absolute F0 maximum, and a measure of a ‘high region.’ It has repeatedly been shown that small and gradual F0 displacement does not lead speakers to perceive pitch differences (
In order to quantify this high region, we used a heuristic measure inspired by earlier work on pitch contour plateaux (plateaux being defined as extended regions characterized by minimal pitch displacement). In specific cases, plateaux have been taken to be a phonological accent category in contrast to a clear single F0 peak (
Schematic representation of peak measure (absolute F0 maximum) and plateau measures (6% lower values in Hertz around maximum) used in the analysis of F0 contour alignment relative to the text.
This section presents results of peak alignment in the five simple qwords
Alignment of absolute F0 maximum relative to normalized word duration for the five simple qwords
Figure
In order to investigate peak alignment more precisely, we looked at the F0 maxima relative to individual segments. Figure
Alignment of absolute maximum with respect to segments (separated by dashed lines) and syllables (solid lines) for all five simple qwords; increased colour saturation reflects overlap of points.
As expected, the distribution of peaks over a large part of the word translates into maxima that variably occur on different segments. In absolute terms, monosyllabic qwords
Peak distribution overall can be characterized as exhibiting a gradient spread rather than a categorical distribution, with most speakers producing a multitude of alignment patterns. The attested variability can be classified along a number of parameters:
peaks that align with different syllables (e.g., 3f’s peaks in
peaks that align with different segments within the same syllable (e.g., 1f’s and 9f’s peaks in
peaks that align with different syllables
With respect to the specific alignment patterns characterizing individual words, a great resemblance of peak locations is observed in
Compared to
In sum, then, the alignment results presented so far indicate that there is little systematicity both within and across speakers in alignment of F0 maxima in Tashlhiyt qwords. While peaks may occur on most segments in the word, and variably align within these segments, a consistent feature of all peaks is that they occur
To name one example of consistent alignment results, Atterer and Ladd (
Turning our focus away now from single F0 turning points to plateaux, we find that the variability seen for single maxima is seen also in the plateau measures. Figure
Plateau alignment for all individual repetitions of simple qwords by word and by speaker; plateau onset and offset indicated by black dots, location of absolute maximum within plateau indicated by orange dot, segments (duration normalized) separated by dashed lines and word boundaries indicated by solid lines.
A pattern observed across all qwords is one whereby the plateau starts in the second segment (the vowel /a/), and extends across a number of segments within the word, but usually does not cross the right qword boundary. Within this general pattern, plateaux nevertheless exhibit considerable within-word variability in alignment of both onset and offset.
Additionally, a number of idiosyncratic patterns for individual words can be identified.
Considering all qword plateaux together, every single plateau parameter seems to be characterized by gradient variation (with the exception of
In sum, the search for systematic alignment has proven unfruitful for qwords in Tashlhiyt. Neither different levels of phonological structure below the word (segment or syllable) nor different measures characterizing the contour (the single maximum and a more broadly defined plateau) have revealed a high degree of systematicity in tune-text alignment. In many languages, alignment of tonal targets is clearly defined with respect to phonological units below the lexical level, like the syllable or mora. This is not what we found for the H tonal target occurring on Tashlhiyt qwords. However, one aspect of the qword tune that should not be overlooked is that the H target systematically occurs
Much in line with the patterns attested for simple qwords, complex qwords are also marked by a rise-fall, with a peak typically occurring on the interrogative element
Alignment of absolute maximum with respect to segments (separated by dashed lines) of the interrogative element
In addition to the rise-fall early in the complex qword, there may also be a rise at the end of the qword constituent as a whole (followed by a fall on the next word). As an example, contours for the constituent
Contours for all individual repetitions of
Number of tokens produced per speaker and number of these produced with final rise.
1f | 2m | 3f | 4f | 6f | 8f | 9f | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N | w/ | N | w/ | N | w/ | N | w/ | N | w/ | N | w/ | N | w/ | |
rise | rise | rise | rise | rise | rise | rise | ||||||||
4 | 4 | 2 | 0 | 4 | 0 | 4 | 2 | 2 | 1 | 2 | 2 | 2 | 1 | |
4 | 4 | 2 | 0 | 2 | 0 | 1 | 1 | 3 | 3 | 1 | 1 | 4 | 4 | |
4 | 4 | 2 | 0 | 4 | 0 | 4 | 0 | 3 | 3 | 2 | 0 | 3 | 2 | |
4 | 4 | 1 | 0 | 3 | 0 | 4 | 1 | 2 | 2 | 2 | 2 | 3 | 2 | |
4 | 4 | 2 | 0 | 3 | 0 | 3 | 0 | 2 | 2 | 2 | 1 | 4 | 4 | |
3 | 3 | 2 | 0 | 3 | 0 | 3 | 1 | 4 | 3 | 1 | 0 | 3 | 3 |
Given the variable alignment of the H target on simple and complex qwords discussed in the previous sections, the question arises as to what the effect of variability in the alignment domain is on scaling, and what this tells us about the properties that are the most crucial aspects of the qword tune. As could already be seen in Figure
Returning to pitch scaling of the individual tones that make up the qword tune, Figure
Relatively stable scaling of H target on qword; all simple and complex qwords.
The pattern of decreased H target height with later alignment is most interesting in the case of the highest targets (speaker 3f), given that these also represent the earliest aligned peaks. In order to produce a rise between the initial %L and a H target shortly after, this speaker must produce very steep pitch rises, or undershoot the %L target (which is indeed what we find, see below). While the other speakers exhibit somewhat varying interactions between alignment and scaling of the H target, the data overall rule out an interpretation in terms of undershoot of the H target. If undershoot were the case, we would expect later peaks to be consistently higher. Rather, an analysis that covers the behaviour of all speakers could invoke the requirement that a specific H target level must be reached, where even in questions with early H alignment this required F0 target is reached. Additionally, the fact that some speakers produce higher peaks with
In the main experiment, qwords were phrase-initial, so that the left qword boundary coincided with the left IP boundary. In these cases, the %L is realized on the qword, as opposed to in the pilot experiment, where the low turning point was usually realized before the qword when it was non-initial (section 2.3). The temporal vicinity of these opposite targets requires a steep rise, and given that speakers do not seem to compromise on the height of the H target, this leaves us to explore the scaling of %L.
Figure
Lower scaling of low target (%L) with increasing peak (H) distance; all simple and complex qwords.
It should be noted that although we did not quantify properties of the low turning point
Example of an utterance with post-peak fall to baseline in
Our point of departure for the investigation of the qword tune in Tashlhiyt was the expectation that there might be no systematic alignment of tonal targets on the qword. We first identified that qwords are realized with a HL narrow focus tune preceded by L at the left phrasal boundary. We then investigated the alignment of the maximum (corresponding to the H tone) on the qword in more detail. The alignment of this maximum was variable within the word, both at the syllable and segmental level, with considerable within- and across-speaker variability. We also calculated a plateau for which we measured onset and offset values in relation to syllables and segments, which showed a similar degree of variation.
While we did not find the degree of systematic alignment that characterizes pitch movements typical of pitch accents in a number of European languages, alignment of the H tonal target was also not completely unsystematic, as it consistently occurred
Phonological representation of qword tunes in Tashlhiyt; simple (left) and complex (right).
While the variable behaviour of H can be thus accounted for, results with respect to the right-edge marking of qwords by a fall, the L target following H, are more difficult to interpret. As indirectly shown by the plateau results, this L in some cases could be aligned much later (a number of syllables to the right). This matches qword intonation patterns found in semi-spontaneous data, confirming the common nature of late-aligned right-edge Ls. At the same time, given that the predominant pattern in the present data saw the L aligned at or near the right edge of the qword, we assume that in the case of narrow focus, this tone also associates to the qword.
It is possible to think of the H and L as being characterized by a strong-weak relation (cf.
Finally, an initial low turning point %L was invoked to explain the initial rise to the H target on the qword. The phrase-initial minimum which we took to represent it was scaled somewhat variably, which, together with the fact that this minimum typically stayed at the left IP edge when another word intervened between the IP edge and the left edge of the qword (section 2), we took as support for an analysis that treats it as a boundary tone associating to the IP, i.e. as %L.
Given the discussion above, the answer to the question in this section’s title would be: Yes, ‘high’ is good enough. There seem to be few if any constraints on the exact realization (in terms of temporal alignment and scaling) of the H tone occurring on the qword. The only reason why the alignment of the H is not completely free (it tends not to be realized at the very edges of the qword) have to do with realizational constraints on the execution of pitch movements in a limited timespan. The H is preceded by an %L and followed by another L, which is usually located near the right edge of the qword. As there are cases where the following L was in fact not realized near the right edge, the requirement for an L target to mark the right edge seems weaker than the requirement for an H target to be realized on the qword.
The variability in alignment of the H tone in this study makes for an interesting comparison with previous work on Tashlhiyt intonation. Grice et al. (
A final point of interest concerns the aforementioned characteristics of the population under investigation (section 1.1). Prior studies as well as the present one suggest that this particular Tashlhiyt population exhibits a high degree of intonational variability no matter how carefully participants are screened for age and socio-economic status (which is controlled for in these studies) and regional background (which is not controlled for, but any differences between speakers did not map consistently onto details of their place of birth). It remains to be tested whether Tashlhiyt speakers in geographically more homogeneous speaker communities will produce intonation patterns that are less variable, keeping in mind that controlled reading experiments are (currently) limited to university students.
If we find that a high pitch region aligns ‘somewhere on the word’, we should ask whether this counts as systematic enough for descriptive purposes. In the case of Tashlhiyt, where the interaction between underlying metrical structure and postlexical intonation is rather unlike that of most other languages, an explanation of this kind might well suffice. Similar accounts to the present one have been proposed for the few other (non-lexical-tonal) languages that are postulated not to have stress. This literature tends to describe the respective prosodic systems in terms of predetermined tonal strings associating sequentially within small phrasal domains like Accentual Phrases or Phonological Phrases (
A first aspect that has received little attention in the present paper, but is undoubtedly pivotal in our understanding of the marking of (qword) focus domains in Tashlhiyt has to do with right-edge marking of focus. The presence and alignment of the post-H L target near the right edge of qwords, while exhibiting variability in the present data, shows an even greater variability in semi-spontaneous speech (that is, it can be aligned further to the right). Undoubtedly, contextual and pragmatic factors play a role here. In the present data, in which the qword is in narrow focus, the L might be placed at the right edge of the qword as a marker of this narrow focus domain. It is likely that other interrogative contexts with qwords require prominence on constituents other than the qword, and one possible way to do this would be to mark the relevant constituent with the fall that would need to be realized somewhere following the compulsory H on the qword.
Another aspect that merits further investigation is the role that perception plays in a categorization of the contours under investigation. Based on informal perceptual checks with our participants, no different interpretations seemed to be linked to differing peak locations. A systematic investigation would prove insightful, however, especially with respect to work by Barnes et al. (
The additional files for this article can be found as follows:
Scripted telephone conversation. DOI:
We are explicitly not including the kind of pitch accent in this definition that is used to describe the
The position that Tashlhiyt lacks stress is in line with both impressionistic observations of linguists (in addition to ourselves, Kossmann p.c.,
This is not to say that there are no other factors that influence tonal association, but for a discussion of such factors, which include e.g., sonority and syllable weight, the reader is referred to Grice et al. (
For in-text use of Tashlhiyt we use phonemic transcription based on Ridouane (
Dell and Elmedlaoui (
We also had a test stimulus with an initial embedded qword:
Despite our methodological efforts, the relatively high number of exclusions is probably still an artefact of speakers not being used to reading Tashlhiyt aloud. The remaining productions, however, were judged to be natural sounding utterances expressing the intended communicative function by two additional Tashlhiyt speakers who did not participate in the experiments.
Given that some utterances were characterized by considerable microprosodic effects at the transition between vowels and nasals, we performed identical alignment analyses on a smoothed version of the F0 contour. For this smoothed contour we manipulated the raw contour (already handcorrected) in four additional steps with the customised Praat script
manual correction or unvoicing of regions with obvious microprosody. smoothing with a 15 Hz bandwidth applied within Praat. application of standard Praat interpolation. additional smoothing with a 15 Hz bandwidth.
The smoothed contour analysis aimed to control for the predictable absence of maxima in certain regions or segments (such as word-medial nasals) characterized by dips in the F0 contours. However, results based on absolute and smoothed contours mostly converged, and we decided to only report results from the ‘raw,’ handcorrected-only F0 contours.
Niebuhr and Hoekstra (
The realization of
An apparent general restriction occurs on F0 maxima at the start of the second syllable (i.e. around /n/ in
Further research will have to show what exactly the location of the fall reflects, specifically whether this seemingly categorical difference between early and late falls is meaningful.
One of the reviewers pointed out that the scaling of H might also be dependent on the
Research that led up to this paper was partly funded by a DAAD doctoral grant to the first author. Fieldwork by the first two authors was funded by ToPIQQ, a Volkswagen foundation funded project. We would like to thank the two anonymous reviewers for their comments, as well as Stefan Baumann and Rachid Ridouane for additional invaluable insights, Marijn van Putten for his help with glossing and András Bárány for all things LaTeX. We are grateful to all the people at the Institut d’études amazighes at the Ibn Zohr university in Agadir for letting us appeal to their hospitality in November 2014 and March 2015. Finally, many thanks are due to Abderrahmane Charki and Sanae Oubraim for help with the preparation of the experimental material. Responsibility for any remaining errors lies with the authors.
The authors have no competing interests to declare.