• Dutch
  • Frisian
  • Saterfrisian
  • Afrikaans
Show all
Perceptual cues of stress
quickinfo

The acoustic properties of stress provide perceptual cues for recognizing stress. There are four important perceptual correlates:

  • duration,
  • intensity,
  • vowel quality and
  • fundamental frequency.
The most important perceptual cue for stress in Dutch is a change in fundamental frequency. The second-most influential cue is temporal organization, specifically the duration ratio between the stressed and the unstressed version of the vowels (rather than that of the consonants). Intensity would seem to rank third, but only if it is implemented in such a way that the gain or loss of intensity is concentrated in frequency bands above 500 Hz, thereby affecting the slope of the spectrum (the flatter the spectrum, the greater the perceived loudness). Overall intensity and vowel quality are the weakest cues.

readmore
[+]Duration versus intensity

Figure 1 shows the main results of a perception study for Dutch modelled after Fry (1955). In the experiment the durations of V1 and V2 in each of five minimal stress pairs (object, subject, digest, compact, import) were varied in seven steps between (and including) values found (averaged over ten speakers) in natural tokens with initial and with final stress. These seven duration steps were combined with seven intensity differences (by amplifying V1 and at the same time attenuating V2) such that the V1–V2 difference varied between +3 and –3 dB. A single Dutch minimal stress pair, the reiterant non-word nana was presented in a sentence frame wil je [target] ZEGgen will you [target] SAY with the sentence stress on the final verb; these variations were suggestive of word stress only – the range of intensity differences in the Dutch stimuli was much smaller (but reflected actual speech production) than in Fry's materials with sentence stress on the targets. Listeners indicated whether they perceived a noun (initial stress) or a verb (final stress). Figure 1 presents the perceived initial stress for duration steps in percent (averaged over words and intensity steps) and for intensity steps (averaged over words and duration ratios).

(After Sluijter, Van Heuven & Pacilly 1997)

Figure 1
[click image to enlarge]

The following figure 2 is a quasi-3D plot of percent initial stress perceived as a function of the difference in vowel duration (X-axis) and of the difference in intensity (Y-axis). The boundary in the figure separates the white area with a majority of initial stress decisions from the dark area with a majority of final stress responses.

(After Van Heuven and Sluijter 1996)

Figure 2
[click image to enlarge]

In panel A the boundary runs at an angle that is much steeper than 45°, which indicates that the duration parameter outweighs the intensity parameter as a stress cue. It also shows that intensity variations are largely inconsequential: they cannot swing the majority decision from initial to final stress for six out of seven duration steps; only when V1 = 170 ms and V2 = 245 ms does intensity yield a (shallow) cross-over from 43 to 60% initial-stress responses.

Sluijter et al. (1997) also included a set of stimuli in which the same intensity differences were generated on V1 and V2 but in such a way that no differences were made at frequencies below 500 Hz and all the changes were concentrated at frequencies above 500 Hz, thereby creating a change in spectral slope. Panel B in figure 2 above shows that (selective) intensity differences (affecting spectral tilt) are as strong a stress cue as are the duration differences: the boundary now runs at a 45° angle. In this experiment, the stimuli had been presented over headphones with artificial reverberation added. The reverb (realistic of room acoustics) obscures temporal details. When the same materials were presented over headphones without reverb, the effects of selective intensity were smaller than those of duration but still larger than those of uniform intensity differences.

[+]Consonant versus vowel duration

Duration generally outweighs other cues for word stress. What are the effects of the duration of subsyllabic units such as the onset consonant, the vocalic nucleus and the coda consonant? An experiment that addresses this issue was reported in Van Heuven (2014). In reiterant stimuli, with B-class vowels (short vowels) ( /pAfpAf, tAstAs/), and with A-class vowels (long vowels) ( /pafpaf, tastas/) the durations of onset, nucleus and coda were varied separately in steps of 50, 75, 100, 125 and 150 percent of the original duration. The stimuli were synthesized from diphones which had been excerpted from stressed syllables produced in nonsense words with sentence stress, so that all original segments were equally suggestive of (strong) stress. Results were as in figure 3.

Figure 3: Percent stress perceived on first syllable as a function of relative duration of onset, vocalic nucleus and coda in either first (left panels) or second (right panels) syllables with B-class vowels (short vowels, upper panels) or A-class vowels (long vowels, lower panels)
[click image to enlarge]

Figure 4 shows that, overall, effects of changing the duration of the vocalic nucleus are large but changes in consonant durations, whether in the onset or in the coda, have little or no effect on stress perception. A complete cross-over from stress perceived on the first syllable (S1) to stress perceived on the second syllable (S2) is found for vowel duration change, except when the vowel is short (B-class) and in the final syllable of the target non-word (top-right panel). Moreover, the effect of changing the (vowel) duration is weaker overall when the changes are implemented in S2 than in S1. Changing the duration of a consonant only affects stress perception if the change takes place in an S1 with a short (B-class) vowel (top-left panel) but even then the effect is still somewhat smaller for consonants than for the vowel. In this condition, it does not matter whether the consonant is in the onset or in the coda. So, it seems safe to conclude that the older literature was right in assuming that vowel duration by itself, rather than syllable duration or rhyme duration, is the relevant duration cue.

[+]Duration versus vowel quality

A direct comparison of vowel duration and quality was made for Dutch (Van Heuven and De Jonge 2011). They varied the V1/V2 ratio and the quality of V1 in the Dutch stress pair canon ~ kanon (see above) in seven steps along each continuum. Targets were presented in postfocal position (no F0 movement on the target) in a carrier ik heb gisteren een canon/kanon gehoord [ɪk hɛp ˈxɪstərə(n) ən ˈkanɔn/kaˈnɔn xəˈhort] I have yesterday a canon/cannon heard I heard a canon/cannon yesterday. The results are shown in figure 4, in quasi-3D format. Convincing cross-overs are obtained for the duration steps. Just one, very incomplete, change from perceived initial stress to final stress is obtained by changing vowel quality from clear to fully reduced to schwa; this change is obtained only when the duration cue is ambiguous (step 4). Fry's (1965) conclusion for English is confirmed here for Dutch: vowel reduction is a much weaker stress cue than vowel duration.

(From Van Heuven and De Jonge 2011)

Figure 4
[click image to enlarge]

[+]Duration versus fundamental frequency

In natural human speech the F0 change has to exceed a certain threshold in order to function as a stress cue, and if it does it typically imparts sentence stress on the word that carries the F0 change. Since sentence stress outranks word stress, this makes the F0 change the strongest stress cue of all. Fry (1958) was among the first to study the effect of F0 change on stress perception, comparing its strength with that of varying the duration ratio of V1 and V2 in the English noun-verb pair subject. The duration ratio was varied as in Fry (1955). In one experiment, Fry synthesized the syllable sub- on a flat 97 Hz followed by stepwise F0 rise to -ject of 5, 10, 15, 20, 30, 40, 60 and 90 Hz. This set of eight rises was supplemented with a similar set of eight falls, with the level higher F0 on sub- and the low 97 Hz pitch on -ject. The total set of 5 (V1/V2 ratios) × 8 (step sizes) × 2 (directions) = 80 stimuli. The results bear out that the frequency step-up generated perceived stress on the second syllable (between 61 and 75% for the various F0 changes but averaged over duration ratios) whilst a step down yielded stress on the first syllable (between 48 and 80%), i.e. the higher-pitched syllable is heard as stressed. The absolute size of the step, however, did not matter: a 5-Hz change was as influential as a 90-Hz change. On average, however, the effect of changing F0 turned out to be smaller than that of varying the duration ratio.

Van Katwijk (1974: 76-88) varied F0 movements in a Dutch reiterant nonsense item /s{s{s{s/ in a rather more realistic fashion. F0 changes were implemented relative to a fixed declination of 5 st/s. Keeping all other parameters constant, F0 rises and falls of 3-st during 100 ms were generated at eleven different time points. The table under 5specifies the alignment for the onset of the F0 movement with respect to the duration of a segment. Here ‘V1 00’ means that the F0 movement begins at 0% of the duration of the first vowel, i.e. at the vowel onset. Van Katwijk also generated three stimuli with rise-fall contours, and two (one rise, one fall) with 6-st excursion sizes (during 200 ms).

(After Van Katwijk 1974: 81-83)

Figure 5
[click image to enlarge]

The results show that the location of the F0 movement greatly influences the perception of stress. A simple rise or rise+fall at the beginning of a syllable suffices to attract a clear majority of stress responses to that syllable (indicated by yellow shading in the table under 5). Simple falls tend to attract fewer stress judgments than rises do, especially when they are associated with the medial or final syllable. For a simple F0 fall to impart stress on a syllable it has to be aligned rather late in the syllable or even in the beginning of the next syllable. The complex rise-fall does not attract more stress judgments than a simple rise; long 6-st rises and falls do not attract more stress judgments than 3-st exemplars. There were also stimuli with differences in vowel duration and intensity but never in combination with F0, or with each other, so that no direct comparison of cue strengths is possible.

extra

Fry (for English) as well as Van Katwijk (for Dutch) insist that F0 change is a stronger stress cue than duration. This claim remains rather unsubstantiated, however, either because the experiment does not allow the conclusion to be drawn, or because the crucial data were not presented. Although Fry (1958) provides at least circumstantial evidence, it is not the case that an F0 change can never be overridden by temporal cues in his materials.

References
  • Fry, D. B1955Duration and Intensity as physical correlates of linguistic stressJournal of the Acoustical Society of America27765-768
  • Fry, D. B1955Duration and Intensity as physical correlates of linguistic stressJournal of the Acoustical Society of America27765-768
  • Fry, D. B1958Experiments in the perception of stressLanguage and Speech1126-152
  • Fry, D. B1958Experiments in the perception of stressLanguage and Speech1126-152
  • Fry, D. B1965The dependence of stress judgments on vowel formant structureZwirner, E. & Bethge, W. (eds.)Proceedings of the 6th International Congress of Phonetic SciencesBasel306-311
  • Heuven, Vincent J. van2014Stress and segment duration in DutchWhere the principles fail. A festschrift for Wim Zonneveld on the occasion of his 64th birthdayUtrechtUtrecht Institute of Linguistics OTS217-228
  • Heuven, Vincent J. van & Jonge, Mirjam de2011Spectral and temporal reduction as stress cues in DutchPhonetica68120-132
  • Heuven, Vincent J. van & Jonge, Mirjam de2011Spectral and temporal reduction as stress cues in DutchPhonetica68120-132
  • Heuven, Vincent J. van & Sluijter, Agaath M. C1996Notes on the phonetics of word prosodyStress patterns of the worldHIL Publications 2Part 1: BackgroundThe HagueHolland Academic Graphics233-269
  • Katwijk, Ab van1974Accentuation in Dutch: An experimental linguistic studyAmsterdamVan Gorcum
  • Katwijk, Ab van1974Accentuation in Dutch: An experimental linguistic studyAmsterdamVan Gorcum
  • Katwijk, Ab van1974Accentuation in Dutch: An experimental linguistic studyAmsterdamVan Gorcum
  • Sluijter, Agaath M. C., Heuven, Vincent J. van & Pacilly, Jos J. A1997Spectral balance as a cue in the perception of linguistic stressJournal of the Acoustical Society of America101503-513
  • Sluijter, Agaath M. C., Heuven, Vincent J. van & Pacilly, Jos J. A1997Spectral balance as a cue in the perception of linguistic stressJournal of the Acoustical Society of America101503-513
printreport errorcite