• Dutch
  • Frisian
  • Saterfrisian
  • Afrikaans
Show all
Segment frequency of consonants in Dutch
quickinfo

The following lists of segmental frequencies was extracted from the phonetically transcribed part of the Dutch Celex database (Baayen et al. 1995). The syllable boundaries provided in Celex were used. All syllables were classified as either being monosyllables (originating from monosyllabic words), stressed polysyllables or unstressed polysyllables (i.e. the stressed or unstressed syllable of a polysyllabic word). Subsequently, each syllable was parsed into a positional syllable template differentiating onset, nucleus and coda positions. The numbers in the following tables are based on the number of entities per syllable position.

Please note that ambisyllabic consonants are not tagged as such in the Celex database. They are consistently classified as onset consonants, which means that B-class vowels in polysyllabic words appear in open syllables in the Celex transcriptions. As a result, the numbers presented for coda consonants in polysyllabic words and in all words combined may be skewed.

Furthermore, the Celex (word) frequency count of 486 cases (out of 5380) is specified as zero - although these words are present in the Celex database. The frequency count of zero was taken over for the syllable counts.

A searchable xls-file with the raw Celex count data can be found here. Examples are provided for each syllable type. Moreover, the data set can be filtered with respect to word type (monosyllabic or polysyllabic word), stress type (stressed or unstressed syllable), each syllable position and all combinations of these elements. Celex token and type frequencies of the filtered data are given in the top left corner of the xls-file.

Table (1) lists the relative type and token frequencies of each consonantal segment given in the (phonetically transcribed part of the) Celex database irrespective of its position within the syllable or word.

Table 1
Segment Type frequency Segment Token frequency
[s] 12.4% [n] 17.8%
[r] 12.3% [t] 14.5%
[t] 12.1% [d] 9.4%
[l] 9.5% [r] 9.3%
[k] 8.7% [z] 5.6%
[n] 6.9% [l] 5.4%
[p] 6.3% [k] 5.0%
[x] 5.0% [m] 4.9%
[m] 4.7% [v] 4.5%
[f] 3.5% [s] 4.2%
[b] 3.1% [x] 4.2%
[ʋ] 3.0% [h] 3.7%
[d] 2.2% [ʋ] 3.3%
[v] 1.8% [p] 3.2%
[j] 1.8% [b] 1.4%
[h] 1.6% [f] 1.3%
[z] 1.6% [j] 1.0%
[ŋ] 1.6% [ŋ] 0.7%
[ʃ] 1.1% [χ] 0.3%
[χ] 0.4% [ʃ] <0.1%
[g] 0.3% [ʒ] <0.1%
[ʒ] 0.2% [g] <0.1%
[dʒ] 0.1% [dʒ] <0.1%
[c] <0.1% [c] <0.1%
[ɲ] <0.1% [ɲ] <0.1%

readmore
[+]Consonants in onsets

The following tables list the relative type and token frequencies of each consonantal segment in onset position given in the (phonetically transcribed part of the) Celex database. The relative frequencies are additionally split into onsets of monosyllabic words and onsets in stressed and unstressed syllables of polysyllabic words.

Figure 1
[click image to enlarge]

Figure 2
[click image to enlarge]

[+]Consonants in codas

The following tables list the relative type and token frequencies of each consonantal segment in coda position given in the (phonetically transcribed part of the) Celex database. The relative frequencies are additionally split into codas of monosyllabic words and codas in stressed and unstressed syllables of polysyllabic words.

Figure 3
[click image to enlarge]

Figure 4
[click image to enlarge]

extra

Segmental frequency data are also available for all Dutch segments combined, as well as for vowels only. Furthermore, frequency data for even more fine-grained positions within onsets and codas are given.

References
  • Baayen, R. Harald, Piepenbrock, Richard & Gulikers, L1995The CELEX Lexical Database (CD-ROM), Release 2, Dutch Version 3.1
printreport errorcite