Studies suggest that speakers of different languages employ language-specific heuristics when chunking speech signals into continuous units (Cutler et al, 1986). For instance, syllabic segmentation has been observed for French (Cutler et al. 1986) and Spanish (Bradley et al. 1993), whereas English (Cutler et al. 1986) and Japanese (Otake et al. 1993) have been reported to elicit stress-based strategies. In these studies, the rhythmic structure of a language defines the way it is segmented – the basic unit of language rhythm in French is the syllable, whereas the rhythm of English is stress-based. This results in an idea that segmentation preferences are largely marked by their L1 (Cutler & Clifton, 1999). The issue here is that even though this hypothesis undoubtedly concerns continuous, natural speech perception, as in everyday life, most of the studies were carried out at segmental level (sounds, words) only.
In our study, we intended to examine cross-linguistic differences in naturalistic speech perception. Specifically, we looked at the size of chunks operationalised by orthographic words. We chose Swedish, Russian and Finnish not only due to geographical convenience, but also due to the differences between these languages. Swedish and Russian are Indo-European fusional languages with no stress in words/sentences, whereas Finnish is a Finno-Ugric agglutinative language with trochaic rhythm and fixed initial stress. Russian and Finnish have relatively free word order, whereas Swedish has a definite one. Swedish is a mildly analytical language, while Russian and Finnish are synthetic. Together, these three languages form a typological continuum that could illuminate potential differences in speech segmentation.
We hypothesised that size of chunks in Swedish, Russian and Finnish is affected by differences in the structure of these languages. Therefore, chunk will be shortest in Finnish, which is highly synthetic and agglutinative, thus allowing the least amount of grammatical ambiguity. The longest chunk will be in Swedish, which is a mildly analytical language.
We chose 100 spontaneous speech extracts for each language, each of high sound quality, 20-31 seconds long and with a small number of hesitations. We then invited native speakers of each language (Swedish = 33; Russian = 56; Finnish = 51) to listen to the extracts, follow the transcript on an iPad, mark boundaries by tapping the screen and answer a comprehension question afterwards. This procedure was adopted from Vetchinnikova et al. (2017). The perceived boundaries resulted in 'chunks'.
We first calculated boundary frequency (number of participants who marked each particular boundary) and then used Monte-Carlo permutation test to determine which boundary frequencies are not random. The resulting chunks sizes are based on the mean number of orthographic words between non-random boundaries. We found that the continuum of chunk size is highest for Swedish (6.81), followed by Russian (6.00), and shortest for Finnish (5.44).
On this basis, we might argue that differences in speech segmentation result from the morphosyntactic features of a language. However, one possible source of uncertainty is that chunks are measured in orthographic words, whose average length varies in these languages: the average word length in characters for Finnish is 7.3 (Kamps, Monz, & De Rijke, 2002), for Swedish is 5.4 (Kamps, Monz, & De Rijke, 2002), and for Russian it is 5.3 (Sharov, 2011). Calculating the average word length in a language is nevertheless not straightforward and can be carried out in different ways (for instance, on non-lemmatized or lemmatized data, counting or not counting the function and low-frequency words). These facts leave a measure of uncertainty in the findings, and will require further research, but it seems that not only the length of words but the nature of the word in different languages, such as the information packed into them, are likely to affect the chunking that speakers of different languages do.
The project was supported by the Finnish Cultural Foundation.