Fluency in terms of speed of speech and (lack of) hesitations such as silent and filled pauses ('uhm's) is part of oral proficiency. Language assessment rubrics therefore include fluency but manually measuring fluency is highly time-consuming. We introduce revised and new PRAAT scripts to automatically measure aspects of L2 fluency, and assess their accuracy and use for language assessment. We conclude that the current script should not (yet) be used for the purpose of assessing fluency automatically in (high-stakes) oral proficiency assessment. However, the performance of the scripts for measuring aspects of fluency globally and quickly are promising.
Research on fluency in second language (L2) speaking is growing. However, ratings and measurements of aspects of fluency are highly time consuming. So for the purpose of more detailed research into specific aspects of fluency, as well as (potentially) for the purpose of assessing fluency automatically for language assessment, this paper will investigate to what extent it may be possible to evaluate aspects of fluency automatically. We take the script by De Jong and Wempe (2009), written in PRAAT (Boersma & Weenink, 2016) as a starting point. This script already measures some aspects of fluency: silent pauses (frequency and duration) and speed of speaking. However, information on filled pauses is, as yet, missing. The aim of the current paper is to create and evaluate a new script in PRAAT that for Dutch as well as English L2, also measures filled pauses (frequency and duration), automatically. To add the measurement of filled pauses to the existing script, previous research on acoustics of filled pauses is taken into account, such as duration (Hughes et al., 2016), variation of F0 (Verkhodanova & Shapranov, 2016), height of F0 (Clark & Fox Tree, 2001) as well as formant variability in F1 through F3 (Kaushik et al., 2010). The script is trained and tested on a Dutch (n = 90) and an English (n = 60) corpus and the outcomes of the script are compared to manual measurements of filled pauses, as well as to judgements on fluency on these speech data. Subsequently, based on the outcomes of these comparisons, the applicability of the script for the purpose of research on fluency, as well as for assessment purposes, will be discussed. Finally, the script is also tested on a new corpus including Dutch and English (L2) speech. We conclude that without further investigations, the current script should not (yet) be used for the purpose of assessing fluency automatically in (high-stakes) oral proficiency assessment. However, the performance of the scripts for measuring aspects of fluency globally and quickly are promising, especially given their stability in accuracy on the new corpora.