This study investigates the relationship between lexical complexity and different proficiency levels in the Trinity Lancaster Corpus (TLC). The TLC is a 4.3-million-word corpus of spoken L2 English based on the Graded Examination in Spoken English (GESE), developed and administered by Trinity College London, a large international examination board. Studies on learner language have shown that vocabulary knowledge is one of the best predictors of language use and overall proficiency (e.g. Nation, 2013). Different measures of vocabulary knowledge have been proposed in the field and lexical complexity plays a key role among them (e.g. Kyle & Crossley, 2015; Lu, 2012; Tidball & Treffers-Dallers, 2007). However, little is known about different aspects of lexical complexity in spoken L2 production; also, there is no general agreement about which of the many existing complexity measures to use. This corpus-based study compares automatic measures of lexical complexity to human ratings of holistic spoken proficiency at different CEFR levels (B1, B2 and C1/C2), also investigating the effect of learners' L1 and age. Implications for language assessment, as well as directions for further research will be discussed.
Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4), 757-786.
Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners' oral narratives. The Modern Language Journal, 96(2), 190–208.
Nation, I. S. P. (2013). Learning vocabulary in another language (2nd ed.). Cambridge University Press.
Tidball, F., & Treffers-Dallers, J. (2007). Exploring measures of vocabulary richness in semi-spontaneous French speech. A quest for the Holy Grail? In H. Daller, J. Milton & J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge (pp. 133-149). Cambridge University Press.