The impact of a rating scale on high-stakes writing assessment

This submission has open access

Abstract Summary

This project explores inter-rater reliability and how this informs the teaching of writing in SLA environments. Based on qualitative and quantitative data collected in a context of high-stakes writing assessment - explicitly focusing on the rating scale currently in use - we were able to distil insightful findings for the rating, examination and teaching process.

Submission ID :

AILA400

Submission Type

Standard

Abstract :

Numerous studies have investigated inter-rater reliability in the context of high-stakes testing and equally as many that studies that have looked at the teaching of writing skills in SLA contexts (see, e.g., Alameddine & Mirza, 2016; AlHassan & Wood, 2015; Banerjee, Yan, Chapman & Elliott, 2015; Janssen, Meier & Trace, 2015; Kim, Wanzek & Otaiba, 2017). While there is common agreement that the two fields can mutually benefit from one another, it is rare for studies to look at the intersection of the two fields of inquiry in an effort to comprehend how they can directly inform one another to understand key issues related to language proficiency and to validate language tests. Our study investigated the context of high-stakes writing assessment in Cyprus in order to gauge the way raters interpret and use an existing rating scale, identify possible problems they encounter during the operational rating process and make suggestions for improvement. Employing a mixed-method approach, we collected data from a sample of 18 novice and experienced raters, who were asked to simultaneously rate the same four essays and comment on the rating process via think-aloud protocols. The raters shared their experiences and concerns and suggested various solutions to mitigate the consequences of the rating tool which is currently being in use. Their suggestions focused on improving the rating process, on the examination process overall and the teaching that takes place prior to the exam. The results showed that raters - and consequently test takers - suffer either directly or indirectly from the impact of the rating scale used for the marking of writing both on a professional and a personal level. The results of the current study have implications for assessment policy makers, curriculum administrators and examination committees of high-stakes exams both in the current and other similar contexts.

Pre-recorded video :

View Attachment

If the file does not load, click here to open/download the file.