Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs

Abstract

Existing automated essay scoring (AES) has solely relied on essay textwithout using explanatory rationales for the scores, thereby forgoing anopportunity to capture the specific aspects evaluated by rubric indicators in afine-grained manner. This paper introduces Rationale-based Multiple TraitScoring (RMTS), a novel approach for multi-trait essay scoring that integratesprompt-engineering-based large language models (LLMs) with a fine-tuning-basedessay scoring model using a smaller large language model (S-LLM). RMTS uses anLLM-based trait-wise rationale generation system where a separate LLM agentgenerates trait-specific rationales based on rubric guidelines, which thescoring model uses to accurately predict multi-trait scores. Extensiveexperiments on benchmark datasets, including ASAP, ASAP++, and Feedback Prize,show that RMTS significantly outperforms state-of-the-art models and vanillaS-LLMs in trait-specific scoring. By assisting quantitative assessment withfine-grained qualitative rationales, RMTS enhances the trait-wise reliability,providing partial explanations about essays. The code is available athttps://github.com/BBeeChu/RMTS.git.

Quick Read (beta)

loading the full paper ...