CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues

Abstract

We develop CNIMA (Chinese Non-Native Interactivity Measurement andAutomation), a Chinese-as-a-second-language labelled dataset with 10Kdialogues. We annotate CNIMA using an evaluation framework -- originallyintroduced for English-as-a-second-language dialogues -- that assessesmicro-level features (e.g.\ backchannels) and macro-level interactivity labels(e.g.\ topic management) and test the framework's transferability from Englishto Chinese. We found the framework robust across languages and revealeduniversal and language-specific relationships between micro-level andmacro-level features. Next, we propose an approach to automate the evaluationand find strong performance, creating a new tool for automated second languageassessment. Our system can be adapted to other languages easily as it useslarge language models and as such does not require large-scale annotatedtraining data.

Quick Read (beta)

loading the full paper ...