Experiments with Universal CEFR Classification

Abstract

The Common European Framework of Reference (CEFR) guidelines describelanguage proficiency of learners on a scale of 6 levels. While the descriptionof CEFR guidelines is generic across languages, the development of automatedproficiency classification systems for different languages follow differentapproaches. In this paper, we explore universal CEFR classification usingdomain-specific and domain-agnostic, theory-guided as well as data-drivenfeatures. We report the results of our preliminary experiments in monolingual,cross-lingual, and multilingual classification with three languages: German,Czech, and Italian. Our results show that both monolingual and multilingualmodels achieve similar performance, and cross-lingual classification yieldslower, but comparable results to monolingual classification.

Quick Read (beta)

loading the full paper ...