Abstract
Natural Language Processing (NLP) has become increasingly utilized to provideadaptivity in educational applications. However, recent research hashighlighted a variety of biases in pre-trained language models. While existingstudies investigate bias in different domains, they are limited in addressingfine-grained analysis on educational and multilingual corpora. In this work, weanalyze bias across text and through multiple architectures on a corpus of9,165 German peer-reviews collected from university students over five years.Notably, our corpus includes labels such as helpfulness, quality, and criticalaspect ratings from the peer-review recipient as well as demographicattributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1)our collected corpus in connection with the clustered labels, (2) the mostcommon pre-trained German language models (T5, BERT, and GPT-2) and GloVeembeddings, and (3) the language models after fine-tuning on our collecteddata-set. In contrast to our initial expectations, we found that our collectedcorpus does not reveal many biases in the co-occurrence analysis or in theGloVe embeddings. However, the pre-trained German language models findsubstantial conceptual, racial, and gender bias and have significant changes inbias across conceptual and racial axes during fine-tuning on the peer-reviewdata. With our research, we aim to contribute to the fourth UN sustainabilitygoal (quality education) with a novel dataset, an understanding of biases innatural language education data, and the potential harms of not counteractingbiases in language models for educational tasks.