Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining

Abstract

This paper studies composer style classification of piano sheet music images.Previous approaches to the composer classification task have been limited by ascarcity of data. We address this issue in two ways: (1) we recast the problemto be based on raw sheet music images rather than a symbolic music format, and(2) we propose an approach that can be trained on unlabeled data. Our approachfirst converts the sheet music image into a sequence of musical "words" basedon the bootleg feature representation, and then feeds the sequence into a textclassifier. We show that it is possible to significantly improve classifierperformance by first training a language model on a set of unlabeled data,initializing the classifier with the pretrained language model weights, andthen finetuning the classifier on a small amount of labeled data. We trainAWD-LSTM, GPT-2, and RoBERTa language models on all piano sheet music images inIMSLP. We find that transformer-based architectures outperform CNN and LSTMmodels, and pretraining boosts classification accuracy for the GPT-2 model from46\% to 70\% on a 9-way classification task. The trained model can also be usedas a feature extractor that projects piano sheet music into a feature spacethat characterizes compositional style.

Quick Read (beta)

loading the full paper ...