NorDiaChange: Diachronic Semantic Change Dataset for Norwegian

  • 2022-01-13 18:27:33
  • Andrey Kutuzov, Samia Touileb, Petter Mæhlum, Tita Ranveig Enstad, Alexandra Wittemann
  • 2

Abstract

We describe NorDiaChange: the first diachronic semantic change dataset forNorwegian. NorDiaChange comprises two novel subsets, covering about 80Norwegian nouns manually annotated with graded semantic change over time. Bothdatasets follow the same annotation procedure and can be used interchangeablyas train and test splits for each other. NorDiaChange covers the time periodsrelated to pre- and post-war events, oil and gas discovery in Norway, andtechnological developments. The annotation was done using the DURel frameworkand two large historical Norwegian corpora. NorDiaChange is published in fullunder a permissive license, complete with raw annotation data and inferreddiachronic word usage graphs (DWUGs).

 

Quick Read (beta)

loading the full paper ...