RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

  • 2018-05-23 14:49:44
  • Alexander Panchenko, Anastasiya Lopukhina, Dmitry Ustalov, Konstantin Lopukhin, Nikolay Arefyev, Alexey Leontyev, Natalia Loukachevitch
  • 0

Abstract

The paper describes the results of the first shared task on word senseinduction (WSI) for the Russian language. While similar shared tasks wereconducted in the past for some Romance and Germanic languages, we explore theperformance of sense induction and disambiguation methods for a Slavic languagethat shares many features with other Slavic languages, such as rich morphologyand virtually free word order. The participants were asked to group contexts ofa given word in accordance with its senses that were not provided beforehand.For instance, given a word "bank" and a set of contexts for this word, e.g."bank is a financial institution that accepts deposits" and "river bank is aslope beside a body of water", a participant was asked to cluster such contextsin the unknown in advance number of clusters corresponding to, in this case,the "company" and the "area" senses of the word "bank". For the purpose of thisevaluation campaign, we developed three new evaluation datasets based on senseinventories that have different sense granularity. The contexts in thesedatasets were sampled from texts of Wikipedia, the academic corpus of Russian,and an explanatory dictionary of Russian. Overall, 18 teams participated in thecompetition submitting 383 models. Multiple teams managed to substantiallyoutperform competitive state-of-the-art baselines from the previous years basedon sense embeddings.

 

Quick Read (beta)

loading the full paper ...