RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

  • 2018-03-15 15:08:36
  • Alexander Panchenko, Anastasiya Lopukhina, Dmitry Ustalov, Konstantin Lopukhin, Nikolay Arefyev, Alexey Leontyev, Natalia Loukachevitch
  • 8

Abstract

The paper describes the results of the first shared task on word senseinduction (WSI) for the Russian language. While similar shared tasks wereconducted in the past for some Romance and Germanic languages, we explore theperformance of sense induction and disambiguation methods for a Slavic languagethat shares many features with other Slavic languages, such as rich morphologyand free word order. The participants were asked to group contexts with a givenword in accordance with its senses that were not provided beforehand. Forinstance, given a word "bank" and a set of contexts with this word, e.g. "bankis a financial institution that accepts deposits" and "river bank is a slopebeside a body of water", a participant was asked to cluster such contexts inthe unknown in advance number of clusters corresponding to, in this case, the"company" and the "area" senses of the word "bank". For the purpose of thisevaluation campaign, we developed three new evaluation datasets based on senseinventories that have different sense granularity. The contexts in thesedatasets were sampled from texts of Wikipedia, the academic corpus of Russian,and an explanatory dictionary of Russian. Overall 18 teams participated in thecompetition submitting 383 models. Multiple teams managed to substantiallyoutperform competitive state-of-the-art baselines from the previous years basedon sense embeddings.

 

Quick Read (beta)

loading the full paper ...