The Frankfurt Latin Lexicon: From Morphological Expansion and Word Embeddings to SemioGraphs

  • 2020-05-21 17:16:53
  • Alexander Mehler, Bernhard Jussen, Tim Geelhaar, Alexander Henlein, Giuseppe Abrami, Daniel Baumartz, Tolga Uslu, Wahed Hemati
  • 3

Abstract

In this article we present the Frankfurt Latin Lexicon (FLL), a lexicalresource for Medieval Latin that is used both for the lemmatization of Latintexts and for the post-editing of lemmatizations. We describe recent advancesin the development of lemmatizers and test them against the Capitularies corpus(comprising Frankish royal edicts, mid-6th to mid-9th century), a corpuscreated as a reference for processing Medieval Latin. We also consider thepost-correction of lemmatizations using a limited crowdsourcing process aimedat continuous review and updating of the FLL. Starting from the texts resultingfrom this lemmatization process, we describe the extension of the FLL by meansof word embeddings, whose interactive traversing by means of SemioGraphscompletes the digital enhanced hermeneutic circle. In this way, the articleargues for a more comprehensive understanding of lemmatization, encompassingclassical machine learning as well as intellectual post-corrections and, inparticular, human computation in the form of interpretation processes based ongraph representations of the underlying lexical resources.

 

Quick Read (beta)

loading the full paper ...