FROST-EMA: Finnish and Russian Oral Speech Dataset of Electromagnetic Articulography Measurements with L1, L2 and Imitated L2 Accents

Abstract

We introduce a new FROST-EMA (Finnish and Russian Oral Speech Dataset ofElectromagnetic Articulography) corpus. It consists of 18 bilingual speakers,who produced speech in their native language (L1), second language (L2), andimitated L2 (fake foreign accent). The new corpus enables research intolanguage variability from phonetic and technological points of view.Accordingly, we include two preliminary case studies to demonstrate bothperspectives. The first case study explores the impact of L2 and imitated L2 onthe performance of an automatic speaker verification system, while the secondillustrates the articulatory patterns of one speaker in L1, L2, and a fakeaccent.

Quick Read (beta)

loading the full paper ...