A surprisal--duration trade-off across and within the world's languages

  • 2021-09-30 10:56:30
  • Tiago Pimentel, Clara Meister, Elizabeth Salesky, Simone Teufel, Dami├ín Blasi, Ryan Cotterell
  • 5

Abstract

While there exist scores of natural languages, each with its unique featuresand idiosyncrasies, they all share a unifying theme: enabling humancommunication. We may thus reasonably predict that human cognition shapes howthese languages evolve and are used. Assuming that the capacity to processinformation is roughly constant across human populations, we expect asurprisal--duration trade-off to arise both across and within languages. Weanalyse this trade-off using a corpus of 600 languages and, after controllingfor several potential confounds, we find strong supporting evidence in bothsettings. Specifically, we find that, on average, phones are produced faster inlanguages where they are less surprising, and vice versa. Further, we confirmthat more surprising phones are longer, on average, in 319 languages out of the600. We thus conclude that there is strong evidence of a surprisal--durationtrade-off in operation, both across and within the world's languages.

 

Quick Read (beta)

loading the full paper ...