Abstract
This paper argues that the relationship between lexical identity and prosody-- one well-studied parameter of linguistic variation -- can be characterizedusing information theory. We predict that languages that use prosody to makelexical distinctions should exhibit a higher mutual information between wordidentity and prosody, compared to languages that don't. We test this hypothesisin the domain of pitch, which is used to make lexical distinctions in tonallanguages, like Cantonese. We use a dataset of speakers reading sentences aloudin ten languages across five language families to estimate the mutualinformation between the text and their pitch curves. We find that, acrosslanguages, pitch curves display similar amounts of entropy. However, thesecurves are easier to predict given their associated text in the tonallanguages, compared to pitch- and stress-accent languages, and thus the mutualinformation is higher in these languages, supporting our hypothesis. Ourresults support perspectives that view linguistic typology as gradient, ratherthan categorical.