Cultural diversity encoded within languages of the world is at risk, as manylanguages have become endangered in the last decades in a context of growingglobalization. To preserve this diversity, it is first necessary to understandwhat drives language extinction, and which mechanisms might enable coexistence.Here, we study language shift mechanisms using theoretical and data-drivenperspectives. A large-scale empirical analysis of multilingual societies usingTwitter and census data yields a wide diversity of spatial patterns of languagecoexistence. It ranges from a mixing of language speakers to segregation withmultilinguals on the boundaries of disjoint linguistic domains. To understandhow these different states can emerge and, especially, become stable, wepropose a model in which language coexistence is reached when learning theother language is facilitated and when bilinguals favor the use of theendangered language. Simulations carried out in a metapopulation frameworkhighlight the importance of spatial interactions arising from people mobilityto explain the stability of a mixed state or the presence of a boundary betweentwo linguistic regions. Further, we find that the history of languages iscritical to understand their present state.