Co-occurrence of the Benford-like and Zipf Laws Arising from the Texts Representing Human and Artificial Languages

  • 2018-03-06 12:24:42
  • Evgeny Shulzinger, Irina Legchenkova, Edward Bormashenko
  • 1

Abstract

We demonstrate that large texts, representing human (English, Russian,Ukrainian) and artificial (C++, Java) languages, display quantitative patternscharacterized by the Benford-like and Zipf laws. The frequency of a wordfollowing the Zipf law is inversely proportional to its rank, whereas the totalnumbers of a certain word appearing in the text generate the unevenBenford-like distribution of leading numbers. Excluding the most popular wordsessentially improves the correlation of actual textual data with the Zipfiandistribution, whereas the Benford distribution of leading numbers (arising fromthe overall amount of a certain word) is insensitive to the same eliminationprocedure. The calculated values of the moduli of slopes of doublelogarithmical plots for artificial languages (C++, Java) are markedly largerthan those for human ones.

 

Quick Read (beta)

loading the full paper ...