Strahler Number of Natural Language Sentences in Comparison with Random Trees

  • 2023-12-06 12:39:33
  • Kumiko Tanaka-Ishii, Akira Tanaka
  • 0

Abstract

The Strahler number was originally proposed to characterize the complexity ofriver bifurcation and has found various applications. This article proposescomputation of the Strahler number's upper and lower limits for naturallanguage sentence tree structures. Through empirical measurements acrossgrammatically annotated data, the Strahler number of natural language sentencesis shown to be almost 3 or 4, similarly to the case of river bifurcation asreported by Strahler (1957). From the theory behind the number, we show that itis one kind of lower limit on the amount of memory required to processsentences. We consider the Strahler number to provide reasoning that explainsreports showing that the number of required memory areas to process sentencesis 3 to 4 for parsing (Schuler et al., 2010), and reports indicating apsychological "magical number" of 3 to 5 (Cowan, 2001). An analytical andempirical analysis shows that the Strahler number is not constant but growslogarithmically; therefore, the Strahler number of sentences derives from therange of sentence lengths. Furthermore, the Strahler number is not differentfor random trees, which could suggest that its origin is not specific tonatural language.

 

Quick Read (beta)

loading the full paper ...