Mean Dependency Depth

With dependency paths calculated for the Greek New Testament, we can use mean dependency depth as a proxy for syntactic complexity.

In Mean Log Frequency of Lexemes I mentioned that, as well as mean log word frequency, reading comprehension measures such as the Lexile® framework use average sentence length. Now that we have Dependency Paths calculated, we can explore potentially more useful proxies for syntactic complexity.

As an initial experiment, we’ll simply take the mean dependency depth of each target where our targets are chapters and by “dependency depth” I simply mean the number of labels in the dependency path. In other words np-O-CL-CL will count as 4 and we’ll just average across all the words in each chapter.

An initial run reveals one interesting problem. Luke 3 is given a considerably higher score than anything else because of the analysis of the genealogy (A the son of B the son of C…and so on, leads to very long paths). Reading that genealogy is arguably not that taxing syntactically which highlights one flaw in the dependency depth approach (or, perhaps the analysis chosen for the genealogy).

This aside, let’s look at what this measure identifies as easiest chapters:

Interestingly, the top 10 chapters for lowest mean dependency depth are all in Romans, 1 Corinthians and Galatians.

If we average, instead, across entire books, the top ten are:

3 John
1 Corinthians
1 John
James
Galatians
John
Romans
Matthew
Mark
2 John

which is perhaps a little less surprising.

The hardest chapters, Luke 3 aside, are the first chapters of Ephesians, 2 Timothy and Colossians, which probably isn’t much of a surprise either. The hardest books overall are Ephesians and Colossians.

The code is available here (tweak line 13 to get book-level stats).

Note, this all may be quite sensitive to the choice of analysis. It would be an interesting exercise to see, for example, what the PROIEL dependency analysis yields.

In future posts, we’ll try a few more measures and then try to bring them together to see how chapters (or books, or authors) compare across multiple criteria.