New MorphGNT Releases and Accentuation Analysis

Over the last few weeks, I've made a number of new releases of the MorphGNT SBLGNT analysis fixing some accentuation issues mostly in the normalization column. This came out of ongoing work on modelling accentuation (and, in particular, rules around clitics).

Back in 2015, I talked about Annotating the Normalization Column in MorphGNT. This post could almost be considered Part 2.

I recently went back to that work and made a fresh start on a new repo gnt-accentuation intended to explain the accentuation of each word in the GNT (and eventually other Greek texts). There's two parts to that: explaining why the normalized form is accented the way it but then explaining why the word-in-context might be accented differently (clitics, etc). The repo is eventually going to do both but I started with the latter.

My goal with that repo is to be part of the larger vision of an "executable grammar" I've talked about for years where rules about, say, enclitics, are formally written up in a way that can be tested against the data. This means:

students reading a rule can immediately jump to real examples (or exceptions)
students confused by something in a text can immediately jump to rules explaining it
the correctness of the rules can be tested
errors in the text can be found

It is the fourth point that meant that my recent work uncovered some accentuation issues in the SBLGNT, normalization and lemmatization. Some of that has been corrected in a series of new releases of the MorphGNT: 6.08, 6.09, and 6.10. See https://github.com/morphgnt/sblgnt/releases for details of specifics. The reason for so many releases was I wanted to get corrections out as soon as I made them but then I found more issues!

There are some issues in the text itself which need to be resolved. See the Github issue https://github.com/morphgnt/sblgnt/issues/52 for details. I'd very much appreciate people's input.

In the meantime, stay tuned for more progress on gnt-accentuation.