greek-inflexion and an Update on the Morphological Lexicon

Exactly seven months ago, I released a generic library, inflexion, and said I’d soon follow it up with the Greek-specific stuff. While I did open-source the latter on GitHub as greek-inflexion shortly thereafter, I didn’t want to announce it here until it was further along. I’m happy to say it now is.

If you recall, I said back in May that “it can currently generate every single verb form in Louise Pratt’s intermediate grammar, on Helma Dik’s Greek verb handouts and in Andrew Keller & Stephanie Russell’s beginner-intermediate text book”. It now also has much better tooling for parsing new verb forms and guessing the stem of a given form. It also has the start of noun and adjective support.

On a separate morphgnt branch, it now has tooling for testing verb form generation against the MorphGNT/SBLGNT text. The coverage of the stem database is the gospel and epistles of John, Galatians and Mark. I expect to have complete MorphGNT/SBLGNT verb coverage by the end of the year.

The repo is at https://github.com/jtauber/greek-inflexion. Note that it’s not pip-installable at the moment and that hasn’t been a priority as it’s not a library.

As mentioned in my May post, most of the value (and effort) is not so much in the code but in the data. The stemming rules and, in particular, the stem database forms the core of the Morphological Lexicon I’ve been working on for a few years.

The best discussion of the Morphological Lexicon can be found in my SBL 2015 Slides although the vision can be found way back in this blog post from 2004 where I say:

the idea is that surface forms, lexical forms, spelling variations, roots, stems, suppletion, morphophonological rules, etc. will all be catalogued with relationships between them expressed as a directed labelled graph.

So good progress is being made (and it’s all available openly as work progresses) and the initial stem and morphophonological rule databases should be completed in the next month.

Alongside that I’m also looking at better representing relationships between stems and also relationships between the stemming rules.

Ultimately, as discussed in my SBL 2015 talk and elsewhere, my goals are to:

freely provide, in a machine-actionable way, all of the morphological information normally found in a Greek lexicon
facilitate tagging of new Greek texts
provide the underlying information to drive a new generation of adaptive Greek readers (the topic of my 2016 SBL talk)
contribute a comprehensive analysis of Ancient Greek of interest to general morphologists
experiment with the notion of an “executable grammar” where all paradigms, rules and assertions are tested automatically against a corpus and, with it, replace the existing plethora of books on paradigms and principal parts.

Particular thanks to Jonathan Robie, who continues to provide the inspiration and encouragement for a lot of this work.